失效链接处理 |
Practical Real time Data Processing and Analytics PDF 下载
转载自:https://download.csdn.net/psearch/0/10/0/2/1/Practical%20Real%20time%20Data%20Processing%20and%20Analytics
本站整理下载:
用户下载说明:
电子版仅供预览,下载后24小时内务必删除,支持正版,喜欢的请购买正版书籍:
http://e.dangdang.com/products/1900756734.html
相关截图:
资料简介:
A practical guide to help you tackle different real-time data processing and analytics problems using the best tools for each scenario About This Book ? Learn about the various challenges in real-time data processing and use the right tools to overcome them ? This book covers popular tools and frameworks such as Spark, Flink, and Apache Storm to solve all your distributed processing problems ? A practical guide filled with examples, tips, and tricks to help you perform efficient Big Data processing in real-time Who This Book Is For If you are a Java developer who would like to be equipped with all the tools required to devise an end-to-end practical solution on real-time data streaming, then this book is for you. Basic knowledge of real-time processing would be helpful, and knowing the fundamentals of Maven, Shell, and Eclipse would be great. What You Will Learn ? Get an introduction to the established real-time stack ? Understand the key integration of all the components ? Get a thorough understanding of the basic building blocks for real-time solution designing ? Garnish the search and visualization aspects for your real-time solution ? Get conceptually and practically acquainted with real-time analytics ? Be well equipped to apply the knowledge and create your own solutions In Detail With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible. This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you’ll be equipped with a clear understanding of how to solve challenges on your own. We’ll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You’ll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case. By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner. Style and Approach In this practical guide to real-time analytics, each chapter begins with a basic high-level concept of the topic, followed by a practical, hands-on implementation of each concept, where you can see the working and execution of it. The book is written in a DIY style, with plenty of practical use cases, well-explained code examples, and relevant screenshots and diagrams.
资料目录:
Title Page Copyright Practical Real-Time Data Processing and Analytics Credits About the Authors About the Reviewers www.PacktPub.com Why subscribe? Customer Feedback Preface What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support Downloading the example code Errata Piracy Questions Introducing Real-Time Analytics What is big data? Big data infrastructure Real–time analytics – the myth and the reality Near real–time solution – an architecture that works NRT – The Storm solution NRT – The Spark solution Lambda architecture – analytics possibilities IOT – thoughts and possibilities Edge analytics Cloud – considerations for NRT and IOT Summary Real Time Applications – The Basic Ingredients The NRT system and its building blocks Data collection Stream processing Analytical layer – serve it to the end user NRT – high-level system view NRT – technology view Event producer Collection Broker Transformation and processing Storage Summary Understanding and Tailing Data Streams Understanding data streams Setting up infrastructure for data ingestion Apache Kafka Apache NiFi Logstash Fluentd Flume Taping data from source to the processor - expectations and caveats Comparing and choosing what works best for your use case Do it yourself Setting up Elasticsearch Summary Setting up the Infrastructure for Storm Overview of Storm Storm architecture and its components Characteristics Components Stream grouping Setting up and configuring Storm Setting up Zookeeper Installing Configuring Standalone Cluster Running Setting up Apache Storm Installing Configuring Running Real-time processing job on Storm Running job Local Cluster Summary Configuring Apache Spark and Flink Setting up and a quick execution of Spark Building from source Downloading Spark Running an example Setting up and a quick execution of Flink Build Flink source Download Flink Running example Setting up and a quick execution of Apache Beam Beam model Running example MinimalWordCount example walk through Balancing in Apache Beam Summary Integrating Storm with a Data Source RabbitMQ – messaging that works RabbitMQ exchanges Direct exchanges Fanout exchanges Topic exchanges Headers exchanges RabbitMQ setup RabbitMQ — publish and subscribe RabbitMQ – integration with Storm AMQPSpout PubNub data stream publisher String together Storm-RMQ-PubNub sensor data topology Summary From Storm to Sink Setting up and configuring Cassandra Setting up Cassandra Configuring Cassandra Storm and Cassandra topology Storm and IMDB integration for dimensional data Integrating the presentation layer with Storm Setting up Grafana with the Elasticsearch plugin Downloading Grafana Configuring Grafana Installing the Elasticsearch plugin in Grafana Running Grafana Adding the Elasticsearch datasource in Grafana Writing code Executing code Visualizing the output on Grafana Do It Yourself Summary Storm Trident State retention and the need for Trident Transactional spout Opaque transactional Spout Basic Storm Trident topology Trident internals Trident operations Functions map and flatMap peek Filters Windowing Tumbling window Sliding window Aggregation Aggregate Partition aggregate Persistence aggregate Combiner aggregator Reducer aggregator Aggregator Grouping Merge and joins DRPC Do It Yourself Summary Working with Spark Spark overview Spark framework and schedulers Distinct advantages of Spark When to avoid using Spark Spark – use cases Spark architecture - working inside the engine Spark pragmatic concepts RDD – the name says it all Spark 2.x – advent of data frames and datasets Summary Working with Spark Operations Spark – packaging and API RDD pragmatic exploration Transformations Actions Shared variables – broadcast variables and accumulators Broadcast variables Accumulators Summary Spark Streaming Spark Streaming concepts Spark Streaming - introduction and architecture Packaging structure of Spark Streaming Spark Streaming APIs Spark Streaming operations Connecting Kafka to Spark Streaming Summary Working with Apache Flink Flink architecture and execution engine Flink basic components and processes Integration of source stream to Flink Integration with Apache Kafka Example Integration with RabbitMQ Running example Flink processing and computation DataStream API DataSet API Flink persistence Integration with Cassandra Running example FlinkCEP Pattern API Detecting pattern Selecting from patterns Example Gelly Gelly API Graph representation Graph creation Graph transformations DIY Summary Case Study Introduction Data modeling Tools and frameworks Setting up the infrastructure Implementing the case study Building the data simulator Hazelcast loader Building Storm topology Parser bolt Check distance and alert bolt Generate alert Bolt Elasticsearch Bolt Complete Topology Running the case study Load Hazelcast Generate Vehicle static value Deploy topology Start simulator Visualization using Kibana
Summary |