0% found this document useful (0 votes)

52 views44 pages

Getting Started With Real-Time Analytics With Kafka and Spark in Microsoft Azure - Joe Plumb.

This document provides an overview of real-time analytics with Kafka and Spark on Microsoft Azure. It discusses fundamentals of streaming data such as sources of streaming data like clickstreams and sensors, and what streaming data can be useful for including fraud detection and recommendations. It also covers streaming system architectures and concepts like event vs message, event time vs processing time, windowing, triggers, and delivery guarantees. Finally it discusses Microsoft Azure services that can be used for streaming analytics including Event Hubs, Stream Analytics, HDInsight, Databricks, and Spark structured streaming.

Uploaded by

Samudra Bandaranayake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views44 pages

Getting Started With Real-Time Analytics With Kafka and Spark in Microsoft Azure - Joe Plumb.

Uploaded by

Samudra Bandaranayake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Getting started with real-time

analytics with Kafka and

Spark in Microsoft Azure
Joe Plumb
Cloud Solution Architect – Microsoft UK
@joe_plumb
Alternative title: Everything I
know about real time analytics
in Microsoft Azure
Joe Plumb
Cloud Solution Architect – Microsoft UK
@joe_plumb
Agenda
• Fundamentals of streaming data
• What streaming data can be useful for
• What options are there to use data streams in Microsoft Azure?
• Demo
• Q&A
Streaming 101
What is streaming data?
• “Streaming data is data that is continuously generated by different
sources.” https://en.wikipedia.org/wiki/Streaming_data

• Streaming system - A type of data processing engine that is designed

with infinite datasets in mind.
https://learning.oreilly.com/library/view/streaming-systems/9781491
983867/ch01.html
Why bother?
• Batch processing can give great insights into things that happened in
the past, but it lacks the ability to answer the question of "what is
happening right now?”
• “Data is valuable only when there is an easy way to process and get
timely insights from data sources.”
Where is streaming What is it good for?
data?
• Website monitoring
• Network monitoring
• Fraud detection
Clickstream data Sensors
• Web clicks
• Advertising
• Environment monitoring
• Application usage tracking
• Recommendations
Smart machinery (e.g. GPS ….
production lines)
Streaming System architecture

Source: https://docs.microsoft.com/en-us/azure/architecture/data-guide/big-data/real-time-processing
Event vs Message
• Could be argued its an issue of semantics, as they ‘look’ the same
(e.g. JSON object, CSV etc)
• Message is a catch-all term, as messages are just bundles of data
• Event message is a type of message

“When a subject has an event to announce, it will create an event

object, wrap it in a message, and send it on a channel.”
https://www.enterpriseintegrationpatterns.com/patterns/messaging/E
ventMessage.html
It’s all about time
Cardinality is important because the
unbounded nature of infinite datasets
imposes additional burdens on data
processing frameworks that consume
them..

We need ways to reason about time

It’s all about time: Event time vs Processing
time
• In an ideal world, the processing system
receives the event when it happens.
Event Time Processing time • In reality, the skew between an event
Time the event Time the system happening and the system processing
occurs becomes aware of
the event
that event can vary wildly

Time
It’s all about time: Event time vs Processing
time
• In an ideal world, the processing system
receives the event when it happens.
• In reality, the skew between an event
happening and the system processing
that event can vary wildly

• Processing time lag is the difference in

observed time vs processing time
• Event-time skew is how far behind the
processing pipeline is at that moment.
It’s all about time: Watermarking
• An event time marker that indicates all events up to “a point” have
been fed to the streaming processor. By the nature of streams, the
incoming event data never stops, so watermarks indicate the progress
to a certain point in the stream.
• Watermarks can either be a strict guarantee (perfect watermark) or
an educated guess (heuristic watermark)
It’s all about time: Windowing
Tumbling windows
Hopping windows
Sliding windows
Session Windows
It’s not just about time: Triggers
• They determine when the processing on the accumulated data is
started.

• Repeated update triggers

• These periodically generate updated panes for a window as its contents
evolve.
• Completeness triggers
• These materialize a pane for a window only after the input for that window is
believed to be complete to some threshold
Delivery Guarantees
• At-most-once
• means that for each message handed to the mechanism, that message is
delivered zero or one times; in more casual terms it means that messages
may be lost.
• At-least-once
• means that for each message handed to the mechanism potentially multiple
attempts are made at delivering it, such that at least one succeeds; again, in
more casual terms this means that messages may be duplicated but not lost.
• Exactly-once
• means that for each message handed to the mechanism exactly one delivery
is made to the recipient; the message can neither be lost nor duplicated.
Streaming + Batch?
• Lambda architecture

• Increasingly viewed as a
workaround, due to advances in
capabilities and reliability of
streaming data systems

By Textractor - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=34963985

https://en.wikipedia.org/wiki/Lambda_architecture#/media/
File:Diagram_of_Lambda_Architecture_(generic).png
Service Options
in Azure
Event Hubs
• Fully-managed PaaS service
• Big data streaming platform and event ingestion service.
• It can receive millions of events per second. Data sent to an event hub can be
transformed and stored by using any real-time analytics provider or
batching/storage adapters.
• Wide range of use cases
• Scalable
• Kafka for Event Hubs
• Data can be captured automatically in either Azure Blob Storage or Azure
Data Lake Store
Stream Analytics
• Event-processing engine that allows you to examine high volumes of
data streaming from devices.
• Supports extracting information from data streams, identifying patterns,
and relationships.
• Can then use these patterns to trigger other actions downstream, such
as create alerts, feed information to a reporting tool, or store it for later
use
Integration with Azure event hubs and IoT hub

• Azure Stream Analytics has built-in, first class

Streaming data
integration with Azure Event Hubs and IoT Hub
• Data from Azure Event Hubs and Azure IoT Hub
can be sources of streaming data to Azure Azure Event Hubs

Stream Analytics
• The connections can be established through the
Azure Portal without any coding Streaming data

• Azure Blob Storage is supported as a source of

reference data Azure IoT Hub

Azure Stream
• Azure Stream Analytics supports compression Analytics

across all data stream input sources—Event Reference data

Hubs, IoT Hub, and Blob Storage

Azure Blob Storage
Fully-managed Hadoop and Spark for the cloud. 99.9% SLA

100% Open Source Hortonworks data platform

Azure Clusters up and running in minutes

HDInsight Familiar BI tools, interactive open source notebooks

63% lower TCO than deploy your own Hadoop on-premises*

Cloud Spark and Hadoop Scale clusters on demand
service for the Enterprise
Secure Hadoop workloads via Active Directory and Ranger

Compliance for Open Source bits

Best in class monitoring and predictive operations via OMS

Native Integration with leading ISVs

*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
Azure HD Insight
Apache storm on HDInsight
Apache Storm offered as a managed service on Azure HDInsight

Scalable. Can analyse One of seven HDInsight

millions of events per second cluster types

Dynamically scale-up and

Integrates with Event Hub
scale-down

SLA of 99.9 percent Develop with Visual Studio

uptime using Java or C#
Azure Databricks
• Apache Spark-based analytics platform optimized for Microsoft Azure.
Designed with the founders of Apache Spark, Databricks is integrated
with Azure to provide one-click setup, streamlined workflows, and an
interactive workspace or analytics.
Spark structured streaming overview
A unified system for end-to-end fault-tolerant, exactly-once, stateful stream processing

The simplest way to perform streaming analytics is not having to think about streaming at all!

Develop
• Unifies streaming, interactive, and batch queries. Uses a single API
for both static bounded data and streaming unbounded data continuous applications
• Supports streaming aggregations, event-time windows, windowed That need to interact with batch data,
grouped aggregation, and stream-to-batch joins interactive analysis, machine learning…
• Features streaming deduplication, multiple output modes, and APIs
for managing and monitoring streaming queries
Pure streaming system Continuous application
• Also supports interactive and batch queries >_ Ad-hoc
queries
• Aggregate data in a stream, then serve using JDBC
Input Streaming Output Input Continuous Output
• Change queries at runtime stream computation sink
(transactions stream application sink
(transactions
often up to user) co often up to user)
ns
• Build and apply machine learning models ist
en
tw
ith

• Built-in sources: Kafka, file source (JSON, CSV, text, and Parquet) (interactions with other systems
Batch
job
left to the user Static data
• App development in Scala, Java, Python, and R
Demo – Event Hubs and
Stream Analytics
What we’re looking at

Event hubs Stream Analytics

Python Flask app - Kafka enabled - Simple tumbling Power BI
- kafka-python window
Azure
So… what do I use?
INGESTION SERVICES- A COMPARISON

HDInsight (Apache Kafka) Azure Event Hubs Azure IoT Hub

Open-Source Yes No No

Serverless Service No Yes Yes

Hybrid (cloud and on-prem) Yes No No

MQTT, AQMP, HTTPS and

Protocols supported HTTP REST AQMP, HTTP REST Azure IoT protocol gateway for custom
protocols

Replication and reliability Manually configured with tools like Relies on underlying Azure Blob Storage Relies on underlying Azure Blob Storage
MirrorMaker
• A side-by-side comparison of the capabilities and features
SLA 99.9% 99.9% 99.9%

Limited by number and type of nodes in the

Scaling HDInsight cluster provisioned Limited by number of Throughput Units Limited by number of IoT Hub Units

Throttling No, explicit throttling Yes, when TU limits are reached Yes, when IoT Hub Unit limits are reached

Message Size No Limits 1MB 256 KB

Message Ordering Yes, Ordered within a partition Yes, Ordered within a partition Yes, within a partition

Can automatically store in Azure Managed Disk. Can automatically store in Azure Blob Storage
Can automatically store in Azure Blob Storage or
Long-term Storage Number of disk has to be explicitly specified during
cluster creation.
Azure Data Lake Store. (using ABS as an endpoint)
COMPARING STREAMING ANALYTICS SERVICE
(1/2)
HDInsight (Apache Storm) Azure Stream Analytics Spark Streaming (Azure Databricks)
Open-Source Yes No Yes

Serverless Service No Yes No

Hybrid (cloud and on-prem) Yes No Yes

Exactly-once processing No Yes Yes

[Cannot distinguish between new events and replays]

SQL as Query Language No Yes Yes

Yes – Combines Batch, Interactive, Machine

Unified Programming Model No No
Learning and Streaming.
• A side-by-side comparison of the capabilities and features
Extensibility Yes, custom code in Java, C# etc No (partial support with JavaScript UDFs) Yes, custom code in Java, Scala, Python

Windowing Support No, needs Trident for Tumbling Window Yes – Sliding, Hopping and Tumbling Yes – Siding and Tumbling Window

No built-in support. Trained model can be Yes. Published Azure Machine Learning models
Azure ML Integration can be configured as functions during job No
invoked through custom Storm Bolts. creation.

Kafka Integration Yes, Kafka Spout available Yes Kafka connector available
COMPARING STREAMING ANALYTICS SERVICE
(2/2)
Apache Storm on HDInsight Azure Stream Analytics Spark Structured Streaming
(Azure Databricks)
Pay for number and type of nodes in the Pay for number and type of nodes in the
Pricing Model HDInsight cluster and duration of use Pay per Streaming Unit (SU) cluster and duration of use
Limited by number of Streaming Units (SU). Limited by number and type of nodes in the
Scaling Model Upper limit of 1 GB/sec.
cluster provisioned
Each SU = 1 MB/sec with max 50 SUs.
Can be anything - custom code needed to
Input Data Format parse Avro, CSV, JSON Text, CSV, JSON, PARQUET

Azure Event Hubs, Azure Blob Storage, Azure

Input Data Sources Can be anything –but need custom code. File Source, Kafka, Socket (for testing)
IoT Hub
Azure Events Hubs, Azure Blob Storage, Azure
Azure Events Hubs, Azure Blob Storage, Azure
Output Data Sink • A side-by-side comparisonTables,
of theAzure capabilities
Tables, Azure Cosmos DB, Azure SQL DB, Azure
Power BI, Azure Data Lake Store, HBase,
and
Cosmos DB, Azure SQL DB, features
Azure Console, Kafka, Memory, ForEachSink
Custom Power BI, Azure Data Lake Store

No limits on data size. Connectors available for Azure Blob Storage only with max size of 100 No limits on data size. Can be stored in any
Reference Data HBase, Azure Cosmos DB, Azure SQL DB and MB in-memory lookup cache source supported by Apache Spark.
custom sources
Users can create, debug, and monitor jobs
Users using .NET can develop, debug, and
Dev Experience monitor through Visual Studio. through the Azure portal, using sample data Use Azure Databricks Notebooks.
derived from a live stream
Further reading
Hands on with Event Hubs and python
https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-python

Hands on with streaming ETL with Azure Databricks

https://medium.com/microsoftazure/an-introduction-to-streaming-etl-o
n-azure-databricks-using-structured-streaming-databricks-16b369d77e3
4

Choosing the right service(s) for your use case

https://docs.microsoft.com/en-us/azure/architecture/data-guide/techn
ology-choices/stream-processing
Further reading
http://shop.oreilly.com/product/0636920073994.do
Questions?
We'd love your feedback!
aka.ms/SQLBits19
Thanks!
Joe Plumb
Cloud Solution Architect – Microsoft UK
@joe_plumb

Real Time Event Processing With Microsoft Azure Stream Analytics
100% (1)
Real Time Event Processing With Microsoft Azure Stream Analytics
31 pages
4 Building Blocks of A Streaming Data Architecture
No ratings yet
4 Building Blocks of A Streaming Data Architecture
11 pages
Azure Book 129
No ratings yet
Azure Book 129
1 page
Mining Data Streams in Data Analytics Refers To The Process of Extracting Useful Patterns
No ratings yet
Mining Data Streams in Data Analytics Refers To The Process of Extracting Useful Patterns
30 pages
Streaming Data Ingestion v1 181001151203
No ratings yet
Streaming Data Ingestion v1 181001151203
59 pages
Stream Processing
No ratings yet
Stream Processing
33 pages
6 - Streaming Part 1
No ratings yet
6 - Streaming Part 1
44 pages
Data Stream in Data Analytics
No ratings yet
Data Stream in Data Analytics
4 pages
Unit 3
No ratings yet
Unit 3
51 pages
Bigdata Unit-Ii
No ratings yet
Bigdata Unit-Ii
33 pages
Data Analytics Unit 3
No ratings yet
Data Analytics Unit 3
14 pages
Big Data Analytics - Unit 2 Notes
No ratings yet
Big Data Analytics - Unit 2 Notes
44 pages
Big Data Architectures
No ratings yet
Big Data Architectures
8 pages
Module-2-MINING DATA STREAMS
100% (3)
Module-2-MINING DATA STREAMS
17 pages
Hazelcast Level Up To Instant Action-1706173416548
No ratings yet
Hazelcast Level Up To Instant Action-1706173416548
36 pages
010.1 - Stream Analytics
No ratings yet
010.1 - Stream Analytics
3 pages
BDA Architecture
No ratings yet
BDA Architecture
15 pages
BDA Unit-4
No ratings yet
BDA Unit-4
12 pages
BDA Unit 3
No ratings yet
BDA Unit 3
42 pages
Chapter 1-1
No ratings yet
Chapter 1-1
34 pages
SPA Session 10 Stream Platforms
No ratings yet
SPA Session 10 Stream Platforms
26 pages
What Is Stream Processing
No ratings yet
What Is Stream Processing
3 pages
Unit Iv
No ratings yet
Unit Iv
11 pages
Unit Iv
No ratings yet
Unit Iv
5 pages
BDA Lec10
No ratings yet
BDA Lec10
33 pages
Bda Mid Ans
No ratings yet
Bda Mid Ans
18 pages
Stream Processing for IT/CSE Students
No ratings yet
Stream Processing for IT/CSE Students
57 pages
What Is Stream Processing
No ratings yet
What Is Stream Processing
11 pages
Hidden Patterns, Unknown Correlations, Market Trends, Customer Preferences and Other Useful Information That Can Help Organizations Make More-Informed Business Decisions
No ratings yet
Hidden Patterns, Unknown Correlations, Market Trends, Customer Preferences and Other Useful Information That Can Help Organizations Make More-Informed Business Decisions
4 pages
SA Unit 1 PPT 2
No ratings yet
SA Unit 1 PPT 2
27 pages
Lec 01
No ratings yet
Lec 01
17 pages
Big Data Analytics Unit-2
100% (1)
Big Data Analytics Unit-2
11 pages
Bigdata-Mining Data Streams
No ratings yet
Bigdata-Mining Data Streams
19 pages
Pass Sqlsaturday Melbourne Azure Data Pipelines v0 1 PDF
No ratings yet
Pass Sqlsaturday Melbourne Azure Data Pipelines v0 1 PDF
41 pages
Streaming Analytics Has Emerged As A Pivotal Technology For Processing and Analyzing Large
No ratings yet
Streaming Analytics Has Emerged As A Pivotal Technology For Processing and Analyzing Large
14 pages
Big Data Stream Processing Guide
No ratings yet
Big Data Stream Processing Guide
22 pages
Unit 3-6
No ratings yet
Unit 3-6
14 pages
Unit 2
No ratings yet
Unit 2
10 pages
Azure Roriz
No ratings yet
Azure Roriz
46 pages
Big Data Essentials for IT Professionals
No ratings yet
Big Data Essentials for IT Professionals
31 pages
Bigdata Unit II
No ratings yet
Bigdata Unit II
19 pages
Module4 1
No ratings yet
Module4 1
68 pages
Stream Computing
No ratings yet
Stream Computing
18 pages
Streaming Data Insights for Tech Pros
No ratings yet
Streaming Data Insights for Tech Pros
4 pages
Azure Synapse & Data Lake Guide
No ratings yet
Azure Synapse & Data Lake Guide
23 pages
Big Data Arch
No ratings yet
Big Data Arch
2 pages
DataStreaming L-4
No ratings yet
DataStreaming L-4
16 pages
T09 Data Streaming
No ratings yet
T09 Data Streaming
52 pages
Uint 4miningdatastream 230810162429 9d7c02a7
No ratings yet
Uint 4miningdatastream 230810162429 9d7c02a7
11 pages
Analytics On Big Fast Data Using A Realtime Stream Data Processing Architecture
No ratings yet
Analytics On Big Fast Data Using A Realtime Stream Data Processing Architecture
34 pages
Choose The Right Stream Processing Engine Whitepaper
No ratings yet
Choose The Right Stream Processing Engine Whitepaper
16 pages
Lec 19
No ratings yet
Lec 19
24 pages
Big Data Architecture Guide
No ratings yet
Big Data Architecture Guide
4 pages
Big Data PDF
No ratings yet
Big Data PDF
10 pages
9th SEM Final Project Amrita
No ratings yet
9th SEM Final Project Amrita
46 pages
Report Merged
No ratings yet
Report Merged
62 pages
Cross Platform Application Development
No ratings yet
Cross Platform Application Development
19 pages
Revision Flashcards: Section 8 - Database Management
No ratings yet
Revision Flashcards: Section 8 - Database Management
1 page
CS 431 Quiz 7 Solution
No ratings yet
CS 431 Quiz 7 Solution
1 page
Online Grocery
No ratings yet
Online Grocery
78 pages
Abap Amdp
100% (1)
Abap Amdp
4 pages
Multilingual Excel Function Guide
No ratings yet
Multilingual Excel Function Guide
14 pages
Topic 4 Erd
No ratings yet
Topic 4 Erd
91 pages
Data Cleaning Checklist - 26 AI Prompts - 40 Prompts - 1 - 1754473980958-Pages-5
No ratings yet
Data Cleaning Checklist - 26 AI Prompts - 40 Prompts - 1 - 1754473980958-Pages-5
1 page
PG - M.sc. - Computer Science - 34141 Data Mining and Ware Housing
No ratings yet
PG - M.sc. - Computer Science - 34141 Data Mining and Ware Housing
192 pages
DBMS Question Bank
No ratings yet
DBMS Question Bank
3 pages
DBMS 2024
No ratings yet
DBMS 2024
2 pages
26 172 - Reason For This Error - Product Lifecycle Management - Community Wiki
No ratings yet
26 172 - Reason For This Error - Product Lifecycle Management - Community Wiki
3 pages
Understanding An Olap Solution
No ratings yet
Understanding An Olap Solution
20 pages
PP Module 2 - PPT
No ratings yet
PP Module 2 - PPT
79 pages
Measuring Biodiversity Excel Instructions
No ratings yet
Measuring Biodiversity Excel Instructions
10 pages
SS Koineni Resume
No ratings yet
SS Koineni Resume
4 pages
Snowpro™ Advanced: Data Engineer: Exam Study Guide
No ratings yet
Snowpro™ Advanced: Data Engineer: Exam Study Guide
16 pages
SPEAR Design and Implementation of An Advanced Virtual Assistant
No ratings yet
SPEAR Design and Implementation of An Advanced Virtual Assistant
6 pages
Practical Project
No ratings yet
Practical Project
6 pages
Veritas Enterprise Vault Et Access Appliance
No ratings yet
Veritas Enterprise Vault Et Access Appliance
51 pages
1.1 Objectives
No ratings yet
1.1 Objectives
28 pages
Achieving Performance, Security in Cloud
No ratings yet
Achieving Performance, Security in Cloud
14 pages
MCA-III Sem (New & Old)
No ratings yet
MCA-III Sem (New & Old)
1 page
PDF Bim Plug in and Plan en 2022
No ratings yet
PDF Bim Plug in and Plan en 2022
7 pages
Taguette: Open-Source Qualitative Data Analysis
No ratings yet
Taguette: Open-Source Qualitative Data Analysis
5 pages
Implement Oracle EBS Receiving Open Interface (ROI) For Lot and Serial Controlled Item. - by Ariful Ambia - Medium
No ratings yet
Implement Oracle EBS Receiving Open Interface (ROI) For Lot and Serial Controlled Item. - by Ariful Ambia - Medium
25 pages
23CS1303 Unit 2 DBMS R 2023 PDF
No ratings yet
23CS1303 Unit 2 DBMS R 2023 PDF
48 pages
Pyhton Potential Interview Questions
No ratings yet
Pyhton Potential Interview Questions
34 pages

Getting Started With Real-Time Analytics With Kafka and Spark in Microsoft Azure - Joe Plumb.

Uploaded by

Getting Started With Real-Time Analytics With Kafka and Spark in Microsoft Azure - Joe Plumb.

Uploaded by

Getting started with real-time

analytics with Kafka and

• Streaming system - A type of data processing engine that is designed

“When a subject has an event to announce, it will create an event

We need ways to reason about time

• Processing time lag is the difference in

• Repeated update triggers

By Textractor - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=34963985

• Azure Stream Analytics has built-in, first class

• Azure Blob Storage is supported as a source of

across all data stream input sources—Event Reference data

Hubs, IoT Hub, and Blob Storage

100% Open Source Hortonworks data platform

Azure Clusters up and running in minutes

HDInsight Familiar BI tools, interactive open source notebooks

63% lower TCO than deploy your own Hadoop on-premises*

Compliance for Open Source bits

Best in class monitoring and predictive operations via OMS

Native Integration with leading ISVs

Scalable. Can analyse One of seven HDInsight

Dynamically scale-up and

SLA of 99.9 percent Develop with Visual Studio

Event hubs Stream Analytics

HDInsight (Apache Kafka) Azure Event Hubs Azure IoT Hub

Serverless Service No Yes Yes

Hybrid (cloud and on-prem) Yes No No

MQTT, AQMP, HTTPS and

Limited by number and type of nodes in the

Message Size No Limits 1MB 256 KB

Serverless Service No Yes No

Hybrid (cloud and on-prem) Yes No Yes

Exactly-once processing No Yes Yes

SQL as Query Language No Yes Yes

Yes – Combines Batch, Interactive, Machine

Azure Event Hubs, Azure Blob Storage, Azure

Hands on with streaming ETL with Azure Databricks

Choosing the right service(s) for your use case

You might also like