SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
ApacheKafkaMeetupJapan#7
Kafka on Azure
~ MicrosoftAzureが提供するマネージドKafkaサービスを使ってみよう~
SATO Naoki / 佐藤 直生
Azure Technologist / Cloud Solution Architect, Microsoft
Twitter @satonaoki / https://satonaoki.wordpress.com/
© Microsoft Corporation
AI built-in | Most secure | Lowest TCO
Data warehouses
Data lakes
Operational databases
Data warehouses
Data lakes
Operational databasesIndustry leader 4 years in a row
#1 TPC-H performance
T-SQL query over any data
70 percent faster than Aurora
More global reach than any other
No Limits and 99.9 percent SLA
Easiest lift and shift
with no code changes
The Microsoft offering
SQL Server
Hybrid
Azure Data Services
Security and performanceFlexibility of choiceReason over any data, anywhere
SocialLOB Graph IoTImageCRM
© Microsoft Corporation
Azure Data
Factory
Azure Import/Export
service
Azure SDKAzure CLI
Cognitive ServicesBot service
Azure Search Azure Data Catalog
Azure ExpressRoute Azure network
security groups
Azure Functions Visual StudioOperations
Management Suite
Azure Active Directory Azure key
management service
Azure Blob Storage Azure Data Lake
Store
Azure IoT Hub Azure Event
Hubs
Kafka on Azure HDInsight
Azure SQL Data WarehouseAzure SQL DB Azure Cosmos DB Azure Analysis Services Power BI
Azure Data
Lake Analytics
Azure
HDInsight
Azure
Databricks
Azure
HDInsight
Azure
Databricks
Azure Stream
Analytics
Azure ML Azure
Databricks
ML Server
The Azure data landscape
© Microsoft Corporation
Azure Data
Factory
Azure Import/Export
service
Azure SDKAzure CLI
Cognitive ServicesBot service
Azure Search Azure Data Catalog
Azure ExpressRoute Azure network
security groups
Azure Functions Visual StudioOperations
Management Suite
Azure Active Directory Azure key
management service
Azure Blob Storage Azure Data Lake
Store
Azure IoT Hub Azure Event
Hubs
Kafka on Azure HDInsight
Azure SQL Data WarehouseAzure SQL DB Azure Cosmos DB Azure Analysis Services Power BI
Azure Data
Lake Analytics
Azure
HDInsight
Azure
Databricks
Azure
HDInsight
Azure
Databricks
Azure Stream
Analytics
Azure ML Azure
Databricks
ML Server
The Azure Big Data landscape
© Microsoft Corporation
Solution scenarios
Big Data and advanced analytics
SQL
Modern data warehousing
“We want to integrate all our
data—including Big Data—with
our data warehouse”
Advanced analytics
“We’re trying to predict when
our customers churn”
Real-time analytics
“We’re trying to get insights
from our devices in real-time”
© Microsoft Corporation
Real-time analytics
Real-time analytics—also called stream analytics—is the practice of processing data as soon as it’s
generated in order to enable very quick analysis and insight for timely action
SQL
Modern data warehousing
“We want to integrate all our
data—including Big Data—with
our data warehouse”
Advanced analytics
“We’re trying to predict when
our customers churn”
Real-time analytics
“We’re trying to get insights
from our devices in real-time”
© Microsoft Corporation
Stream analytics scenarios
© Microsoft Corporation
Canonical operations
Streaming
Connect, collect, and store
Ingest
Process and analyze
Analytics
Connect, collect, and store
Actions
A
B
C
© Microsoft Corporation
Big Data streaming pattern with Azure
Real-time applications
Real-time dashboards
Sensors and IoT
(unstructured)
Event hubs IoT hub Kafka on HDInsight Azure Stream
Analytics
Storm on
HDInsight
Azure Databricks
(Spark Streaming)
Azure ML
Studio
R Server Azure Databricks
(Spark ML)
Machine learning
Stream ingestion
Long-term storage
Stream analytics
Data Lake Store SQL DB Cosmos DB Azure Blob Storage
Business/custom apps
(structured)
Logs, files, and media
(unstructured)
Power BI
© Microsoft Corporation
Apache Kafka on
HDInsight
Azure is the only public cloud to offer Apache
Kafka as a managed service
Can be provisioned directly from the Azure Portal
Apache Kafka is one of the HDInsight cluster types
Clusters can be scaled within minutes
99.9 percent SLA
No additional charge for running Kafka clusters
Out-of-box management using Azure Monitor
Logs
Apache Kafka on HDInsight
A open-source, scalable, stream ingestion platform offered as a managed service on Azure HDInsight
© Microsoft Corporation
Provisioning Apache Kafka on HDInsight
A typical HDInsight Kafka cluster consists of:
Three or more worker nodes—at least three for data high availability
Two head nodes—for redundancy
Three zookeeper nodes
Kafka is I/O heavy, so Azure Managed Disks are used
for high throughput and more storage per node
Can deploy Apache Kafka on HDInsight clusters with
managed disks straight from Azure Portal
Disks or nodes can be configured during HDInsight
cluster creation—up to 16 TB per node
Kafka for Azure HDInsight
• Managed Kafka clusters with 99.9% service level
SLA
• Native integration with Azure Managed Disks.
Allows for exponentially lower costs, and higher
scale.
• Scalable On Demand clusters - Kafka clusters
with 16 TB/node and Zookeeper up and running
in 15 minutes
• Rack awareness for Kafka on the Azure cloud
• Alerting and predictive cluster maintenance
through Azure Monitor Logs
• Extensibility via one click deploy of leading ISVs
such as StreamSets
• Disaster recovery support via MirrorMaker
• Deploy End to End streaming pipelines with
Storm, Spark, Storage via automated ARM
templates in the same VNET.
Kafka is a distributed, horizontally-scalable, fault-tolerant pub-sub store
Broker 1
Producer 1
IoT Hub
Storm
Spark
Streaming
1
2
3
ZK 1 ZK 2 ZK 3
Broker 2
Broker 3
3
1
2
Topic 1
Topic 2 Topic 1
Topic 2
Topic 2
Topic 1
Data Ingestion using Kafka on HDInsight
4 5
Setup the broker
configuration
Publish the
message
The consumer
reads the messages
Kafka: Producers and Consumers
© Microsoft Corporation
Choosing Apache Kafka on HDInsight
When you want… Description
A proven ingestion service
Apache Kafka is the de-facto leader in the Big Data stream ingestion space. It’s used by the who’s who
of modern internet companies. Powered by Apache Kafka lists companies using Apache Kafka.
A hybrid, multi-cloud solution with
choice of deployment models
You can run Apache Kafka in multiple ways: On-premises, as a managed service on Azure, as an IaaS
solution on Azure VMs, or even on other public clouds—including AWS and Google Cloud Service.
An open-source solution
Kafka is an open-sourced product licensed under Apache License 2.0. It’s implemented in Java and
Scala.
A highly reliable, fault-tolerant,
scalable service
Kafka is reported to scale to handle ingestion rates of 1.1 trillion messages a day at LinkedIn. Kafka is a
horizontally scalable service—you can scale Apache Kafka on HDInsight by dynamically adding more
nodes to the cluster.
Extensibility, with support for a
large number of data sources and
sinks
Kafka Connect is a tool for scaling and reliably streaming data between Apache Kafka and other
systems. It makes it simple to quickly define connectors that move large collections of data into and
out of Kafka. Pre-built connectors to a number of data sources are available. You can extend this list by
building custom connectors.
When Apache Kafka can be a good option
Azure
Gateway
Services
Open source Stream Processing on Azure HDInsight
Real-time applications
Long term storage
Real-time dashboards
IoT Hubs
Azure VNet Boundary
Connected Car Architecture Powered by HDInsight
Siphon on HDInsight Kafka 8 million
EVENTS PER SECOND PEAK INGRESS
800 TB (10 GB per Sec)
INGRESS PER DAY
1,800; 450
PRODUCTION KAFKA BROKERS; TOPICS
15 Sec
99th PERCENTILE LATENCY
KEY CUSTOMER
SCENARIOS
Ads Monetization (Fast BI)
O365 Customer Fabric NRT – Tenant & User insights
BingNRT Operational Intelligence
Presto (Fast SML) interactive analysis
Delve Analytics
0
5
10
15
20
25
30
35
40
45
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
Throughput(inGBps)
Siphon Data Volume (Ingress and Egress)
Series1 Series2
0
5
10
15
20
25
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
1-00
Throughput(eventspersec)Millions
Siphon Events per second (Ingress and Egress)
Series1 Series2
© Microsoft Corporation
Apache Spark 2.4 and Apache Kafka 2.1 support on Azure
HDInsight
https://azure.microsoft.com/updates/apache-spark-2-4-and-apache-kafka-2-1-support-on-azure-hdinsight/
© Microsoft Corporation
Azure Event Hubs
© Microsoft Corporation
Big Data streaming pattern with Azure
Real-time applications
Real-time dashboards
Sensors and IoT
(unstructured)
Event hubs IoT hub Kafka on HDInsight Azure Stream
Analytics
Storm on
HDInsight
Azure Databricks
(Spark Streaming)
Azure ML
Studio
R Server Azure Databricks
(Spark ML)
Machine learning
Stream ingestion
Long-term storage
Stream analytics
Data Lake Store SQL DB Cosmos DB Azure Blob Storage
Business/custom apps
(structured)
Logs, files, and media
(unstructured)
Power BI
© Microsoft Corporation
✓ Input Capacity: 1 MB/s per TU*
✓ Output Capacity: 2 MB/s per TU*
✓ Latency: 50 ms avg, 99% < 100ms
✓ Events/second: 1,000
✓ Max message size: 256 KB
*In Azure Event Hubs, capacity is purchased in throughput units (TU). Add TUs to increase capacity.
Event
publisher
Partition
Partition
Partition
Reader
Reader
Reader
Event
Consumer
Event hubs
Azure Event Hubs:
Scale and performance
Azure Event Hubs
A highly scalable, fully-managed telemetry ingestion service
© Microsoft Corporation
Based on the concept of event producers and
consumers
Producers send data to an event hub via AMQP 1.0 or HTTPS
Consumers read event data from an event hub via AMQP 1.0
SAS tokens identifies and authenticates the event
publisher
Data can be captured automatically in either Azure
Blob Storage or Azure Data Lake Store (in AVRO
format)
Data is stored for 24 hours by default
84 GB storage included per throughput unit
Azure Event Hubs capabilities overview
© Microsoft Corporation
Partition consumer conceptual architecture
HTTP
AMQP
Kafka
Event Hubs for Kafka Ecosystems
© Microsoft Corporation
When you want… Description
To automatically scale capacity
Auto-inflate enables you to start small with the minimum required
throughput units. It then scales automatically to the maximum limit of
throughput units, depending on the increase in traffic
A serverless solution
Azure Event Hubs is a serverless service. Your ability to fine tune the
performance is limited
To integrate easily with Azure Stream Analytics
You can configure Azure Events Hubs as a streaming data input to Azure
Stream Analytics via the Azure Portal without any coding
A low-latency ingestion service
Azure Event Hubs latency can be less than 50 ms on average, with latency
under 100 ms 99 percent of the time*
To store ingested data in Azure Blob Storage
or Azure Data Lake Store
Azure Events Hubs has built in integration with these two Azure storage
services
* Note that other services might have a similar latency, but there are no publicly available numbers.
Choosing Event Hubs
When Azure Event Hubs can be a good option
Event Hubs in the real world:
Halo 5
80 million requests per minute
within 24 hours of release
All game telemetry and statistics
run through Azure Event Hubs,
processed, and sent back to console
1 Dedicated Capacity cluster (3 CUs)
Zero administration by Halo team
Azure provides everything you need for streaming data
– no matter how you do it
© Microsoft Corporation
• Azure Free Account: https://azure.microsoft.com/free/
• Azure Marketplace (VM Images, VM Cluster Templates, Container Images, Helm Chart):
https://azuremarketplace.microsoft.com/en-us/marketplace/apps?search=Kafka
• Third Party Managed Kafka Clusters
• Confluent Cloud: https://confluent.jp/confluent-cloud/
• Instaclustr: https://www.instaclustr.com/solutions/microsoft-azure/
• Azure HDInsight: https://docs.microsoft.com/azure/hdinsight/kafka/apache-kafka-introduction
• Azure Event Hubs: https://docs.microsoft.com/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview
• Kafka Connect
• Azure Blob Storage: https://docs.confluent.io/current/connect/kafka-connect-azure-blob-storage/
• Azure SQL Database (SQL Server): https://docs.confluent.io/current/connect/kafka-connect-cdc-mssql/
• Azure IoT Hub: https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-connector-iot-hub
Additional Information
© Copyright Microsoft Corporation. All rights reserved.

Más contenido relacionado

Más de Naoki (Neo) SATO

[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...Naoki (Neo) SATO
 
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...Naoki (Neo) SATO
 
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 UpdatesNaoki (Neo) SATO
 
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI UpdatesNaoki (Neo) SATO
 
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019Naoki (Neo) SATO
 
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)Naoki (Neo) SATO
 
[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...Naoki (Neo) SATO
 
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...Naoki (Neo) SATO
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...Naoki (Neo) SATO
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...Naoki (Neo) SATO
 
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...Naoki (Neo) SATO
 
[de:code 2019 振り返り Night!] Data Platform
[de:code 2019 振り返り Night!] Data Platform[de:code 2019 振り返り Night!] Data Platform
[de:code 2019 振り返り Night!] Data PlatformNaoki (Neo) SATO
 
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデートNaoki (Neo) SATO
 
[第35回 Machine Learning 15minutes!] Microsoft AI Updates
[第35回 Machine Learning 15minutes!] Microsoft AI Updates[第35回 Machine Learning 15minutes!] Microsoft AI Updates
[第35回 Machine Learning 15minutes!] Microsoft AI UpdatesNaoki (Neo) SATO
 
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...Naoki (Neo) SATO
 
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...Naoki (Neo) SATO
 

Más de Naoki (Neo) SATO (20)

[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
 
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
 
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
 
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
 
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
 
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
 
[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)
 
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
 
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
 
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
 
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...
[第37回 Machine Learning 15minutes!] Microsoft AI - Build 2019 Updates ~ Azure ...
 
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...
[Azure Council Experts (ACE) 第35回定例会] Microsoft Azureアップデート情報 (2019/04/19-201...
 
[de:code 2019 振り返り Night!] Data Platform
[de:code 2019 振り返り Night!] Data Platform[de:code 2019 振り返り Night!] Data Platform
[de:code 2019 振り返り Night!] Data Platform
 
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート
[de:code 2019] [DP10] Build 2019 Azure AI & Data Platform 最新アップデート
 
[第35回 Machine Learning 15minutes!] Microsoft AI Updates
[第35回 Machine Learning 15minutes!] Microsoft AI Updates[第35回 Machine Learning 15minutes!] Microsoft AI Updates
[第35回 Machine Learning 15minutes!] Microsoft AI Updates
 
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...
[Azure Council Experts (ACE) 第34回定例会] Microsoft Azureアップデート情報 (2019/02/15-201...
 
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...
[Azure Council Experts (ACE) 第33回定例会] Microsoft Azureアップデート情報 (2018/12/14-201...
 
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...
[Container X mas Party with flexy] Machine Learning Lifecycle with Kubeflow o...
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 

Último (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 

[Apache Kafka Meetup Japan #7] Kafka on Azure

  • 1. ApacheKafkaMeetupJapan#7 Kafka on Azure ~ MicrosoftAzureが提供するマネージドKafkaサービスを使ってみよう~ SATO Naoki / 佐藤 直生 Azure Technologist / Cloud Solution Architect, Microsoft Twitter @satonaoki / https://satonaoki.wordpress.com/
  • 2.
  • 3. © Microsoft Corporation AI built-in | Most secure | Lowest TCO Data warehouses Data lakes Operational databases Data warehouses Data lakes Operational databasesIndustry leader 4 years in a row #1 TPC-H performance T-SQL query over any data 70 percent faster than Aurora More global reach than any other No Limits and 99.9 percent SLA Easiest lift and shift with no code changes The Microsoft offering SQL Server Hybrid Azure Data Services Security and performanceFlexibility of choiceReason over any data, anywhere SocialLOB Graph IoTImageCRM
  • 4. © Microsoft Corporation Azure Data Factory Azure Import/Export service Azure SDKAzure CLI Cognitive ServicesBot service Azure Search Azure Data Catalog Azure ExpressRoute Azure network security groups Azure Functions Visual StudioOperations Management Suite Azure Active Directory Azure key management service Azure Blob Storage Azure Data Lake Store Azure IoT Hub Azure Event Hubs Kafka on Azure HDInsight Azure SQL Data WarehouseAzure SQL DB Azure Cosmos DB Azure Analysis Services Power BI Azure Data Lake Analytics Azure HDInsight Azure Databricks Azure HDInsight Azure Databricks Azure Stream Analytics Azure ML Azure Databricks ML Server The Azure data landscape
  • 5. © Microsoft Corporation Azure Data Factory Azure Import/Export service Azure SDKAzure CLI Cognitive ServicesBot service Azure Search Azure Data Catalog Azure ExpressRoute Azure network security groups Azure Functions Visual StudioOperations Management Suite Azure Active Directory Azure key management service Azure Blob Storage Azure Data Lake Store Azure IoT Hub Azure Event Hubs Kafka on Azure HDInsight Azure SQL Data WarehouseAzure SQL DB Azure Cosmos DB Azure Analysis Services Power BI Azure Data Lake Analytics Azure HDInsight Azure Databricks Azure HDInsight Azure Databricks Azure Stream Analytics Azure ML Azure Databricks ML Server The Azure Big Data landscape
  • 6. © Microsoft Corporation Solution scenarios Big Data and advanced analytics SQL Modern data warehousing “We want to integrate all our data—including Big Data—with our data warehouse” Advanced analytics “We’re trying to predict when our customers churn” Real-time analytics “We’re trying to get insights from our devices in real-time”
  • 7. © Microsoft Corporation Real-time analytics Real-time analytics—also called stream analytics—is the practice of processing data as soon as it’s generated in order to enable very quick analysis and insight for timely action SQL Modern data warehousing “We want to integrate all our data—including Big Data—with our data warehouse” Advanced analytics “We’re trying to predict when our customers churn” Real-time analytics “We’re trying to get insights from our devices in real-time”
  • 8. © Microsoft Corporation Stream analytics scenarios
  • 9. © Microsoft Corporation Canonical operations Streaming Connect, collect, and store Ingest Process and analyze Analytics Connect, collect, and store Actions A B C
  • 10. © Microsoft Corporation Big Data streaming pattern with Azure Real-time applications Real-time dashboards Sensors and IoT (unstructured) Event hubs IoT hub Kafka on HDInsight Azure Stream Analytics Storm on HDInsight Azure Databricks (Spark Streaming) Azure ML Studio R Server Azure Databricks (Spark ML) Machine learning Stream ingestion Long-term storage Stream analytics Data Lake Store SQL DB Cosmos DB Azure Blob Storage Business/custom apps (structured) Logs, files, and media (unstructured) Power BI
  • 11. © Microsoft Corporation Apache Kafka on HDInsight
  • 12. Azure is the only public cloud to offer Apache Kafka as a managed service Can be provisioned directly from the Azure Portal Apache Kafka is one of the HDInsight cluster types Clusters can be scaled within minutes 99.9 percent SLA No additional charge for running Kafka clusters Out-of-box management using Azure Monitor Logs Apache Kafka on HDInsight A open-source, scalable, stream ingestion platform offered as a managed service on Azure HDInsight
  • 13. © Microsoft Corporation Provisioning Apache Kafka on HDInsight A typical HDInsight Kafka cluster consists of: Three or more worker nodes—at least three for data high availability Two head nodes—for redundancy Three zookeeper nodes Kafka is I/O heavy, so Azure Managed Disks are used for high throughput and more storage per node Can deploy Apache Kafka on HDInsight clusters with managed disks straight from Azure Portal Disks or nodes can be configured during HDInsight cluster creation—up to 16 TB per node
  • 14. Kafka for Azure HDInsight • Managed Kafka clusters with 99.9% service level SLA • Native integration with Azure Managed Disks. Allows for exponentially lower costs, and higher scale. • Scalable On Demand clusters - Kafka clusters with 16 TB/node and Zookeeper up and running in 15 minutes • Rack awareness for Kafka on the Azure cloud • Alerting and predictive cluster maintenance through Azure Monitor Logs • Extensibility via one click deploy of leading ISVs such as StreamSets • Disaster recovery support via MirrorMaker • Deploy End to End streaming pipelines with Storm, Spark, Storage via automated ARM templates in the same VNET.
  • 15. Kafka is a distributed, horizontally-scalable, fault-tolerant pub-sub store Broker 1 Producer 1 IoT Hub Storm Spark Streaming 1 2 3 ZK 1 ZK 2 ZK 3 Broker 2 Broker 3 3 1 2 Topic 1 Topic 2 Topic 1 Topic 2 Topic 2 Topic 1 Data Ingestion using Kafka on HDInsight
  • 16. 4 5 Setup the broker configuration Publish the message The consumer reads the messages Kafka: Producers and Consumers
  • 17. © Microsoft Corporation Choosing Apache Kafka on HDInsight When you want… Description A proven ingestion service Apache Kafka is the de-facto leader in the Big Data stream ingestion space. It’s used by the who’s who of modern internet companies. Powered by Apache Kafka lists companies using Apache Kafka. A hybrid, multi-cloud solution with choice of deployment models You can run Apache Kafka in multiple ways: On-premises, as a managed service on Azure, as an IaaS solution on Azure VMs, or even on other public clouds—including AWS and Google Cloud Service. An open-source solution Kafka is an open-sourced product licensed under Apache License 2.0. It’s implemented in Java and Scala. A highly reliable, fault-tolerant, scalable service Kafka is reported to scale to handle ingestion rates of 1.1 trillion messages a day at LinkedIn. Kafka is a horizontally scalable service—you can scale Apache Kafka on HDInsight by dynamically adding more nodes to the cluster. Extensibility, with support for a large number of data sources and sinks Kafka Connect is a tool for scaling and reliably streaming data between Apache Kafka and other systems. It makes it simple to quickly define connectors that move large collections of data into and out of Kafka. Pre-built connectors to a number of data sources are available. You can extend this list by building custom connectors. When Apache Kafka can be a good option
  • 18. Azure Gateway Services Open source Stream Processing on Azure HDInsight Real-time applications Long term storage Real-time dashboards IoT Hubs Azure VNet Boundary Connected Car Architecture Powered by HDInsight
  • 19. Siphon on HDInsight Kafka 8 million EVENTS PER SECOND PEAK INGRESS 800 TB (10 GB per Sec) INGRESS PER DAY 1,800; 450 PRODUCTION KAFKA BROKERS; TOPICS 15 Sec 99th PERCENTILE LATENCY KEY CUSTOMER SCENARIOS Ads Monetization (Fast BI) O365 Customer Fabric NRT – Tenant & User insights BingNRT Operational Intelligence Presto (Fast SML) interactive analysis Delve Analytics 0 5 10 15 20 25 30 35 40 45 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 Throughput(inGBps) Siphon Data Volume (Ingress and Egress) Series1 Series2 0 5 10 15 20 25 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 1-00 Throughput(eventspersec)Millions Siphon Events per second (Ingress and Egress) Series1 Series2
  • 20. © Microsoft Corporation Apache Spark 2.4 and Apache Kafka 2.1 support on Azure HDInsight https://azure.microsoft.com/updates/apache-spark-2-4-and-apache-kafka-2-1-support-on-azure-hdinsight/
  • 22. © Microsoft Corporation Big Data streaming pattern with Azure Real-time applications Real-time dashboards Sensors and IoT (unstructured) Event hubs IoT hub Kafka on HDInsight Azure Stream Analytics Storm on HDInsight Azure Databricks (Spark Streaming) Azure ML Studio R Server Azure Databricks (Spark ML) Machine learning Stream ingestion Long-term storage Stream analytics Data Lake Store SQL DB Cosmos DB Azure Blob Storage Business/custom apps (structured) Logs, files, and media (unstructured) Power BI
  • 23. © Microsoft Corporation ✓ Input Capacity: 1 MB/s per TU* ✓ Output Capacity: 2 MB/s per TU* ✓ Latency: 50 ms avg, 99% < 100ms ✓ Events/second: 1,000 ✓ Max message size: 256 KB *In Azure Event Hubs, capacity is purchased in throughput units (TU). Add TUs to increase capacity. Event publisher Partition Partition Partition Reader Reader Reader Event Consumer Event hubs Azure Event Hubs: Scale and performance Azure Event Hubs A highly scalable, fully-managed telemetry ingestion service
  • 24. © Microsoft Corporation Based on the concept of event producers and consumers Producers send data to an event hub via AMQP 1.0 or HTTPS Consumers read event data from an event hub via AMQP 1.0 SAS tokens identifies and authenticates the event publisher Data can be captured automatically in either Azure Blob Storage or Azure Data Lake Store (in AVRO format) Data is stored for 24 hours by default 84 GB storage included per throughput unit Azure Event Hubs capabilities overview
  • 25. © Microsoft Corporation Partition consumer conceptual architecture HTTP AMQP Kafka
  • 26. Event Hubs for Kafka Ecosystems
  • 27. © Microsoft Corporation When you want… Description To automatically scale capacity Auto-inflate enables you to start small with the minimum required throughput units. It then scales automatically to the maximum limit of throughput units, depending on the increase in traffic A serverless solution Azure Event Hubs is a serverless service. Your ability to fine tune the performance is limited To integrate easily with Azure Stream Analytics You can configure Azure Events Hubs as a streaming data input to Azure Stream Analytics via the Azure Portal without any coding A low-latency ingestion service Azure Event Hubs latency can be less than 50 ms on average, with latency under 100 ms 99 percent of the time* To store ingested data in Azure Blob Storage or Azure Data Lake Store Azure Events Hubs has built in integration with these two Azure storage services * Note that other services might have a similar latency, but there are no publicly available numbers. Choosing Event Hubs When Azure Event Hubs can be a good option
  • 28. Event Hubs in the real world: Halo 5 80 million requests per minute within 24 hours of release All game telemetry and statistics run through Azure Event Hubs, processed, and sent back to console 1 Dedicated Capacity cluster (3 CUs) Zero administration by Halo team
  • 29. Azure provides everything you need for streaming data – no matter how you do it
  • 30. © Microsoft Corporation • Azure Free Account: https://azure.microsoft.com/free/ • Azure Marketplace (VM Images, VM Cluster Templates, Container Images, Helm Chart): https://azuremarketplace.microsoft.com/en-us/marketplace/apps?search=Kafka • Third Party Managed Kafka Clusters • Confluent Cloud: https://confluent.jp/confluent-cloud/ • Instaclustr: https://www.instaclustr.com/solutions/microsoft-azure/ • Azure HDInsight: https://docs.microsoft.com/azure/hdinsight/kafka/apache-kafka-introduction • Azure Event Hubs: https://docs.microsoft.com/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview • Kafka Connect • Azure Blob Storage: https://docs.confluent.io/current/connect/kafka-connect-azure-blob-storage/ • Azure SQL Database (SQL Server): https://docs.confluent.io/current/connect/kafka-connect-cdc-mssql/ • Azure IoT Hub: https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-connector-iot-hub Additional Information
  • 31. © Copyright Microsoft Corporation. All rights reserved.