SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
When One Data Center is not Enough
Guozhang Wang Strata San Jose, 2016
Building large-scale stream infrastructure across multiple data centers with Apache Kafka
2
• Why across Data Centers?
• Design patterns for Multi-DC
• Kafka for Multi-DC
• Conclusion
Agenda
3
Why across Data Centers?
4
Why across Data Centers
• Catastrophic / expected failures
• Routine maintenance
• Geo-locality (Example: CDNs)
5
Why NOT across Data Centers
• Low bandwidth (10Mbps - 1Gbps)
• High latency (50ms - 450ms)
• Much More $$$
6
Why NOT across Data Centers
• … is hard and expensive
7
Why NOT across Data Centers
• … is hard and expensive
• … with real-time writes? Harder
8
Why NOT across Data Centers
• … is hard and expensive
• … with real-time writes? Harder
• … consistently? Oh My!
9
Consistency
• Weak
• Eventual
• Strong
Latency Guarantee
10
Weak No Consistency
• Now you see my writes, now you don’t
• Best effort only, data can be stale
• Examples: think of “caches”, VoIP
11
Eventual Consistency
• You will see my writes, … eventually
• May need to resolve conflicts (manually)
• Examples: think of “emails”, SMTP
12
Strong Consistency
• You get what you write, for sure
• External > Sequential > Causal (Session)
• Examples: RDBMS, file systems
13
• LAN: consistency over latency
• WAN: latency over consistency
Latency vs. Consistency
14
• Why across Data Centers?
• Design patterns for Multi-DC
• Kafka for Multi-DC
• Conclusion
Agenda
15
Option I: Don’t do it
• Bunkerize the single data center
• Expect data loss at failures
• Examples: ??
16
Option II: Primary with Hot Standby
• Failover to hot standby (maybe inconsistent)
• Window of data loss at failures
• Examples: MySQL binlog
17
Option III: Active-Active
• Accepts writes in multi-DC
• Resolve conflicts (strong / week consistency)
• Examples: Amazon DynamoDB (vector clock)
Google Spanner (2PC), Mesa (Paxos)
18
Ordering is the Key!
19
Ordering is Key
• Vector clocks: partial ordering
• Paxos, 2PC: global ordering
• Log shipping: logical ordering (per-partition)
21
Apache Kafka
• A distributed messaging system
..that store messages as a log!
22
Store Messages as a Log
4 5 5 7 8 9 10 11 12...
Producer Write
Consumer1 Reads
(offset 7)
Consumer2 Reads
(offset 10)
Messages
3
23
Partition the Log across Machines
Topic 1
Topic 2
Partitions
Producers
Producers
Consumers
Consumers
Brokers
24
ACK mode Latency On Failures
“no" no network delay some data loss
“leader" 1 network roundtrip a few data loss
“all" ~2 network roundtrips no data loss
Configurable ISR Commits
25
• Why across Data Centers?
• Design patterns for Multi-DC
• Kafka for Multi-DC
• Conclusion
Agenda
26
Option I: Active-Passive Replication
Kafka
local
producers
consumer consumer
DC 1
MirrorMaker
DC 2
Kafka
replica
27
Option I: Active-Passive Replication
• Async- replication across DC
• May lose data on failover
• Example: ETL to data warehouse / HDFS
Kafka
local
producers
consumer consumer
DC 1
MirrorMaker
DC 2
Kafka
replica
28
Option II: Active-Active Replication
Kafka
local
Kafka
aggregate
Kafka
aggregate
producers producers
consumer consumer
MirrorMaker
Kafka
local
on DC1 failure
DC 1 DC 2
29
Option II: Active-Active Replication
• Global view on agg. cluster
• Require offsets to resume
• Example: store materialization, index updates
Kafka
local
Kafka
agg
Kafka
agg
producers producers
consumer consumer
MirrorMaker
Kafka
local
on DC1 failure
DC 1 DC 2
30
• Offsets not identical between Kafka clusters
• Duplicates during failover
• Partition selection may be different
• Solutions
• Resume from log end offset (suitable for real-time apps)
• Resume from a timestamp (ListOffsets, offset index: KIP-33)
Caveats: offsets across DCs
31
Option III: Deploy across DCs
Kafka
producers producers
consumer consumer
DC 1 DC 2
32
Option III: Deploy across DCs
• Multi-tenancy support
• Security (0.9)
• Quota Management (0.9)
• Latency optimization
• Rack-aware partition assignment (0.10)
• Read affinity (future?)
Kafka
producers producers
consumer consumer
DC 1 DC 2
33
• Same region: essentially same network
• asymmetric partitioning is rare, low latency
• Need at least 3 DCs for Zookeeper
• Reserved instance to reduce churns
• EIP for external clients, private IPs for internal communication
• Reserved instance, local storage
Example: EC2 multi-AZ Deployment
34
Take-aways
• Multi-DC: trade-off between latency and consistency
• Kafka: replicated log streams for multihoming
Thank you
Guozhang | guozhang@confluent.io | @guozhangwang
Meet Confluent in booth #838 

Confluent University ~ Kafka training ~ confluent.io/training
Join the Stream Data Hackathon Apr 25, SF

kafka-summit.org/hackathon/
Download Apache Kafka
& Confluent Platform
confluent.io/download

Más contenido relacionado

La actualidad más candente

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDatabricks
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...HostedbyConfluent
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsSandesh Rao
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...confluent
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!Databricks
 

La actualidad más candente (20)

Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things Right
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
 

Similar a Building Stream Infrastructure across Multiple Data Centers with Apache Kafka

Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?Ricardo Paiva
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Ontico
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Johnny Miller
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverstonbcoverston
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...confluent
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...ScyllaDB
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaLevon Avakyan
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 
As fast as a grid, as safe as a database
As fast as a grid, as safe as a databaseAs fast as a grid, as safe as a database
As fast as a grid, as safe as a databasegojkoadzic
 

Similar a Building Stream Infrastructure across Multiple Data Centers with Apache Kafka (20)

Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverston
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
As fast as a grid, as safe as a database
As fast as a grid, as safe as a databaseAs fast as a grid, as safe as a database
As fast as a grid, as safe as a database
 

Más de Guozhang Wang

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfGuozhang Wang
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Guozhang Wang
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Guozhang Wang
 
Introduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaIntroduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaGuozhang Wang
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedGuozhang Wang
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingGuozhang Wang
 
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingGuozhang Wang
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaGuozhang Wang
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedInGuozhang Wang
 
Behavioral Simulations in MapReduce
Behavioral Simulations in MapReduceBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduceGuozhang Wang
 
Automatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsAutomatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsGuozhang Wang
 

Más de Guozhang Wang (14)

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
 
Introduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaIntroduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of Kafka
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
 
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Behavioral Simulations in MapReduce
Behavioral Simulations in MapReduceBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduce
 
Automatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsAutomatic Scaling Iterative Computations
Automatic Scaling Iterative Computations
 

Último

DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxNadaHaitham1
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 

Último (20)

DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 

Building Stream Infrastructure across Multiple Data Centers with Apache Kafka

  • 1. When One Data Center is not Enough Guozhang Wang Strata San Jose, 2016 Building large-scale stream infrastructure across multiple data centers with Apache Kafka
  • 2. 2 • Why across Data Centers? • Design patterns for Multi-DC • Kafka for Multi-DC • Conclusion Agenda
  • 3. 3 Why across Data Centers?
  • 4. 4 Why across Data Centers • Catastrophic / expected failures • Routine maintenance • Geo-locality (Example: CDNs)
  • 5. 5 Why NOT across Data Centers • Low bandwidth (10Mbps - 1Gbps) • High latency (50ms - 450ms) • Much More $$$
  • 6. 6 Why NOT across Data Centers • … is hard and expensive
  • 7. 7 Why NOT across Data Centers • … is hard and expensive • … with real-time writes? Harder
  • 8. 8 Why NOT across Data Centers • … is hard and expensive • … with real-time writes? Harder • … consistently? Oh My!
  • 9. 9 Consistency • Weak • Eventual • Strong Latency Guarantee
  • 10. 10 Weak No Consistency • Now you see my writes, now you don’t • Best effort only, data can be stale • Examples: think of “caches”, VoIP
  • 11. 11 Eventual Consistency • You will see my writes, … eventually • May need to resolve conflicts (manually) • Examples: think of “emails”, SMTP
  • 12. 12 Strong Consistency • You get what you write, for sure • External > Sequential > Causal (Session) • Examples: RDBMS, file systems
  • 13. 13 • LAN: consistency over latency • WAN: latency over consistency Latency vs. Consistency
  • 14. 14 • Why across Data Centers? • Design patterns for Multi-DC • Kafka for Multi-DC • Conclusion Agenda
  • 15. 15 Option I: Don’t do it • Bunkerize the single data center • Expect data loss at failures • Examples: ??
  • 16. 16 Option II: Primary with Hot Standby • Failover to hot standby (maybe inconsistent) • Window of data loss at failures • Examples: MySQL binlog
  • 17. 17 Option III: Active-Active • Accepts writes in multi-DC • Resolve conflicts (strong / week consistency) • Examples: Amazon DynamoDB (vector clock) Google Spanner (2PC), Mesa (Paxos)
  • 19. 19 Ordering is Key • Vector clocks: partial ordering • Paxos, 2PC: global ordering • Log shipping: logical ordering (per-partition)
  • 20.
  • 21. 21 Apache Kafka • A distributed messaging system ..that store messages as a log!
  • 22. 22 Store Messages as a Log 4 5 5 7 8 9 10 11 12... Producer Write Consumer1 Reads (offset 7) Consumer2 Reads (offset 10) Messages 3
  • 23. 23 Partition the Log across Machines Topic 1 Topic 2 Partitions Producers Producers Consumers Consumers Brokers
  • 24. 24 ACK mode Latency On Failures “no" no network delay some data loss “leader" 1 network roundtrip a few data loss “all" ~2 network roundtrips no data loss Configurable ISR Commits
  • 25. 25 • Why across Data Centers? • Design patterns for Multi-DC • Kafka for Multi-DC • Conclusion Agenda
  • 26. 26 Option I: Active-Passive Replication Kafka local producers consumer consumer DC 1 MirrorMaker DC 2 Kafka replica
  • 27. 27 Option I: Active-Passive Replication • Async- replication across DC • May lose data on failover • Example: ETL to data warehouse / HDFS Kafka local producers consumer consumer DC 1 MirrorMaker DC 2 Kafka replica
  • 28. 28 Option II: Active-Active Replication Kafka local Kafka aggregate Kafka aggregate producers producers consumer consumer MirrorMaker Kafka local on DC1 failure DC 1 DC 2
  • 29. 29 Option II: Active-Active Replication • Global view on agg. cluster • Require offsets to resume • Example: store materialization, index updates Kafka local Kafka agg Kafka agg producers producers consumer consumer MirrorMaker Kafka local on DC1 failure DC 1 DC 2
  • 30. 30 • Offsets not identical between Kafka clusters • Duplicates during failover • Partition selection may be different • Solutions • Resume from log end offset (suitable for real-time apps) • Resume from a timestamp (ListOffsets, offset index: KIP-33) Caveats: offsets across DCs
  • 31. 31 Option III: Deploy across DCs Kafka producers producers consumer consumer DC 1 DC 2
  • 32. 32 Option III: Deploy across DCs • Multi-tenancy support • Security (0.9) • Quota Management (0.9) • Latency optimization • Rack-aware partition assignment (0.10) • Read affinity (future?) Kafka producers producers consumer consumer DC 1 DC 2
  • 33. 33 • Same region: essentially same network • asymmetric partitioning is rare, low latency • Need at least 3 DCs for Zookeeper • Reserved instance to reduce churns • EIP for external clients, private IPs for internal communication • Reserved instance, local storage Example: EC2 multi-AZ Deployment
  • 34. 34 Take-aways • Multi-DC: trade-off between latency and consistency • Kafka: replicated log streams for multihoming
  • 35. Thank you Guozhang | guozhang@confluent.io | @guozhangwang Meet Confluent in booth #838 
 Confluent University ~ Kafka training ~ confluent.io/training Join the Stream Data Hackathon Apr 25, SF
 kafka-summit.org/hackathon/ Download Apache Kafka & Confluent Platform confluent.io/download