How Much Kafka?

•

6 recomendaciones•934 vistas

confluent

How much Kafka? Gwen Shapira discusses the art and science of capacity planning.

Tecnología

1
How Much Kafka?
The Art and Science of Capacity Planning

2
About Me
● Gwen Shapira
● Principal Data Architect
● At Confluent
● Committer to Apache Kafka
● Tweets. A lot. @gwenshap

3
In Which We Answer Numeric
Questions About Kafka

4
Factors
● Performance Requirements
● Availability Requirements
● Stability concerns
● Organizational concerns
● Operations “comfort zone”
● Kafka limitations
● Costs
● Rate of upgrades

7
It is tradeoffs all the way down
● Retention - disk size
● Throughput - network, CPU
● Producer performance - disk IO
● Consumer performance - CPU, memory
Just performance
requirements!

9
Separate clusters for...
● DR
● Geographical distribution
● Writing vs Reading
● Real-time vs Batch
● Dev / Test
● High throughput
● Highly reliable
● Security

10
You will have many clusters.
Plan for it.

12
Kafka is built to scale horizontally
● Largest cluster: 200+ nodes
● Lots of work on improving controller in 1.1, 2.0
● Larger / more loaded brokers mean longer restarts and recovery.
● Larger brokers require tuning to take full advantage.

13
Disks depends on version
- Before 1.1: RAID 10 recommended
- 1.1 and up:
- KIP 112 - Broker will survive loss of disk
- KIP 113 - Can assign replicas to specific disks

18
From capacity perspective
Topics don’t exist.

20
The 25K question
1. Read Jun Rao blog on topic
2. More partitions == more scale
3. More partitions == more throughput
4. More partitions != more speed
5. Controller improvements in 1.1 mean more
partitions per broker

21
Do not design for partition
per user or device!

23
# of partitions
*
Max Fetch Size
+
Compaction Buffer
+
“Extra”

26
Consumer groups have been tested
at 50+ consumers per group

27
Not all clients are same
1. Producers have very high throughput
2. Especially when tuned
3. EOS / Order require single-writer-for-entity
4. Many consumer groups is where Kafka shines

29
Benchmarks
● Kafka ships with performance tools
● And your fav language has tools
● Your own workload (or similar)
● Your own configuration
● Your own failure scenarios

30
Tuning
● Don’t fly blind
● Why is it slow?
● Where is the bottleneck?
● Version control for all configuration
● Automate the
“change->test->observe” loop

31
My broker is slow 101
● Are all brokers working?
● Did you saturate network capacity?
● Is CPU utilization high?
● Are you running an old version?
● Do you have HUGE messages?
● Is it really the broker?

32
Capture as many metrics as possible
Alert / Dashboard select few

33
Do you enjoy tuning, capacity
planning, monitoring and all that?
If yes - we are hiring.
If not - check out Confluent Cloud

34
Questions?
https://slackpass.io/confluentcommunity
https://www.confluent.io/blog
https://confluent.cloud
@gwenshap
@confluentinc

35
Resources
http://tutorials.jenkov.com/java-performance/jmh.html
http://notes.stephenholiday.com/Kafka.pdf
https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600
https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster

Más contenido relacionado

La actualidad más candente

Pulsar - flexible pub-sub for internet scaleMatteo Merli

Kafka Summit SF 2017 - Running Kafka as a Service at Scaleconfluent

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...confluent

Introduction to VagrantMesut Özen

State of the CLI- Kat MarchanNodejsFoundation

Ensuring Consistency in a Replicated WorldYelp Engineering

Introduction to Apache KafkaShiao-An Yuan

Gluster Metrics: why they are crucial for running stable deployments of all s...Gluster.org

Nginx in productionGraham Weldon

Spring Boot+Kafka: the New Enterprise PlatformVMware Tanzu

MySQL Multi-Master ReplicationMichael Naumov

Apache Kafka Architecture & Fundamentals Explainedconfluent

Local Development EnvironmentsJoe Casabona

FlurryDB: A Dynamically Scalable Relational Database with Virtual Machine Clo...Michael Mior

Apache Kafka - Free FridayOtávio Carvalho

Kafka At Scale in the Cloudconfluent

Devoxx Morocco 2016 - Microservices with KafkaLászló-Róbert Albert

Nyc kafka meetup 2015 - when bad things happen to good kafka clustersGwen (Chen) Shapira

Multi-DC Kafkaconfluent

Introducing Exactly Once Semantics To Apache KafkaApurva Mehta

La actualidad más candente (20)

Pulsar - flexible pub-sub for internet scale

Kafka Summit SF 2017 - Running Kafka as a Service at Scale

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...

Introduction to Vagrant

State of the CLI- Kat Marchan

Ensuring Consistency in a Replicated World

Introduction to Apache Kafka

Gluster Metrics: why they are crucial for running stable deployments of all s...

Nginx in production

Spring Boot+Kafka: the New Enterprise Platform

MySQL Multi-Master Replication

Apache Kafka Architecture & Fundamentals Explained

Local Development Environments

FlurryDB: A Dynamically Scalable Relational Database with Virtual Machine Clo...

Apache Kafka - Free Friday

Kafka At Scale in the Cloud

Devoxx Morocco 2016 - Microservices with Kafka

Nyc kafka meetup 2015 - when bad things happen to good kafka clusters

Multi-DC Kafka

Introducing Exactly Once Semantics To Apache Kafka

Similar a How Much Kafka?

Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community

Apache Kafka's Common Pitfalls & Intricacies: A Customer Support PerspectiveHostedbyConfluent

kafkaAriel Moskovich

Buytaert kris my_sql-pacemakerkuchinskaya

Apache KAfkaPedro Alcantara

Non-Kafkaesque Apache Kafka - Yottabyte 2018Otávio Carvalho

Cognos Performance Tuning Tips & TricksSenturus

Tips & Tricks for Apache Kafka®confluent

Kafka at scale facebook israelGwen (Chen) Shapira

Scaling apps for the big timeproitconsult

Still All on One Server: Perforce at Scale Perforce

Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Bob Pusateri

Ambedded - how to build a true no single point of failure ceph cluster inwin stack

Oracle Performance On Linux X86 systems Baruch Osoveskiy

Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri

Our Multi-Year Journey to a 10x Faster Confluent CloudHostedbyConfluent

Tuning Linux Windows and Firebird for Heavy WorkloadMarius Adrian Popa

Architecture for building scalable and highly available Postgres ClusterAshnikbiz

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data

Taking Splunk to the Next Level - Architecture Breakout SessionSplunk

Similar a How Much Kafka? (20)

Ceph Community Talk on High-Performance Solid Sate Ceph

Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective

kafka

Buytaert kris my_sql-pacemaker

Apache KAfka

Non-Kafkaesque Apache Kafka - Yottabyte 2018

Cognos Performance Tuning Tips & Tricks

Tips & Tricks for Apache Kafka®

Kafka at scale facebook israel

Scaling apps for the big time

Still All on One Server: Perforce at Scale

Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...

Ambedded - how to build a true no single point of failure ceph cluster

Oracle Performance On Linux X86 systems

Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)

Our Multi-Year Journey to a 10x Faster Confluent Cloud

Tuning Linux Windows and Firebird for Heavy Workload

Architecture for building scalable and highly available Postgres Cluster

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...

Taking Splunk to the Next Level - Architecture Breakout Session

Más de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent

Santander Stream Processing with Apache Flinkconfluent

Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent

Workshop híbrido: Stream Processing con Flinkconfluent

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent

AWS Immersion Day Mapfre - Confluentconfluent

Eventos y Microservicios - Santander TechTalkconfluent

Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent

Citi TechTalk Session 2: Kafka Deep Diveconfluent

Build real-time streaming data pipelines to AWS with Confluentconfluent

Q&A with Confluent Professional Services: Confluent Service Meshconfluent

Citi Tech Talk: Event Driven Kafka Microservicesconfluent

Confluent & GSI Webinars series - Session 3confluent

Citi Tech Talk: Messaging Modernizationconfluent

Citi Tech Talk: Data Governance for streaming and real time dataconfluent

Confluent & GSI Webinars series: Session 2confluent

Data In Motion Paris 2023confluent

Confluent Partner Tech Talk with Synthesisconfluent

The Future of Application Development - API Days - Melbourne 2023confluent

The Playful Bond Between REST And Data Streamsconfluent

Más de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...

Santander Stream Processing with Apache Flink

Unlocking the Power of IoT: A comprehensive approach to real-time insights

Workshop híbrido: Stream Processing con Flink

Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...

AWS Immersion Day Mapfre - Confluent

Eventos y Microservicios - Santander TechTalk

Q&A with Confluent Experts: Navigating Networking in Confluent Cloud

Citi TechTalk Session 2: Kafka Deep Dive

Build real-time streaming data pipelines to AWS with Confluent

Q&A with Confluent Professional Services: Confluent Service Mesh

Citi Tech Talk: Event Driven Kafka Microservices

Confluent & GSI Webinars series - Session 3

Citi Tech Talk: Messaging Modernization

Citi Tech Talk: Data Governance for streaming and real time data

Confluent & GSI Webinars series: Session 2

Data In Motion Paris 2023

Confluent Partner Tech Talk with Synthesis

The Future of Application Development - API Days - Melbourne 2023

The Playful Bond Between REST And Data Streams

Último

Key Features Of Token Development (1).pptxLBM Solutions

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Slack Application Development 101 Slidespraypatel2

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

How to Remove Document Management Hurdles with X-Docs?XfilesPro

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Pigging Solutions in Pet Food ManufacturingPigging Solutions

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Understanding the Laravel MVC ArchitecturePixlogix Infotech

How Much Kafka?

1. 1 How Much Kafka? The Art and Science of Capacity Planning

2. 2 About Me ● Gwen Shapira ● Principal Data Architect ● At Confluent ● Committer to Apache Kafka ● Tweets. A lot. @gwenshap

3. 3 In Which We Answer Numeric Questions About Kafka

4. 4 Factors ● Performance Requirements ● Availability Requirements ● Stability concerns ● Organizational concerns ● Operations “comfort zone” ● Kafka limitations ● Costs ● Rate of upgrades

5. 5 Choose 2: Easy to Operate Cheap Fast

6. 6 Choose 2:

7. 7 It is tradeoffs all the way down ● Retention - disk size ● Throughput - network, CPU ● Producer performance - disk IO ● Consumer performance - CPU, memory Just performance requirements!

8. 8 How Many Clusters?

9. 9 Separate clusters for... ● DR ● Geographical distribution ● Writing vs Reading ● Real-time vs Batch ● Dev / Test ● High throughput ● Highly reliable ● Security

10. 10 You will have many clusters. Plan for it.

11. 11 How Many Brokers?

12. 12 Kafka is built to scale horizontally ● Largest cluster: 200+ nodes ● Lots of work on improving controller in 1.1, 2.0 ● Larger / more loaded brokers mean longer restarts and recovery. ● Larger brokers require tuning to take full advantage.

13. 13 Disks depends on version - Before 1.1: RAID 10 recommended - 1.1 and up: - KIP 112 - Broker will survive loss of disk - KIP 113 - Can assign replicas to specific disks

14. 14 How Many Zookeepers?

15. 15 Right now: 3 or 5

16. 16 Future: 0

17. 17 How Many Topics?

18. 18 From capacity perspective Topics don’t exist.

19. 19 How Many Partitions?

20. 20 The 25K question 1. Read Jun Rao blog on topic 2. More partitions == more scale 3. More partitions == more throughput 4. More partitions != more speed 5. Controller improvements in 1.1 mean more partitions per broker

21. 21 Do not design for partition per user or device!

22. 22 How Much Memory?

23. 23 # of partitions * Max Fetch Size + Compaction Buffer + “Extra”

24. 24 How Many clients?

25. 25 Lots. 50K has been done.

26. 26 Consumer groups have been tested at 50+ consumers per group

27. 27 Not all clients are same 1. Producers have very high throughput 2. Especially when tuned 3. EOS / Order require single-writer-for-entity 4. Many consumer groups is where Kafka shines

28. 28 How Much Throughput?

29. 29 Benchmarks ● Kafka ships with performance tools ● And your fav language has tools ● Your own workload (or similar) ● Your own configuration ● Your own failure scenarios

30. 30 Tuning ● Don’t fly blind ● Why is it slow? ● Where is the bottleneck? ● Version control for all configuration ● Automate the “change->test->observe” loop

31. 31 My broker is slow 101 ● Are all brokers working? ● Did you saturate network capacity? ● Is CPU utilization high? ● Are you running an old version? ● Do you have HUGE messages? ● Is it really the broker?

32. 32 Capture as many metrics as possible Alert / Dashboard select few

33. 33 Do you enjoy tuning, capacity planning, monitoring and all that? If yes - we are hiring. If not - check out Confluent Cloud

34. 34 Questions? https://slackpass.io/confluentcommunity https://www.confluent.io/blog https://confluent.cloud @gwenshap @confluentinc

35. 35 Resources http://tutorials.jenkov.com/java-performance/jmh.html http://notes.stephenholiday.com/Kafka.pdf https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600 https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster

36. 36

How Much Kafka?

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a How Much Kafka?

Similar a How Much Kafka? (20)

Más de confluent

Más de confluent (20)

Último

Último (20)

How Much Kafka?