SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
HIGH PERFORMANCE MESSAGING
WITH APACHE PULSAR
http://pulsar.apache.org
WHAT IS PULSAR?
“Pub-Sub messaging backed by durable log storage”
WHAT IS APACHE PULSAR?
3
Multi-tenancy
A single cluster can
support many tenants
and use cases
Ordering
Guaranteed ordering
Durability
Data replicated and
synced to disk
Delivery Guarantees
At least once, at most
once and effectively
once
Highly scalable
Can support millions
of topics
Unified messaging model
Support both Topic &
Queue semantic in a
single model
Geo-replication
Out of box support for
geographically distributed
applications
High throughput
Can reach 1.8 M
messages/s in a single
partition
Low Latency
Low publish latency of
5ms at 99pct
MESSAGING MODEL
DEFINING PERFORMANCE
DISTRIBUTEDVSVERTICAL
• Focus is very different
• While both target overall system performance
• Distributed systems generally focus on distributed perf
6
DISTRIBUTED SYSTEMS PERFORMANCE
• Key factors:
• How different components interact
• How data is replicated
• Can each component make progress while waiting for other
7
STATEFUL SYSTEMS
• Macro optimizations typically yield orders of magnitude differences
• Ensure throughput is not bottlenecked by waiting
• In failure path (eg: how to replace a failed node)
• In ops tasks (eg: how to expand a cluster)
8
STATEFUL SYSTEMS
• Stateful systems can become unbalanced when traffic changes
• The system needs to be designed to allow for quick reaction,
distributing the load across all nodes
9
VERTICAL OPTIMIZATIONS
• Ensure a single machine can give the max throughput
• Optimize thread access
• Concurrent data structures
• Micro-Profiling
10
ARCHITECTURALVIEW
SEGMENT
CENTRIC
STORAGE
• In addition to partitioning,
messages are stored in segments
(based on time and size)
• Segments are independent from
each others and spread across
all storage nodes
SEGMENT CENTRIC
• Unbounded log storage
• Instant scaling without data rebalancing
• High write and read availability via maximized data placement options
• Fast replica repair — many-to-many read
13
SEGMENTSVS PARTITIONS
COMPARISON WITH APACHE KAFKA
• In Kafka, partitions are sticky to brokers
• A single partition is stored entirely in a single node
• Retention is limited by a single node storage capacity
• Failure recovery and capacity expansion require “rebalancing”
• Rebalancing has a big impact over the system, affecting regular traffic
15
DATA PATH
1 — Publisher sends message to broker
DATA PATH
2 — Broker writes in parallel to N replicas
DATA PATH
3 — Wait for a quorum of acks from bookies
DATA PATH
4 — Send ack to producer — Dispatch to consumer
BOOKKEEPER REPLICATION MODEL
• Single writer (Pulsar broker)
• Write in parallel to multiple storage nodes
• Wait for a configurable number of acks
• Supports quorum writes (eg: write 3 nodes — wait 2 acks)
• Perform recovery only after writer crashes
• Establish what was the last committed entry and “seal” the segment
20
KAFKA REPLICATION MODEL
ISR replication
LIMITATIONS OF KAFKA REPLICATION
• When followers are “in-sync”, the leader will have to wait for them —
cannot prune the slowest follower
• Leader election happens per partition
• To ensure ordering, only 1 message (or batch) can be outstanding —
limits throughput
• Reads can only happen from leader broker
22
STORAGE
STORAGE
• Disk access patterns can lead to order of magnitude differences
• Systems that rely on page-cache have unpredictable performance
• Page cache is RAM speed until the system is under stress
• After that, memory accesses can take 100s of millis
24
BOOKKEEPER INTERNAL
BOOKKEEPER STORAGE
• IO isolation between write and read operations
• Slow consumers won’t impact latency
• Very effective IO patterns:
• Journal — append only and no reads
• Storage device — bulk write and sequential reads
• Number of files is independent from number of topics
26
KAFKA STORAGE
• Multiple files per each partitions — each segment has data
• Lot of file descriptors needed
• Page cache only works well until active data set exceed RAM size
• IO is scattered throughout the disk
27
OPTIMIZATIONS
OPTIMIZATIONS
• Payload Buffer pooling — Direct memory — No heap pollution
• Object pooling in data path — minimize GC work
• Serialize operations to thread to avoid mutex contention
• Pulsar brokers acts as a “proxy” — Payloads are forwarded with
zero-copies from producers to storage and consumers
29
BENCHMARK
OPENMESSAGING
BENCHMARK
openmessaging.cloud
openmessaging.cloud/docs/benchmarks
BENCHMARK FRAMEWORK
• Designed to measure performance of distributed messaging systems
• Supports various “drivers” (Kafka, Pulsar, RocketMQ, RabbitMQ)
• Automated deployment in EC2
• Configure workloads through aYAML file
32
DISTRIBUTED EXECUTION
Coordinator will take the workload definition and propagate to multiple
workers — Collects and reports stats
BENCHMARK RESULTS
• Testing goals
• Throughput & latency under different conditions
• Min 2 guaranteed copies
• Running on 3 EC2VMs with local SSDs
34
KAFKA SETTINGS
• Topic settings
replicationFactor=3
min.insync.replicas=2
log.flush.interval.ms= # Using default: means no fsyncs
• Kafka producer config
acks=all
linger.ms=1
batch.size=131072
35
PULSAR / BOOKKEEPER SETTINGS
• Use ensemble=3 write=3 ack=2
• Write to 3 bookies and wait for 2 acks
• Data synced on disk before ack
36
MaxThroughput
1Topic
1 Partition
1KB payload
MaxThroughput
—
Exactly once
producer
1Topic
1 Partition
1KB payload
—
Kafka settings:

enable.idempotence=true
max.in.flight.requests.per.connection=1
retries=2147483647
Latency at fixed
throughput
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
99pct
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
(including Kafka-sync)
50K msg/s
1Topic
1 Partition
1KB payload
Latency at fixed
throughput
—
99pct
50K msg/s
1Topic
1 Partition
1KB payload
OPTIMIZING FOR LOW LATENCY
• Testing at a smaller throughput for sub-millisecond latency
• Tested on bare metal server
• Single machine to isolate impact of slow networks
43
HARDWARE SETUP
• 1 Machine — Bare metal
• 12 CPU cores — Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
• 128 GB RAM
• 2 x 1.2TB NVMe disks
44
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload
QUESTIONS ?

Más contenido relacionado

La actualidad más candente

October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupKarthik Ramasamy
 
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...
Messaging, storage, or both?  The real time story of Pulsar and Apache Distri...Messaging, storage, or both?  The real time story of Pulsar and Apache Distri...
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...Streamlio
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemStreamNative
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overviewiamtodor
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceSijie Guo
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...StreamNative
 
Apache Bookkeeper and Apache Zookeeper for Apache Pulsar
Apache Bookkeeper and Apache Zookeeper for Apache PulsarApache Bookkeeper and Apache Zookeeper for Apache Pulsar
Apache Bookkeeper and Apache Zookeeper for Apache PulsarEnrico Olivelli
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaLászló-Róbert Albert
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanStreamNative
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarStreamNative
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 

La actualidad más candente (20)

October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 
Apache Pulsar Seattle - Meetup
Apache Pulsar Seattle - MeetupApache Pulsar Seattle - Meetup
Apache Pulsar Seattle - Meetup
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
kafka
kafkakafka
kafka
 
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...
Messaging, storage, or both?  The real time story of Pulsar and Apache Distri...Messaging, storage, or both?  The real time story of Pulsar and Apache Distri...
Messaging, storage, or both? The real time story of Pulsar and Apache Distri...
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage Service
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Apache Bookkeeper and Apache Zookeeper for Apache Pulsar
Apache Bookkeeper and Apache Zookeeper for Apache PulsarApache Bookkeeper and Apache Zookeeper for Apache Pulsar
Apache Bookkeeper and Apache Zookeeper for Apache Pulsar
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 

Similar a High performance messaging with Apache Pulsar

Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Otávio Carvalho
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmSumit Jain
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarKarthik Ramasamy
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPCMax Alexejev
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free FridayOtávio Carvalho
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaLevon Avakyan
 
Building zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafkaBuilding zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafkaAvinash Ramineni
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331Fengchang Xie
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Multicore processor.pdf
Multicore processor.pdfMulticore processor.pdf
Multicore processor.pdfrajaratna4
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMAvailability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMHostedbyConfluent
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache KafkaSaumitra Srivastav
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 

Similar a High performance messaging with Apache Pulsar (20)

Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - Paytm
 
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache PulsarUnifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
World of Tanks Experience of Using Kafka
World of Tanks Experience of Using KafkaWorld of Tanks Experience of Using Kafka
World of Tanks Experience of Using Kafka
 
Building zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafkaBuilding zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Multicore processor.pdf
Multicore processor.pdfMulticore processor.pdf
Multicore processor.pdf
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
kafka
kafkakafka
kafka
 
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMAvailability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 

Último

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 

Último (20)

notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 

High performance messaging with Apache Pulsar

  • 1. HIGH PERFORMANCE MESSAGING WITH APACHE PULSAR http://pulsar.apache.org
  • 2. WHAT IS PULSAR? “Pub-Sub messaging backed by durable log storage”
  • 3. WHAT IS APACHE PULSAR? 3 Multi-tenancy A single cluster can support many tenants and use cases Ordering Guaranteed ordering Durability Data replicated and synced to disk Delivery Guarantees At least once, at most once and effectively once Highly scalable Can support millions of topics Unified messaging model Support both Topic & Queue semantic in a single model Geo-replication Out of box support for geographically distributed applications High throughput Can reach 1.8 M messages/s in a single partition Low Latency Low publish latency of 5ms at 99pct
  • 6. DISTRIBUTEDVSVERTICAL • Focus is very different • While both target overall system performance • Distributed systems generally focus on distributed perf 6
  • 7. DISTRIBUTED SYSTEMS PERFORMANCE • Key factors: • How different components interact • How data is replicated • Can each component make progress while waiting for other 7
  • 8. STATEFUL SYSTEMS • Macro optimizations typically yield orders of magnitude differences • Ensure throughput is not bottlenecked by waiting • In failure path (eg: how to replace a failed node) • In ops tasks (eg: how to expand a cluster) 8
  • 9. STATEFUL SYSTEMS • Stateful systems can become unbalanced when traffic changes • The system needs to be designed to allow for quick reaction, distributing the load across all nodes 9
  • 10. VERTICAL OPTIMIZATIONS • Ensure a single machine can give the max throughput • Optimize thread access • Concurrent data structures • Micro-Profiling 10
  • 12. SEGMENT CENTRIC STORAGE • In addition to partitioning, messages are stored in segments (based on time and size) • Segments are independent from each others and spread across all storage nodes
  • 13. SEGMENT CENTRIC • Unbounded log storage • Instant scaling without data rebalancing • High write and read availability via maximized data placement options • Fast replica repair — many-to-many read 13
  • 15. COMPARISON WITH APACHE KAFKA • In Kafka, partitions are sticky to brokers • A single partition is stored entirely in a single node • Retention is limited by a single node storage capacity • Failure recovery and capacity expansion require “rebalancing” • Rebalancing has a big impact over the system, affecting regular traffic 15
  • 16. DATA PATH 1 — Publisher sends message to broker
  • 17. DATA PATH 2 — Broker writes in parallel to N replicas
  • 18. DATA PATH 3 — Wait for a quorum of acks from bookies
  • 19. DATA PATH 4 — Send ack to producer — Dispatch to consumer
  • 20. BOOKKEEPER REPLICATION MODEL • Single writer (Pulsar broker) • Write in parallel to multiple storage nodes • Wait for a configurable number of acks • Supports quorum writes (eg: write 3 nodes — wait 2 acks) • Perform recovery only after writer crashes • Establish what was the last committed entry and “seal” the segment 20
  • 22. LIMITATIONS OF KAFKA REPLICATION • When followers are “in-sync”, the leader will have to wait for them — cannot prune the slowest follower • Leader election happens per partition • To ensure ordering, only 1 message (or batch) can be outstanding — limits throughput • Reads can only happen from leader broker 22
  • 24. STORAGE • Disk access patterns can lead to order of magnitude differences • Systems that rely on page-cache have unpredictable performance • Page cache is RAM speed until the system is under stress • After that, memory accesses can take 100s of millis 24
  • 26. BOOKKEEPER STORAGE • IO isolation between write and read operations • Slow consumers won’t impact latency • Very effective IO patterns: • Journal — append only and no reads • Storage device — bulk write and sequential reads • Number of files is independent from number of topics 26
  • 27. KAFKA STORAGE • Multiple files per each partitions — each segment has data • Lot of file descriptors needed • Page cache only works well until active data set exceed RAM size • IO is scattered throughout the disk 27
  • 29. OPTIMIZATIONS • Payload Buffer pooling — Direct memory — No heap pollution • Object pooling in data path — minimize GC work • Serialize operations to thread to avoid mutex contention • Pulsar brokers acts as a “proxy” — Payloads are forwarded with zero-copies from producers to storage and consumers 29
  • 32. BENCHMARK FRAMEWORK • Designed to measure performance of distributed messaging systems • Supports various “drivers” (Kafka, Pulsar, RocketMQ, RabbitMQ) • Automated deployment in EC2 • Configure workloads through aYAML file 32
  • 33. DISTRIBUTED EXECUTION Coordinator will take the workload definition and propagate to multiple workers — Collects and reports stats
  • 34. BENCHMARK RESULTS • Testing goals • Throughput & latency under different conditions • Min 2 guaranteed copies • Running on 3 EC2VMs with local SSDs 34
  • 35. KAFKA SETTINGS • Topic settings replicationFactor=3 min.insync.replicas=2 log.flush.interval.ms= # Using default: means no fsyncs • Kafka producer config acks=all linger.ms=1 batch.size=131072 35
  • 36. PULSAR / BOOKKEEPER SETTINGS • Use ensemble=3 write=3 ack=2 • Write to 3 bookies and wait for 2 acks • Data synced on disk before ack 36
  • 38. MaxThroughput — Exactly once producer 1Topic 1 Partition 1KB payload — Kafka settings:
 enable.idempotence=true max.in.flight.requests.per.connection=1 retries=2147483647
  • 39. Latency at fixed throughput 50K msg/s 1Topic 1 Partition 1KB payload
  • 40. Latency at fixed throughput — 99pct 50K msg/s 1Topic 1 Partition 1KB payload
  • 41. Latency at fixed throughput — (including Kafka-sync) 50K msg/s 1Topic 1 Partition 1KB payload
  • 42. Latency at fixed throughput — 99pct 50K msg/s 1Topic 1 Partition 1KB payload
  • 43. OPTIMIZING FOR LOW LATENCY • Testing at a smaller throughput for sub-millisecond latency • Tested on bare metal server • Single machine to isolate impact of slow networks 43
  • 44. HARDWARE SETUP • 1 Machine — Bare metal • 12 CPU cores — Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz • 128 GB RAM • 2 x 1.2TB NVMe disks 44
  • 45. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload
  • 46. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload
  • 47. Latency at low throughput 1K msg/s 1Topic 1 Partition 1KB payload