SlideShare una empresa de Scribd logo
1 de 51
Scalable Complex Event Processing On
Samza @Uber
Shuyi Chen
Uber Technologies Inc.
● 6 continents, 70 countries, 400+ cities
● Transportation as reliable as running water, everywhere,
for everyone
Uber
Outline
● Motivation
● Architecture
● Limitations
● Challenges
Outline
● Motivation
● Architecture
● Limitations
● Challenges
Uber is a data-driven company
Thousands of Kafka topics from different services
We can extract a lot of useful information from this
rich set of logs in real-time!
Multiple logins from the same IP within a short
interval
Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Partners reject the second pickup of a UberPOOL
trip
Multiple logins from the same IP within a short
interval
Window Aggregation
Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Pattern detection
Partners reject the second pickup of a UberPOOL
trip
Filter
Can we use declarative semantics to specify these
stream processing logics?
Complex event processing
● Combines data from multiple sources to infer events or patterns that suggest
more complicated circumstances
● CEP is used across many industries for various use cases, including:
○ Finance: Trade analysis, fraud detection
○ Airlines: Operations monitoring
○ Healthcare: Claims processing, patient monitoring
○ Energy and Telecommunications: Outage detection
● CEP uses declarative rule/query language to specify event processing logic
Siddhi: Complex event processing engine
● Lightweight, extensible, open source, released as a Java library
● Features supported
○ Filter
○ Join
○ Aggregation
○ Group by
○ Window
○ Pattern processing
○ Sequence processing
○ Event tables
○ Event-time processing
○ Declarative query language: SiddhiQL
How Siddhi works
● Specify processing logic declaratively with SiddhiQL
How Siddhi works
● Query is parsed at runtime into an execution plan runtime
● As events flow in, the execution plan runtime process events inside the CEP
engine according the query logic
How can we make it scalable at Uber scale?
Samza
● A distributed stream processing framework
○ Scalable
○ Built-in State management
○ Built-in fault tolerant
○ At-least-once message processing
● Good support from our data infra team
How can we make the stream processing output
useful?
Actions
● Generalize a set of common action templates to make it easy for services and
human to harness the power of realtime stream processing
● Currently we support
○ Make an RPC call
○ Invoke a Webhook endpoint
○ Index to ElasticSearch
○ Write Cassandra
○ Kafka
○ Statsd
○ Chat service
○ Email
○ Push notification
Actions
Real-time Scalable Complex Event Processing
Outline
● Motivation
● Architecture
● Limitations
● Challenges
Preprocessor
● Enrich raw Kafka events with business information
Shuffler
● Re-shuffle events
● Prefiltering for predicate pushdown
Complex event processor
● Parse Siddhi queries into execution plan runtime
● Process events in Siddhi execution plan runtime
● Checkpoint state regularly to ensure recovery upon crash/restart using
RocksDB
Action processor
● Execute actions upon the complex event output
● Support various kinds of actions for easy integration
● Implement configurable and finite action retry mechanism using RocksDB
No stream processing logic is hard-coded in the data
pipeline
REST API backend
● All queries, actions, shuffling logics and pre-filtering logics are stored
externally in Cassandra
● RESTFUL API for CRUD operations
● Data pipeline automatically reload the data upon update w/o job restart
○ fast data exploration
○ Realtime feedback loop
○ incremental DAG construction
● Decouple processing logic from the data pipeline
Unified management and monitoring
● Every use case
○ share the same data pipeline architecture
○ Use queries and actions to describe its processing logic
● A single monitoring template can be reused across different use cases
Applications
● Real-time fraud detection
● Real-time anomaly detection
● Real-time marketing campaign
● Real-time promotion
● Real-time monitoring
● Real-time feedback system
● Real-time analytics
● Real-time visualizations
● And etc.
Outline
● Motivation
● Architecture
● Limitations
● Challenges
Not a general purpose stream processing system
No dynamic topology
● The DAG is not dynamic
● Can not shuffle arbitrary number of times
● Ideally, we can chain multiple copies of the data pipeline to build arbitrary
DAG
○ Large DAG can be difficult to manage and monitor
○ Samza use Kafka as intermediate message queue between jobs, wide DAGs cause large load
on Kafka
○ Out of 40+ use cases we run in production, none requires it.
Out-of-order event handling
● Not a big concern
○ Events of the same rider/partner are usually seconds aparts
● K-slack extension in Siddhi for out-of-order event processing
Job deployment
● Samza job creation is semi-automated
○ Auto-generate standard job properties
○ JVM memory tuning
○ Samza parameter tuning, e.g. container count
● Integrate with in-house cluster job management system to simplify
start/restart/stop/upgrade of Samza jobs
Predicate pushdown
● Allow prefiltering of streams in shuffle stage
● Need manual configuration through Web UI
● In the future, we can automate this by query analysis
Outline
● Motivation
● Architecture
● Limitations
● Challenges
Broadcast stream
● We need broadcast stream to broadcast updates in storage backend to the
data pipeline
● No broadcast stream in Samza 0.9.1
● Override SystemStreamPartitionGrouper
● Samza 0.10.0 added broadcast support (SAMZA-676)
Unbalanced task workload
● Shufflers ingest multiple topics with different partition counts
● Default task partition assignment does not scale
● Override SystemStreamPartitionGrouper to balance the partitions across all
tasks
Large checkpointing state
● Samza use Kafka to log state changes
● Kafka message size limit to 1 MB by default
● Solution: we build logics to slice state into smaller pieces and checkpoint
them into Rocksdb
Synchronous checkpointing
● If state is large, time to checkpoint can be long
● Samza uses single-threaded model, unsafe to do it asynchronously
● Ongoing work on multi-thread support in Samza (SAMZA-863)
Exactly once state processing?
● Can not commit state and offset atomically
● No exactly once state processing
Debugging
● Need to inspect multiple logs to diagnose Samza job problems
○ Application master log
○ Multiple container logs
○ Log size is huge
○ Container logs are difficult to locate after job failure
● Sometimes, Samza job get stuck at launch, and no log can be found
○ YARN problem
○ Binary downloading problem
Upgrading Samza jobs
● Upgrade Samza jobs require a full restart, and can take minutes due to
○ Offset checkpointing topic too large → set retention to hours
○ Changelog topic too large → set retention or enable compaction in Kafka or host affinity
(SAMZA-617)
● To minimize the interruption during upgrade, it would be nice to have
○ Rolling restart
○ Per container restart
Our solution: non-interrupted handoff
● For critical jobs, we use replication during upgrade
○ Start a shadow job
○ Upgrade shadow
○ Switch primary and shadow
○ Upgrade primary
○ Switch back
● Downside: require 2x capacity during upgrade
Manage complicated DAG
● Samza uses Kafka as message queue for intermediate processing output
○ This enables sharing of shuffler or preprocessor output among multiple downstream Samza
jobs
○ Increase resource efficiency
● This gradually results in a large and complicated DAG
○ Complicated dependencies between jobs
○ Jobs closer to the sources of the DAG becoming more and more critical
● In practice, we isolate DAGs by logical groups
Thank you

Más contenido relacionado

La actualidad más candente

Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
confluent
 

La actualidad más candente (20)

Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with Kafka
 
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Financial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise ScaleFinancial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise Scale
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
 
Long running processes in DDD
Long running processes in DDDLong running processes in DDD
Long running processes in DDD
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Avro
AvroAvro
Avro
 
Containers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes IstioContainers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes Istio
 
Consumer offset management in Kafka
Consumer offset management in KafkaConsumer offset management in Kafka
Consumer offset management in Kafka
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
Maintaining Consistency for a Financial Event-Driven Architecture (Iago Borge...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 

Destacado

Bases legales.- 10º Aniversario Centroamérica #destinosIberia
Bases legales.- 10º Aniversario Centroamérica #destinosIberiaBases legales.- 10º Aniversario Centroamérica #destinosIberia
Bases legales.- 10º Aniversario Centroamérica #destinosIberia
Iberia
 
My health record1
My health record1My health record1
My health record1
BEBESTRUMF1
 
I clienti parte dell'impresa: crowdfunding e crowdsourcing per testare, prom...
I clienti parte dell'impresa: crowdfunding e crowdsourcing  per testare, prom...I clienti parte dell'impresa: crowdfunding e crowdsourcing  per testare, prom...
I clienti parte dell'impresa: crowdfunding e crowdsourcing per testare, prom...
ShareableWay
 

Destacado (20)

LinkedIn Mobile: How do we do it?
LinkedIn Mobile: How do we do it?LinkedIn Mobile: How do we do it?
LinkedIn Mobile: How do we do it?
 
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at UberWSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
 
Air traffic controller - Streams Processing meetup
Air traffic controller  - Streams Processing meetupAir traffic controller  - Streams Processing meetup
Air traffic controller - Streams Processing meetup
 
IMCA RESUME
IMCA RESUMEIMCA RESUME
IMCA RESUME
 
Bases legales.- 10º Aniversario Centroamérica #destinosIberia
Bases legales.- 10º Aniversario Centroamérica #destinosIberiaBases legales.- 10º Aniversario Centroamérica #destinosIberia
Bases legales.- 10º Aniversario Centroamérica #destinosIberia
 
Aia 11g-performance-tuning-1915233
Aia 11g-performance-tuning-1915233Aia 11g-performance-tuning-1915233
Aia 11g-performance-tuning-1915233
 
Códigos QR
Códigos QRCódigos QR
Códigos QR
 
Webexpo 2010
Webexpo 2010 Webexpo 2010
Webexpo 2010
 
My health record1
My health record1My health record1
My health record1
 
Musica house
Musica houseMusica house
Musica house
 
I clienti parte dell'impresa: crowdfunding e crowdsourcing per testare, prom...
I clienti parte dell'impresa: crowdfunding e crowdsourcing  per testare, prom...I clienti parte dell'impresa: crowdfunding e crowdsourcing  per testare, prom...
I clienti parte dell'impresa: crowdfunding e crowdsourcing per testare, prom...
 
Tutorial mind meister
Tutorial mind meisterTutorial mind meister
Tutorial mind meister
 
Forrest Gump Project
Forrest Gump ProjectForrest Gump Project
Forrest Gump Project
 
La Reforma Fiscal, ¿Cómo me afecta?
La Reforma Fiscal, ¿Cómo me afecta?La Reforma Fiscal, ¿Cómo me afecta?
La Reforma Fiscal, ¿Cómo me afecta?
 
Resistencia a la insulina
Resistencia a la insulinaResistencia a la insulina
Resistencia a la insulina
 
Complex Event Processing: What?, Why?, How?
Complex Event Processing: What?, Why?, How?Complex Event Processing: What?, Why?, How?
Complex Event Processing: What?, Why?, How?
 
Applying complex event processing (2010-10-11)
Applying complex event processing (2010-10-11)Applying complex event processing (2010-10-11)
Applying complex event processing (2010-10-11)
 
Complex Event Processing with Esper
Complex Event Processing with EsperComplex Event Processing with Esper
Complex Event Processing with Esper
 
Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010
 
Porfolio del alumno
Porfolio del alumnoPorfolio del alumno
Porfolio del alumno
 

Similar a Scalable complex event processing on samza @UBER

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
confluent
 

Similar a Scalable complex event processing on samza @UBER (20)

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Cassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceCassandra Lunch #88: Cadence
Cassandra Lunch #88: Cadence
 
Netty training
Netty trainingNetty training
Netty training
 
Netty training
Netty trainingNetty training
Netty training
 
Kaseya Connect 2013: Optimizing Your K Server - Best Practices in Kaseya Infr...
Kaseya Connect 2013: Optimizing Your K Server - Best Practices in Kaseya Infr...Kaseya Connect 2013: Optimizing Your K Server - Best Practices in Kaseya Infr...
Kaseya Connect 2013: Optimizing Your K Server - Best Practices in Kaseya Infr...
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 
Skillenza Build with Serverless Challenge - Advanced Serverless Concepts
Skillenza Build with Serverless Challenge -  Advanced Serverless ConceptsSkillenza Build with Serverless Challenge -  Advanced Serverless Concepts
Skillenza Build with Serverless Challenge - Advanced Serverless Concepts
 
The future of serverless is STATE!
The future of serverless is STATE!The future of serverless is STATE!
The future of serverless is STATE!
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent Cloud
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]
 
'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Deploying Perl apps on dotCloud
Deploying Perl apps on dotCloudDeploying Perl apps on dotCloud
Deploying Perl apps on dotCloud
 
A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Rui...
A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Rui...A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Rui...
A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Rui...
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 

Último

1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 

Último (20)

Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 

Scalable complex event processing on samza @UBER

  • 1. Scalable Complex Event Processing On Samza @Uber Shuyi Chen Uber Technologies Inc.
  • 2. ● 6 continents, 70 countries, 400+ cities ● Transportation as reliable as running water, everywhere, for everyone Uber
  • 3. Outline ● Motivation ● Architecture ● Limitations ● Challenges
  • 4. Outline ● Motivation ● Architecture ● Limitations ● Challenges
  • 5. Uber is a data-driven company
  • 6. Thousands of Kafka topics from different services
  • 7. We can extract a lot of useful information from this rich set of logs in real-time!
  • 8. Multiple logins from the same IP within a short interval
  • 9. Partner accepted a trip → partner calls rider through the Uber APP → rider cancels the trip
  • 10. Partners reject the second pickup of a UberPOOL trip
  • 11. Multiple logins from the same IP within a short interval Window Aggregation
  • 12. Partner accepted a trip → partner calls rider through the Uber APP → rider cancels the trip Pattern detection
  • 13. Partners reject the second pickup of a UberPOOL trip Filter
  • 14. Can we use declarative semantics to specify these stream processing logics?
  • 15. Complex event processing ● Combines data from multiple sources to infer events or patterns that suggest more complicated circumstances ● CEP is used across many industries for various use cases, including: ○ Finance: Trade analysis, fraud detection ○ Airlines: Operations monitoring ○ Healthcare: Claims processing, patient monitoring ○ Energy and Telecommunications: Outage detection ● CEP uses declarative rule/query language to specify event processing logic
  • 16. Siddhi: Complex event processing engine ● Lightweight, extensible, open source, released as a Java library ● Features supported ○ Filter ○ Join ○ Aggregation ○ Group by ○ Window ○ Pattern processing ○ Sequence processing ○ Event tables ○ Event-time processing ○ Declarative query language: SiddhiQL
  • 17. How Siddhi works ● Specify processing logic declaratively with SiddhiQL
  • 18. How Siddhi works ● Query is parsed at runtime into an execution plan runtime ● As events flow in, the execution plan runtime process events inside the CEP engine according the query logic
  • 19. How can we make it scalable at Uber scale?
  • 20. Samza ● A distributed stream processing framework ○ Scalable ○ Built-in State management ○ Built-in fault tolerant ○ At-least-once message processing ● Good support from our data infra team
  • 21. How can we make the stream processing output useful?
  • 22. Actions ● Generalize a set of common action templates to make it easy for services and human to harness the power of realtime stream processing ● Currently we support ○ Make an RPC call ○ Invoke a Webhook endpoint ○ Index to ElasticSearch ○ Write Cassandra ○ Kafka ○ Statsd ○ Chat service ○ Email ○ Push notification
  • 24. Outline ● Motivation ● Architecture ● Limitations ● Challenges
  • 25.
  • 26.
  • 27. Preprocessor ● Enrich raw Kafka events with business information
  • 28. Shuffler ● Re-shuffle events ● Prefiltering for predicate pushdown
  • 29. Complex event processor ● Parse Siddhi queries into execution plan runtime ● Process events in Siddhi execution plan runtime ● Checkpoint state regularly to ensure recovery upon crash/restart using RocksDB
  • 30. Action processor ● Execute actions upon the complex event output ● Support various kinds of actions for easy integration ● Implement configurable and finite action retry mechanism using RocksDB
  • 31. No stream processing logic is hard-coded in the data pipeline
  • 32. REST API backend ● All queries, actions, shuffling logics and pre-filtering logics are stored externally in Cassandra ● RESTFUL API for CRUD operations ● Data pipeline automatically reload the data upon update w/o job restart ○ fast data exploration ○ Realtime feedback loop ○ incremental DAG construction ● Decouple processing logic from the data pipeline
  • 33. Unified management and monitoring ● Every use case ○ share the same data pipeline architecture ○ Use queries and actions to describe its processing logic ● A single monitoring template can be reused across different use cases
  • 34. Applications ● Real-time fraud detection ● Real-time anomaly detection ● Real-time marketing campaign ● Real-time promotion ● Real-time monitoring ● Real-time feedback system ● Real-time analytics ● Real-time visualizations ● And etc.
  • 35. Outline ● Motivation ● Architecture ● Limitations ● Challenges
  • 36. Not a general purpose stream processing system
  • 37. No dynamic topology ● The DAG is not dynamic ● Can not shuffle arbitrary number of times ● Ideally, we can chain multiple copies of the data pipeline to build arbitrary DAG ○ Large DAG can be difficult to manage and monitor ○ Samza use Kafka as intermediate message queue between jobs, wide DAGs cause large load on Kafka ○ Out of 40+ use cases we run in production, none requires it.
  • 38. Out-of-order event handling ● Not a big concern ○ Events of the same rider/partner are usually seconds aparts ● K-slack extension in Siddhi for out-of-order event processing
  • 39. Job deployment ● Samza job creation is semi-automated ○ Auto-generate standard job properties ○ JVM memory tuning ○ Samza parameter tuning, e.g. container count ● Integrate with in-house cluster job management system to simplify start/restart/stop/upgrade of Samza jobs
  • 40. Predicate pushdown ● Allow prefiltering of streams in shuffle stage ● Need manual configuration through Web UI ● In the future, we can automate this by query analysis
  • 41. Outline ● Motivation ● Architecture ● Limitations ● Challenges
  • 42. Broadcast stream ● We need broadcast stream to broadcast updates in storage backend to the data pipeline ● No broadcast stream in Samza 0.9.1 ● Override SystemStreamPartitionGrouper ● Samza 0.10.0 added broadcast support (SAMZA-676)
  • 43. Unbalanced task workload ● Shufflers ingest multiple topics with different partition counts ● Default task partition assignment does not scale ● Override SystemStreamPartitionGrouper to balance the partitions across all tasks
  • 44. Large checkpointing state ● Samza use Kafka to log state changes ● Kafka message size limit to 1 MB by default ● Solution: we build logics to slice state into smaller pieces and checkpoint them into Rocksdb
  • 45. Synchronous checkpointing ● If state is large, time to checkpoint can be long ● Samza uses single-threaded model, unsafe to do it asynchronously ● Ongoing work on multi-thread support in Samza (SAMZA-863)
  • 46. Exactly once state processing? ● Can not commit state and offset atomically ● No exactly once state processing
  • 47. Debugging ● Need to inspect multiple logs to diagnose Samza job problems ○ Application master log ○ Multiple container logs ○ Log size is huge ○ Container logs are difficult to locate after job failure ● Sometimes, Samza job get stuck at launch, and no log can be found ○ YARN problem ○ Binary downloading problem
  • 48. Upgrading Samza jobs ● Upgrade Samza jobs require a full restart, and can take minutes due to ○ Offset checkpointing topic too large → set retention to hours ○ Changelog topic too large → set retention or enable compaction in Kafka or host affinity (SAMZA-617) ● To minimize the interruption during upgrade, it would be nice to have ○ Rolling restart ○ Per container restart
  • 49. Our solution: non-interrupted handoff ● For critical jobs, we use replication during upgrade ○ Start a shadow job ○ Upgrade shadow ○ Switch primary and shadow ○ Upgrade primary ○ Switch back ● Downside: require 2x capacity during upgrade
  • 50. Manage complicated DAG ● Samza uses Kafka as message queue for intermediate processing output ○ This enables sharing of shuffler or preprocessor output among multiple downstream Samza jobs ○ Increase resource efficiency ● This gradually results in a large and complicated DAG ○ Complicated dependencies between jobs ○ Jobs closer to the sources of the DAG becoming more and more critical ● In practice, we isolate DAGs by logical groups