SlideShare una empresa de Scribd logo
1 de 26
Solving DEBS Grand
Challenge with WSO2
CEP
Srinath Perera, Suhothayan Sriskandarajah,
Mohanadarshan Vivekanandalingam,
Paul Fremantle, Sanjiva Weerawarana
WSO2 Inc.
Outline
• Grand
Challenge
• CEP
• Basic Solution
• Optimizations
• Beyond Grand
Challenge
Photo by John Trainoron Flickr
http://www.flickr.com/photos/trainor/2902023575/, Licensed under CC
Grand Challenge Problem
• Smart Home electricity data:
power and load
• 90 days, 40 houses, 120GB,
2125 sensors
• Event frequency per plug
– load values: once 2s.
– Work values: once 10-50s.
Value sent through when the
difference is more than 1Wh
• Queries
– Load prediction
– Outlier detection
Complex Event Processing
WSO2 CEP Operators
• Filters or transformations (process a single event)
– From MeterReadings[load>10] select .. insert into ..
• Windows + aggregation (track window of events: time,
length)
– from MeterReadings#window.time(30s) select avg(load) ..
• Joins (join two event streams to one)
– From MeterReadings#window.time(30s) as b join Cricket as c on ..
• Patterns (state machine implementation)
– From e1 = MeterReadings-> e2 = MeterReadings[load > e1.load + 10]
within 1h
select ..
• Event tables (join events against stored data)
– Define table ReadingSummary(v double) using .. db info ..
• WSO2 CEP Performance: (core i5 processor)
– 2-9 million events/sec on same JVM
– 300k events/sec over the network
How we did this?
• Basic version using CEP EQL
Queries
• Does it use machines fully?
(load average > 2X number
of cores)
• Find and fix bottlenecks
(Write extensions if
needed)
• Repeat until the deadline 
From http://seedtofeedme.blogspot.in/2013/09/watering-plants.html
Correcting Load Values
• We receive load values roughly once every 2 seconds from each
plug. Sensors send work values only when the difference is more
than 1Wh: this happens roughly every 10 to 50 seconds.
• Correcting function
– Simple linear fitting using last two work values and the calculate
load using that value
– Correct using the fact that work from t2 and t < 1kWh as we
have not seen the work value.
• Micro-benchmark with 100 million events.
– Errors < 16% of the load value 75% of the time.
– Error Percentiles 25%=1.9 50%=6.39 75%=15.41
Q1 Challenges
• CEP engine is single
threaded, hence cannot
use a multi-core machine
fully.
• Avoid calculating each
timeslot (1,5 .. 120m)
separately
• Batch windows. Median
calculations only keep <
129,600 events
– No memory problem
– Median calculated once for
30 sec, Min-Max Heap is
good enough
Query 1: Single Node Solution
Optimizations: Improve Parallelism
• Do we have enough
parallelism? Does load
average > 2X number of
cores
• Scale, parallelism => data
partitions
• Partition data and assign to
different CEP engines
– Partition by House (let us go
up to 40 threads)
• Not all houses would have
the same workload
– We use CEP engines and
threads independently and
use work stealing
Q2 Challenges
• Sliding window has to
remember each event in
the window to expire
them.
– (1h=3M events and
24h=74M events)
• Calculating mean over
window. Q2 is limited by
the Global median
• Per plug windows and the
global windows will keep
two copies of events
Query 2: Single Node Solution
Calculating Median
• MinMax Heap
• Buckets – track how
many fall on each bucket
• Buckets with variable
resolution
– Does not work when
median shift
• Reservoir Sampling
• Re-Median
Micro-benchmark: (1 million
values)
– Min-Max Heap 1000ms
– Reservoir Sampling 4ms
– Bucketing <1ms.
Disk backed Window
• Sliding window reads from one end and writes to
the other
– Write in batches
– Pre-fetch data in batches
– Micro-benchmark shows almost no latency addition
More Optimizations
• 1 Sec Slide
– Windows slide by 1second
– Rather than tracking all events in a window, we can
aggregate events in each second
– Then adjust the Buckets accordingly when they are
added or removed.
• Q2 only asks for how many plugs are greater or
less than the global median
– If you know the global median, then you can check
plug median < or > the global median without
calculating plug median.
Q1 Single Node Results
• 16 core, 24GB machine
• Fast (0.4M events/sec) with with 1s latency
• Stable with different number of houses
Q2 Single Node Results
• 16 core, 24GB machine
• Fast (3M to 1M events/sec) with < 100ms latency
• Sensitive to houses (number of events), because
that increases the number of events in windows
Query 1: Distributed Solution
• Embarrassingly parallel by house
• Partition by house
Distributed Communication
• WSO2 CEP uses a thrift based
binary protocol (can handle
variable size messages) that
can do 0.3M events/sec
• Too slow to scale up (only got
0.18M/sec with thrift)
• We did a custom fixed message size solution (using java
ByteBuffer) that can do up to 0.8M events/sec
• Then we got close 1M events/sec for Q1 distributed
• Also tried ZeroMQ, which was bit slower.
Query 2: Distributed Solution
Q1 Distributed Results
• Using Amazon c1.xlarge, 1 client to 2 CEP
nodes ratio
• Got close to 1M msg/sec 2X speedup
Beyond Grand Challenge
• Support for automatic parallelization on
partitions with WSO2 CEP
• Adding some custom functions as inbuilt
operators
• Describing distributed deployment via queries
and automating the deployment
• All new features (e.g. disk backed window,
median, scaling) released with WSO2 CEP
coming 2014 Q4.
Scaling CEP
• Think like MapReduce! ask user to define partitions:
parallel and non parallel parts of computations.
• Each node as Storm bolt, communication and HA via storm
WSO2 CEP
• WSO2 is an opensource
middleware company
– 350+ people
– Customers: Ebay, Boeing, Cisco …
– Funded by Intel Capital, Quest, Cisco
• WSO2 CEP is available opensource
under apache license from
https://github.com/wso2/siddhi
• We welcome contributions (both
code and ideas)!! and
collaborations
• If you want to build your research
on top of WSO2 CEP, let us know.
(architecture@wso2.org)
Conclusions
• Main Ideas
– Load correction via linear
prediction
– Partitioning data via house for
parallelism and distribution
– Did an study for median
calculation options
– Disk backed window with
paging
– Fixed message sized protocol
– About 400k single node
throughput and close to 1M
distributed throughput
• Try out WSO2 CEP/ Siddhi:
Open source under Apache
License.
Questions?

Más contenido relacionado

La actualidad más candente

ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...Srinath Perera
 
Introduction to WSO2 Data Analytics Platform
Introduction to  WSO2 Data Analytics PlatformIntroduction to  WSO2 Data Analytics Platform
Introduction to WSO2 Data Analytics PlatformSrinath Perera
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsSrinath Perera
 
Introduction to WSO2 Analytics Platform: 2016 Q2 Update
Introduction to WSO2 Analytics Platform: 2016 Q2 UpdateIntroduction to WSO2 Analytics Platform: 2016 Q2 Update
Introduction to WSO2 Analytics Platform: 2016 Q2 UpdateSrinath Perera
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeFlink Forward
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Petr Zapletal
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesNatalino Busa
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Julian Hyde
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Databricks
 
Apache Storm
Apache StormApache Storm
Apache StormEdureka!
 
Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Stratio
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RRadek Maciaszek
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream ProcessingZbigniew Jerzak
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansSpark Summit
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLSpark Summit
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)Apache Apex
 

La actualidad más candente (20)

ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
 
Introduction to WSO2 Data Analytics Platform
Introduction to  WSO2 Data Analytics PlatformIntroduction to  WSO2 Data Analytics Platform
Introduction to WSO2 Data Analytics Platform
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and Applications
 
Introduction to WSO2 Analytics Platform: 2016 Q2 Update
Introduction to WSO2 Analytics Platform: 2016 Q2 UpdateIntroduction to WSO2 Analytics Platform: 2016 Q2 Update
Introduction to WSO2 Analytics Platform: 2016 Q2 Update
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming
 
Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
 
Cloud-based Data Stream Processing
Cloud-based Data Stream ProcessingCloud-based Data Stream Processing
Cloud-based Data Stream Processing
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone ML
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
 

Destacado

Analyzing a Soccer Game with WSO2 CEP
Analyzing a Soccer Game with WSO2 CEPAnalyzing a Soccer Game with WSO2 CEP
Analyzing a Soccer Game with WSO2 CEPSrinath Perera
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand WSO2
 
Maximize Messaging and Performance and Lowering Infrastructure Footprint
Maximize Messaging and Performance and Lowering Infrastructure FootprintMaximize Messaging and Performance and Lowering Infrastructure Footprint
Maximize Messaging and Performance and Lowering Infrastructure FootprintWSO2
 
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...WSO2
 
Monitoring API Performance and Delivering a Scalable API Solution
Monitoring API Performance and Delivering a Scalable API SolutionMonitoring API Performance and Delivering a Scalable API Solution
Monitoring API Performance and Delivering a Scalable API SolutionWSO2
 
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...WSO2
 
Big Data in the Real World. Real-time Football Analytics
Big Data in the Real World. Real-time Football AnalyticsBig Data in the Real World. Real-time Football Analytics
Big Data in the Real World. Real-time Football AnalyticsWSO2
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Robert Evans
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014P. Taylor Goetz
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesSrinath Perera
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 

Destacado (16)

Analyzing a Soccer Game with WSO2 CEP
Analyzing a Soccer Game with WSO2 CEPAnalyzing a Soccer Game with WSO2 CEP
Analyzing a Soccer Game with WSO2 CEP
 
Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand Right-size Deployment Instances to Meet Enterprise Demand
Right-size Deployment Instances to Meet Enterprise Demand
 
Maximize Messaging and Performance and Lowering Infrastructure Footprint
Maximize Messaging and Performance and Lowering Infrastructure FootprintMaximize Messaging and Performance and Lowering Infrastructure Footprint
Maximize Messaging and Performance and Lowering Infrastructure Footprint
 
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...
WSO2Con ASIA 2016: Catch Them in the Act: Fraud Detection with the WSO2 Analy...
 
Monitoring API Performance and Delivering a Scalable API Solution
Monitoring API Performance and Delivering a Scalable API SolutionMonitoring API Performance and Delivering a Scalable API Solution
Monitoring API Performance and Delivering a Scalable API Solution
 
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...
Runtime Governance with WSO2 Governance Registry integrated with WSO2 BAM and...
 
Big Data in the Real World. Real-time Football Analytics
Big Data in the Real World. Real-time Football AnalyticsBig Data in the Real World. Real-time Football Analytics
Big Data in the Real World. Real-time Football Analytics
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Data, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected DevicesData, Big Data and real time analytics for Connected Devices
Data, Big Data and real time analytics for Connected Devices
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 

Similar a Solving DEBS Grand Challenge with WSO2 CEP

The rice and fail of an IoT solution
The rice and fail of an IoT solutionThe rice and fail of an IoT solution
The rice and fail of an IoT solutionRadu Vunvulea
 
Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Bhupesh Chawda
 
Real-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache ApexReal-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache ApexApache Apex
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
Resilience at Extreme Scale
Resilience at Extreme ScaleResilience at Extreme Scale
Resilience at Extreme ScaleMarc Snir
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...Amazon Web Services
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasFlink Forward
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorStéphane Maldini
 
Performance tuning the Spring Pet Clinic sample application
Performance tuning the Spring Pet Clinic sample applicationPerformance tuning the Spring Pet Clinic sample application
Performance tuning the Spring Pet Clinic sample applicationJulien Dubois
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
Solving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.comSolving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.comIvan Kruglov
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving ForwardON.Lab
 
Cloudsim & greencloud
Cloudsim & greencloud Cloudsim & greencloud
Cloudsim & greencloud nedamaleki87
 
Storm 2012-03-29
Storm 2012-03-29Storm 2012-03-29
Storm 2012-03-29Ted Dunning
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...Amazon Web Services
 

Similar a Solving DEBS Grand Challenge with WSO2 CEP (20)

The rice and fail of an IoT solution
The rice and fail of an IoT solutionThe rice and fail of an IoT solution
The rice and fail of an IoT solution
 
Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016
 
Real-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache ApexReal-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache Apex
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
Resilience at Extreme Scale
Resilience at Extreme ScaleResilience at Extreme Scale
Resilience at Extreme Scale
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Performance tuning the Spring Pet Clinic sample application
Performance tuning the Spring Pet Clinic sample applicationPerformance tuning the Spring Pet Clinic sample application
Performance tuning the Spring Pet Clinic sample application
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Solving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.comSolving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.com
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Mininet: Moving Forward
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving Forward
 
03 performance
03 performance03 performance
03 performance
 
Cloudsim & greencloud
Cloudsim & greencloud Cloudsim & greencloud
Cloudsim & greencloud
 
Storm 2012-03-29
Storm 2012-03-29Storm 2012-03-29
Storm 2012-03-29
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Lecture1
Lecture1Lecture1
Lecture1
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 

Más de Srinath Perera

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingSrinath Perera
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs Srinath Perera
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsSrinath Perera
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?Srinath Perera
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesSrinath Perera
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?Srinath Perera
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsSrinath Perera
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Srinath Perera
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of BlockchainSrinath Perera
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesSrinath Perera
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata EraSrinath Perera
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksSrinath Perera
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeSrinath Perera
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies TimelineSrinath Perera
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglySrinath Perera
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through AnalyticsSrinath Perera
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
 

Más de Srinath Perera (20)

Book: Software Architecture and Decision-Making
Book: Software Architecture and Decision-MakingBook: Software Architecture and Decision-Making
Book: Software Architecture and Decision-Making
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
An Introduction to APIs
An Introduction to APIs An Introduction to APIs
An Introduction to APIs
 
An Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance ProfessionalsAn Introduction to Blockchain for Finance Professionals
An Introduction to Blockchain for Finance Professionals
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Healthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & ChallengesHealthcare + AI: Use cases & Challenges
Healthcare + AI: Use cases & Challenges
 
How would AI shape Future Integrations?
How would AI shape Future Integrations?How would AI shape Future Integrations?
How would AI shape Future Integrations?
 
The Role of Blockchain in Future Integrations
The Role of Blockchain in Future IntegrationsThe Role of Blockchain in Future Integrations
The Role of Blockchain in Future Integrations
 
Future of Serverless
Future of ServerlessFuture of Serverless
Future of Serverless
 
Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going? Blockchain: Where are we? Where are we going?
Blockchain: Where are we? Where are we going?
 
Few thoughts about Future of Blockchain
Few thoughts about Future of BlockchainFew thoughts about Future of Blockchain
Few thoughts about Future of Blockchain
 
A Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New TechnologiesA Visual Canvas for Judging New Technologies
A Visual Canvas for Judging New Technologies
 
Privacy in Bigdata Era
Privacy in Bigdata  EraPrivacy in Bigdata  Era
Privacy in Bigdata Era
 
Blockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and RisksBlockchain, Impact, Challenges, and Risks
Blockchain, Impact, Challenges, and Risks
 
Today's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology LandscapeToday's Technology and Emerging Technology Landscape
Today's Technology and Emerging Technology Landscape
 
An Emerging Technologies Timeline
An Emerging Technologies TimelineAn Emerging Technologies Timeline
An Emerging Technologies Timeline
 
The Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming ApplicationsThe Rise of Streaming SQL and Evolution of Streaming Applications
The Rise of Streaming SQL and Evolution of Streaming Applications
 
Analytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the UglyAnalytics and AI: The Good, the Bad and the Ugly
Analytics and AI: The Good, the Bad and the Ugly
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through Analytics
 
SoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration TechnologySoC Keynote:The State of the Art in Integration Technology
SoC Keynote:The State of the Art in Integration Technology
 

Solving DEBS Grand Challenge with WSO2 CEP

  • 1. Solving DEBS Grand Challenge with WSO2 CEP Srinath Perera, Suhothayan Sriskandarajah, Mohanadarshan Vivekanandalingam, Paul Fremantle, Sanjiva Weerawarana WSO2 Inc.
  • 2. Outline • Grand Challenge • CEP • Basic Solution • Optimizations • Beyond Grand Challenge Photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed under CC
  • 3. Grand Challenge Problem • Smart Home electricity data: power and load • 90 days, 40 houses, 120GB, 2125 sensors • Event frequency per plug – load values: once 2s. – Work values: once 10-50s. Value sent through when the difference is more than 1Wh • Queries – Load prediction – Outlier detection
  • 5. WSO2 CEP Operators • Filters or transformations (process a single event) – From MeterReadings[load>10] select .. insert into .. • Windows + aggregation (track window of events: time, length) – from MeterReadings#window.time(30s) select avg(load) .. • Joins (join two event streams to one) – From MeterReadings#window.time(30s) as b join Cricket as c on .. • Patterns (state machine implementation) – From e1 = MeterReadings-> e2 = MeterReadings[load > e1.load + 10] within 1h select .. • Event tables (join events against stored data) – Define table ReadingSummary(v double) using .. db info .. • WSO2 CEP Performance: (core i5 processor) – 2-9 million events/sec on same JVM – 300k events/sec over the network
  • 6. How we did this? • Basic version using CEP EQL Queries • Does it use machines fully? (load average > 2X number of cores) • Find and fix bottlenecks (Write extensions if needed) • Repeat until the deadline  From http://seedtofeedme.blogspot.in/2013/09/watering-plants.html
  • 7. Correcting Load Values • We receive load values roughly once every 2 seconds from each plug. Sensors send work values only when the difference is more than 1Wh: this happens roughly every 10 to 50 seconds. • Correcting function – Simple linear fitting using last two work values and the calculate load using that value – Correct using the fact that work from t2 and t < 1kWh as we have not seen the work value. • Micro-benchmark with 100 million events. – Errors < 16% of the load value 75% of the time. – Error Percentiles 25%=1.9 50%=6.39 75%=15.41
  • 8. Q1 Challenges • CEP engine is single threaded, hence cannot use a multi-core machine fully. • Avoid calculating each timeslot (1,5 .. 120m) separately • Batch windows. Median calculations only keep < 129,600 events – No memory problem – Median calculated once for 30 sec, Min-Max Heap is good enough
  • 9. Query 1: Single Node Solution
  • 10. Optimizations: Improve Parallelism • Do we have enough parallelism? Does load average > 2X number of cores • Scale, parallelism => data partitions • Partition data and assign to different CEP engines – Partition by House (let us go up to 40 threads) • Not all houses would have the same workload – We use CEP engines and threads independently and use work stealing
  • 11. Q2 Challenges • Sliding window has to remember each event in the window to expire them. – (1h=3M events and 24h=74M events) • Calculating mean over window. Q2 is limited by the Global median • Per plug windows and the global windows will keep two copies of events
  • 12. Query 2: Single Node Solution
  • 13. Calculating Median • MinMax Heap • Buckets – track how many fall on each bucket • Buckets with variable resolution – Does not work when median shift • Reservoir Sampling • Re-Median Micro-benchmark: (1 million values) – Min-Max Heap 1000ms – Reservoir Sampling 4ms – Bucketing <1ms.
  • 14. Disk backed Window • Sliding window reads from one end and writes to the other – Write in batches – Pre-fetch data in batches – Micro-benchmark shows almost no latency addition
  • 15. More Optimizations • 1 Sec Slide – Windows slide by 1second – Rather than tracking all events in a window, we can aggregate events in each second – Then adjust the Buckets accordingly when they are added or removed. • Q2 only asks for how many plugs are greater or less than the global median – If you know the global median, then you can check plug median < or > the global median without calculating plug median.
  • 16. Q1 Single Node Results • 16 core, 24GB machine • Fast (0.4M events/sec) with with 1s latency • Stable with different number of houses
  • 17. Q2 Single Node Results • 16 core, 24GB machine • Fast (3M to 1M events/sec) with < 100ms latency • Sensitive to houses (number of events), because that increases the number of events in windows
  • 18. Query 1: Distributed Solution • Embarrassingly parallel by house • Partition by house
  • 19. Distributed Communication • WSO2 CEP uses a thrift based binary protocol (can handle variable size messages) that can do 0.3M events/sec • Too slow to scale up (only got 0.18M/sec with thrift) • We did a custom fixed message size solution (using java ByteBuffer) that can do up to 0.8M events/sec • Then we got close 1M events/sec for Q1 distributed • Also tried ZeroMQ, which was bit slower.
  • 21. Q1 Distributed Results • Using Amazon c1.xlarge, 1 client to 2 CEP nodes ratio • Got close to 1M msg/sec 2X speedup
  • 22. Beyond Grand Challenge • Support for automatic parallelization on partitions with WSO2 CEP • Adding some custom functions as inbuilt operators • Describing distributed deployment via queries and automating the deployment • All new features (e.g. disk backed window, median, scaling) released with WSO2 CEP coming 2014 Q4.
  • 23. Scaling CEP • Think like MapReduce! ask user to define partitions: parallel and non parallel parts of computations. • Each node as Storm bolt, communication and HA via storm
  • 24. WSO2 CEP • WSO2 is an opensource middleware company – 350+ people – Customers: Ebay, Boeing, Cisco … – Funded by Intel Capital, Quest, Cisco • WSO2 CEP is available opensource under apache license from https://github.com/wso2/siddhi • We welcome contributions (both code and ideas)!! and collaborations • If you want to build your research on top of WSO2 CEP, let us know. (architecture@wso2.org)
  • 25. Conclusions • Main Ideas – Load correction via linear prediction – Partitioning data via house for parallelism and distribution – Did an study for median calculation options – Disk backed window with paging – Fixed message sized protocol – About 400k single node throughput and close to 1M distributed throughput • Try out WSO2 CEP/ Siddhi: Open source under Apache License.