SlideShare una empresa de Scribd logo
1 de 20
From Gust To Tempest: Scaling Storm
P R E S E N T E D B Y B o b b y E v a n s
Hi I’m Bobby Evans
bobby@apache.org @bobbydata
2
 Low Latency Data Processing Architect @ Yahoo
 Apache Storm
 Apache Spark
 Apache Kafka
 Committer and PMC member for
 Apache Storm
 Apache Hadoop
 Apache Spark
 Apache TEZ
Agenda
3
 Apache Storm Architecture
 What Was Done Already
 Current/Future Work
background: https://www.flickr.com/photos/gsfc/15072362777
Storm Concepts
1. Streams
 Unbounded sequence of tuples
2. Spout
 Source of Stream
 E.g. Read from Twitter streaming API
3. Bolts
 Processes input streams and produces new
streams
 E.g. Functions, Filters, Aggregation, Joins
4. Topologies
 Network of spouts and bolts
Routing of tuples
 Shuffle grouping: pick a random task
(but with load balancing)
 Fields grouping: consistent hashing on
a subset of tuple fields
 All grouping: send to all tasks
 Global grouping: pick task with lowest
id
 Shuffle or Local grouping: If there is a
local bolt (in the same worker process)
use it otherwise use shuffle
 Partial Key grouping: Fields grouping
but with 2 choices for load balancing.
Storm Architecture
Master
Node
Cluster
Coordination
Worker
processes
Worker
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor Worker
Worker
Worker
Launches
workers
Worker
Task
(Spout A-1)
Task
(Spout A-5)
Task
(Spout A-9)
Task
(Bolt B-3)
Other
Workers
Task
(Acker)
Routing
Current State
w hat w as done alr eady
background: https://www.flickr.com/photos/maf04/14392794749
Largest Topology Growth at Yahoo
9
2013 2014 2015
Executors 100 3000 4000
Workers 40 400 1500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
background: https://www.flickr.com/photos/68942208@N02/16242761551
Cluster Growth at Yahoo
10
0
500
1000
1500
2000
2500
Jun-12
Aug-12
Oct-12
Dec-12
Feb-13
Apr-13
Jun-13
Aug-13
Oct-13
Dec-13
Feb-14
Apr-14
Jun-14
Aug-14
Oct-14
Dec-14
Feb-15
Apr-15
Jun-15
Jun-12 Jan-13 Jan-14 Jan-15 Jun-15
Total Nodes 40 170 600 1100 2300
Largest Cluster 20 60 120 250 300
background: http://bit.ly/1KypnCN
In the Beginning…
11
 Mid 2011:
 Storm is released as open source
 Early 2012:
 Yahoo evaluation begins
 https://github.com/yahoo/storm-perf-test
 Mid 2012:
 Purpose built clusters 10+ nodes
 Early 2013:
 60-node cluster, largest topology 40 workers, 100 executors
 ZooKeeper config -Djute.maxbuffer=4194304
 May 2013:
 Netty messaging layer
 http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty
 Oct 2013:
 ZooKeeper heartbeat timeout checks
background: https://www.flickr.com/photos/gedas/3618792161
So Far…
 Late 2013:
 ZooKeeper config -Dzookeeper.forceSync=no
 Storm enters Apache Incubator
 Early 2014:
 250-node cluster, largest topology 400 workers, 3,000 executors
 June 2014:
 STORM-376 – Compress ZooKeeper data
 STORM-375 – Check for changes before reading data from ZooKeeper
 Sep 2014
 Storm becomes an Apache Top Level Project
 Early 2015:
 STORM-632 Better grouping for data skew
 STORM-634 Thrift serialization for ZooKeeper data.
 300-node cluster (Tested 400 nodes, 1,200 theoretical maximum)
 Largest topology 1,500 workers, 4,000 executors
background: http://s0.geograph.org.uk/geophotos/02/27/03/2270317_7653a833.jpg
We still have a ways to go
13
Hadoop 5400
Storm 300
Nodes
Largest Cluster Size
We want to get to a
4,000-node Storm
cluster.
Hadoop 41000
Storm 2300
Nodes
Total Nodes
background: https://www.flickr.com/photos/68397968@N07/14600216228
Future and Current Work
how w e ar e going to get to 4,000
background: https://www.flickr.com/photos/12567713@N00/2859921414
Why Can’t Storm Scale?
It’s all about the data.
State Storage (ZooKeeper):
 Limited to disk write speed (80MB/sec typically)
 Scheduling
O(num_execs * resched_rate)
 Supervisor
O(num_supervisors * hb_rate)
 Topology Metrics (worst case)
O(num_execs * num_comps * num_streams * hb_rate)
On one 240-node Yahoo Storm cluster, ZK writes 16 MB/sec, about
99.2% of that is worker heartbeats
Theoretical Limit:
80 MB/sec / 16 MB/sec * 240 nodes = 1,200 nodes
background: http://cnx.org/resources/8ab472b9b2bc2e90bb15a2a7b2182ca45a883e0f/Figure_45_07_02.jpg
Pacemaker
heartbeat server
Simple Secure In-Memory Store for Worker Heartbeats.
 Removes Disk Limitation
 Writes Scale Linearly
(but nimbus still needs to read it all, ideally in 10 sec or less)
240 node cluster’s complete HB state is 48MB, Gigabit is about 125 MB/s
10 s / (48 MB / 125 MB/s) * 240 nodes = 6,250 nodes
1200
6250
Theoretical Maximum Cluster Size
Zookeeper PaceMaker Gigabit
Highly-connected
topologies dominate data
volume.
10 GigE helps
Why Can’t Storm Scale?
It’s all about the data.
All raw data serialized, transferred to UI, de-serialized and aggregated
per page load
Our largest topology uses about 400 MB in memory
Aggregate stats for UI/REST in Nimbus
 10+ min page load to 7 seconds
DDOS on Nimbus for jar download
Distributed Cache/Blob Store (STORM-411)
 Pluggable backend with HDFS support
background: https://www.flickr.com/photos/oregondot/15799498927
Why Can’t Storm Scale?
It’s all about the data.
Storm round-robin scheduling
 R-1/R % of traffic will be off rack where R is
the number of racks
 N-1/N % of traffic will be off node where N is
the number of nodes
 Does not know when resources are full (i.e.
network)
Resource & Network Topography Aware Scheduling
One slow node slows the entire topology.
Load Aware Routing (STORM-162)
Intelligent network aware routing
How does this compare to…
Heron (Twitter) and Apex (DataTorrent)?
 Code not released yet (June 9, 2015 at 6 am Pacific)
› So I have not seen it
 And we are not done yet either
 So, it is hard to tell
Google Cloud Dataflow?
 Open Source API, not implementation
 I have not tested it for scale
 Great stream processing concepts
background: http://www.publicdomainpictures.net/view-image.php?image=38889&picture=heron-2&large=1
Questions?
https://www.flickr.com/photos/51029297@N00/5275403364
bobby@apache.org

Más contenido relacionado

La actualidad más candente

Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
DataWorks Summit
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
P. Taylor Goetz
 

La actualidad más candente (20)

Storm and Cassandra
Storm and Cassandra Storm and Cassandra
Storm and Cassandra
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
 
Storm
StormStorm
Storm
 
Spark vs storm
Spark vs stormSpark vs storm
Spark vs storm
 
Storm
StormStorm
Storm
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Real-time Big Data Processing with Storm
Real-time Big Data Processing with StormReal-time Big Data Processing with Storm
Real-time Big Data Processing with Storm
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-Data
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Introduction to Storm
Introduction to StormIntroduction to Storm
Introduction to Storm
 

Similar a Scaling Apache Storm (Hadoop Summit 2015)

Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesday
Andrei Savu
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Kyle Hailey
 
sector-sphere
sector-spheresector-sphere
sector-sphere
xlight
 

Similar a Scaling Apache Storm (Hadoop Summit 2015) (20)

From Gust To Tempest: Scaling Storm
From Gust To Tempest: Scaling StormFrom Gust To Tempest: Scaling Storm
From Gust To Tempest: Scaling Storm
 
Strata Stinger Talk October 2013
Strata Stinger Talk October 2013Strata Stinger Talk October 2013
Strata Stinger Talk October 2013
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 
Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java Everything you wanted to know about writing async, concurrent http apps in java
Everything you wanted to know about writing async, concurrent http apps in java
 
Apache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesdayApache ZooKeeper TechTuesday
Apache ZooKeeper TechTuesday
 
Hadoop ecosystem framework n hadoop in live environment
Hadoop ecosystem framework  n hadoop in live environmentHadoop ecosystem framework  n hadoop in live environment
Hadoop ecosystem framework n hadoop in live environment
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
 
Resource planning on the (Amazon) cloud
Resource planning on the (Amazon) cloudResource planning on the (Amazon) cloud
Resource planning on the (Amazon) cloud
 
Clug 2011 March web server optimisation
Clug 2011 March  web server optimisationClug 2011 March  web server optimisation
Clug 2011 March web server optimisation
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
 
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Scaling Apache Storm (Hadoop Summit 2015)

  • 1. From Gust To Tempest: Scaling Storm P R E S E N T E D B Y B o b b y E v a n s
  • 2. Hi I’m Bobby Evans bobby@apache.org @bobbydata 2  Low Latency Data Processing Architect @ Yahoo  Apache Storm  Apache Spark  Apache Kafka  Committer and PMC member for  Apache Storm  Apache Hadoop  Apache Spark  Apache TEZ
  • 3. Agenda 3  Apache Storm Architecture  What Was Done Already  Current/Future Work background: https://www.flickr.com/photos/gsfc/15072362777
  • 4. Storm Concepts 1. Streams  Unbounded sequence of tuples 2. Spout  Source of Stream  E.g. Read from Twitter streaming API 3. Bolts  Processes input streams and produces new streams  E.g. Functions, Filters, Aggregation, Joins 4. Topologies  Network of spouts and bolts
  • 5. Routing of tuples  Shuffle grouping: pick a random task (but with load balancing)  Fields grouping: consistent hashing on a subset of tuple fields  All grouping: send to all tasks  Global grouping: pick task with lowest id  Shuffle or Local grouping: If there is a local bolt (in the same worker process) use it otherwise use shuffle  Partial Key grouping: Fields grouping but with 2 choices for load balancing.
  • 7. Worker Task (Spout A-1) Task (Spout A-5) Task (Spout A-9) Task (Bolt B-3) Other Workers Task (Acker) Routing
  • 8. Current State w hat w as done alr eady background: https://www.flickr.com/photos/maf04/14392794749
  • 9. Largest Topology Growth at Yahoo 9 2013 2014 2015 Executors 100 3000 4000 Workers 40 400 1500 0 500 1000 1500 2000 2500 3000 3500 4000 4500 background: https://www.flickr.com/photos/68942208@N02/16242761551
  • 10. Cluster Growth at Yahoo 10 0 500 1000 1500 2000 2500 Jun-12 Aug-12 Oct-12 Dec-12 Feb-13 Apr-13 Jun-13 Aug-13 Oct-13 Dec-13 Feb-14 Apr-14 Jun-14 Aug-14 Oct-14 Dec-14 Feb-15 Apr-15 Jun-15 Jun-12 Jan-13 Jan-14 Jan-15 Jun-15 Total Nodes 40 170 600 1100 2300 Largest Cluster 20 60 120 250 300 background: http://bit.ly/1KypnCN
  • 11. In the Beginning… 11  Mid 2011:  Storm is released as open source  Early 2012:  Yahoo evaluation begins  https://github.com/yahoo/storm-perf-test  Mid 2012:  Purpose built clusters 10+ nodes  Early 2013:  60-node cluster, largest topology 40 workers, 100 executors  ZooKeeper config -Djute.maxbuffer=4194304  May 2013:  Netty messaging layer  http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty  Oct 2013:  ZooKeeper heartbeat timeout checks background: https://www.flickr.com/photos/gedas/3618792161
  • 12. So Far…  Late 2013:  ZooKeeper config -Dzookeeper.forceSync=no  Storm enters Apache Incubator  Early 2014:  250-node cluster, largest topology 400 workers, 3,000 executors  June 2014:  STORM-376 – Compress ZooKeeper data  STORM-375 – Check for changes before reading data from ZooKeeper  Sep 2014  Storm becomes an Apache Top Level Project  Early 2015:  STORM-632 Better grouping for data skew  STORM-634 Thrift serialization for ZooKeeper data.  300-node cluster (Tested 400 nodes, 1,200 theoretical maximum)  Largest topology 1,500 workers, 4,000 executors background: http://s0.geograph.org.uk/geophotos/02/27/03/2270317_7653a833.jpg
  • 13. We still have a ways to go 13 Hadoop 5400 Storm 300 Nodes Largest Cluster Size We want to get to a 4,000-node Storm cluster. Hadoop 41000 Storm 2300 Nodes Total Nodes background: https://www.flickr.com/photos/68397968@N07/14600216228
  • 14. Future and Current Work how w e ar e going to get to 4,000 background: https://www.flickr.com/photos/12567713@N00/2859921414
  • 15. Why Can’t Storm Scale? It’s all about the data. State Storage (ZooKeeper):  Limited to disk write speed (80MB/sec typically)  Scheduling O(num_execs * resched_rate)  Supervisor O(num_supervisors * hb_rate)  Topology Metrics (worst case) O(num_execs * num_comps * num_streams * hb_rate) On one 240-node Yahoo Storm cluster, ZK writes 16 MB/sec, about 99.2% of that is worker heartbeats Theoretical Limit: 80 MB/sec / 16 MB/sec * 240 nodes = 1,200 nodes background: http://cnx.org/resources/8ab472b9b2bc2e90bb15a2a7b2182ca45a883e0f/Figure_45_07_02.jpg
  • 16. Pacemaker heartbeat server Simple Secure In-Memory Store for Worker Heartbeats.  Removes Disk Limitation  Writes Scale Linearly (but nimbus still needs to read it all, ideally in 10 sec or less) 240 node cluster’s complete HB state is 48MB, Gigabit is about 125 MB/s 10 s / (48 MB / 125 MB/s) * 240 nodes = 6,250 nodes 1200 6250 Theoretical Maximum Cluster Size Zookeeper PaceMaker Gigabit Highly-connected topologies dominate data volume. 10 GigE helps
  • 17. Why Can’t Storm Scale? It’s all about the data. All raw data serialized, transferred to UI, de-serialized and aggregated per page load Our largest topology uses about 400 MB in memory Aggregate stats for UI/REST in Nimbus  10+ min page load to 7 seconds DDOS on Nimbus for jar download Distributed Cache/Blob Store (STORM-411)  Pluggable backend with HDFS support background: https://www.flickr.com/photos/oregondot/15799498927
  • 18. Why Can’t Storm Scale? It’s all about the data. Storm round-robin scheduling  R-1/R % of traffic will be off rack where R is the number of racks  N-1/N % of traffic will be off node where N is the number of nodes  Does not know when resources are full (i.e. network) Resource & Network Topography Aware Scheduling One slow node slows the entire topology. Load Aware Routing (STORM-162) Intelligent network aware routing
  • 19. How does this compare to… Heron (Twitter) and Apex (DataTorrent)?  Code not released yet (June 9, 2015 at 6 am Pacific) › So I have not seen it  And we are not done yet either  So, it is hard to tell Google Cloud Dataflow?  Open Source API, not implementation  I have not tested it for scale  Great stream processing concepts background: http://www.publicdomainpictures.net/view-image.php?image=38889&picture=heron-2&large=1