SlideShare una empresa de Scribd logo
1 de 47
SpaaS*
* Stream processing as a service with Apache Storm
Ernestas Vaiciukevičius
Birth of the platform
Birth of the platform
Legacy solution issues:
Delays
Resource utilization
Storage for temp data
Hard to scale
Not fault tolerant
Licenses
Batch based
Gradually refactoring old solution
Birth of the platform
Birth of the platform
Storm
Kafka
Our Storm cluster became generic enough to be
offered as a service to other teams.
Just needed to address a few points:
• Simpler scaling
• Resource isolation
Birth of the platform
Storm
Birth of the platform
Storm
Mesos
Our Storm cluster became generic enough to be
offered as a service to other teams.
Just needed to address a few points:
• Simpler scaling
• Resource isolation
Birth of the platform
Storm
Mesos
Our Storm cluster became generic enough to be
offered as a service to other teams.
Just needed to address a few points:
• Simpler scaling – Storm-mesos integration
• Resource isolation
Birth of the platform
Storm
Mesos
Our Storm cluster became generic enough to be
offered as a service to other teams.
Just needed to address a few points:
• Simpler scaling – Storm-mesos integration
• Resource isolation - cgroups
Birth of the platform
Storm
Mesos
Providing stream processing platform as a service
Storm cluster infrastructure
• 600 CPU cores, 3TB RAM
• Scala common library with reusable components
• Monitoring/alerting/logging for topologies
• Normal load - 0.7M messages/s
Our commons library
Storm basics
• Tuple – a record/message/item from
whose stream consists
• Spout – source of stream
• Bolt – a step in processing chain
• Topology – graph of connected bolts and
spouts describing data flow
• Worker – one of many distributed JVM
processes that executes a topology
Storm basics
• Tuple – a record/message/item from
whose stream consists
• Spout – source of stream
• Bolt – a step in processing chain
• Topology – graph of connected bolts and
spouts describing data flow
• Worker – one of many distributed JVM
processes that executes a topology
Storm basics
• Tuple – a record/message/item from
whose stream consists
• Spout – source of stream
• Bolt – a step in processing chain
• Topology – graph of connected bolts and
spouts describing data flow
• Worker – one of many distributed JVM
processes that executes a topology
Bolt
Bolt
Storm basics
• Tuple – a record/message/item from
whose stream consists
• Spout – source of stream
• Bolt – a step in processing chain
• Topology – graph of connected bolts and
spouts describing data flow
• Worker – one of many distributed JVM
processes that executes a topology
Topology
Storm basics
• Tuple – a record/message/item from
whose stream consists
• Spout – source of stream
• Bolt – a step in processing chain
• Topology – graph of connected bolts and
spouts describing data flow
• Worker – one of many distributed JVM
processes that executes a topology
Storm basics – reliable processing
Spout types:
• Unreliable
• Reliable
Guarantees:
• At most once
• At least once
Storm basics – reliable processing
Bolts may emit tuples anchored to one or more input tuples.
Here tuple B is descendant of A
Storm basics – reliable processing
Multiple anchorings form a tuple tree.
Storm basics – reliable processing
Bolts can either
• “acknowledge” or
• “fail”
it’s input tuples.
Storm basics – reliable processing
Failing in any of the bolts of the tuple tree will fail original tuples(s).
Spouts will retry and re-emit them again.
Our commons library
Tiny layer on top of Storm API and ScalaStorm* DSL to make developing in
Scala more convenient
• Typed messages
• Unified exception handling
• Reusable components
* https://github.com/velvia/ScalaStorm
Our commons library – typed messages
t.getInteger(0)
t.getString(1)
t.getValue(2)
{1, 2}
{2, "click"}
{1, "click", [1, 2, 3] }
Standard Storm tuples
Our commons library – typed messages
override def execute(t: Tuple) = { // what if wrong tuple comes here...
val click = t.getValue(0).asInstanceOf[Click] // it would crash the worker with an exception
val clickId = t.getInteger(0) // or worse - what if that's not clickId...
}
Standard "execute" method
Our commons library – typed messages
case class ClickMessage(id: Int, url: String) extends
BaseMessage
message
{1, "http://example.com"}
Our commons library – typed messages
case class ClickMessage(id: Int, url: String) extends BaseMessage
…
override def exec(t: Tuple) = {
case ClickMessage(id, url) =>
...
using anchor t emitMsg NextMessage(id)
}
We started to use typed Scala case classes
Our commons library – typed messages
Many fine-grained bolts can lead to high number of threads in worker processes and huge
heartbeat states stored in ZooKeeper.
override def transformer(): BaseMessage = {
case m: BaseMessage => MyNewMessage()
}
Each bolt brings at least two threads overhead.
Message transformation as standard functionality in base bolt helps to avoid “mapper” bolts..
Our commons library – exception handling
class MyBolt … with FailTupleExceptionHandler
…
class MyOtherBolt … {
override def handleException(t: Tuple, tw: Throwable): Unit = …
}
• FailTupleExceptionHandler
• WorkerRestartExceptionHandler
• AckTupleExceptionHandler
• DeactivateTopologyExceptionHandler
• AckTupleWithLimitExceptionHandler
Our commons library – reusable components
• CacheBolt
• SyncBolt
• KafkaProducerBolt
• RestApiBolt
• HadoopApiUploaderBolt
• InMemoryJoinBolt
• DeduplicatorBolt
• common helpers for logging. metrics, calling REST API's, etc.
Our commons library – stream join
Our commons library – stream join
Challenge 1: Data is not perfectly ordered
• out-of-order items in both streams might cause unjoined results
Challenge 1: Data is not perfectly ordered
• increase join window to compensate for out-of-order items in left stream
• increase synchronization offset for out-of-order items in right stream
Challenge 2: topic partitions not consumed evenly
Challenge 2: topic partitions not consumed evenly
• introduced PartitionAwareKafkaSpout – each item knows it's source partition
trait PartitionAwareMessage extends BaseMessage {
def partition: Int
}
• use minimal timestamp across all partitions for window expiration and sync time
Challenge 2: topic partitions not consumed evenly
Challenge 3: joins with huge join windows
• there are cases when join windows need to be minutes or even hours rather
than seconds – it may be difficult to hold these huge buffers in Storm worker's
RAM
• items are not acknowledged until they aren't joined and fully processed – so
huge number of items stuck in join buffer would not work with reliable Storm
topologies
Challenge 3: joins with huge join windows
Introduced another flavor of the join using external storage
• store join window items to Aerospike in-memory storage via REST API
• allows to store and retrieve arbitrary data by key
• API supports batching for performance
Challenge 3: joins with huge join windows
Feeding the data to join window
Challenge 3: joins with huge join windows
Doing the join
Challenge 3: joins with huge join windows
Tracking data delays
Challenge 3: joins with huge join windows
Challenge 3: joins with huge join windows
• fewer nuances than with in-memory join
• more external components
• supports huge join windows
• no handling for unjoined right stream items
• supports right stream with no continuous
throughput (allows pauses)
Thank you!

Más contenido relacionado

La actualidad más candente

Build a custom metrics on aws cloud
Build a custom metrics on aws cloudBuild a custom metrics on aws cloud
Build a custom metrics on aws cloudAhmad karawash
 
Pushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency SystemPushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency SystemKevin Ballard
 
Spark vs storm
Spark vs stormSpark vs storm
Spark vs stormTrong Ton
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaAndrew Montalenti
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormMd. Shamsur Rahim
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
 
Analysis big data by use php with storm
Analysis big data by use php with stormAnalysis big data by use php with storm
Analysis big data by use php with storm毅 吕
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceP. Taylor Goetz
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easynathanmarz
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Robert Evans
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationUday Vakalapudi
 
Kafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User GroupKafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User GroupJeff Holoman
 

La actualidad más candente (20)

Build a custom metrics on aws cloud
Build a custom metrics on aws cloudBuild a custom metrics on aws cloud
Build a custom metrics on aws cloud
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Kafka ops-new
Kafka ops-newKafka ops-new
Kafka ops-new
 
Pushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency SystemPushing Python: Building a High Throughput, Low Latency System
Pushing Python: Building a High Throughput, Low Latency System
 
Spark vs storm
Spark vs stormSpark vs storm
Spark vs storm
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQ
 
Introduction to Apache Storm
Introduction to Apache StormIntroduction to Apache Storm
Introduction to Apache Storm
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
 
Slide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache StormSlide #1:Introduction to Apache Storm
Slide #1:Introduction to Apache Storm
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Apache Storm Internals
Apache Storm InternalsApache Storm Internals
Apache Storm Internals
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
Analysis big data by use php with storm
Analysis big data by use php with stormAnalysis big data by use php with storm
Analysis big data by use php with storm
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
 
Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)Scaling Apache Storm (Hadoop Summit 2015)
Scaling Apache Storm (Hadoop Summit 2015)
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integration
 
Kafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User GroupKafka Reliability Guarantees ATL Kafka User Group
Kafka Reliability Guarantees ATL Kafka User Group
 

Destacado

How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Data center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITData center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITAlessandro Guli
 
Melt iron heterogeneous computing - lspe v3
Melt iron   heterogeneous computing - lspe v3Melt iron   heterogeneous computing - lspe v3
Melt iron heterogeneous computing - lspe v3Rinka Singh
 
National Weather Service Storm Spotter Training
National Weather Service Storm Spotter TrainingNational Weather Service Storm Spotter Training
National Weather Service Storm Spotter Trainingchowd
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming dataClaudiu Barbura
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...Nathan Bijnens
 
Storage and warehousing
Storage and warehousingStorage and warehousing
Storage and warehousingChandan Singh
 
Linux Interrupts
Linux InterruptsLinux Interrupts
Linux InterruptsKernel TLV
 
Adform webinar: New Features
Adform webinar: New FeaturesAdform webinar: New Features
Adform webinar: New FeaturesAdformMarketing
 
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysi
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysiLoppuraportti: ODA-hankkeen kustannus-hyötyanalyysi
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysiSitra / Hyvinvointi
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
Why Do Givers Give?
Why Do Givers Give?Why Do Givers Give?
Why Do Givers Give?WeDidIt
 
BABYSCANの開発について - 技術面より
BABYSCANの開発について - 技術面よりBABYSCANの開発について - 技術面より
BABYSCANの開発について - 技術面よりRyu Hayano
 
Harness the Power of 21st Century Online Marketing: LinkedIn
Harness the Power of 21st Century Online Marketing: LinkedInHarness the Power of 21st Century Online Marketing: LinkedIn
Harness the Power of 21st Century Online Marketing: LinkedInCatherine Cunningham
 
Introducing Apache Mesos
Introducing Apache MesosIntroducing Apache Mesos
Introducing Apache MesosMatthias Furrer
 
Lp kmb ulkus dm
Lp kmb ulkus dmLp kmb ulkus dm
Lp kmb ulkus dmifaaa
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesArm
 

Destacado (20)

How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Storm
StormStorm
Storm
 
Data center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the ITData center outsourcing a new paradigm for the IT
Data center outsourcing a new paradigm for the IT
 
Melt iron heterogeneous computing - lspe v3
Melt iron   heterogeneous computing - lspe v3Melt iron   heterogeneous computing - lspe v3
Melt iron heterogeneous computing - lspe v3
 
National Weather Service Storm Spotter Training
National Weather Service Storm Spotter TrainingNational Weather Service Storm Spotter Training
National Weather Service Storm Spotter Training
 
Autonomous analytics on streaming data
Autonomous analytics on streaming dataAutonomous analytics on streaming data
Autonomous analytics on streaming data
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
 
Storage and warehousing
Storage and warehousingStorage and warehousing
Storage and warehousing
 
Crock pot mind
Crock pot mindCrock pot mind
Crock pot mind
 
Linux Interrupts
Linux InterruptsLinux Interrupts
Linux Interrupts
 
Adform webinar: New Features
Adform webinar: New FeaturesAdform webinar: New Features
Adform webinar: New Features
 
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysi
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysiLoppuraportti: ODA-hankkeen kustannus-hyötyanalyysi
Loppuraportti: ODA-hankkeen kustannus-hyötyanalyysi
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
Why Do Givers Give?
Why Do Givers Give?Why Do Givers Give?
Why Do Givers Give?
 
BABYSCANの開発について - 技術面より
BABYSCANの開発について - 技術面よりBABYSCANの開発について - 技術面より
BABYSCANの開発について - 技術面より
 
Harness the Power of 21st Century Online Marketing: LinkedIn
Harness the Power of 21st Century Online Marketing: LinkedInHarness the Power of 21st Century Online Marketing: LinkedIn
Harness the Power of 21st Century Online Marketing: LinkedIn
 
Introducing Apache Mesos
Introducing Apache MesosIntroducing Apache Mesos
Introducing Apache Mesos
 
Lp kmb ulkus dm
Lp kmb ulkus dmLp kmb ulkus dm
Lp kmb ulkus dm
 
Efficient software development with heterogeneous devices
Efficient software development with heterogeneous devicesEfficient software development with heterogeneous devices
Efficient software development with heterogeneous devices
 
5s audit template ver2
5s audit template ver25s audit template ver2
5s audit template ver2
 

Similar a Storm - SpaaS

The rice and fail of an IoT solution
The rice and fail of an IoT solutionThe rice and fail of an IoT solution
The rice and fail of an IoT solutionRadu Vunvulea
 
Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Bryan Bende
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Eliminating the Pauses in your Java Application
Eliminating the Pauses in your Java ApplicationEliminating the Pauses in your Java Application
Eliminating the Pauses in your Java ApplicationMark Stoodley
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly
 
Project Deimos
Project DeimosProject Deimos
Project DeimosSimon Suo
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Stormjustinjleet
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...srisatish ambati
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Ilya Ganelin
 
Akka london scala_user_group
Akka london scala_user_groupAkka london scala_user_group
Akka london scala_user_groupSkills Matter
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorStéphane Maldini
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...Timothy McCormick
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Evan Chan
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudMark Voelker
 
Cassandra
CassandraCassandra
Cassandraexsuns
 

Similar a Storm - SpaaS (20)

The rice and fail of an IoT solution
The rice and fail of an IoT solutionThe rice and fail of an IoT solution
The rice and fail of an IoT solution
 
Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014Real-Time Inverted Search NYC ASLUG Oct 2014
Real-Time Inverted Search NYC ASLUG Oct 2014
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Eliminating the Pauses in your Java Application
Eliminating the Pauses in your Java ApplicationEliminating the Pauses in your Java Application
Eliminating the Pauses in your Java Application
 
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly
 
Project Deimos
Project DeimosProject Deimos
Project Deimos
 
Cleveland HUG - Storm
Cleveland HUG - StormCleveland HUG - Storm
Cleveland HUG - Storm
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
Akka london scala_user_group
Akka london scala_user_groupAkka london scala_user_group
Akka london scala_user_group
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Tech4Africa 2014
Tech4Africa 2014Tech4Africa 2014
Tech4Africa 2014
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
OpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient CloudOpenStack: Toward a More Resilient Cloud
OpenStack: Toward a More Resilient Cloud
 
Cassandra
CassandraCassandra
Cassandra
 

Último

8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 

Último (20)

8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 

Storm - SpaaS

  • 1. SpaaS* * Stream processing as a service with Apache Storm Ernestas Vaiciukevičius
  • 2. Birth of the platform
  • 3. Birth of the platform Legacy solution issues: Delays Resource utilization Storage for temp data Hard to scale Not fault tolerant Licenses Batch based
  • 4. Gradually refactoring old solution Birth of the platform
  • 5. Birth of the platform Storm Kafka
  • 6. Our Storm cluster became generic enough to be offered as a service to other teams. Just needed to address a few points: • Simpler scaling • Resource isolation Birth of the platform Storm
  • 7. Birth of the platform Storm Mesos Our Storm cluster became generic enough to be offered as a service to other teams. Just needed to address a few points: • Simpler scaling • Resource isolation
  • 8. Birth of the platform Storm Mesos Our Storm cluster became generic enough to be offered as a service to other teams. Just needed to address a few points: • Simpler scaling – Storm-mesos integration • Resource isolation
  • 9. Birth of the platform Storm Mesos Our Storm cluster became generic enough to be offered as a service to other teams. Just needed to address a few points: • Simpler scaling – Storm-mesos integration • Resource isolation - cgroups
  • 10. Birth of the platform Storm Mesos Providing stream processing platform as a service Storm cluster infrastructure • 600 CPU cores, 3TB RAM • Scala common library with reusable components • Monitoring/alerting/logging for topologies • Normal load - 0.7M messages/s
  • 12. Storm basics • Tuple – a record/message/item from whose stream consists • Spout – source of stream • Bolt – a step in processing chain • Topology – graph of connected bolts and spouts describing data flow • Worker – one of many distributed JVM processes that executes a topology
  • 13. Storm basics • Tuple – a record/message/item from whose stream consists • Spout – source of stream • Bolt – a step in processing chain • Topology – graph of connected bolts and spouts describing data flow • Worker – one of many distributed JVM processes that executes a topology
  • 14. Storm basics • Tuple – a record/message/item from whose stream consists • Spout – source of stream • Bolt – a step in processing chain • Topology – graph of connected bolts and spouts describing data flow • Worker – one of many distributed JVM processes that executes a topology
  • 15. Bolt
  • 16. Bolt
  • 17. Storm basics • Tuple – a record/message/item from whose stream consists • Spout – source of stream • Bolt – a step in processing chain • Topology – graph of connected bolts and spouts describing data flow • Worker – one of many distributed JVM processes that executes a topology
  • 19. Storm basics • Tuple – a record/message/item from whose stream consists • Spout – source of stream • Bolt – a step in processing chain • Topology – graph of connected bolts and spouts describing data flow • Worker – one of many distributed JVM processes that executes a topology
  • 20. Storm basics – reliable processing Spout types: • Unreliable • Reliable Guarantees: • At most once • At least once
  • 21. Storm basics – reliable processing Bolts may emit tuples anchored to one or more input tuples. Here tuple B is descendant of A
  • 22. Storm basics – reliable processing Multiple anchorings form a tuple tree.
  • 23. Storm basics – reliable processing Bolts can either • “acknowledge” or • “fail” it’s input tuples.
  • 24. Storm basics – reliable processing Failing in any of the bolts of the tuple tree will fail original tuples(s). Spouts will retry and re-emit them again.
  • 25. Our commons library Tiny layer on top of Storm API and ScalaStorm* DSL to make developing in Scala more convenient • Typed messages • Unified exception handling • Reusable components * https://github.com/velvia/ScalaStorm
  • 26. Our commons library – typed messages t.getInteger(0) t.getString(1) t.getValue(2) {1, 2} {2, "click"} {1, "click", [1, 2, 3] } Standard Storm tuples
  • 27. Our commons library – typed messages override def execute(t: Tuple) = { // what if wrong tuple comes here... val click = t.getValue(0).asInstanceOf[Click] // it would crash the worker with an exception val clickId = t.getInteger(0) // or worse - what if that's not clickId... } Standard "execute" method
  • 28. Our commons library – typed messages case class ClickMessage(id: Int, url: String) extends BaseMessage message {1, "http://example.com"}
  • 29. Our commons library – typed messages case class ClickMessage(id: Int, url: String) extends BaseMessage … override def exec(t: Tuple) = { case ClickMessage(id, url) => ... using anchor t emitMsg NextMessage(id) } We started to use typed Scala case classes
  • 30. Our commons library – typed messages Many fine-grained bolts can lead to high number of threads in worker processes and huge heartbeat states stored in ZooKeeper. override def transformer(): BaseMessage = { case m: BaseMessage => MyNewMessage() } Each bolt brings at least two threads overhead. Message transformation as standard functionality in base bolt helps to avoid “mapper” bolts..
  • 31. Our commons library – exception handling class MyBolt … with FailTupleExceptionHandler … class MyOtherBolt … { override def handleException(t: Tuple, tw: Throwable): Unit = … } • FailTupleExceptionHandler • WorkerRestartExceptionHandler • AckTupleExceptionHandler • DeactivateTopologyExceptionHandler • AckTupleWithLimitExceptionHandler
  • 32. Our commons library – reusable components • CacheBolt • SyncBolt • KafkaProducerBolt • RestApiBolt • HadoopApiUploaderBolt • InMemoryJoinBolt • DeduplicatorBolt • common helpers for logging. metrics, calling REST API's, etc.
  • 33. Our commons library – stream join
  • 34. Our commons library – stream join
  • 35. Challenge 1: Data is not perfectly ordered • out-of-order items in both streams might cause unjoined results
  • 36. Challenge 1: Data is not perfectly ordered • increase join window to compensate for out-of-order items in left stream • increase synchronization offset for out-of-order items in right stream
  • 37. Challenge 2: topic partitions not consumed evenly
  • 38. Challenge 2: topic partitions not consumed evenly • introduced PartitionAwareKafkaSpout – each item knows it's source partition trait PartitionAwareMessage extends BaseMessage { def partition: Int } • use minimal timestamp across all partitions for window expiration and sync time
  • 39. Challenge 2: topic partitions not consumed evenly
  • 40. Challenge 3: joins with huge join windows • there are cases when join windows need to be minutes or even hours rather than seconds – it may be difficult to hold these huge buffers in Storm worker's RAM • items are not acknowledged until they aren't joined and fully processed – so huge number of items stuck in join buffer would not work with reliable Storm topologies
  • 41. Challenge 3: joins with huge join windows Introduced another flavor of the join using external storage • store join window items to Aerospike in-memory storage via REST API • allows to store and retrieve arbitrary data by key • API supports batching for performance
  • 42. Challenge 3: joins with huge join windows Feeding the data to join window
  • 43. Challenge 3: joins with huge join windows Doing the join
  • 44. Challenge 3: joins with huge join windows Tracking data delays
  • 45. Challenge 3: joins with huge join windows
  • 46. Challenge 3: joins with huge join windows • fewer nuances than with in-memory join • more external components • supports huge join windows • no handling for unjoined right stream items • supports right stream with no continuous throughput (allows pauses)