SlideShare una empresa de Scribd logo
1 de 52
Descargar para leer sin conexión
A journey into stream processing with
Reactive Streams
and
Akka Streams
What to expect
• An Introduction
• The Reactive Streams specification
• A deep-dive into Akka Streams
• Code walkthrough and demo
• Q&A
An Introduction
Part 1 of 4
What's an array?
• A series of elements arranged in memory
• Has a beginning and an end
What's a stream?
• A series of elements emitted over time
• Live data (e.g, events) or at rest data (e.g, partitions of a file)
• May not have a beginning or an end
Appeal of stream processing?
• Scaling business logic
• Processing real-time data (fast data)
• Batch processing of large data sets (big data)
• Monitoring, analytics, complex event processing, etc
Challenges?
• Ephemeral
• Unbounded in size
• Potential "flooding" downstream
• Unfamiliar programming paradigm
You cannot step twice into the same
stream. For as you are stepping in, other
waters are ever flowing on to you.
— Heraclitus
Exploring two challenges of
stream processing
• An Rx-based approach for passing data across an
asynchronous boundary
• An approach for implementing back pressure
Synchrony
Asynchrony
Asynchrony
Back pressure
Flow control options
Flow control
• We need a way to signal when a subscriber is able to
process more data
• Effectively push-based (dynamic pull/push)
A lack of back pressure will eventually lead to an Out of Memory
Exception (OOME), which is the worst possible outcome. Then
you lose not just the work that overloaded the system, but
everything, even the stuff that you were safely working on. 
— Jim Powers, Typesafe
Subscriber usually has some kind of buffer.
Fast publishers can overwhelm the buffer of a slow subscriber.
Option 1: Use bounded buffer and drop messages.
Option 2: Increase buffer size if memory available.
Option 3: Pull-based backpressure.
Reactive Streams
Part 2 of 4
Why Reactive Streams?
• Reactive Streams is a specification and low-level API for
library developers.
• Started as an initiative in late 2013 between engineers at
Netflix, Pivotal, and Typesafe
• Streaming was complex!
• Play had “iteratees”, Akka had Akka IO
What is Reactive Streams?
• TCK (Technology Compatibility Kit)
• API (JVM, JavaScript)
• Specifications for library developers
• Early conversation on future spec for IO
1. Flow control via back pressure
• Fast publisher responsibilities
1. Not generate elements, if it is able to control their
production rate
2. Buffer elements in a bounded manner until more
demand is signalled
3. Drop elements until more demand is signalled
4. Tear down the stream if unable to apply any of the above
strategies
2. An Rx-based
approach to
asyncrony
public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {}
public interface Publisher<T> {
public void subscribe(Subscriber<? super T> s);
}
public interface Subscriber<T> {
public void onSubscribe(Subscription s);
public void onNext(T t);
public void onError(Throwable t);
public void onComplete();
}
public interface Subscription {
public void request(long n);
public void cancel();
}
Interoperability
• RxJava (Netflix)
• Reactor (Pivotal)
• Vert.x (RedHat)
• Akka Streams and Slick (Typesafe)
Three main repositories
• Reactive Streams for the JVM
• Reactive Streams for JavaScript
• Reactive Streams IO (for network protocols such as TCP,
WebSockets and possibly HTTP/2)
• Early exploration kicked off by Netflix
• 2016 timeframe
Reactive Streams
Visit the Reactive Streams website for more information.
http://www.reactive-streams.org/
Akka Streams
Part 3 of 4
Akka Streams
Akka Streams provides a way to express and run a chain of
asynchronous processing steps acting on a sequence of
elements.
• DSL for async/non-blocking stream processing
• Default back pressure
• Conforms to the Reactive Streams spec for interop
Basics
• Source - A processing stage with exactly one output
• Sink - A processing stage with exactly one input
• Flow - A processing stage which has exactly one input and
output
• RunnableFlow - A Flow that has both ends "attached" to a
Source and Sink
API design
Considerations
• Immutable, composable stream blueprints
• Explicit materialization step
• No magic at the expense of some extra code
Materialization
• Separate the what from the how
• Declarative Source/Flow/Sink to
create a blueprint
• FlowMaterializer turns blueprint
into actors
• Involves an extra step, but no magic
Error handling
• The element causing division by zero will be dropped
• Result will be a Future completed with Success(228)
val decider: Supervision.Decider = exc => exc match {
case _: ArithmeticException => Supervision.Resume
case _ => Supervision.Stop
}
// ActorFlowMaterializer takes the list of transformations comprising a akka.stream.scaladsl.Flow
// and materializes them in the form of org.reactivestreams.Processor
implicit val mat = ActorFlowMaterializer(
ActorFlowMaterializerSettings(system).withSupervisionStrategy(decider))
val source = Source(0 to 5).map(100 / _)
val result = source.runWith(Sink.fold(0)(_ + _))
Dynamic push/pull backpressure
• Fast subscriber can issue more Request(n) even before more
data arrives
• Publisher can accumulate demand
• Conforming to "fast publisher" responsibilities
• Total demand of elements is safe to publish
• Subscriber's buffer will never overflow
In-depth
Fan out
• Broadcast[T] (1 input, n outputs)
• Signals each output given an input signal
• Balance[T] (1 input => n outputs)
• Signals one of its output ports given an input signal
• FlexiRoute[In] (1 input, n outputs)
• Write custom fan out elements using a simple DSL
Fan in
• Merge[In] (n inputs , 1 output)
• Picks signals randomly from inputs
• Zip[A,B,Out] (2 inputs, 1 output)
• Zipping into an (A,B) tuple stream
• Concat[T] (2 inputs, 1 output)
• Concatenate streams (first, then second)
Scala example
val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder =>
import FlowGraph.Implicits._
val in = Source(1 to 10)
val out = Sink.ignore
val bcast = builder.add(Broadcast[Int](2))
val merge = builder.add(Merge[Int](2))
val f1, f2, f3, f4 = Flow[Int].map(_ + 10)
in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
bcast ~> f4 ~> merge
}
Advanced flow control
// return only the freshest element when the subscriber signals demand
val droppyStream: Flow[Message, Message] =
Flow[Message].conflate(seed = identity)((lastMessage, newMessage) => newMessage)
• conflate can be thought as a special fold operation that
collapses multiple upstream elements into one aggregate
element
• groupedWithin chunks up this stream into groups of
elements received within a time window, or limited by the
given number of elements, whatever happens first
Other sinks and sources - simple
streaming from/to Kafka
implicit val actorSystem = ActorSystem("ReactiveKafka")
implicit val materializer = ActorMaterializer()
val kafka = new ReactiveKafka(host = "localhost:9092", zooKeeperHost = "localhost:2181")
val publisher = kafka.consume("lowercaseStrings", "groupName", new StringDecoder())
val subscriber = kafka.publish("uppercaseStrings", "groupName", new StringEncoder())
// consume lowercase strings from kafka and publish them transformed to uppercase
Source(publisher).map(_.toUpperCase).to(Sink(subscriber)).run()
A quick comparison with Java 8
Streams
• Pull-based, synchronous sequences of values
• Iterators with a more parallelism-friendly interface
• Intermediate operations are lazy (e.g, filter, map)
• Terminal operations are eager (e.g, reduce)
• Only high-level control (no next/hasNext)
• Similar to Scala Collections
Java 8 Streams
String concatenatedString = listOfStrings
.stream()
.peek(s -> listOfStrings.add("three")) // don't do this!
.reduce((a, b) -> a + " " + b)
.get();
Code review and demo
Part 4 of 4
Source code available at https://github.com/rocketpages
Thank you!

Más contenido relacionado

La actualidad más candente

Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Konrad Malawski
 

La actualidad más candente (20)

Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
 
Asynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsAsynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka Streams
 
A dive into akka streams: from the basics to a real-world scenario
A dive into akka streams: from the basics to a real-world scenarioA dive into akka streams: from the basics to a real-world scenario
A dive into akka streams: from the basics to a real-world scenario
 
Streaming all the things with akka streams
Streaming all the things with akka streams   Streaming all the things with akka streams
Streaming all the things with akka streams
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Scala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streamsScala usergroup stockholm - reactive integrations with akka streams
Scala usergroup stockholm - reactive integrations with akka streams
 
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje CrnjakJavantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
 
VJUG24 - Reactive Integrations with Akka Streams
VJUG24  - Reactive Integrations with Akka StreamsVJUG24  - Reactive Integrations with Akka Streams
VJUG24 - Reactive Integrations with Akka Streams
 
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka StreamsFresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
Fresh from the Oven (04.2015): Experimental Akka Typed and Akka Streams
 
Asynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbsAsynchronous Orchestration DSL on squbs
Asynchronous Orchestration DSL on squbs
 
Reactive integrations with Akka Streams
Reactive integrations with Akka StreamsReactive integrations with Akka Streams
Reactive integrations with Akka Streams
 
Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTP
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaBack-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
 
Stream processing from single node to a cluster
Stream processing from single node to a clusterStream processing from single node to a cluster
Stream processing from single node to a cluster
 
Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014
 
Building Stateful Microservices With Akka
Building Stateful Microservices With AkkaBuilding Stateful Microservices With Akka
Building Stateful Microservices With Akka
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
 
Building Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJavaBuilding Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJava
 
Reactive programming with RxJava
Reactive programming with RxJavaReactive programming with RxJava
Reactive programming with RxJava
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 

Similar a Journey into Reactive Streams and Akka Streams

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
Apache Apex
 

Similar a Journey into Reactive Streams and Akka Streams (20)

Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Reactive Streams - László van den Hoek
Reactive Streams - László van den HoekReactive Streams - László van den Hoek
Reactive Streams - László van den Hoek
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5Reactive Programming in Java and Spring Framework 5
Reactive Programming in Java and Spring Framework 5
 
Writing Asynchronous Programs with Scala & Akka
Writing Asynchronous Programs with Scala & AkkaWriting Asynchronous Programs with Scala & Akka
Writing Asynchronous Programs with Scala & Akka
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Reactive Spring 5
Reactive Spring 5Reactive Spring 5
Reactive Spring 5
 
Akka-demy (a.k.a. How to build stateful distributed systems) I/II
 Akka-demy (a.k.a. How to build stateful distributed systems) I/II Akka-demy (a.k.a. How to build stateful distributed systems) I/II
Akka-demy (a.k.a. How to build stateful distributed systems) I/II
 
cb streams - gavin pickin
cb streams - gavin pickincb streams - gavin pickin
cb streams - gavin pickin
 
Groovy concurrency
Groovy concurrencyGroovy concurrency
Groovy concurrency
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
Reactive Streams and RxJava2
Reactive Streams and RxJava2Reactive Streams and RxJava2
Reactive Streams and RxJava2
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)Smart Partitioning with Apache Apex (Webinar)
Smart Partitioning with Apache Apex (Webinar)
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
RxJava - introduction & design
RxJava - introduction & designRxJava - introduction & design
RxJava - introduction & design
 

Último

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Último (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

Journey into Reactive Streams and Akka Streams

  • 1. A journey into stream processing with Reactive Streams and Akka Streams
  • 2. What to expect • An Introduction • The Reactive Streams specification • A deep-dive into Akka Streams • Code walkthrough and demo • Q&A
  • 4. What's an array? • A series of elements arranged in memory • Has a beginning and an end
  • 5. What's a stream? • A series of elements emitted over time • Live data (e.g, events) or at rest data (e.g, partitions of a file) • May not have a beginning or an end
  • 6. Appeal of stream processing? • Scaling business logic • Processing real-time data (fast data) • Batch processing of large data sets (big data) • Monitoring, analytics, complex event processing, etc
  • 7. Challenges? • Ephemeral • Unbounded in size • Potential "flooding" downstream • Unfamiliar programming paradigm You cannot step twice into the same stream. For as you are stepping in, other waters are ever flowing on to you. — Heraclitus
  • 8. Exploring two challenges of stream processing • An Rx-based approach for passing data across an asynchronous boundary • An approach for implementing back pressure
  • 14. Flow control • We need a way to signal when a subscriber is able to process more data • Effectively push-based (dynamic pull/push) A lack of back pressure will eventually lead to an Out of Memory Exception (OOME), which is the worst possible outcome. Then you lose not just the work that overloaded the system, but everything, even the stuff that you were safely working on.  — Jim Powers, Typesafe
  • 15. Subscriber usually has some kind of buffer.
  • 16. Fast publishers can overwhelm the buffer of a slow subscriber.
  • 17. Option 1: Use bounded buffer and drop messages.
  • 18. Option 2: Increase buffer size if memory available.
  • 19. Option 3: Pull-based backpressure.
  • 21. Why Reactive Streams? • Reactive Streams is a specification and low-level API for library developers. • Started as an initiative in late 2013 between engineers at Netflix, Pivotal, and Typesafe • Streaming was complex! • Play had “iteratees”, Akka had Akka IO
  • 22. What is Reactive Streams? • TCK (Technology Compatibility Kit) • API (JVM, JavaScript) • Specifications for library developers • Early conversation on future spec for IO
  • 23. 1. Flow control via back pressure • Fast publisher responsibilities 1. Not generate elements, if it is able to control their production rate 2. Buffer elements in a bounded manner until more demand is signalled 3. Drop elements until more demand is signalled 4. Tear down the stream if unable to apply any of the above strategies
  • 24. 2. An Rx-based approach to asyncrony public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {} public interface Publisher<T> { public void subscribe(Subscriber<? super T> s); } public interface Subscriber<T> { public void onSubscribe(Subscription s); public void onNext(T t); public void onError(Throwable t); public void onComplete(); } public interface Subscription { public void request(long n); public void cancel(); }
  • 25. Interoperability • RxJava (Netflix) • Reactor (Pivotal) • Vert.x (RedHat) • Akka Streams and Slick (Typesafe)
  • 26. Three main repositories • Reactive Streams for the JVM • Reactive Streams for JavaScript • Reactive Streams IO (for network protocols such as TCP, WebSockets and possibly HTTP/2) • Early exploration kicked off by Netflix • 2016 timeframe
  • 27. Reactive Streams Visit the Reactive Streams website for more information. http://www.reactive-streams.org/
  • 29. Akka Streams Akka Streams provides a way to express and run a chain of asynchronous processing steps acting on a sequence of elements. • DSL for async/non-blocking stream processing • Default back pressure • Conforms to the Reactive Streams spec for interop
  • 31. • Source - A processing stage with exactly one output • Sink - A processing stage with exactly one input • Flow - A processing stage which has exactly one input and output • RunnableFlow - A Flow that has both ends "attached" to a Source and Sink
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. API design Considerations • Immutable, composable stream blueprints • Explicit materialization step • No magic at the expense of some extra code
  • 38. Materialization • Separate the what from the how • Declarative Source/Flow/Sink to create a blueprint • FlowMaterializer turns blueprint into actors • Involves an extra step, but no magic
  • 39. Error handling • The element causing division by zero will be dropped • Result will be a Future completed with Success(228) val decider: Supervision.Decider = exc => exc match { case _: ArithmeticException => Supervision.Resume case _ => Supervision.Stop } // ActorFlowMaterializer takes the list of transformations comprising a akka.stream.scaladsl.Flow // and materializes them in the form of org.reactivestreams.Processor implicit val mat = ActorFlowMaterializer( ActorFlowMaterializerSettings(system).withSupervisionStrategy(decider)) val source = Source(0 to 5).map(100 / _) val result = source.runWith(Sink.fold(0)(_ + _))
  • 40. Dynamic push/pull backpressure • Fast subscriber can issue more Request(n) even before more data arrives • Publisher can accumulate demand • Conforming to "fast publisher" responsibilities • Total demand of elements is safe to publish • Subscriber's buffer will never overflow
  • 42. Fan out • Broadcast[T] (1 input, n outputs) • Signals each output given an input signal • Balance[T] (1 input => n outputs) • Signals one of its output ports given an input signal • FlexiRoute[In] (1 input, n outputs) • Write custom fan out elements using a simple DSL
  • 43. Fan in • Merge[In] (n inputs , 1 output) • Picks signals randomly from inputs • Zip[A,B,Out] (2 inputs, 1 output) • Zipping into an (A,B) tuple stream • Concat[T] (2 inputs, 1 output) • Concatenate streams (first, then second)
  • 44. Scala example val g = FlowGraph.closed() { implicit builder: FlowGraph.Builder => import FlowGraph.Implicits._ val in = Source(1 to 10) val out = Sink.ignore val bcast = builder.add(Broadcast[Int](2)) val merge = builder.add(Merge[Int](2)) val f1, f2, f3, f4 = Flow[Int].map(_ + 10) in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out bcast ~> f4 ~> merge }
  • 45.
  • 46. Advanced flow control // return only the freshest element when the subscriber signals demand val droppyStream: Flow[Message, Message] = Flow[Message].conflate(seed = identity)((lastMessage, newMessage) => newMessage) • conflate can be thought as a special fold operation that collapses multiple upstream elements into one aggregate element • groupedWithin chunks up this stream into groups of elements received within a time window, or limited by the given number of elements, whatever happens first
  • 47. Other sinks and sources - simple streaming from/to Kafka implicit val actorSystem = ActorSystem("ReactiveKafka") implicit val materializer = ActorMaterializer() val kafka = new ReactiveKafka(host = "localhost:9092", zooKeeperHost = "localhost:2181") val publisher = kafka.consume("lowercaseStrings", "groupName", new StringDecoder()) val subscriber = kafka.publish("uppercaseStrings", "groupName", new StringEncoder()) // consume lowercase strings from kafka and publish them transformed to uppercase Source(publisher).map(_.toUpperCase).to(Sink(subscriber)).run()
  • 48. A quick comparison with Java 8 Streams • Pull-based, synchronous sequences of values • Iterators with a more parallelism-friendly interface • Intermediate operations are lazy (e.g, filter, map) • Terminal operations are eager (e.g, reduce) • Only high-level control (no next/hasNext) • Similar to Scala Collections
  • 49. Java 8 Streams String concatenatedString = listOfStrings .stream() .peek(s -> listOfStrings.add("three")) // don't do this! .reduce((a, b) -> a + " " + b) .get();
  • 50. Code review and demo Part 4 of 4 Source code available at https://github.com/rocketpages
  • 51.