Reactive streaming is becoming the best approach to handle data flows across asynchronous boundaries. Here, we present the implementation of a real-world application based on Akka Streams. After reviewing the basics, we will discuss the development of a data processing pipeline that collects real-time sensor data and sends it to a Kinesis stream. There are various possible point of failures in this architecture. What should happen when Kinesis is unavailable? If the data flow is not handled in the correct way, some information may get lost. Akka Streams are the tools that enabled us to build a reliable processing logic for the pipeline that avoids data losses and maximizes the robustness of the entire system.
5. The Collector Service: two months ago...
AMAZON
KINESIS
INBOX
ACTOR
TCP
Kinesis Client
INBOX
ACTOR Kinesis Client
INBOX
ACTOR Kinesis Client
TCP
TCP
chunks
chunks
chunks
events
events
events
put
put
put
6.
7. Data may be lost
COLLECTOR SVC AMAZON KINESIS
WI-FI/3GSENSORS
CONNECTION
ISSUES
MESSAGE
OVERLOAD
23. class PersistentBuffer[A] (...)
extends GraphStage[FlowShape[A, A]] {
val in =
Inlet[A]("PersistentBuffer.in")
val out =
Outlet[A]("PersistentBuffer.out")
override val shape = FlowShape.of(in,out)
When the available stages are not enough, write your own
24. Custom stages: State
override def createLogic(
attr: Attributes
): GraphStageLogic =
new GraphStageLogic(shape) {
var state: StageState[A] = initialState[A]
25. val cb = getAsyncCallback[Try[A]] {
case Success(elements:A) =>
state = state.copy(...)
...
}
queue.getCurrentQueue
.onComplete(cb.invoke)
26. Custom Stages: ports
setHandler(in, new InHandler {
override def onPush() = {
val element = grab(in)
pull(in)
...
}
...
})
setHandler(out, new OutHandler {
override def onPull(): Unit = {
push(out, something)
}
...
})
27. Custom Stages: Start and stop
override def postStop(): Unit = {
...
}
override def preStart(): Unit = {
...
}
33. Automatic Fusion and Async Stages
A stream is handled by 1 thread
NO CROSS-THREAD COMMUNICATIONS
NO PARALLELISM
FASTER CODE!
SLOWER CODE!
A boundary will split your stream on different threads
source
.via(doSomething).async
.via(doSomething).async
.runWith(Sink.foreach(println))
34. Materialized Values
val sink: Sink[Any, Future[Done]] = Sink.ignore
.runWith(sink) is a shortcut for .toMat(sink)(Keep.Right).run
STREAM RUN
Obtain a materialized
value per each stage
35. Materialized Values Example: TestKit
val myFlow = Flow[String].map { v => v.take(1) }
val (pub, sub) = TestSource.probe[String]
.via(myFlow)
.toMat(TestSink.probe[Any])(Keep.both)
.run()
sub.request(1)
pub.sendNext("Gathering")
sub.expectNext("G")
36. Stream Ending: Supervisioning
val flow = otherFlow
.withAttributes
ActorAttributes.supervisionStrategy {
case _ => Supervision.resume
}
)