SlideShare una empresa de Scribd logo
1 de 91
Descargar para leer sin conexión
MANCHESTER LONDON NEW YORK
Petr Zapletal @petr_zapletal
#scaladays
@cakesolutions
Top Mistakes When Writing Reactive
Applications
Agenda
● Motivation
● Actors vs Futures
● Serialization
● Flat Actor Hierarchies
● Graceful Shutdown
● Distributed Transactions
● Longtail Latencies
● Quick Tips
Actors vs Futures
Constraints Liberate, Liberties Constrain
Pick the Right Tool for The Job
Scala
Future[T]
Akka
ACTORS
Power
Constraints
Akka
Stream
Pick the Right Tool for The Job
Scala
Future[T]
Akka
ACTORS
Power
Constraints
Akka
TYPED
Pick the Right Tool for The Job
Scala
Future[T] Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
Pick the Right Tool for The Job
Scala
Future[T]
Local Abstractions Distribution
Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
Actor Use Cases
● State management
● Location transparency
● Resilience mechanisms
● Single writer
● In-memory lock-free cache
● Sharding
Akka
ACTOR
Future Use Cases
● Local Concurrency
● Simplicity
● Composition
● Typesafety
Scala
Future[T]
Avoid Java Serialization
Java Serialization is the default in Akka, since
it is easy to start with it, but is very slow and
footprint heavy
Akka
ACTOR
Sending Data Through Network
Serialization Serialization
Akka
ACTOR
Persisting Data
Akka
ACTOR
Serialization
Java Serialization - Round Trip
Java Serialization - Footprint
Java Serialization - Footprint
case class Order (id: Long, description: String, totalCost: BigDecimal, orderLines: ArrayList[OrderLine], customer: Customer)
Java Serialization:
----sr--model.Order----h#-----J--idL--customert--Lmodel/Customer;L--descriptiont--Ljava/lang/String;L--orderLinest--Ljava/util
/List;L--totalCostt--Ljava/math/BigDecimal;xp--------ppsr--java.util.ArrayListx-----a----I--sizexp----w-----sr--model.OrderLine--
&-1-S----I--lineNumberL--costq-~--L--descriptionq-~--L--ordert--Lmodel/Order;xp----sr--java.math.BigDecimalT--W--(O---I--s
caleL--intValt--Ljava/math/BigInteger;xr--java.lang.Number-----------xp----sr--java.math.BigInteger-----;-----I--bitCountI--bitLe
ngthI--firstNonzeroByteNumI--lowestSetBitI--signum[--magnitudet--[Bxq-~----------------------ur--[B------T----xp----xxpq-~--x
q-~--
XML:
<order id="0" totalCost="0"><orderLines lineNumber="1" cost="0"><order>0</order></orderLines></order>
JSON:
{"order":{"id":0,"totalCost":0,"orderLines":[{"lineNumber":1,"cost":0,"order":0}]}}
Java Serialization Implementation
● Serializes
○ Data
○ Entire class definition
○ Definitions of all referenced classes
● It just “works”
○ Serializes almost everything (what implements Serializable)
○ Works with different JVMs
● Performance was not the main requirement
Points of Interest
● Performance
● Footprint
● Schema evolution
● Implementation effort
● Human readability
● Language bindings
● Backwards & forwards compatibility
● ...
JSON
● Advantages:
○ Human readability
○ Simple & well known
○ Many good libraries
for all platforms
● Disadvantages:
○ Slow
○ Large
○ Object names included
○ No schema (except e.g. json
schema)
○ Format and precision issues
● json4s, circe, µPickle, spray-json, argonaut, rapture-json, play-json, …
Binary formats [Schema-less]
● Metadata send together with data
● Advantages:
○ Implementation effort
○ Performance
○ Footprint *
● Disadvantages:
○ No human readability
● Kryo, Binary JSON (MessagePack, BSON, ... )
Binary formats [Schema]
● Schema defined by some kind of DSL
● Advantages:
○ Performance
○ Footprint
○ Schema evolution
● Disadvantages:
○ Implementation effort
○ No human readability
● Protobuf (+ projects like Flatbuffers, Cap’n Proto, etc.), Thrift, Avro
Summary
● Should be always changed
● Depends on particular use case
● Quick tips:
○ json4s
○ kryo
○ protobuf
Flat Actor Hierarchies
Errors should be handled out of band in a
parallel process - they are not part of the
main app
Top Level Actors
The Actor Hierarchy
/a1 /a2
Top Level Actors
The Actor Hierarchy
/a1 /a2
Root Actor
/user
Top Level Actors
The Actor Hierarchy
/a1 /a2
/b1 /b2
Root Actor
/c4/c3/c2/c1
/user
Top Level Actors
The Actor Hierarchy
/a1 /a2
/b1 /b2
Root Actor
/c4/c3/c2/c1
/user
/
/system
Two Different Battles to Win
● Separate business logic and failure handling
○ Less complexity
○ Better supportability
● Getting our application back to life after something bad happened
○ Failure isolation
○ Recovery
○ No more midnight calls :)
---> no more midnight calls :)
Errors & Failures
Errors
● Common events
● The current request is affected
● Will be communicated with the client/caller
● Incorrect requests, errors during validations, ...
Failures
● Unexpected events
● Service/actor is not able to operate normally
● Reports to supervisor
● Client can’t do anything, might be notified
● Database failures, network partitions, hardware
malfunctions, ...
Error Kernel Pattern
● Actor’s state is lost during restart and may not be recovered
● Delegating dangerous tasks to child actors and supervise them
/user/
a1
/user/
a1
/user/
a1/w1
/user/
a1
/user/
a1/w1
Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Backoff Supervisor
● Restarts actors each time with a growing time delay between restarts
BackoffSupervisor.props(
Backoff.onFailure(
childProps,
childName = "foo",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Summary
● Create rich actor hierarchies
● Separate business logic and failure handling
● Backoff Supervisor
Graceful Shutdown
We have thousands of sharded actors on
multiple nodes and we want to shut one of
them down
Graceful Shutdown
High-level Procedure
High-level Procedure
1. JVM gets the shutdown signal
High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
5. Actor System & JVM Termination
Integration with Sharded Actors
● Handling of added messages
○ Passivate() message for graceful stop
○ Context.stop() for immediate stop
● Priority mailbox
○ Priority message handling
○ Message retrying support
CoordinatedShutdown Extension
● Stops actors/services in a specific order
● Allows to register tasks and execute them during the shutdown
● More generic approach
● Added in Akka 2.5 (~ a week ago)
Summary
● We don’t want to lose data (usually)
● Shutdown coordinator on every node & Integration
with sharded actors
● Akka’s CoordinatedShutdown
Distributed Transactions
Any situation where a single event results in
the mutation of two separate sources of data
which cannot be committed atomically
What’s Wrong With Them
● Simple happy paths
● Fallacies of Distributed Programming
○ The network is reliable.
○ Latency is zero.
○ Bandwidth is infinite.
○ The network is secure.
○ Topology doesn't change.
○ There is one administrator.
○ Transport cost is zero.
○ The network is homogeneous.
Two-phase commit (2PC)
Stage 1 - Prepare Stage 2 - Commit
Prepare
Prepared
Prepare
Prepared
Com
m
it
Com
m
itted
Commit
Committed
Resource
Manager
Resource
Manager
Transaction
Manager
Resource
Manager
Resource
Manager
Transaction
Manager
Saga Pattern
T1 T2 T3 T4
C1 C2 C3 C4
The Big Trade-Off
● Distributed transactions can be usually avoided
○ Hard, expensive, fragile and do not scale
● Every business event needs to result in a single synchronous commit
● Other data sources should be updated asynchronously
● Introducing eventual consistency
Longtail Latencies
Consider a system where each service
typically responds in 10ms but with a 99th
percentile latency of one second
Longtail Latencies
Latency Normal vs. Longtail
Legend:
Normal
Longtail
50
40
30
20
10
0
25 50 75 90 99 99.9
Latency(ms)
Percentile
Longtails really matter
● Latency accumulation
● Not just noise
● Don’t have to be power users
● Real problem
Investigating Longtail Latencies
● Narrow the problem
● Isolate in a test environment
● Measure & monitor everything
● Tackle the problem
● Pretty hard job
Tolerating Longtail Latencies
Tolerating Longtail Latencies
● Hedging your bet
Tolerating Longtail Latencies
● Hedging your bet
● Tied requests
Tolerating Longtail Latencies
● Hedging your bet
● Tied requests
● Selectively increase replication factors
Tolerating Longtail Latencies
● Hedging your bet
● Tied requests
● Selectively increase replication factors
● Put slow machines on probation
Tolerating Longtail Latencies
● Hedging your bet
● Tied requests
● Selectively increase replication factors
● Put slow machines on probation
● Consider ‘good enough’ responses
Tolerating Longtail Latencies
● Hedging your bet
● Tied requests
● Selectively increase replication factors
● Put slow machines on probation
● Consider ‘good enough’ responses
● Hardware update
Quick Tips
Quick Tips
● Monitoring
Quick Tips
● Monitoring
● Network partitions
Quick Tips
● Monitoring
● Network partitions
○ Split Brain Resolver
Quick Tips
● Monitoring
● Network partitions
○ Split Brain Resolver
● Blocking
Quick Tips
● Monitoring
● Network partitions
○ Split Brain Resolver
● Blocking
● Too many actor systems
Questions
MANCHESTER LONDON NEW YORK
MANCHESTER LONDON NEW YORK
@petr_zapletal @cakesolutions
347 708 1518
petrz@cakesolutions.net
We are hiring
http://www.cakesolutions.net/careers
References
● http://www.reactivemanifesto.org/
● http://www.slideshare.net/ktoso/zen-of-akka
● http://eishay.github.io/jvm-serializers/prototype-results-page/
● http://java-persistence-performance.blogspot.com/2013/08/optimizing-java-serialization-java-vs.html
● https://github.com/romix/akka-kryo-serialization
● http://gotocon.com/dl/goto-chicago-2015/slides/CaitieMcCaffrey_ApplyingTheSagaPattern.pdf
● http://www.grahamlea.com/2016/08/distributed-transactions-microservices-icebergs/
● http://www.cs.duke.edu/courses/cps296.4/fall13/838-CloudPapers/dean_longtail.pdf
● https://engineering.linkedin.com/performance/who-moved-my-99th-percentile-latency
● http://doc.akka.io/docs/akka/rp-15v09p01/scala/split-brain-resolver.html
● http://manuel.bernhardt.io/2016/08/09/akka-anti-patterns-flat-actor-hierarchies-or-mixing-business-logic-a
nd-failure-handling/
Backup Slides
MANCHESTER LONDON NEW YORK
Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
Adding Shutdown Hook
val nodeShutdownCoordinatorActor = system.actorOf(Props(
new NodeGracefulShutdownCoordinator(...)))
sys.addShutdownHook {
nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions)
}
Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
Tell Local Regions to Shutdown
when(AwaitNodeShutdownInitiation) {
case Event(StartNodeShutdown(shardRegions), _) =>
if (shardRegions.nonEmpty) {
// starts watching of every shard region and sends GracefulShutdown msg to them
stopShardRegions(shardRegions)
goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions)
} else {
// registers OnMemberRemoved and leaves the cluster
leaveCluster()
goto(AwaitClusterExit)
}
}
Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
Node Leaves the Cluster
when(AwaitShardRegionsShutdown, stateTimeout = ... ){
case Event(Terminated(actor), ManagedRegions(regions)) =>
if (regions.contains(actor)) {
val remainingRegions = regions - actor
if (remainingRegions.isEmpty) {
leaveCluster()
goto(AwaitClusterExit)
} else {
goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions)
}
} else {
stay()
}
case Event(StateTimeout, _) =>
leaveCluster()
goto(AwaitNodeTerminationSignal)
}
Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
Wait for Singletons to Migrate
when(AwaitClusterExit, stateTimeout = ...) {
case Event(NodeLeftCluster | StateTimeout, _) =>
// Waiting on cluster singleton migration
goto(AwaitClusterSingletonMigration)
}
when(AwaitClusterSingletonMigration, stateTimeout = ... ) {
case Event(StateTimeout, _) =>
goto(AwaitNodeTerminationSignal)
}
onTransition {
case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal =>
self ! TerminateNode
}
Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}
Actor System & JVM Termination
when(AwaitNodeTerminationSignal, stateTimeout = ...) {
case Event(TerminateNode | StateTimeout, _) =>
// This is NOT an Akka thread-pool (since we're shutting those down)
val ec = scala.concurrent.ExecutionContext.global
// Calls context.system.terminate with registered onComplete block
terminateSystem {
case Success(ex) =>
System.exit(...)
case Failure(ex) =>
System.exit(...)
}(ec)
stop(Shutdown)
}

Más contenido relacionado

La actualidad más candente

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
confluent
 

La actualidad más candente (20)

Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje CrnjakJavantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
Javantura v3 - Going Reactive with RxJava – Hrvoje Crnjak
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
Building scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTPBuilding scalable rest service using Akka HTTP
Building scalable rest service using Akka HTTP
 
Core Services behind Spark Job Execution
Core Services behind Spark Job ExecutionCore Services behind Spark Job Execution
Core Services behind Spark Job Execution
 
Zoo keeper in the wild
Zoo keeper in the wildZoo keeper in the wild
Zoo keeper in the wild
 
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
Thinking Functionally with Clojure
Thinking Functionally with ClojureThinking Functionally with Clojure
Thinking Functionally with Clojure
 
Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Lightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just RightLightbend Lagom: Microservices Just Right
Lightbend Lagom: Microservices Just Right
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
 
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy FarkasVirtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
Bellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark StreamingBellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark Streaming
 
Spark on Yarn
Spark on YarnSpark on Yarn
Spark on Yarn
 
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
 

Similar a Reactive mistakes - ScalaDays Chicago 2017

Similar a Reactive mistakes - ScalaDays Chicago 2017 (20)

Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
Top Mistakes When Writing Reactive Applications - Scala by the Bay 2016
 
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015Netflix Keystone Pipeline at Samza Meetup 10-13-2015
Netflix Keystone Pipeline at Samza Meetup 10-13-2015
 
Akka (1)
Akka (1)Akka (1)
Akka (1)
 
Barcamp presentation
Barcamp presentationBarcamp presentation
Barcamp presentation
 
Real-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache ApexReal-time Stream Processing using Apache Apex
Real-time Stream Processing using Apache Apex
 
Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016Introduction to Apache Apex - CoDS 2016
Introduction to Apache Apex - CoDS 2016
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Container Orchestration from Theory to Practice
Container Orchestration from Theory to PracticeContainer Orchestration from Theory to Practice
Container Orchestration from Theory to Practice
 
An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...An adaptive and eventually self healing framework for geo-distributed real-ti...
An adaptive and eventually self healing framework for geo-distributed real-ti...
 
Akka-intro-training-public.pdf
Akka-intro-training-public.pdfAkka-intro-training-public.pdf
Akka-intro-training-public.pdf
 
2 years into drinking the Microservice kool-aid (Fact and Fiction)
2 years into drinking the Microservice kool-aid (Fact and Fiction)2 years into drinking the Microservice kool-aid (Fact and Fiction)
2 years into drinking the Microservice kool-aid (Fact and Fiction)
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Taskerman - a distributed cluster task manager
Taskerman - a distributed cluster task managerTaskerman - a distributed cluster task manager
Taskerman - a distributed cluster task manager
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Go
 
"Stateful app as an efficient way to build dispatching for riders and drivers...
"Stateful app as an efficient way to build dispatching for riders and drivers..."Stateful app as an efficient way to build dispatching for riders and drivers...
"Stateful app as an efficient way to build dispatching for riders and drivers...
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 

Más de Petr Zapletal

Más de Petr Zapletal (9)

Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019Change Data Capture - Scale by the Bay 2019
Change Data Capture - Scale by the Bay 2019
 
Adopting GraalVM - NE Scala 2019
Adopting GraalVM - NE Scala 2019Adopting GraalVM - NE Scala 2019
Adopting GraalVM - NE Scala 2019
 
Adopting GraalVM - Scala eXchange London 2018
Adopting GraalVM - Scala eXchange London 2018Adopting GraalVM - Scala eXchange London 2018
Adopting GraalVM - Scala eXchange London 2018
 
Adopting GraalVM - Scale by the Bay 2018
Adopting GraalVM - Scale by the Bay 2018Adopting GraalVM - Scale by the Bay 2018
Adopting GraalVM - Scale by the Bay 2018
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
Spark Concepts - Spark SQL, Graphx, Streaming
Spark Concepts - Spark SQL, Graphx, StreamingSpark Concepts - Spark SQL, Graphx, Streaming
Spark Concepts - Spark SQL, Graphx, Streaming
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
 

Último

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Último (20)

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

Reactive mistakes - ScalaDays Chicago 2017

  • 2. Petr Zapletal @petr_zapletal #scaladays @cakesolutions Top Mistakes When Writing Reactive Applications
  • 3. Agenda ● Motivation ● Actors vs Futures ● Serialization ● Flat Actor Hierarchies ● Graceful Shutdown ● Distributed Transactions ● Longtail Latencies ● Quick Tips
  • 4. Actors vs Futures Constraints Liberate, Liberties Constrain
  • 5. Pick the Right Tool for The Job Scala Future[T] Akka ACTORS Power Constraints Akka Stream
  • 6. Pick the Right Tool for The Job Scala Future[T] Akka ACTORS Power Constraints Akka TYPED
  • 7. Pick the Right Tool for The Job Scala Future[T] Akka TYPED Akka ACTORS Power Constraints Akka Stream
  • 8. Pick the Right Tool for The Job Scala Future[T] Local Abstractions Distribution Akka TYPED Akka ACTORS Power Constraints Akka Stream
  • 9. Actor Use Cases ● State management ● Location transparency ● Resilience mechanisms ● Single writer ● In-memory lock-free cache ● Sharding Akka ACTOR
  • 10. Future Use Cases ● Local Concurrency ● Simplicity ● Composition ● Typesafety Scala Future[T]
  • 11. Avoid Java Serialization Java Serialization is the default in Akka, since it is easy to start with it, but is very slow and footprint heavy
  • 12. Akka ACTOR Sending Data Through Network Serialization Serialization Akka ACTOR
  • 14. Java Serialization - Round Trip
  • 15. Java Serialization - Footprint
  • 16. Java Serialization - Footprint case class Order (id: Long, description: String, totalCost: BigDecimal, orderLines: ArrayList[OrderLine], customer: Customer) Java Serialization: ----sr--model.Order----h#-----J--idL--customert--Lmodel/Customer;L--descriptiont--Ljava/lang/String;L--orderLinest--Ljava/util /List;L--totalCostt--Ljava/math/BigDecimal;xp--------ppsr--java.util.ArrayListx-----a----I--sizexp----w-----sr--model.OrderLine-- &-1-S----I--lineNumberL--costq-~--L--descriptionq-~--L--ordert--Lmodel/Order;xp----sr--java.math.BigDecimalT--W--(O---I--s caleL--intValt--Ljava/math/BigInteger;xr--java.lang.Number-----------xp----sr--java.math.BigInteger-----;-----I--bitCountI--bitLe ngthI--firstNonzeroByteNumI--lowestSetBitI--signum[--magnitudet--[Bxq-~----------------------ur--[B------T----xp----xxpq-~--x q-~-- XML: <order id="0" totalCost="0"><orderLines lineNumber="1" cost="0"><order>0</order></orderLines></order> JSON: {"order":{"id":0,"totalCost":0,"orderLines":[{"lineNumber":1,"cost":0,"order":0}]}}
  • 17. Java Serialization Implementation ● Serializes ○ Data ○ Entire class definition ○ Definitions of all referenced classes ● It just “works” ○ Serializes almost everything (what implements Serializable) ○ Works with different JVMs ● Performance was not the main requirement
  • 18. Points of Interest ● Performance ● Footprint ● Schema evolution ● Implementation effort ● Human readability ● Language bindings ● Backwards & forwards compatibility ● ...
  • 19. JSON ● Advantages: ○ Human readability ○ Simple & well known ○ Many good libraries for all platforms ● Disadvantages: ○ Slow ○ Large ○ Object names included ○ No schema (except e.g. json schema) ○ Format and precision issues ● json4s, circe, µPickle, spray-json, argonaut, rapture-json, play-json, …
  • 20. Binary formats [Schema-less] ● Metadata send together with data ● Advantages: ○ Implementation effort ○ Performance ○ Footprint * ● Disadvantages: ○ No human readability ● Kryo, Binary JSON (MessagePack, BSON, ... )
  • 21. Binary formats [Schema] ● Schema defined by some kind of DSL ● Advantages: ○ Performance ○ Footprint ○ Schema evolution ● Disadvantages: ○ Implementation effort ○ No human readability ● Protobuf (+ projects like Flatbuffers, Cap’n Proto, etc.), Thrift, Avro
  • 22. Summary ● Should be always changed ● Depends on particular use case ● Quick tips: ○ json4s ○ kryo ○ protobuf
  • 23. Flat Actor Hierarchies Errors should be handled out of band in a parallel process - they are not part of the main app
  • 24. Top Level Actors The Actor Hierarchy /a1 /a2
  • 25. Top Level Actors The Actor Hierarchy /a1 /a2 Root Actor /user
  • 26. Top Level Actors The Actor Hierarchy /a1 /a2 /b1 /b2 Root Actor /c4/c3/c2/c1 /user
  • 27. Top Level Actors The Actor Hierarchy /a1 /a2 /b1 /b2 Root Actor /c4/c3/c2/c1 /user / /system
  • 28. Two Different Battles to Win ● Separate business logic and failure handling ○ Less complexity ○ Better supportability ● Getting our application back to life after something bad happened ○ Failure isolation ○ Recovery ○ No more midnight calls :) ---> no more midnight calls :)
  • 29. Errors & Failures Errors ● Common events ● The current request is affected ● Will be communicated with the client/caller ● Incorrect requests, errors during validations, ... Failures ● Unexpected events ● Service/actor is not able to operate normally ● Reports to supervisor ● Client can’t do anything, might be notified ● Database failures, network partitions, hardware malfunctions, ...
  • 30. Error Kernel Pattern ● Actor’s state is lost during restart and may not be recovered ● Delegating dangerous tasks to child actors and supervise them /user/ a1 /user/ a1 /user/ a1/w1 /user/ a1 /user/ a1/w1
  • 31. Backoff Supervisor ● Restarts actors each time with a growing time delay between restarts BackoffSupervisor.props( Backoff.onFailure( childProps, childName = "foo", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 32. Backoff Supervisor ● Restarts actors each time with a growing time delay between restarts BackoffSupervisor.props( Backoff.onFailure( childProps, childName = "foo", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 33. Backoff Supervisor ● Restarts actors each time with a growing time delay between restarts BackoffSupervisor.props( Backoff.onFailure( childProps, childName = "foo", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 34. Backoff Supervisor ● Restarts actors each time with a growing time delay between restarts BackoffSupervisor.props( Backoff.onFailure( childProps, childName = "foo", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 35. Backoff Supervisor ● Restarts actors each time with a growing time delay between restarts BackoffSupervisor.props( Backoff.onFailure( childProps, childName = "foo", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 36. Summary ● Create rich actor hierarchies ● Separate business logic and failure handling ● Backoff Supervisor
  • 37. Graceful Shutdown We have thousands of sharded actors on multiple nodes and we want to shut one of them down
  • 40. High-level Procedure 1. JVM gets the shutdown signal
  • 41. High-level Procedure 1. JVM gets the shutdown signal 2. Coordinator tells all local ShardRegions to shut down gracefully
  • 42. High-level Procedure 1. JVM gets the shutdown signal 2. Coordinator tells all local ShardRegions to shut down gracefully 3. Node leaves cluster
  • 43. High-level Procedure 1. JVM gets the shutdown signal 2. Coordinator tells all local ShardRegions to shut down gracefully 3. Node leaves cluster 4. Coordinator gives singletons a grace period to migrate
  • 44. High-level Procedure 1. JVM gets the shutdown signal 2. Coordinator tells all local ShardRegions to shut down gracefully 3. Node leaves cluster 4. Coordinator gives singletons a grace period to migrate 5. Actor System & JVM Termination
  • 45. Integration with Sharded Actors ● Handling of added messages ○ Passivate() message for graceful stop ○ Context.stop() for immediate stop ● Priority mailbox ○ Priority message handling ○ Message retrying support
  • 46. CoordinatedShutdown Extension ● Stops actors/services in a specific order ● Allows to register tasks and execute them during the shutdown ● More generic approach ● Added in Akka 2.5 (~ a week ago)
  • 47. Summary ● We don’t want to lose data (usually) ● Shutdown coordinator on every node & Integration with sharded actors ● Akka’s CoordinatedShutdown
  • 48. Distributed Transactions Any situation where a single event results in the mutation of two separate sources of data which cannot be committed atomically
  • 49. What’s Wrong With Them ● Simple happy paths ● Fallacies of Distributed Programming ○ The network is reliable. ○ Latency is zero. ○ Bandwidth is infinite. ○ The network is secure. ○ Topology doesn't change. ○ There is one administrator. ○ Transport cost is zero. ○ The network is homogeneous.
  • 50. Two-phase commit (2PC) Stage 1 - Prepare Stage 2 - Commit Prepare Prepared Prepare Prepared Com m it Com m itted Commit Committed Resource Manager Resource Manager Transaction Manager Resource Manager Resource Manager Transaction Manager
  • 51. Saga Pattern T1 T2 T3 T4 C1 C2 C3 C4
  • 52. The Big Trade-Off ● Distributed transactions can be usually avoided ○ Hard, expensive, fragile and do not scale ● Every business event needs to result in a single synchronous commit ● Other data sources should be updated asynchronously ● Introducing eventual consistency
  • 53. Longtail Latencies Consider a system where each service typically responds in 10ms but with a 99th percentile latency of one second
  • 54. Longtail Latencies Latency Normal vs. Longtail Legend: Normal Longtail 50 40 30 20 10 0 25 50 75 90 99 99.9 Latency(ms) Percentile
  • 55. Longtails really matter ● Latency accumulation ● Not just noise ● Don’t have to be power users ● Real problem
  • 56. Investigating Longtail Latencies ● Narrow the problem ● Isolate in a test environment ● Measure & monitor everything ● Tackle the problem ● Pretty hard job
  • 59. Tolerating Longtail Latencies ● Hedging your bet ● Tied requests
  • 60. Tolerating Longtail Latencies ● Hedging your bet ● Tied requests ● Selectively increase replication factors
  • 61. Tolerating Longtail Latencies ● Hedging your bet ● Tied requests ● Selectively increase replication factors ● Put slow machines on probation
  • 62. Tolerating Longtail Latencies ● Hedging your bet ● Tied requests ● Selectively increase replication factors ● Put slow machines on probation ● Consider ‘good enough’ responses
  • 63. Tolerating Longtail Latencies ● Hedging your bet ● Tied requests ● Selectively increase replication factors ● Put slow machines on probation ● Consider ‘good enough’ responses ● Hardware update
  • 66. Quick Tips ● Monitoring ● Network partitions
  • 67. Quick Tips ● Monitoring ● Network partitions ○ Split Brain Resolver
  • 68. Quick Tips ● Monitoring ● Network partitions ○ Split Brain Resolver ● Blocking
  • 69. Quick Tips ● Monitoring ● Network partitions ○ Split Brain Resolver ● Blocking ● Too many actor systems
  • 71. MANCHESTER LONDON NEW YORK @petr_zapletal @cakesolutions 347 708 1518 petrz@cakesolutions.net We are hiring http://www.cakesolutions.net/careers
  • 72. References ● http://www.reactivemanifesto.org/ ● http://www.slideshare.net/ktoso/zen-of-akka ● http://eishay.github.io/jvm-serializers/prototype-results-page/ ● http://java-persistence-performance.blogspot.com/2013/08/optimizing-java-serialization-java-vs.html ● https://github.com/romix/akka-kryo-serialization ● http://gotocon.com/dl/goto-chicago-2015/slides/CaitieMcCaffrey_ApplyingTheSagaPattern.pdf ● http://www.grahamlea.com/2016/08/distributed-transactions-microservices-icebergs/ ● http://www.cs.duke.edu/courses/cps296.4/fall13/838-CloudPapers/dean_longtail.pdf ● https://engineering.linkedin.com/performance/who-moved-my-99th-percentile-latency ● http://doc.akka.io/docs/akka/rp-15v09p01/scala/split-brain-resolver.html ● http://manuel.bernhardt.io/2016/08/09/akka-anti-patterns-flat-actor-hierarchies-or-mixing-business-logic-a nd-failure-handling/
  • 74. Adding Shutdown Hook val nodeShutdownCoordinatorActor = system.actorOf(Props( new NodeGracefulShutdownCoordinator(...))) sys.addShutdownHook { nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions) }
  • 75. Adding Shutdown Hook val nodeShutdownCoordinatorActor = system.actorOf(Props( new NodeGracefulShutdownCoordinator(...))) sys.addShutdownHook { nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions) }
  • 76. Adding Shutdown Hook val nodeShutdownCoordinatorActor = system.actorOf(Props( new NodeGracefulShutdownCoordinator(...))) sys.addShutdownHook { nodeShutdownCoordinatorActor ! StartNodeShutdown(shardRegions) }
  • 77. Tell Local Regions to Shutdown when(AwaitNodeShutdownInitiation) { case Event(StartNodeShutdown(shardRegions), _) => if (shardRegions.nonEmpty) { // starts watching of every shard region and sends GracefulShutdown msg to them stopShardRegions(shardRegions) goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions) } else { // registers OnMemberRemoved and leaves the cluster leaveCluster() goto(AwaitClusterExit) } }
  • 78. Tell Local Regions to Shutdown when(AwaitNodeShutdownInitiation) { case Event(StartNodeShutdown(shardRegions), _) => if (shardRegions.nonEmpty) { // starts watching of every shard region and sends GracefulShutdown msg to them stopShardRegions(shardRegions) goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions) } else { // registers OnMemberRemoved and leaves the cluster leaveCluster() goto(AwaitClusterExit) } }
  • 79. Tell Local Regions to Shutdown when(AwaitNodeShutdownInitiation) { case Event(StartNodeShutdown(shardRegions), _) => if (shardRegions.nonEmpty) { // starts watching of every shard region and sends GracefulShutdown msg to them stopShardRegions(shardRegions) goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions) } else { // registers OnMemberRemoved and leaves the cluster leaveCluster() goto(AwaitClusterExit) } }
  • 80. Tell Local Regions to Shutdown when(AwaitNodeShutdownInitiation) { case Event(StartNodeShutdown(shardRegions), _) => if (shardRegions.nonEmpty) { // starts watching of every shard region and sends GracefulShutdown msg to them stopShardRegions(shardRegions) goto(AwaitShardRegionsShutdown) using ManagedRegions(shardRegions) } else { // registers OnMemberRemoved and leaves the cluster leaveCluster() goto(AwaitClusterExit) } }
  • 81. Node Leaves the Cluster when(AwaitShardRegionsShutdown, stateTimeout = ... ){ case Event(Terminated(actor), ManagedRegions(regions)) => if (regions.contains(actor)) { val remainingRegions = regions - actor if (remainingRegions.isEmpty) { leaveCluster() goto(AwaitClusterExit) } else { goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions) } } else { stay() } case Event(StateTimeout, _) => leaveCluster() goto(AwaitNodeTerminationSignal) }
  • 82. Node Leaves the Cluster when(AwaitShardRegionsShutdown, stateTimeout = ... ){ case Event(Terminated(actor), ManagedRegions(regions)) => if (regions.contains(actor)) { val remainingRegions = regions - actor if (remainingRegions.isEmpty) { leaveCluster() goto(AwaitClusterExit) } else { goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions) } } else { stay() } case Event(StateTimeout, _) => leaveCluster() goto(AwaitNodeTerminationSignal) }
  • 83. Node Leaves the Cluster when(AwaitShardRegionsShutdown, stateTimeout = ... ){ case Event(Terminated(actor), ManagedRegions(regions)) => if (regions.contains(actor)) { val remainingRegions = regions - actor if (remainingRegions.isEmpty) { leaveCluster() goto(AwaitClusterExit) } else { goto(AwaitShardRegionsShutdown) using ManagedRegions(remainingRegions) } } else { stay() } case Event(StateTimeout, _) => leaveCluster() goto(AwaitNodeTerminationSignal) }
  • 84. Wait for Singletons to Migrate when(AwaitClusterExit, stateTimeout = ...) { case Event(NodeLeftCluster | StateTimeout, _) => // Waiting on cluster singleton migration goto(AwaitClusterSingletonMigration) } when(AwaitClusterSingletonMigration, stateTimeout = ... ) { case Event(StateTimeout, _) => goto(AwaitNodeTerminationSignal) } onTransition { case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal => self ! TerminateNode }
  • 85. Wait for Singletons to Migrate when(AwaitClusterExit, stateTimeout = ...) { case Event(NodeLeftCluster | StateTimeout, _) => // Waiting on cluster singleton migration goto(AwaitClusterSingletonMigration) } when(AwaitClusterSingletonMigration, stateTimeout = ... ) { case Event(StateTimeout, _) => goto(AwaitNodeTerminationSignal) } onTransition { case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal => self ! TerminateNode }
  • 86. Wait for Singletons to Migrate when(AwaitClusterExit, stateTimeout = ...) { case Event(NodeLeftCluster | StateTimeout, _) => // Waiting on cluster singleton migration goto(AwaitClusterSingletonMigration) } when(AwaitClusterSingletonMigration, stateTimeout = ... ) { case Event(StateTimeout, _) => goto(AwaitNodeTerminationSignal) } onTransition { case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal => self ! TerminateNode }
  • 87. Wait for Singletons to Migrate when(AwaitClusterExit, stateTimeout = ...) { case Event(NodeLeftCluster | StateTimeout, _) => // Waiting on cluster singleton migration goto(AwaitClusterSingletonMigration) } when(AwaitClusterSingletonMigration, stateTimeout = ... ) { case Event(StateTimeout, _) => goto(AwaitNodeTerminationSignal) } onTransition { case AwaitClusterSingletonMigration -> AwaitNodeTerminationSignal => self ! TerminateNode }
  • 88. Actor System & JVM Termination when(AwaitNodeTerminationSignal, stateTimeout = ...) { case Event(TerminateNode | StateTimeout, _) => // This is NOT an Akka thread-pool (since we're shutting those down) val ec = scala.concurrent.ExecutionContext.global // Calls context.system.terminate with registered onComplete block terminateSystem { case Success(ex) => System.exit(...) case Failure(ex) => System.exit(...) }(ec) stop(Shutdown) }
  • 89. Actor System & JVM Termination when(AwaitNodeTerminationSignal, stateTimeout = ...) { case Event(TerminateNode | StateTimeout, _) => // This is NOT an Akka thread-pool (since we're shutting those down) val ec = scala.concurrent.ExecutionContext.global // Calls context.system.terminate with registered onComplete block terminateSystem { case Success(ex) => System.exit(...) case Failure(ex) => System.exit(...) }(ec) stop(Shutdown) }
  • 90. Actor System & JVM Termination when(AwaitNodeTerminationSignal, stateTimeout = ...) { case Event(TerminateNode | StateTimeout, _) => // This is NOT an Akka thread-pool (since we're shutting those down) val ec = scala.concurrent.ExecutionContext.global // Calls context.system.terminate with registered onComplete block terminateSystem { case Success(ex) => System.exit(...) case Failure(ex) => System.exit(...) }(ec) stop(Shutdown) }
  • 91. Actor System & JVM Termination when(AwaitNodeTerminationSignal, stateTimeout = ...) { case Event(TerminateNode | StateTimeout, _) => // This is NOT an Akka thread-pool (since we're shutting those down) val ec = scala.concurrent.ExecutionContext.global // Calls context.system.terminate with registered onComplete block terminateSystem { case Success(ex) => System.exit(...) case Failure(ex) => System.exit(...) }(ec) stop(Shutdown) }