Using Futures as a basic building block for concurrent, async code has become pervasive in the past few years and for a good reason. However, when moving from the traditional synchronous code to the async one, a set of patterns that were obvious to implement before now seem to be more challenging. The aim of this talk is to show few examples of these patterns implemented with Scala futures in an async and non blocking manner. We will present the usage pattern and the implementation in order to show the principles of properly handling async code.
2. Asy Ronen
* Independent consultant
* asy.ronen@gmail.com
* linkedin.com/in/asyronen
Michael Arenzon
* Application Infrastructure TL @ Outbrain
* github.com/marenzo
* linkedin.com/in/arenzon
About Us
3. Intro
▪ When writing async code, a few models can be used, such as:
▫ Actor
▫ CSP
▫ Future / Promise
▪ We will concentrate on futures, specifically Scala Future
▪ We will introduce a set of useful patterns that will enhance the
resiliency of your system
5. Motivation
▪ In Outbrain we started writing asynchronous services a few
years back using our own library called Ob1k.
▪ Ob1k includes a Future implementation in Java that is similar to
the Scala one (or C# Task).
▪ We learned a few patterns that were useful when working with
asynchronous code in a live production environment
▪ These patterns are now ported to a standalone library around
Scala’s future.
6. ▪ When writing async code there is a need to relate to real clock in order to:
▫ Schedule a future computation
▫ “Sleep” between two async actions
▫ Measure time of async actions
▫ Etc…
▪ To do that we need a scheduler
Pattern #1 - schedule()
8. "schedule" should "schedule execution in the future" in {
val res = schedule(1 second) {
System.currentTimeMillis()
}
val t1 = System.currentTimeMillis()
val t2 = Await.result(res, 2 second)
val time = t2 - t1
assert (time >= 1000 && time <= 1100)
}
schedule() - Usage Example
9. ▪ If not properly bounded, we can wait for a future forever.
▪ If we want to provide SLA for our service, we must limit the total
time allowed to process the request.
▪ An error returned on time is better than no answer.
Pattern #2 - withTimeout()
10. implicit class FutureTimeout[T](future: Future[T]) {
def withTimeout(duration: FiniteDuration)
(implicit scheduler: Scheduler,
executor: ExecutionContext):Future[T] = {
val deadline = schedule(duration) {
throw new TimeoutException("future timeout")
}
Future firstCompletedOf Seq(future, deadline)
}
}
withTimeout() - Implementation
The original future task is not interrupted!
11. withTimeout() - (Happy) Usage Example
"withTimeout" should "do nothing if result arrives on time" in {
val scheduledFuture = schedule(1 second) {
"hello"
} withTimeout (2 second)
val result = Await.result(scheduledFuture, 2 second)
assert (result === "hello")
}
12. withTimeout() - Usage Example
"withTimeout" should "throw exception after timeout" in {
val scheduledFuture = schedule(2 second) {
"hello"
} withTimeout (1 second)
assertThrows[TimeoutException] {
Await.result(scheduledFuture, 2 second)
}
}
13. Pattern #3 - sequence()
▪ Scala’s Future class contain a sequence method that transforms
a List[Future[T]] => Future[List[T]]
▪ However, it has two main drawbacks:
a. If one future fails the whole thing fails but what if 90% of
the results are good enough for us?
b. It doesn’t fail fast i.e. we wait for the slowest result/error to
arrive
15. sequence() - Stop Conditions
sealed trait StopCondition
case object FailOnError extends StopCondition
case object StopOnError extends StopCondition
case object ContinueOnError extends StopCondition
Used to choose a strategy of handling errors in our execution.
20. sequence() - #1 Usage Example
"sequence" should "fail immediately if error occurs" in {
val f1 = schedule(1 second)("first")
val f2 = Future failed new ServiceException("I’m down")
val f3 = schedule(2 second)("second")
val res = sequence(Seq(f1, f2, f3), FailOnError)
try {
val finalRes = Await.result(res, 10 milli)
fail("should throw exception.")
} catch {
case e: ServiceException => succeed
}
}
21. collect() - #2 Usage Example
"sequence" should "stop on first error" in {
val f1 = schedule(1 second)("first")
val f2 = Future failed
new RuntimeException("failed") delay (2 second)
val f3 = schedule(3 second)("second")
val input = Map("1" -> f1, "2" -> f2, "3" -> f3)
val res = collect(input, StopOnError)
val finalRes = Await.result(res, 3 second)
assert (finalRes.size === 1)
}
22. sequence() - Usage Example (contd.)
"sequence" should "collect all successful results" in {
val f1 = Future("first")
val f2 = Future("second")
val f3 = Future failed new RuntimeException("failed result")
val f4 = Future("third")
val input = Map("1" -> f1, "2" -> f2, "3" -> f3, "4" -> f4)
val res = collect(input, ContinueOnError)
val finalRes = Await.result(res, 1 second)
assert (finalRes.size === 3)
}
23. collectAll() - Usage Example
"collectAll" should "collect all results" in {
val f1 = schedule(1 second)("first")
val f2 = Future failed new IOException("failed")
val f3 = schedule(2 second)("second")
val res = collectAll(Map("1" -> f1, "2" -> f2, "3" -> f3))
val finalRes = Await.result(res, 3 second)
val (goodResults, badResults) = finalRes partition {
case (_, Success(_)) => true
case _ => false
}
assert(goodResults.size === 2)
assert(badResults.size === 1)
}
24. Pattern #4 - parallelCollect()
▪ Executing too many operations concurrently can be overwhelming.
▪ Some services cap concurrency of a single consumer
▪ To throttle execution we need a tool that will allow us to define maximum
concurrent operations
26. parallelCollect() - Usage Example
"Google Search" should "return all results without throttling" in {
val queries: List[String] = createQueries(amount = 10000)
val results = parallelCollect(queries, 10, ContinueOnError) {
query => GoogleSearchClient.sendQuery(query)
} withTimeout(30 second)
val finalRes = Await.result(results, 30 second)
assert (finalRes.size === 10000)
}
27. Pattern #5 - retry()
● Applications fail. Network connections drop. Connections
timeout. Bad things happen.
● You can give your application perseverance with retry.
28. retry() - (Naive) Implementation
def retry[T](retries: Int)(f: => Future[T]): Future[T] = f recoverWith {
case _ if retries > 0 => retry(retries - 1)(f)
}
29. retry() - (Real) Implementation
sealed trait RetryPolicy
case object Immediate extends RetryPolicy
case class Fixed(duration: FiniteDuration) extends RetryPolicy
case class Exponential(duration: FiniteDuration) extends RetryPolicy
def retry[T](retries: Int, policy: RetryPolicy)
(producer: Int => Future[T])
(implicit executor: ExecutionContext,
scheduler: Scheduler): Future[T]
31. retry() - Fixed Usage Example
"retry(fixed)" should "be called 3 times" in {
val strategy = Fixed(1 second)
val res = retry(3, strategy) {
case 0 => Future.failed
new RuntimeException("not good enough...")
case 1 => Future.failed
new RuntimeException("getting better...")
case 2 => Future.successful("great success !")
}
val finalResult = Await.result(res, 3 second)
assert (finalResult === "great success !")
}
32. retry() - Conditional Usage Example
"retry(conditional)" should "stop on IOException" in {
val policy: PartialFunction[Throwable, RetryPolicy] = {
case _: TimeoutException => Fixed(1 second)
}
val res = retry(3)(policy) {
case 0 => Future failed new TimeoutException("really slow")
case 1 => Future failed new IOException("something bad")
case 2 => Future successful "great success"
}
try Await.result(res, 3 second) catch {
case _: IOException => succeed
}
}
33. ▪ Slow response at a single instance level happens on a regular basis -
GC, unresponsive queries, network issues, etc.
▪ When we analyze our latencies over time, we will see a long tail of long
latencies exceeding our SLA.
▪ It is possible to trade-off between average load of a system and overall
response time.
Pattern #6 - doubleDispatch()
34. doubleDispatch() - (Naive) Algorithm
1. Send two requests immediately, assuming that we have a
load-balancer that will distribute requests across nodes.
2. Collecting first returned answer from both calls.
Client
Server1 Server2
35. doubleDispatch() - A better approach
1. Send the first request to the first node
2. After a predefined period of time, if no answer arrived from
the first request, we dispatch the second one.
3. Collect first returned answer from both calls.
Client
Server1 Server2
20 msec
36. doubleDispatch() - Definition
def doubleDispatch[T](duration: FiniteDuration)
(producer: => Future[T])
(implicit executor: ExecutionContext,
scheduler: Scheduler): Future[T]
This mechanism can only be used with idempotent operations
37. doubleDispatch() - Usage Example
"doubleDispatch" should "return the short call" in {
val switch = new AtomicBoolean()
val res = doubleDispatch(1 second) {
if (switch.compareAndSet(false, true)) {
Future("slow response") delay (3 second)
} else {
Future("fast response") delay (1 second)
}
}
val finalRes = Await.result(res, 4 second)
assert (finalRes === "fast response")
}
38. doubleDispatch() - Choosing Duration
● Without DD 99% of calls are
under the SLA
● DD on 50ms (90th percentile)
● Out of 1% calls 90% will be under
SLA which is 50ms (0.9%)
● In total 99.9% successful calls!
● Resource utilization is up by 10%
39. Future Plans
▪ Circuit breaker
▪ Resource (Object) Pool
▪ What else ? we accept pull requests ;)