2. Agenda
• MSA Use cases
• What is Failsafe
• Usage in Coupang
• How to work
• Main Features
3. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
4. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
5. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
6. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
7. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
8. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
9. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
10. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
11. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
12. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
13. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
14. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
15. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Fail? -> Retry? Fallback?
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
16. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Fail? -> Retry? Fallback?
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
17. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Fail? -> Retry? Fallback?
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
18. MSA - Key features
Service A
Service B
Service C
Our Service
Event Loop
Asynchronous
Non-Blocking
Async
Dashboard
Fail? -> Retry? Fallback?
Non-Blocking
CompletableFuture ?
Observable (RxJava) ?
Cache
19. Failsafe
• Latency and FaultTolerance for Distributed Systems
• Realtime Operations
• Synchronous & Asynchronous
• Resiliency - Fallback, Retry, Circuit Breaker
20. Failsafe vs Hystrix
• Executable logic can be passed to Failsafe as simple lambda expression
• Failsafe support retry
• Asynchronous execution in Failsafe are performed on a user suppliedThreadPool /
Scheduler
• Asynchronous execution can be observed via event listener api and return Future
• Hystrix circuit breakers are time sensitive, Failsafe use last executions, regardless of
when they took place.
• Failsafe circuit breakers support execution timeouts and configurable support
thresholds. Hystrix only performs a single execution when in half-open state
• Failsafe circuit breakers can be shared across different executions against the same
component, so that if a failure occurs, all executions against that component will be
halted by the circuit breaker.
21. Usage in Coupang
• Used in Connect SDK
• Main Goals
• Retry policy (backoff, jitters)
• Circuit breaker (Resiliency)
23. Circuit Breaker Pattern
Closed Open
Half
Open
trip breaker
If threshold
reached
Call pass through
count fail/success
reset breakers
trip breaker
try reset
after timeout is
reached
calls pass through
on success
reset breaker
on success
reset breaker
24. Closed-State
• The circuit breaker executes operations as usual
• If a failure occurs, the circuit breaker write it down
• If a specified error threshold (number of failures or frequency
of failures) is reached, it trips and opens the circuit breaker
(transitions to the open-state)
25. Open-State
• Calls to the circuit breaker in the open state fail immediately
• No call to the underlying operations is executed
• After a specified timeout is reached, the circuit breaker
transitions to the half-open state.
26. Half-Open-State
• In this state, one call is allowed to call the underlying operation
• If this call failed, the circuit-breaker transitions to the open-
state again until another timeout is reached
• If it succeeded, the circuit-breaker resets and transitions to
the closed-state.
37. Retry
RetryPolicy<Object> retryPolicy = new RetryPolicy<>()
.handle(ConnectException.class) // handle specific exception
.withDelay(Duration.ofSeconds(1)) // retry with delay
.withMaxRetries(3); // maximum retry count
// Run with retries
Failsafe.with(retryPolicy).run(() -> connect());
// Get with retries
Connection connection = Failsafe.with(retryPolicy).get(() -> connect());
// Run with retries asynchronously
CompletableFuture<Void> future = Failsafe.with(retryPolicy).runAsync(() -> connect());
// Get with retries asynchronously
CompletableFuture<Connection> future = Failsafe.with(retryPolicy).getAsync(() -> connect());
38. Retry policies
retryPolicy.withMaxAttempts(3);
// delay between attempts
retryPolicy.withDelay(Duration.ofSeconds(1));
// delay with back off exponentially
retryPolicy.withBackoff(1, 30, ChronoUnit.SECONDS);
// random delay for some range
retryPolicy.withDelay(1, 10, ChronoUnit.SECONDS);
// time bases jitter
retryPolicy.withJitter(Duration.ofMillis(100));
retryPolicy
.abortWhen(false)
.abortOn(NoRouteToHostException.class)
.abortIf(result -> result == false)
39. Circuit Breakers
Circuit breakers allow you to create systems that fail-fast by temporarily disabling execution as a way of preventing system overload.
CircuitBreaker<Object> breaker = new CircuitBreaker<>()
.handle(ConnectException.class) // when ConnectionException occurs, open circuit
.withFailureThreshold(3, 10) // failure threshold to transit to open circuit
.withSuccessThreshold(5) // success threshold to transit to closed state from half-open
.withDelay(Duration.ofMinutes(1)); // after 1 minutes, transit to half-open state
breaker.withFailureThreshold(5); // when a successive number of executions has failed
breaker.withFailureThreshold(3, 5); // the last 3 out of 5 executions has failed
breaker.withSuccessThreshold(3, 5); // the last 3 out of 5 executions has success