SlideShare una empresa de Scribd logo
1 de 68
Descargar para leer sin conexión
Resilient Functional Service Design
The usually forgotten parts of resilient software design

Uwe Friedrichsen – codecentric AG – 2015-2017
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
What’s that “resilience” thing?
Business
Production
Availability
(Almost) every system is a distributed system

Chas Emerick






http://www.infoq.com/presentations/problems-distributed-systems
A distributed system is one in which the failure
of a computer you didn't even know existed
can render your own computer unusable.

Leslie Lamport
Failures in todays complex, distributed and
interconnected systems are not the exception.

•  They are the normal case

•  They are not predictable
… and it’s getting “worse”


•  Cloud-based systems
•  Microservices
•  Zero Downtime
•  Mobile & IoT
•  Social Web

à Ever-increasing complexity and connectivity
Do not try to avoid failures. Embrace them.
resilience (IT)

the ability of a system to handle unexpected situations
-  without the user noticing it (best case)
-  with a graceful degradation of service (worst case)
Beware of the “100% available” trap!
Designing for resilience
The pitfall
First, you learn about resilience …
Complement
Core
Detect
Prevent
Recover
Mitigate
Treat
Core
Detect
 Treat
Prevent
Recover
Mitigate
 Complement
Supporting
patterns
Redundancy
Stateless
Idempotency
Escalation
Zero downtime
deployment
Location
transparency
Relaxed
temporal
constraints
Fallback
Shed load
Share load
Marked data
Queue for
resources
Bounded queue
Finish work
in progress
Fresh work
before stale
Deferrable work
Communication
paradigm
Isolation
Bulkhead
System level
Monitor
Watchdog
Heartbeat
Acknowledgement
Either level
Voting
Synthetic
transaction
Leaky
bucket
 Routine

checks
Health
check
Fail fast
Let sleeping

dogs lie
Small

releases
Hot

deployments
Routine
maintenance
Backup

request
Anti-fragility
Diversity
 Jitter
Error
injection
Spread
the news
Anti-entropy
Backpressure
Retry
Limit retries
Rollback
 Roll-forward
Checkpoint
 Safe point
Failover
Read repair
Error
handler
Reset
Restart
Reconnect
Fail silently
Default value
Node level
Timeout
Circuit
breaker
Complete
parameter
checking
Checksum
Statically
Dynamically
Confinement
... then, you digest the stuff just learned
Confinement
Core
Detect
 Treat
Prevent
Recover
Mitigate
 Complement
Supporting
patterns
Redundancy
Stateless
Idempotency
Escalation
Zero downtime
deployment
Location
transparency
Relaxed
temporal
constraints
Fallback
Shed load
Share load
Marked data
Queue for
resources
Bounded queue
Finish work
in progress
Fresh work
before stale
Deferrable work
Communication
paradigm
Isolation
Bulkhead
System level
Monitor
Watchdog
Heartbeat
Acknowledgement
Either level
Voting
Synthetic
transaction
Leaky
bucket
 Routine

checks
Health
check
Fail fast
Let sleeping

dogs lie
Small

releases
Hot

deployments
Routine
maintenance
Backup

request
Anti-fragility
Diversity
 Jitter
Error
injection
Spread
the news
Anti-entropy
Backpressure
Retry
Limit retries
Rollback
 Roll-forward
Checkpoint
 Safe point
Failover
Read repair
Error
handler
Reset
Restart
Reconnect
Fail silently
Default value
Node level
Timeout
Circuit
breaker
Complete
parameter
checking
Checksum
Statically
Dynamically
Oh, my!

Theoretical blah!
Uncool!


Know that anyway
for eons. So, let’s
move on to the cool
parts …
Ah, now we’re talkin’! Here’s the cool stuff!

That‘s practical, applicable. Don‘t you have more code examples?
Or even better: Can‘t we turn that all into a live hacking session?
Offline
activities?


Hmm, let‘s
focus on the
other stuff.
Uh, sounds like
one-off, tough
stuff …


Better start with the
easier stuff, best
with library support
Yeah, more cool stuff!

Aren‘t there more libs like Hystrix
that we can drag into our projects
with a line of configuration?
Well, neat …

I’ll come back to
that stuff whenever
I really need it
Core
Detect
Recover
Mitigate
Treat
Prevent
Complement
Developer
priority
Relevance for
application robustness
Ye be warned!

If you don’t get this part
right, nothing else matters
Here be dragons!

This is extremely hard
and poorly understood
Let’s recap …
The core parts are



•  extremely important
•  poorly understood
•  massively underestimated
Houston, we have a problem!
Let’s have a closer look at the core parts
Complement
Core
Detect
Prevent
Recover
Mitigate
Treat
Core
Detect
 Treat
Prevent
Recover
Mitigate
 Complement
Isolation
Isolation
•  System must not fail as a whole
•  Split system in parts and isolate parts against each other
•  Avoid cascading failures
•  Foundation of resilient software design
Core
Detect
 Treat
Prevent
Recover
Mitigate
 Complement
Isolation
Bulkhead
Bulkheads are not about thread pools!
Bulkheads

•  Core isolation pattern (a.k.a. “failure units” or “units of mitigation”)
•  Diverse implementation choices available, e.g., µservice, actor, scs, ...
•  Implementation choice impacts system and resilience design a lot
•  Shaping good bulkheads is extremely hard (pure design issue)
Sounds easy. Where is the problem?
Service A
 Service B
Request
Due to functional design, Service A
always needs backing from Service B
to be able to answer a client request,

i.e. the isolation is broken by design
How do we avoid this …
Service
Request
Due to functional design we need
to call a lot of services to be able
to answer a client request,

i.e. availability is broken by design
... and this ...
Service
Service
Service
 Service
Service
Service
Service
Service
Service
Service
Service
Service
Mothership Service

(a.k.a. Monolith)
Request
By trying to avoid the aforementioned
issues we ended up with cramming all
required functionality in one big service

i.e. the isolation is broken by design
... without ending up with this?
Let’s use the well-known best practices

•  Divide & conquer a.k.a. functional decomposition
•  DRY (Don’t Repeat Yourself)
•  Design for reusability
•  Layered architecture
•  …
Unfortunately, …
Service A
 Service B
Request
Due to functional design, Service A
always needs backing from Service B
to be able to answer a client request,

i.e. the isolation is broken by design
... this usually leads to this ...
Service
Request
Due to functional design we need
to call a lot of services to be able
to answer a client request,

i.e. availability is broken by design
... and this ...
Service
Service
Service
 Service
Service
Service
Service
Service
Service
Service
Service
Service
Mothership Service

(a.k.a. Monolith)
Request
By trying to avoid the aforementioned
issues we ended up with cramming all
required functionality in one big service

i.e. the isolation is broken by design
... and in the end also often to this.
Welcome to distributed hell!
Caches to the rescue!
Service A
 Service B
Request
Due to functional design, Service A
always needs backing from Service B
to be able to answer a client request,

i.e. the isolation is broken by design
CacheofB
 Break tight service coupling
by caching data/responses
of downstream service
Caches to the rescue?
Do you really think

that copying stale data all over your system
is a suitable measure

to fix an inherently broken design?
We have to re-learn design
for distributed systems!
A works-out-of-the-box-in-all-contexts,
just-add-water-and-stir,
three-bullet-point panacea
for designing perfect bulkheads
You need lots of those …
... maybe some of those
Then it is a lot of hard work …
... and there is no silver bullet
Yet, a few guiding thoughts

about bulkhead design …
Foundations of design

•  High cohesion, low coupling
•  Separation of concerns
•  Crucial across process boundaries
•  Still poorly understood issue
•  Start with
•  Understanding organizational boundaries
•  Understanding use cases and flows
•  Identifying functional domains (à DDD)
•  Finding areas that change independently
•  Do not start with a data model!
Short activation paths


•  Long activation paths affect availability
•  Increase latency and likelihood of failures
•  Minimize remote calls per request
•  Need to balance opposing forces
•  Avoid monolith à clear separation of concerns
•  Minimize requests à cluster functionality & data
•  Caches sometimes help, but stale data as trade-off
Dismiss reusability


•  Reusability increases coupling
•  Reusability leads to bad service design
•  Reusability compromises availability
•  Reusability rarely pays
•  Do not strive for reuse
•  Strive for replaceability instead
Broadening the options ...
Core
Detect
 Treat
Prevent
Recover
Mitigate
 Complement
Isolation
Communication
paradigm
Bulkhead
Communication paradigm

•  Request-response <-> messaging <-> events <-> …
•  Heavily influences resilience patterns to be used
•  Also heavily influences functional bulkhead design
•  Very fundamental decision which is often underestimated
µS
Request/Response : Horizontal slicing
Flow / Process
µS
 µS
µS
 µS
 µS
µS
Event-driven : Vertical slicing
µS
 µS
µS
µS
 µS
Flow / Process
Synchronous R/R vs. asynchronous events

•  Decomposition
•  Vertically divide-and-conquer vs. horizontally go-with-the-flow
•  Coordination
•  Coordination logic/services and orchestration vs. event chains and choreography
•  Transactions
•  Built-in transaction handling vs. external supervision
•  Error handling
•  Built into service vs. escalation/supervision strategy
•  Separation of concerns
•  Multiple responsibilities service vs. single responsibility services
•  Encapsulation
•  Domain logic distributed across services vs. domain logic in one place
•  Reusability vs. Replaceability
•  Complexity
•  A draw …
The communication paradigm influences

the functional service design a lot

and also the resilience patterns to be used
Example: order fulfillment

•  Simple order, credit card, non-digital items
•  Add coupons incl. validation
•  Add promotions incl. usage notification
•  Add bonus card incl. purchase notification
•  Customer accounts as payment type
•  PayPal as payment type
•  Integrate digital music library
•  Integrate digital video library
•  Integrate e-book library
Design exercise – Part 1


Create a bulkhead design for the case study

•  Use one communication paradigm
•  Synchronous request/response (e.g., REST)
•  Asynchronous messaging (e.g., Akka)
•  Asynchronous events (e.g., Pub/Sub)
•  Assume incremental requirements
•  How many services do you need to touch
•  What about the functional isolation of the services
•  How big/maintainable are the resulting services
•  Take a few notes
Online Shop Checkout
Credit Card
Provider
Warehouse
System
Coupon
Management
Campaign
Management
PayPal
Loyalty
Management
Accounts
Receivables
Music Library
E-Book
Library
Video Library
E-Mail Server
Customer
pressed
“Buy now”
?
Order Fulfillment

Service
Online Shop
Payment

Service
Credit Card
Provider
Shipment

Service
Warehouse
System
<Foreign Service>
<Own Service>
Coupon
Management
Promotion
Campaign
Management
Loyalty
Account
Service
Payment
Provider
PayPal
Loyalty
Management
Accounts
Receivables
Music Library
E-Book
Library
Video Library
E-Mail Server
Coupon
Credit Card
Coordinate
Warehouse
Coordinate
Assets
Notify Cust.
PayPal
Coordinate
Order confirmed
Online Shop
Credit Card
Provider
Warehouse
System
<Foreign Service>
<Own Service>
Coupon
Management
Campaign
Management
Account
service
Credit Card
Service
Loyalty
Management
Accounts
Receivables
Music Library
E-Book
Library
Video Library
 E-Mail Server
PayPal
PayPal
Service
Warehouse
Service
Promotion
Service
Bonus Card
Service
Coupon
Service
Music Library
Service
Video Library
Service
E-Book Library
Service
Notification
Service
Payment authorized
 Digital asset provisioned
Payment failed
<Event>
Order fulfillment
supervisor
Track flow of events
Reschedule events
in case of failure
Services are responsible to eventually succeed
or fail for good, usually incorporating a
supervision/escalation hierarchy for that
Do not limit your design options upfront
without an important reason
Wrap-up

•  Today’s systems are distributed
•  Failures are not avoidable, nor predictable
•  Resilient software design needed
•  Bulkhead design is
•  crucial for application robustness
•  poorly understood
•  massively underrated
•  different from traditional design best practices
•  Communication paradigms broaden

your bulkhead design options
We have to re-learn design
for distributed systems
@ufried
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Resilient Functional Service Design

Más contenido relacionado

La actualidad más candente

Introduction to microservices
Introduction to microservicesIntroduction to microservices
Introduction to microservicesAnil Allewar
 
Micro services Architecture
Micro services ArchitectureMicro services Architecture
Micro services ArchitectureAraf Karsh Hamid
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingAraf Karsh Hamid
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceAdrian Cockcroft
 
Modeling microservices using DDD
Modeling microservices using DDDModeling microservices using DDD
Modeling microservices using DDDMasashi Narumoto
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)Hussain Mansoor
 
Introduction To Microservices
Introduction To MicroservicesIntroduction To Microservices
Introduction To MicroservicesLalit Kale
 
Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architectureThe Software House
 
CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD OverviewAn Nguyen
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
CI/CD 101
CI/CD 101CI/CD 101
CI/CD 101djdule
 
The Paved Road at Netflix
The Paved Road at NetflixThe Paved Road at Netflix
The Paved Road at NetflixDianne Marsh
 
Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native worldUwe Friedrichsen
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps EngineersMárton Kodok
 

La actualidad más candente (20)

CI CD Basics
CI CD BasicsCI CD Basics
CI CD Basics
 
Introduction to microservices
Introduction to microservicesIntroduction to microservices
Introduction to microservices
 
Architecture: Microservices
Architecture: MicroservicesArchitecture: Microservices
Architecture: Microservices
 
Micro services Architecture
Micro services ArchitectureMicro services Architecture
Micro services Architecture
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
 
Modeling microservices using DDD
Modeling microservices using DDDModeling microservices using DDD
Modeling microservices using DDD
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
 
Introduction To Microservices
Introduction To MicroservicesIntroduction To Microservices
Introduction To Microservices
 
DevSecOps 101
DevSecOps 101DevSecOps 101
DevSecOps 101
 
Gitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCDGitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCD
 
Soluciones Dynatrace
Soluciones DynatraceSoluciones Dynatrace
Soluciones Dynatrace
 
Design patterns for microservice architecture
Design patterns for microservice architectureDesign patterns for microservice architecture
Design patterns for microservice architecture
 
CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD Overview
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
CI/CD 101
CI/CD 101CI/CD 101
CI/CD 101
 
The Paved Road at Netflix
The Paved Road at NetflixThe Paved Road at Netflix
The Paved Road at Netflix
 
Timeless design in a cloud-native world
Timeless design in a cloud-native worldTimeless design in a cloud-native world
Timeless design in a cloud-native world
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
 

Destacado

The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservicesUwe Friedrichsen
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextUwe Friedrichsen
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architecturesUwe Friedrichsen
 
Microservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack riskMicroservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack riskUwe Friedrichsen
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITUwe Friedrichsen
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITUwe Friedrichsen
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) ITUwe Friedrichsen
 
Why resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudesWhy resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudesUwe Friedrichsen
 
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...Phil Calçado
 
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...TAUS - The Language Data Network
 
Der Business Case für Architektur
Der Business Case für ArchitekturDer Business Case für Architektur
Der Business Case für ArchitekturUwe Friedrichsen
 

Destacado (20)

The promises and perils of microservices
The promises and perils of microservicesThe promises and perils of microservices
The promises and perils of microservices
 
Watch your communication
Watch your communicationWatch your communication
Watch your communication
 
Life, IT and everything
Life, IT and everythingLife, IT and everything
Life, IT and everything
 
DevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader contextDevOps is not enough - Embedding DevOps in a broader context
DevOps is not enough - Embedding DevOps in a broader context
 
Towards complex adaptive architectures
Towards complex adaptive architecturesTowards complex adaptive architectures
Towards complex adaptive architectures
 
Production-ready Software
Production-ready SoftwareProduction-ready Software
Production-ready Software
 
Resilience with Hystrix
Resilience with HystrixResilience with Hystrix
Resilience with Hystrix
 
Microservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack riskMicroservices - stress-free and without increased heart attack risk
Microservices - stress-free and without increased heart attack risk
 
Conway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective ITConway's law revisited - Architectures for an effective IT
Conway's law revisited - Architectures for an effective IT
 
No stress with state
No stress with stateNo stress with state
No stress with state
 
Fantastic Elastic
Fantastic ElasticFantastic Elastic
Fantastic Elastic
 
Self healing data
Self healing dataSelf healing data
Self healing data
 
Modern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of ITModern times - architectures for a Next Generation of IT
Modern times - architectures for a Next Generation of IT
 
The Next Generation (of) IT
The Next Generation (of) ITThe Next Generation (of) IT
The Next Generation (of) IT
 
Devops for Developers
Devops for DevelopersDevops for Developers
Devops for Developers
 
Why resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudesWhy resilience - A primer at varying flight altitudes
Why resilience - A primer at varying flight altitudes
 
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
 
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...
XTM; from AEM\Drupal\Sitecore direct connectors to translation vendor subcont...
 
Der Business Case für Architektur
Der Business Case für ArchitekturDer Business Case für Architektur
Der Business Case für Architektur
 
Down with the_sandals
Down with the_sandalsDown with the_sandals
Down with the_sandals
 

Similar a Resilient Functional Service Design

Onion Architecture / Clean Architecture
Onion Architecture / Clean ArchitectureOnion Architecture / Clean Architecture
Onion Architecture / Clean ArchitectureAttila Bertók
 
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...Microservices - stress-free and without increased heart-attack risk - Uwe Fri...
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...distributed matters
 
Microservices for Mortals by Bert Ertman at Codemotion Dubai
 Microservices for Mortals by Bert Ertman at Codemotion Dubai Microservices for Mortals by Bert Ertman at Codemotion Dubai
Microservices for Mortals by Bert Ertman at Codemotion DubaiCodemotion Dubai
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsUwe Friedrichsen
 
Microservices Gone Wrong!
Microservices Gone Wrong!Microservices Gone Wrong!
Microservices Gone Wrong!Bert Ertman
 
Microservices for Mortals
Microservices for MortalsMicroservices for Mortals
Microservices for MortalsBert Ertman
 
Deep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software ArchitectureDeep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software ArchitectureMatthew Clarke
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedTyler Treat
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableComsysto Reply GmbH
 
The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices INPAY
 
Software Architecture Anti-Patterns
Software Architecture Anti-PatternsSoftware Architecture Anti-Patterns
Software Architecture Anti-PatternsEduards Sizovs
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesLightbend
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyComsysto Reply GmbH
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyComsysto Reply GmbH
 
Arch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesArch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesIgor Moochnick
 
[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with BallerinaWSO2
 
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201Amazon Web Services
 
Cloud Native Future
Cloud Native FutureCloud Native Future
Cloud Native FutureJulie Coonce
 
Agile Development Overview (with a bit about builds)
Agile Development Overview (with a bit about builds)Agile Development Overview (with a bit about builds)
Agile Development Overview (with a bit about builds)David Benjamin
 

Similar a Resilient Functional Service Design (20)

Onion Architecture / Clean Architecture
Onion Architecture / Clean ArchitectureOnion Architecture / Clean Architecture
Onion Architecture / Clean Architecture
 
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...Microservices - stress-free and without increased heart-attack risk - Uwe Fri...
Microservices - stress-free and without increased heart-attack risk - Uwe Fri...
 
Microservices for Mortals by Bert Ertman at Codemotion Dubai
 Microservices for Mortals by Bert Ertman at Codemotion Dubai Microservices for Mortals by Bert Ertman at Codemotion Dubai
Microservices for Mortals by Bert Ertman at Codemotion Dubai
 
Excavating the knowledge of our ancestors
Excavating the knowledge of our ancestorsExcavating the knowledge of our ancestors
Excavating the knowledge of our ancestors
 
Microservices Gone Wrong!
Microservices Gone Wrong!Microservices Gone Wrong!
Microservices Gone Wrong!
 
Microservices for Mortals
Microservices for MortalsMicroservices for Mortals
Microservices for Mortals
 
No silver bullet
No silver bulletNo silver bullet
No silver bullet
 
Deep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software ArchitectureDeep Dive into the Idea of Software Architecture
Deep Dive into the Idea of Software Architecture
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going Distributed
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuable
 
The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices
 
Software Architecture Anti-Patterns
Software Architecture Anti-PatternsSoftware Architecture Anti-Patterns
Software Architecture Anti-Patterns
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and Consistently
 
Architectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and ConsistentlyArchitectural Decisions: Smoothly and Consistently
Architectural Decisions: Smoothly and Consistently
 
Arch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best PracticesArch factory - Agile Design: Best Practices
Arch factory - Agile Design: Best Practices
 
[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina[WSO2Con EU 2017] Resilience Patterns with Ballerina
[WSO2Con EU 2017] Resilience Patterns with Ballerina
 
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201
Smaller is Better - Exploiting Microservice Architectures on AWS - Technical 201
 
Cloud Native Future
Cloud Native FutureCloud Native Future
Cloud Native Future
 
Agile Development Overview (with a bit about builds)
Agile Development Overview (with a bit about builds)Agile Development Overview (with a bit about builds)
Agile Development Overview (with a bit about builds)
 

Más de Uwe Friedrichsen

The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerUwe Friedrichsen
 
Digitization solutions - A new breed of software
Digitization solutions - A new breed of softwareDigitization solutions - A new breed of software
Digitization solutions - A new breed of softwareUwe Friedrichsen
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explainedUwe Friedrichsen
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"Uwe Friedrichsen
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismUwe Friedrichsen
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE worldUwe Friedrichsen
 

Más de Uwe Friedrichsen (9)

Deep learning - a primer
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
 
Life after microservices
Life after microservicesLife after microservices
Life after microservices
 
The hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developerThe hitchhiker's guide for the confused developer
The hitchhiker's guide for the confused developer
 
Digitization solutions - A new breed of software
Digitization solutions - A new breed of softwareDigitization solutions - A new breed of software
Digitization solutions - A new breed of software
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
 
The truth about "You build it, you run it!"
The truth about "You build it, you run it!"The truth about "You build it, you run it!"
The truth about "You build it, you run it!"
 
Dr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinismDr. Hectic and Mr. Hype - surviving the economic darwinism
Dr. Hectic and Mr. Hype - surviving the economic darwinism
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE world
 
Fault tolerance made easy
Fault tolerance made easyFault tolerance made easy
Fault tolerance made easy
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Resilient Functional Service Design

  • 1. Resilient Functional Service Design The usually forgotten parts of resilient software design Uwe Friedrichsen – codecentric AG – 2015-2017
  • 2. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
  • 5. (Almost) every system is a distributed system Chas Emerick http://www.infoq.com/presentations/problems-distributed-systems
  • 6. A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. Leslie Lamport
  • 7. Failures in todays complex, distributed and interconnected systems are not the exception. •  They are the normal case •  They are not predictable
  • 8. … and it’s getting “worse” •  Cloud-based systems •  Microservices •  Zero Downtime •  Mobile & IoT •  Social Web à Ever-increasing complexity and connectivity
  • 9. Do not try to avoid failures. Embrace them.
  • 10. resilience (IT) the ability of a system to handle unexpected situations -  without the user noticing it (best case) -  with a graceful degradation of service (worst case)
  • 11. Beware of the “100% available” trap!
  • 13. First, you learn about resilience …
  • 15. Core Detect Treat Prevent Recover Mitigate Complement Supporting patterns Redundancy Stateless Idempotency Escalation Zero downtime deployment Location transparency Relaxed temporal constraints Fallback Shed load Share load Marked data Queue for resources Bounded queue Finish work in progress Fresh work before stale Deferrable work Communication paradigm Isolation Bulkhead System level Monitor Watchdog Heartbeat Acknowledgement Either level Voting Synthetic transaction Leaky bucket Routine
 checks Health check Fail fast Let sleeping
 dogs lie Small
 releases Hot
 deployments Routine maintenance Backup
 request Anti-fragility Diversity Jitter Error injection Spread the news Anti-entropy Backpressure Retry Limit retries Rollback Roll-forward Checkpoint Safe point Failover Read repair Error handler Reset Restart Reconnect Fail silently Default value Node level Timeout Circuit breaker Complete parameter checking Checksum Statically Dynamically Confinement
  • 16. ... then, you digest the stuff just learned
  • 17. Confinement Core Detect Treat Prevent Recover Mitigate Complement Supporting patterns Redundancy Stateless Idempotency Escalation Zero downtime deployment Location transparency Relaxed temporal constraints Fallback Shed load Share load Marked data Queue for resources Bounded queue Finish work in progress Fresh work before stale Deferrable work Communication paradigm Isolation Bulkhead System level Monitor Watchdog Heartbeat Acknowledgement Either level Voting Synthetic transaction Leaky bucket Routine
 checks Health check Fail fast Let sleeping
 dogs lie Small
 releases Hot
 deployments Routine maintenance Backup
 request Anti-fragility Diversity Jitter Error injection Spread the news Anti-entropy Backpressure Retry Limit retries Rollback Roll-forward Checkpoint Safe point Failover Read repair Error handler Reset Restart Reconnect Fail silently Default value Node level Timeout Circuit breaker Complete parameter checking Checksum Statically Dynamically Oh, my!
 Theoretical blah! Uncool!
 Know that anyway for eons. So, let’s move on to the cool parts … Ah, now we’re talkin’! Here’s the cool stuff! That‘s practical, applicable. Don‘t you have more code examples? Or even better: Can‘t we turn that all into a live hacking session? Offline activities?
 Hmm, let‘s focus on the other stuff. Uh, sounds like one-off, tough stuff …
 Better start with the easier stuff, best with library support Yeah, more cool stuff! Aren‘t there more libs like Hystrix that we can drag into our projects with a line of configuration? Well, neat … I’ll come back to that stuff whenever I really need it
  • 18. Core Detect Recover Mitigate Treat Prevent Complement Developer priority Relevance for application robustness Ye be warned! If you don’t get this part right, nothing else matters Here be dragons! This is extremely hard and poorly understood
  • 20. The core parts are •  extremely important •  poorly understood •  massively underestimated
  • 21. Houston, we have a problem!
  • 22. Let’s have a closer look at the core parts
  • 25. Isolation •  System must not fail as a whole •  Split system in parts and isolate parts against each other •  Avoid cascading failures •  Foundation of resilient software design
  • 27. Bulkheads are not about thread pools!
  • 28. Bulkheads •  Core isolation pattern (a.k.a. “failure units” or “units of mitigation”) •  Diverse implementation choices available, e.g., µservice, actor, scs, ... •  Implementation choice impacts system and resilience design a lot •  Shaping good bulkheads is extremely hard (pure design issue)
  • 29. Sounds easy. Where is the problem?
  • 30. Service A Service B Request Due to functional design, Service A always needs backing from Service B to be able to answer a client request, i.e. the isolation is broken by design How do we avoid this …
  • 31. Service Request Due to functional design we need to call a lot of services to be able to answer a client request, i.e. availability is broken by design ... and this ... Service Service Service Service Service Service Service Service Service Service Service Service
  • 32. Mothership Service (a.k.a. Monolith) Request By trying to avoid the aforementioned issues we ended up with cramming all required functionality in one big service i.e. the isolation is broken by design ... without ending up with this?
  • 33. Let’s use the well-known best practices •  Divide & conquer a.k.a. functional decomposition •  DRY (Don’t Repeat Yourself) •  Design for reusability •  Layered architecture •  …
  • 35. Service A Service B Request Due to functional design, Service A always needs backing from Service B to be able to answer a client request, i.e. the isolation is broken by design ... this usually leads to this ...
  • 36. Service Request Due to functional design we need to call a lot of services to be able to answer a client request, i.e. availability is broken by design ... and this ... Service Service Service Service Service Service Service Service Service Service Service Service
  • 37. Mothership Service (a.k.a. Monolith) Request By trying to avoid the aforementioned issues we ended up with cramming all required functionality in one big service i.e. the isolation is broken by design ... and in the end also often to this.
  • 39. Caches to the rescue!
  • 40. Service A Service B Request Due to functional design, Service A always needs backing from Service B to be able to answer a client request, i.e. the isolation is broken by design CacheofB Break tight service coupling by caching data/responses of downstream service
  • 41. Caches to the rescue?
  • 42. Do you really think
 that copying stale data all over your system is a suitable measure
 to fix an inherently broken design?
  • 43. We have to re-learn design for distributed systems!
  • 45. You need lots of those …
  • 46. ... maybe some of those
  • 47. Then it is a lot of hard work …
  • 48. ... and there is no silver bullet
  • 49. Yet, a few guiding thoughts
 about bulkhead design …
  • 50. Foundations of design •  High cohesion, low coupling •  Separation of concerns •  Crucial across process boundaries •  Still poorly understood issue •  Start with •  Understanding organizational boundaries •  Understanding use cases and flows •  Identifying functional domains (à DDD) •  Finding areas that change independently •  Do not start with a data model!
  • 51. Short activation paths •  Long activation paths affect availability •  Increase latency and likelihood of failures •  Minimize remote calls per request •  Need to balance opposing forces •  Avoid monolith à clear separation of concerns •  Minimize requests à cluster functionality & data •  Caches sometimes help, but stale data as trade-off
  • 52. Dismiss reusability •  Reusability increases coupling •  Reusability leads to bad service design •  Reusability compromises availability •  Reusability rarely pays •  Do not strive for reuse •  Strive for replaceability instead
  • 55. Communication paradigm •  Request-response <-> messaging <-> events <-> … •  Heavily influences resilience patterns to be used •  Also heavily influences functional bulkhead design •  Very fundamental decision which is often underestimated
  • 56. µS Request/Response : Horizontal slicing Flow / Process µS µS µS µS µS µS Event-driven : Vertical slicing µS µS µS µS µS Flow / Process
  • 57. Synchronous R/R vs. asynchronous events •  Decomposition •  Vertically divide-and-conquer vs. horizontally go-with-the-flow •  Coordination •  Coordination logic/services and orchestration vs. event chains and choreography •  Transactions •  Built-in transaction handling vs. external supervision •  Error handling •  Built into service vs. escalation/supervision strategy •  Separation of concerns •  Multiple responsibilities service vs. single responsibility services •  Encapsulation •  Domain logic distributed across services vs. domain logic in one place •  Reusability vs. Replaceability •  Complexity •  A draw …
  • 58. The communication paradigm influences
 the functional service design a lot and also the resilience patterns to be used
  • 59. Example: order fulfillment •  Simple order, credit card, non-digital items •  Add coupons incl. validation •  Add promotions incl. usage notification •  Add bonus card incl. purchase notification •  Customer accounts as payment type •  PayPal as payment type •  Integrate digital music library •  Integrate digital video library •  Integrate e-book library
  • 60. Design exercise – Part 1 Create a bulkhead design for the case study •  Use one communication paradigm •  Synchronous request/response (e.g., REST) •  Asynchronous messaging (e.g., Akka) •  Asynchronous events (e.g., Pub/Sub) •  Assume incremental requirements •  How many services do you need to touch •  What about the functional isolation of the services •  How big/maintainable are the resulting services •  Take a few notes
  • 61. Online Shop Checkout Credit Card Provider Warehouse System Coupon Management Campaign Management PayPal Loyalty Management Accounts Receivables Music Library E-Book Library Video Library E-Mail Server Customer pressed “Buy now” ?
  • 62. Order Fulfillment
 Service Online Shop Payment
 Service Credit Card Provider Shipment
 Service Warehouse System <Foreign Service> <Own Service> Coupon Management Promotion Campaign Management Loyalty Account Service Payment Provider PayPal Loyalty Management Accounts Receivables Music Library E-Book Library Video Library E-Mail Server Coupon Credit Card Coordinate Warehouse Coordinate Assets Notify Cust. PayPal Coordinate
  • 63. Order confirmed Online Shop Credit Card Provider Warehouse System <Foreign Service> <Own Service> Coupon Management Campaign Management Account service Credit Card Service Loyalty Management Accounts Receivables Music Library E-Book Library Video Library E-Mail Server PayPal PayPal Service Warehouse Service Promotion Service Bonus Card Service Coupon Service Music Library Service Video Library Service E-Book Library Service Notification Service Payment authorized Digital asset provisioned Payment failed <Event> Order fulfillment supervisor Track flow of events Reschedule events in case of failure Services are responsible to eventually succeed or fail for good, usually incorporating a supervision/escalation hierarchy for that
  • 64. Do not limit your design options upfront without an important reason
  • 65. Wrap-up •  Today’s systems are distributed •  Failures are not avoidable, nor predictable •  Resilient software design needed •  Bulkhead design is •  crucial for application robustness •  poorly understood •  massively underrated •  different from traditional design best practices •  Communication paradigms broaden
 your bulkhead design options
  • 66. We have to re-learn design for distributed systems
  • 67. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com