Service Stampede: Surviving a Thousand Services

Agenda Monoliths to Microservices
Problems with microservices
Solves & Practices
The need for standardization
Introducing squbs

Monolith to Microservices
Requests
Congrats! Your monolith became a thousand microservices – now you’re in serious trouble!!!

Cost/Benefits of Moving to Microservices
• Independence – faster PDLC
• Freedom of choice for service implementation
• Easy evolution of service & technology
• Coexisting services across generations
• Complexity & Latency
Gains
• Homogeneity
• Consistency of implementation across
• Timing & Determinism
Losses
Hmm. To be, or not to be… a service, that is...

Microservices
Issues
Latency & Determinism
Service Boundaries
To be, or not to be a service
Scaling and rightsizing
Many failure points – need resiliency
Inconsistency – need standardization

Latency by Deployment Topology
• Avoid too many layers of services
• Keep state close to the edge
• The more hops, the higher and less deterministic the latency is

Services Need to Scale
• Scale horizontally with increasing workload
• More nodes, or…
• More pods with increasing workload
• Scale vertically – why?
• Keep the number of instances under control
• 125 nodes @16CPU easier to manage than 1000 nodes @2CPU
• Less load on network and switching infrastructure
• Potentially better utilization & cache hits
• Stateful systems: More limited horizontal scale
• Need critical mass for redundancy

Practices for
Successful
Microservices
Deployment Topologies
Reactive Systems
Resilience with Circuit Breakers
Asynchronous Communication
Standardization

Individual Service Deployments
Service A Service B
RequestsRequests

Joint Deployments
Service A
Requests
Service B
Service C
• Deployment orchestration using Chef, etc.
• Kubernetes Pods

The Reactive Manifesto
Responsive
Message Driven
Elastic Resilient

Why Does it Matter?
Respond in a deterministic, timely manner. Controls determinism
Stays responsive in the face of failure – even cascading failures
Stays responsive under workload spikes
Basic building block for responsive, resilient, and elastic systems
Responsive
Resilient
Elastic
Message Driven

Circuit Breaker Keeps systems responsive under
failure
Avoids cascading failures
Especially with multi-generational
downstream services
Critical part to keeping your 1000
services alive

Standardization
• Monitoring
• Need to collect metrics, consistently
• Logging
• Correlation across services
• Uniformity in logs
• Security
• Need to apply standard security configuration
• Environment Resolution
• Staging, production, etc.
Consistency in the face of Heterogeneity

Standardized Reactive Platform

Akka, Spray,
Akka Http &
Streams
Asynchronous
High Performance
Resilience & Supervision
Great Libraries for building Reactive
Systems

Bootstrap and
Lifecycle
Management
Unicomplex: Lightweight bootstrap
module
Emits lifecycle events: starting, active,
stopping
Startup and shutdown hooks
Allows obtaining the current state

Listener
• Declares configuration for port binding, interfaces, security, etc

Service
• Akka Http/Spray Routes and Http Request Handler Actors
• Configured in squbs-meta.conf
• A service can be defined in a dependency artifact

Extension
• To start low level (non-actor) facilities needed for the environment

Cubes
Another deployment Topology
squbs: rhymes with cubes
Drop-in modules
Cubes can run in isolation as well as on
a flat classpath
Easy to compose/decompose/refactor
Cubes share the actor system
Provide better predictability

Orchestration
task1
task2
task3
task4
task5
Input
Output

val task1F = doTask1(input)
val task2F = doTask2(input)
val task3F = (task1F, task2F) >> doTask3
val task4F = task2F >> doTask4
val task5F = (task3F, task4F) >> doTask5
for { result <- task5F } {
requester ! result
context.stop(self)
}
Orchestration
task1
task2
task3
task4
task5
Input
Output

Orchestration
DSL
High-performance asynchronous
orchestration
Responsive: Respond within SLA,
with or without results
Streamlined error handling
Reduced code complexity

More Utilities
• Http Client
• Admin Console
• Actor Registry
• Perpetual Stream
• Persistence Buffer
• …

Summary
• Large number of services have benefits, but are more difficult
• Control your service topology for more determinism and lower latency
• Rule of thumb: No more than two hops of synchronous calls from
edge
• Reactive systems – ideal for services
• Responsive & resilient
• Standardization
• Walk like a duck, quack like a duck, and manage it like a duck
• squbs: Have the cake, and eat it too

Service Stampede: Surviving a Thousand Services

Service Stampede: Surviving a Thousand Services

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Service Stampede: Surviving a Thousand Services

Similar a Service Stampede: Surviving a Thousand Services (20)

Último

Último (20)

Service Stampede: Surviving a Thousand Services

Notas del editor