The document discusses microservice performance. It recommends measuring performance correctly by recording timestamped requests with latency and success/failure data. Latency distributions have heavy tails so percentiles are important to understand. Throughput and latency are related by Little's Law. Latency stacks across services so simulation tools are useful. Amdahl's Law and Universal Scalability Law can help identify optimization targets and forecast scalability. The key is to measure performance correctly to identify potential issues and optimize the right parts of the system.
4. Who am I?
▸ CTO @ Skipjaq
▸ ML-driven performance optimisation
▸ Co-founder of SpringSource
▸ Once upon a time I…
▸ Contributed to Spring Framework
▸ Wrote a book about Spring
▸ Talked a lot about Spring
5. Who am I?
▸ I’m on Twitter:
▸ @robertharrop
▸ I’m on Github:
▸ github.com/robharrop
▸ I write about maths and performance
▸ https://robharrop.github.io
If you have questions after the session, {grab, tweet} me.
7. After this talk you will know how to:
▸ Measure performance correctly
▸ Find potential performance disasters
▸ Identify the best candidates for optimisation
▸ Model complex micro services systems
▸ Forecast system scalability
12. Throughput
▸ The rate of processing: x per y
▸ Requests per second
▸ Records per minute
▸ Messages per second
▸ Tasks per day
13. Latency
▸ Time taken… for something
▸ Service time?
▸ First byte?
▸ First response complete?
▸ Last byte?
▸ Render?
▸ Moral of the story: define what you mean by latency
16. Crib Sheet
▸ Record timestamped requests with observed latency and success/error
▸ Throughput
▸ Min, max, mean
▸ Varying time windows (10s, 30s, 1m, 5m, …)
▸ Latency
▸ Min, max, 95th, 99th, 99.9th and other tail percentiles
▸ Mean just means meaningless
17. We need to talk about latency
▸ Latency isn’t exponentially-distributed
▸ And it certainly isn’t normally-distributed
▸ Latency distributions have heavy tails
▸ Latency distributions are multi-modal
▸ Customers see tail latencies way more than you think
▸ Don’t let percentiles trick you
▸ Understand what latency means to your business
33. Service latencies stack
▸ For simple cases (feed-forward networks), latencies are additive
▸ Analytical models are available
▸ http://robharrop.github.io/maths/performance/2016/03/15/queue-networks.html
▸ For most interesting cases this cannot be assumed
▸ Simulation is the best option
▸ Pretty Damn Quick (PDQ) is a great tool, but requires a chunk of effort
▸ Guesstimate is great for quick and dirty models
39. Amdahl’s Law
▸ Theoretical improvement in latency given a fixed workload
Theoretical max
system speedup
Speedup of part under optimisation
Percentage of execution time in part
under optimisation
40. Amdahl’s Law in the Limit
▸ Theoretical max is limited by parts of the system not under improvement
Theoretical max
system speedup Percentage of execution time in part
under optimisation
Speedup of part under optimisation
53. Summary
▸ Measurements are critical
▸ Garbage in, garbage out
▸ Monitor utilisation for early-warning of disaster
▸ Little’s Law
▸ Monitor latency per-user, not just per-request
▸ Select optimisation targets carefully
▸ Amdahl’s Law
▸ Monitor crosstalk to forecast scalability
▸ Universal Scalability Law
54. Reading List and Q&A
▸ Release It! - Michael Nygard
▸ Systems Performance - Brendan Gregg
▸ Guerrilla Capacity Planning - Dr. Neil Gunther
▸ Practical Scalability Analysis - Baron Schwartz