The document discusses challenges with processor benchmarking and provides recommendations. It summarizes a case study where a popular CPU benchmark claimed a new processor was 2.6x faster than Intel, but detailed analysis found the benchmark was testing division speed, which accounted for only 0.1% of cycles on Netflix servers. The document advocates for low-level, active benchmarking and profiling over statistical analysis. It also provides a checklist for evaluating benchmarks and cautions that increased processor complexity and cloud environments make accurate benchmarking more difficult.
10. Netflix Cloud
● <1% div cycles
● Therefore, perf win should be <1% (not 2.6x!)
11. Challenges
● This benchmark is widely used
● Cycle analysis is nearly impossible in the cloud
○ Under hypervisors: Limited PMCs; no PEBS
● Accurate benchmarking needs senior engineers
13. My Benchmarking Checklist
1. Why not double?
2. Was it tuned?
3. Did it break limits?
4. Did it error?
5. Does it reproduce?
6. Does it matter?
7. Did it even happen?
https://www.brendangregg.com/blog/2018-06-30/benchmarking-checklist.html
14. An Exciting New Era of
Processor Innovation
Vertical stacking, new capabilities
More processors & competition
15. But also a Challenging New Era of
Processor Benchmarking
Increased demand
Hard to do debug in the cloud
Popular benchmarks can be wrong