SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22
To 
Write 
Good 
Benchmarks… 
Need 
to 
be 
Full 
Stack
Benchmark 
= 
How 
Fast? 
your 
process 
vs 
Goal 
your 
process 
vs 
Best 
PracCces
Today 
• How 
Not 
to 
Write 
Benchmarks 
• Benchmark 
Setup 
& 
Results: 
- 
You’re 
wrong 
about 
machines 
- 
You’re 
wrong 
about 
stats 
- 
You’re 
wrong 
about 
what 
maLers 
• Becoming 
Less 
Wrong 
• Having 
Fun 
with 
Riak
HOW 
NOT 
TO 
WRITE 
BENCHMARKS
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
WHAT’S 
WRONG 
WITH 
THIS 
BENCHMARK?
YOU’RE 
WRONG 
ABOUT 
THE 
MACHINE
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache!
It’s 
Caches 
All 
The 
Way 
Down 
Web 
Request 
Server 
Cache 
S3
It’s 
Caches 
All 
The 
Way 
Down
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod 
• Power 
mode 
changes
YOU’RE 
WRONG 
ABOUT 
THE 
STATS
Wrong 
About 
Stats 
• Too 
few 
samples
Wrong 
About 
Stats 
120 
100 
80 
60 
40 
20 
0 
Convergence 
of 
Median 
on 
Samples 
0 
10 
20 
30 
40 
50 
60 
Latency 
Time 
Stable 
Samples 
Stable 
Median 
Decaying 
Samples 
Decaying 
Median
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not)
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon
MulCmodal 
DistribuCon 
50% 
99% 
# 
occurrences 
Latency 
5 
ms 
10 
ms
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon 
• Outliers
YOU’RE 
WRONG 
ABOUT 
WHAT 
MATTERS
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon
“Programmers 
waste 
enormous 
amounts 
of 
Cme 
thinking 
about 
… 
the 
speed 
of 
noncriCcal 
parts 
of 
their 
programs 
... 
Forget 
about 
small 
efficiencies 
…97% 
of 
the 
Cme: 
premature 
opHmizaHon 
is 
the 
root 
of 
all 
evil. 
Yet 
we 
should 
not 
pass 
up 
our 
opportuniCes 
in 
that 
criCcal 
3%.” 
-­‐-­‐ 
Donald 
Knuth
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing 
• Reproducibility 
of 
measurements
BECOMING 
LESS 
WRONG
User 
AcCons 
MaLer 
X 
> 
Y 
for 
workload 
Z 
with 
trade 
offs 
A, 
B, 
and 
C 
-­‐ 
hLp://www.toomuchcode.org/
Profiling 
Code 
instrumentaCon 
Aggregate 
over 
logs 
Traces
Microbenchmarking: 
Blessing 
& 
Curse 
+ Quick 
& 
cheap 
+ Answers 
narrow 
?s 
well 
- Osen 
misleading 
results 
- Not 
representaCve 
of 
the 
program
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely
Choose 
Your 
N 
Wisely 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon 
• Constant 
work 
per 
iteraCon
Non-­‐Constant 
Work 
Per 
IteraCon
Follow-­‐up 
Material 
• How 
NOT 
to 
Measure 
Latency 
by 
Gil 
Tene 
– hLp://www.infoq.com/presentaCons/latency-­‐piualls 
• Taming 
the 
Long 
Latency 
Tail 
on 
highscalability.com 
– hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ 
the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html 
• Performance 
Analysis 
Methodology 
by 
Brendan 
Gregg 
– hLp://www.brendangregg.com/methodology.html 
• Silverman’s 
Mode 
Detec@on 
Method 
by 
MaL 
Adereth 
– hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ 
mode-­‐detecCon-­‐method-­‐explained/
HAVING 
FUN 
WITH
Setup 
• SSD 
30 
GB 
• M3 
large 
• Riak 
version 
1.4.2-­‐0-­‐g61ac9d8 
• Ubuntu 
12.04.5 
LTS 
• 4 
byte 
keys, 
10 
KB 
values
2350 
2300 
2250 
2200 
2150 
2100 
2050 
2000 
1950 
1900 
1850 
Latency 
(usec) 
Get 
Latency 
L3 
Number 
of 
Keys
Takeaway 
#1: 
Cache
Takeaway 
#2: 
Outliers
Takeaway 
#3: 
Workload
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22

Más contenido relacionado

Similar a Benchmarking (RICON 2014)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLANumenta
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesDerek Graham
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineersCameron Joannidis
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...Matt Ray
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceBOSC 2010
 
Embrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationEmbrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationPaul Osman
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework reviewtaeseon ryu
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser colleenfry
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...huguk
 
Badneedles
BadneedlesBadneedles
Badneedlesdimisec
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuningJohn McCaffrey
 
The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.Grafana Labs
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by InstrumentAmazon Web Services
 
Rubyslava beyond the_monolith
Rubyslava beyond the_monolithRubyslava beyond the_monolith
Rubyslava beyond the_monolitholahmichal
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profitRodrigo Campos
 
Continuous Integration, the minimum viable product
Continuous Integration, the minimum viable productContinuous Integration, the minimum viable product
Continuous Integration, the minimum viable productJulian Simpson
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
 

Similar a Benchmarking (RICON 2014) (20)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE Bytes
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
 
Embrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationEmbrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your Organization
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
 
Badneedles
BadneedlesBadneedles
Badneedles
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
 
Rubyslava beyond the_monolith
Rubyslava beyond the_monolithRubyslava beyond the_monolith
Rubyslava beyond the_monolith
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profit
 
Continuous Integration, the minimum viable product
Continuous Integration, the minimum viable productContinuous Integration, the minimum viable product
Continuous Integration, the minimum viable product
 
Cloud War Stories
Cloud War StoriesCloud War Stories
Cloud War Stories
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 

Más de Aysylu Greenberg

Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Aysylu Greenberg
 
Binary Authorization in Kubernetes
Binary Authorization in KubernetesBinary Authorization in Kubernetes
Binary Authorization in KubernetesAysylu Greenberg
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisAysylu Greenberg
 
Software Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisSoftware Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisAysylu Greenberg
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisAysylu Greenberg
 
Zero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleZero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleAysylu Greenberg
 
Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Aysylu Greenberg
 
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightMesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightAysylu Greenberg
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Aysylu Greenberg
 
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Aysylu Greenberg
 
Building a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleBuilding a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleAysylu Greenberg
 
Distributed systems in practice, in theory
Distributed systems in practice, in theoryDistributed systems in practice, in theory
Distributed systems in practice, in theoryAysylu Greenberg
 
Probabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFProbabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFAysylu Greenberg
 
Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Aysylu Greenberg
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Aysylu Greenberg
 
Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Aysylu Greenberg
 
PWL: One VM to Rule Them All
PWL: One VM to Rule Them AllPWL: One VM to Rule Them All
PWL: One VM to Rule Them AllAysylu Greenberg
 

Más de Aysylu Greenberg (20)

Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021
 
Binary Authorization in Kubernetes
Binary Authorization in KubernetesBinary Authorization in Kubernetes
Binary Authorization in Kubernetes
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Software Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisSoftware Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and Kritis
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Zero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleZero Downtime Migrations at Scale
Zero Downtime Migrations at Scale
 
Zero Downtime Migration
Zero Downtime MigrationZero Downtime Migration
Zero Downtime Migration
 
PWL Denver: Copysets
PWL Denver: CopysetsPWL Denver: Copysets
PWL Denver: Copysets
 
Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)
 
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightMesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)
 
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
 
Building a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleBuilding a Distributed Build System at Google Scale
Building a Distributed Build System at Google Scale
 
(+ Loom (years 2))
(+ Loom (years 2))(+ Loom (years 2))
(+ Loom (years 2))
 
Distributed systems in practice, in theory
Distributed systems in practice, in theoryDistributed systems in practice, in theory
Distributed systems in practice, in theory
 
Probabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFProbabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SF
 
Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015
 
Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)
 
PWL: One VM to Rule Them All
PWL: One VM to Rule Them AllPWL: One VM to Rule Them All
PWL: One VM to Rule Them All
 

Último

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Último (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Benchmarking (RICON 2014)

  • 1. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22
  • 2.
  • 3. To Write Good Benchmarks… Need to be Full Stack
  • 4. Benchmark = How Fast? your process vs Goal your process vs Best PracCces
  • 5. Today • How Not to Write Benchmarks • Benchmark Setup & Results: - You’re wrong about machines - You’re wrong about stats - You’re wrong about what maLers • Becoming Less Wrong • Having Fun with Riak
  • 6. HOW NOT TO WRITE BENCHMARKS
  • 7. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 8. WHAT’S WRONG WITH THIS BENCHMARK?
  • 9. YOU’RE WRONG ABOUT THE MACHINE
  • 10. Wrong About the Machine • Cache, cache, cache, cache!
  • 11. It’s Caches All The Way Down Web Request Server Cache S3
  • 12. It’s Caches All The Way Down
  • 13. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 14. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 15. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 16. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 17. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 18. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 19. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming
  • 20. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 21. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference
  • 22. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 23. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod
  • 24. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 25. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod • Power mode changes
  • 26. YOU’RE WRONG ABOUT THE STATS
  • 27. Wrong About Stats • Too few samples
  • 28. Wrong About Stats 120 100 80 60 40 20 0 Convergence of Median on Samples 0 10 20 30 40 50 60 Latency Time Stable Samples Stable Median Decaying Samples Decaying Median
  • 29. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 30. Wrong About Stats • Too few samples • Gaussian (not)
  • 31. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 32. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon
  • 33. MulCmodal DistribuCon 50% 99% # occurrences Latency 5 ms 10 ms
  • 34. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon • Outliers
  • 35. YOU’RE WRONG ABOUT WHAT MATTERS
  • 36. Wrong About What MaLers • Premature opCmizaCon
  • 37. “Programmers waste enormous amounts of Cme thinking about … the speed of noncriCcal parts of their programs ... Forget about small efficiencies …97% of the Cme: premature opHmizaHon is the root of all evil. Yet we should not pass up our opportuniCes in that criCcal 3%.” -­‐-­‐ Donald Knuth
  • 38. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads
  • 39. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure
  • 40. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing
  • 41. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing • Reproducibility of measurements
  • 43. User AcCons MaLer X > Y for workload Z with trade offs A, B, and C -­‐ hLp://www.toomuchcode.org/
  • 44. Profiling Code instrumentaCon Aggregate over logs Traces
  • 45. Microbenchmarking: Blessing & Curse + Quick & cheap + Answers narrow ?s well - Osen misleading results - Not representaCve of the program
  • 46. Microbenchmarking: Blessing & Curse • Choose your N wisely
  • 47. Choose Your N Wisely Prof. Saman Amarasinghe, MIT 2009
  • 48. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects
  • 49. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon
  • 50. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon
  • 51. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon • Constant work per iteraCon
  • 53. Follow-­‐up Material • How NOT to Measure Latency by Gil Tene – hLp://www.infoq.com/presentaCons/latency-­‐piualls • Taming the Long Latency Tail on highscalability.com – hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html • Performance Analysis Methodology by Brendan Gregg – hLp://www.brendangregg.com/methodology.html • Silverman’s Mode Detec@on Method by MaL Adereth – hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ mode-­‐detecCon-­‐method-­‐explained/
  • 55. Setup • SSD 30 GB • M3 large • Riak version 1.4.2-­‐0-­‐g61ac9d8 • Ubuntu 12.04.5 LTS • 4 byte keys, 10 KB values
  • 56. 2350 2300 2250 2200 2150 2100 2050 2000 1950 1900 1850 Latency (usec) Get Latency L3 Number of Keys
  • 60. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22