SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
Opsian
Production Profiling: What, Why
and How
Richard Warburton (@richardwarburto)
Sadiq Jaffer (@sadiqj)
Opsian
Why Performance Matters
Development isn’t Production
Profiling vs Monitoring
Production Profiling
Conclusion
Opsian
Customer Experience
Opsian
Stop Costly Downtime
Opsian
Amazon: 100ms of latency costs 1% of sales
Google: 500ms seconds in search page generation time drops traffic by 20%
Responsive Applications make more Money
Opsian
Improve Developer Productivity
Source: ZT RebelLabs Developer Productivity Report 2017
Opsian
Reduce Costs
Opsian
Why Performance Matters
Development isn’t Production
Profiling vs Monitoring
Production Profiling
Conclusion
Opsian
Development isn’t Production
Testing in development can be easier
Tooling often desktop-based
Not representative of production
Opsian
Unrepresentative Hardware
vs
Opsian
Unrepresentative Software
Opsian
Virtualised environments can perform very differently
“Two frequently used system calls are
~77% slower on AWS EC2”
https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-sl
ower-on-ec2/
Opsian
Unrepresentative Workloads
vs
Opsian
The JVM may have very different behaviour in production
Hotspot does adaptive optimisation
Production may optimise differently
Opsian
Why Performance Matters
Development isn’t Production
Profiling vs Monitoring
Production Profiling
Conclusion
Opsian
Opsian
Ambient/Passive/System Metrics
Static numerical measure about the system
CPU Time Usage / Page-load Times
Cheap and sometimes effective
Opsian
Coarse Grained Instrumentation
Measures time within some instrumented section of the code
Time spent inside the controller layer of your web-app or performing SQL queries
More detailed and actionable though expensive
Opsian
Logging
Records arbitrary events emitted by the system being monitored
log4j/slf4j/logback
Logs of GC events
Often manual, aids system understanding, expensive
Opsian
Production Profiling
Measures what part of your code is eating up some resource
What methods take up CPU or wallclock time?
What lines of code allocate the most objects?
Where are your CPU Cache misses coming from?
Automatic, can be cheap but often isn’t
Opsian
Understand don’t Overwhelm
Too Little
You can’t understand production
problems
Too Much
Needle in a Haystack
You are the problem (overhead)
Opsian
Unactionable Metrics
Many metrics provide pretty graphs but don’t progress treatment
Opsian
Normalisation of Deviance
“Some of the tests always fail, so we just ignore them.”
“Some of the alerts get triggered regularly, so we just ignore them.”
Alert false positives have a cost
Opsian
Instrumentation Cycle
Too much overhead
Reduce instrumentation
granularity
Lack of
instrumentation
Increase instrumentation
granularity
Opsian
Surely a better way?
Diagnostics not Diagnosis
Are there alternatives to the instrumentation dance?
Can’t just rely on metrics - you need actionable insights for your app
What about profiling?
Opsian
Why Performance Matters
Development isn’t Production
Profiling vs Monitoring
Production Profiling
Conclusion
Opsian
Profiling tells you what your application was doing
Opsian
Instrumenting Profilers
Add instructions to collect timings (Eg: JVisualVM Profiler)
Inaccurate - modifies the behaviour of the program
High Overhead - > 2x slower
Opsian
Sampling/Statistical Profilers
Interrupt application periodically to sample the stack
Overhead dependent upon sampling mechanism
Build up a statistical picture over time
Opsian
Statistical Profiling in Java
Periodic GetAllStackTraces
Safepoint bias
Latency and throughput impact
Opsian
Advanced Statistical Profiling in Java
OS Signals to interrupt threads on resource consumption threshold
JVM’s signal handler-safe AsyncGetCallTrace to walk the stack
Opsian
Open Source Java Profilers
High Overhead
VisualVM
hprof
Twitter’s CPUProfile
Anything GetAllStackTraces based
Low Overhead
Async Profiler
Honest Profiler
Java Mission Control
Opsian
Profiling Support in the Linux Kernel
perf and eBPF
perf-map-agent for the JVM
Hardware events (L1/L2/L3 cache misses, branch mispredictions, etc.)
Opsian
People are put off by practical as
well as technical issues
Opsian
Barriers to Ad-Hoc Production Profiling
Low overhead unbiased tools are
relatively new
Generally requires access to
production
Process involves manual work - hard
to automate
Opsian
What if we profiled all the time?
Opsian
Historical Data
Allows for post-hoc incident analysis
Enables correlation with other data/metrics
Performance regression analysis
Opsian
Aggregation
Aggregation and
Indexing
Samples
Reports
Opsian
Putting Samples in Context
Application version
Environment parameters (machine type, CPU, location, etc.)
Desktop profilers don’t do this
Opsian
Summary
Production Profiling is possible - all the time with low overhead
Production Profiling is desirable - can give deep insights
Problems can be overcome
Opsian
Why Performance Matters
Development isn’t Production
Profiling vs Monitoring
Prod Profiling Problems
Conclusion
Opsian
Performance Matters
Development isn’t Production
Metrics can be unactionable
Instrumentation has high overhead
Production Profiling provides insight
Opsian
We need an attitude shift on profiling
+ monitoring
Opsian
Always
On
Proactive
not Reactive
Systematic
not Ad Hoc
Opsian
Please do Production Profiling.
All the time.
Opsian
Any Questions?
https://www.opsian.com/
Opsian
The End

Más contenido relacionado

Similar a Production Profiling: What, Why and How

Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]
Munirathnam Naidu
 
Technical Testing Introduction
Technical Testing IntroductionTechnical Testing Introduction
Technical Testing Introduction
extentconf Tsoy
 

Similar a Production Profiling: What, Why and How (20)

Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]Best Practices In Load And Stress Testing Cmg Seminar[1]
Best Practices In Load And Stress Testing Cmg Seminar[1]
 
Brave New World - A wider perspective of our opportunities
Brave New World - A wider perspective of our opportunitiesBrave New World - A wider perspective of our opportunities
Brave New World - A wider perspective of our opportunities
 
Operations: Production Readiness
Operations: Production ReadinessOperations: Production Readiness
Operations: Production Readiness
 
DevOps and Splunk
DevOps and SplunkDevOps and Splunk
DevOps and Splunk
 
Innovate Better Through Machine data Analytics
Innovate Better Through Machine data AnalyticsInnovate Better Through Machine data Analytics
Innovate Better Through Machine data Analytics
 
AfterTest Madrid March 2016 - DevOps and Testing Introduction
AfterTest Madrid March 2016 - DevOps and Testing IntroductionAfterTest Madrid March 2016 - DevOps and Testing Introduction
AfterTest Madrid March 2016 - DevOps and Testing Introduction
 
Designing Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web ApplicationsDesigning Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web Applications
 
Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019
Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019
Agile Data Science on Greenplum Using Airflow - Greenplum Summit 2019
 
Neotys PAC - Stijn Schepers
Neotys PAC - Stijn SchepersNeotys PAC - Stijn Schepers
Neotys PAC - Stijn Schepers
 
Performance Engineering Basics
Performance Engineering BasicsPerformance Engineering Basics
Performance Engineering Basics
 
Discover the power of QA automation testing
Discover the power of QA automation testingDiscover the power of QA automation testing
Discover the power of QA automation testing
 
Start Up Austin 2017: Production Preview - How to Stop Bad Things From Happening
Start Up Austin 2017: Production Preview - How to Stop Bad Things From HappeningStart Up Austin 2017: Production Preview - How to Stop Bad Things From Happening
Start Up Austin 2017: Production Preview - How to Stop Bad Things From Happening
 
NashTech - Azure Application Insights
NashTech - Azure Application InsightsNashTech - Azure Application Insights
NashTech - Azure Application Insights
 
Virtualising Tier 1 Apps
Virtualising Tier 1 AppsVirtualising Tier 1 Apps
Virtualising Tier 1 Apps
 
The DevOps Dance - Shift Left, Shift Right - Get It Right
The DevOps Dance - Shift Left, Shift Right - Get It RightThe DevOps Dance - Shift Left, Shift Right - Get It Right
The DevOps Dance - Shift Left, Shift Right - Get It Right
 
Beyond the Buzzwords
Beyond the BuzzwordsBeyond the Buzzwords
Beyond the Buzzwords
 
Technical Testing Introduction
Technical Testing IntroductionTechnical Testing Introduction
Technical Testing Introduction
 
Technical Testing Introduction
Technical Testing IntroductionTechnical Testing Introduction
Technical Testing Introduction
 
Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016Netflix Edge Engineering Open House Presentations - June 9, 2016
Netflix Edge Engineering Open House Presentations - June 9, 2016
 
Encontrando la Aguja en el Rendimiento de Aplicaciones
Encontrando la Aguja en el Rendimiento de AplicacionesEncontrando la Aguja en el Rendimiento de Aplicaciones
Encontrando la Aguja en el Rendimiento de Aplicaciones
 

Más de RichardWarburton

Más de RichardWarburton (20)

Java collections the force awakens
Java collections  the force awakensJava collections  the force awakens
Java collections the force awakens
 
Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)Generics Past, Present and Future (Latest)
Generics Past, Present and Future (Latest)
 
Collections forceawakens
Collections forceawakensCollections forceawakens
Collections forceawakens
 
Generics past, present and future
Generics  past, present and futureGenerics  past, present and future
Generics past, present and future
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
 
How to run a hackday
How to run a hackdayHow to run a hackday
How to run a hackday
 
Generics Past, Present and Future
Generics Past, Present and FutureGenerics Past, Present and Future
Generics Past, Present and Future
 
Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)Pragmatic functional refactoring with java 8 (1)
Pragmatic functional refactoring with java 8 (1)
 
Performance and predictability (1)
Performance and predictability (1)Performance and predictability (1)
Performance and predictability (1)
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Twins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional ProgrammingTwins: Object Oriented Programming and Functional Programming
Twins: Object Oriented Programming and Functional Programming
 
Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8Pragmatic functional refactoring with java 8
Pragmatic functional refactoring with java 8
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Introduction to lambda behave
Introduction to lambda behaveIntroduction to lambda behave
Introduction to lambda behave
 
Performance and predictability
Performance and predictabilityPerformance and predictability
Performance and predictability
 
Simplifying java with lambdas (short)
Simplifying java with lambdas (short)Simplifying java with lambdas (short)
Simplifying java with lambdas (short)
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
The Bleeding Edge
The Bleeding EdgeThe Bleeding Edge
The Bleeding Edge
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakes
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Production Profiling: What, Why and How