SlideShare a Scribd company logo
1 of 29
Download to read offline
Slow is the new down.
Understanding
Slowness
When shit goes wrong,
the gloves come off.
Goals
❖ Approach an understanding of your architecture,
❖ Convert this understanding into a strategic plan
❖ Develop logistics for diagnosis
❖ Discuss discipline around remediation
The first step
Understand
Build a map
!
Build two
!
!
“If you don’t have a good map of your
architecture, Dora will whoop you.”
-Theo
How you’d like to think of
Your architecture
Elegant

Beautiful in its simplicity

Robust

Resilient
When in actuality
Your

Architecture
is
!
Organically grown

Cancerous tumors
Disaster waiting to happen
Hella complicated
!
of which you are



Inexplicably proud
Photograph courtesy of Herman Rhoids
Map #1
High-level map
Architectural components

Connectedness

Data flow
Map #2
Low-level map
Component versions
Component languages
OS/NICs/HBAs
Location
Switches/Routers/FW
Connected Service details
Develop
Strategic Plan
There are 2 types useful SREs:
!
Spanning several boundaries
!
Spanning all boundaries
Photograph courtesy of Tambako The Jaguar
https://www.flickr.com/photos/tambako/4598642399
You can’t play ball without bases.
Who’s on first?
Establish who is responsible for each
component in each context.
!
Establish who is responsible when that
person fails

(upward).
!
Establish who is responsible when that
person needs help

(upward and downward)
Nothing will ever be “broken”

if it isn’t expected to “work.”
Expectations
Set expectations for

breakages and slowdowns.
!
What you build will break,
understanding under what stress is your
job as an engineer.
Parts are parts.
Ø tech loyalty
Constructing a solution from parts.
!
Parts are replaceable.
!
Have a list of replacement vendors of
part alternates.
!
If you design a solution relying on a part
available only from a single vendor, you
have accomplished lock-in.
Photograph courtesy of Jason Ilagan
https://www.flickr.com/photos/thepen/428014152
When things are broken (or slow)
Logistics matter
Observability
!
Tool parity
!
Safety harnesses
You cannot improve

what you cannot measure
Measure
Cut once
Rear Admiral Grace Murray Hopper 1906-1992
The one beast you cannot slay:
Latency
You must subdue it



First you must understand it
Averages are for chumps
Histograms over
Aggregations
Reducing many observations S
to N values (∀ |N| << |S|) is
the definition of lossy.
!
or… “you don’t know shit”
Exploring quantiles is simple and can provide increased understanding.
Quantiles
Time-series histograms are a
lot of information to digest.
!
Moving quantiles can often
provide much more insight.
Remember that you’re consolidating time.
Granular data
Time consolidation is needed.
!
It can be misleading.
!
Ask good statistical questions.
Knowing your q(0.99) is “too high” is one thing…
Work backwards
Work backwards.
!
At what quantile are you?
mvalue: http://www.brendangregg.com/FrequencyTrails/modes.html
Understand
Workloads
man(1) is a tool’s tool.
Tools
Tools do not a master craftsman make.
!
Regardless, know your damn tools.
!
There are three types of tools.
Photograph courtesy of James Bowe
https://www.flickr.com/photos/jamesrbowe/7164489201
Tool type #1
Observation
Taking measurements.
!
Inspecting state.
!
Inspecting conversation.
Photograph courtesy of Gordon Wrigley
https://www.flickr.com/photos/tolomea/4196160169
Tool type #2
Synthesis Synthesizes something to
enable the use of tool type #1
Photograph courtesy of Simon Yao
https://www.flickr.com/photos/smjb/8107539280
Tool type #3
Manipulation
Changing state.
!
Used for testing hypotheses.
Photograph courtesy of DragonFlyCC
https://www.flickr.com/photos/ladydragonflyherworld/4299545598
Favorite tools
Martial Arts
• DTrace
• truss/ktrace/strace
• tcpdump/snoop
• mdb/gdb/dbx/lldb
• sar/mpstat/iostat/vmstat
!
• curl
!
• vi/echo
• sysctl/mdb(-w)
• DTrace(-w)
#1#2#3
Photograph courtesy of Republic of Korea
https://www.flickr.com/photos/koreanet/6099430458
Lorem Ipsum Dolor Indeed
Anecdotes This one time at band camp
Photograph courtesy of umjanedoan
https://www.flickr.com/photos/umjanedoan/497411169
Latency
I’m huge in Japan
Latency for a hot landing page jumps
from around 300ms to around 450ms.
!
No changes in latency to other regions.
Latency
Scrub in

or go home
Latency for disk writes radically change
behavior.
!
It’s as if we have a new workload.
!
We do not have a new workload.
!
… we do have a new workload.
!
Photograph courtesy of Phalinn Ooi
https://www.flickr.com/photos/umjanedoan/497411169
Latent effect
Hitting the wall
Disk I/O latency goes to hell at 3pm.
!
Turns out disk throughput is plateaued.
!
No change in configuration near 3pm.
!
Oops, I tripped at 10am.
Illustration courtesy of Jeff Warren
https://www.flickr.com/photos/jeffreywarren/354553098
Thank You

More Related Content

Viewers also liked

OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012Theo Schlossnagle
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentationTheo Schlossnagle
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About PerformanceTheo Schlossnagle
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 

Viewers also liked (10)

Craftsmanship
CraftsmanshipCraftsmanship
Craftsmanship
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
Project reality
Project realityProject reality
Project reality
 
It's all about telemetry
It's all about telemetryIt's all about telemetry
It's all about telemetry
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Monitoring the #DevOps way
Monitoring the #DevOps wayMonitoring the #DevOps way
Monitoring the #DevOps way
 
Operational Software Design
Operational Software DesignOperational Software Design
Operational Software Design
 

Similar to Understanding Slowness

Network Mapping & Data Storytelling for Beginners
Network Mapping & Data Storytelling for BeginnersNetwork Mapping & Data Storytelling for Beginners
Network Mapping & Data Storytelling for BeginnersRenaud Clément
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013Ken Mwai
 
Vision And Progress
Vision And ProgressVision And Progress
Vision And Progressmpmeier
 
Planning for Uncertainty
Planning for UncertaintyPlanning for Uncertainty
Planning for UncertaintyMarcin Czenko
 
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014Ruven Gotz
 
Visual Tools and Innovation Games Workshop - #SPSChicagoBurbs - May 2014
Visual Tools and Innovation Games  Workshop - #SPSChicagoBurbs - May 2014Visual Tools and Innovation Games  Workshop - #SPSChicagoBurbs - May 2014
Visual Tools and Innovation Games Workshop - #SPSChicagoBurbs - May 2014Michelle Caldwell, PSM, SSGB
 
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...Colin Panisset
 
Design practice Project - MSc HCI
Design practice Project - MSc HCIDesign practice Project - MSc HCI
Design practice Project - MSc HCIGustavo Soto Miño
 
Data Modelling at Scale
Data Modelling at ScaleData Modelling at Scale
Data Modelling at ScaleDavid Simons
 
Visualising Space and Time
Visualising Space and TimeVisualising Space and Time
Visualising Space and TimeShawn Day
 
Data visualisations as a gateway to programming
Data visualisations as a gateway to programmingData visualisations as a gateway to programming
Data visualisations as a gateway to programmingMia
 
Steps for mapping - a rough guide
Steps for mapping - a rough guideSteps for mapping - a rough guide
Steps for mapping - a rough guideSimon Wardley
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
 
The Architecture of Uncertainty
The Architecture of UncertaintyThe Architecture of Uncertainty
The Architecture of UncertaintyKevlin Henney
 
The Future of the Map
The Future of the MapThe Future of the Map
The Future of the Mapelliotharmon
 
CS5032 Lecture 5: Human Error 1
CS5032 Lecture 5: Human Error 1CS5032 Lecture 5: Human Error 1
CS5032 Lecture 5: Human Error 1John Rooksby
 
Visual tools and innovation games workshop - SPTechCon - Apr 2014
Visual tools and innovation games workshop - SPTechCon - Apr 2014Visual tools and innovation games workshop - SPTechCon - Apr 2014
Visual tools and innovation games workshop - SPTechCon - Apr 2014Ruven Gotz
 

Similar to Understanding Slowness (20)

Network Mapping & Data Storytelling for Beginners
Network Mapping & Data Storytelling for BeginnersNetwork Mapping & Data Storytelling for Beginners
Network Mapping & Data Storytelling for Beginners
 
Grc t18
Grc t18Grc t18
Grc t18
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013
 
Vision And Progress
Vision And ProgressVision And Progress
Vision And Progress
 
Planning for Uncertainty
Planning for UncertaintyPlanning for Uncertainty
Planning for Uncertainty
 
1. Cyber and Intelligence
1. Cyber and Intelligence1. Cyber and Intelligence
1. Cyber and Intelligence
 
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014
Visual Tools and Innovation Games - Workshop - SPS Chicago Suburbs - May 2014
 
Visual Tools and Innovation Games Workshop - #SPSChicagoBurbs - May 2014
Visual Tools and Innovation Games  Workshop - #SPSChicagoBurbs - May 2014Visual Tools and Innovation Games  Workshop - #SPSChicagoBurbs - May 2014
Visual Tools and Innovation Games Workshop - #SPSChicagoBurbs - May 2014
 
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...
LASTconf 2018 - System Mapping: Discover, Communicate and Explore the Real Co...
 
Design practice Project - MSc HCI
Design practice Project - MSc HCIDesign practice Project - MSc HCI
Design practice Project - MSc HCI
 
Data Modelling at Scale
Data Modelling at ScaleData Modelling at Scale
Data Modelling at Scale
 
Visualising Space and Time
Visualising Space and TimeVisualising Space and Time
Visualising Space and Time
 
Data visualisations as a gateway to programming
Data visualisations as a gateway to programmingData visualisations as a gateway to programming
Data visualisations as a gateway to programming
 
Steps for mapping - a rough guide
Steps for mapping - a rough guideSteps for mapping - a rough guide
Steps for mapping - a rough guide
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
The Architecture of Uncertainty
The Architecture of UncertaintyThe Architecture of Uncertainty
The Architecture of Uncertainty
 
The Future of the Map
The Future of the MapThe Future of the Map
The Future of the Map
 
CS5032 Lecture 5: Human Error 1
CS5032 Lecture 5: Human Error 1CS5032 Lecture 5: Human Error 1
CS5032 Lecture 5: Human Error 1
 
Visual tools and innovation games workshop - SPTechCon - Apr 2014
Visual tools and innovation games workshop - SPTechCon - Apr 2014Visual tools and innovation games workshop - SPTechCon - Apr 2014
Visual tools and innovation games workshop - SPTechCon - Apr 2014
 

More from Theo Schlossnagle

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to ComplexityTheo Schlossnagle
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwareTheo Schlossnagle
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or NotTheo Schlossnagle
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service designTheo Schlossnagle
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoringTheo Schlossnagle
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachTheo Schlossnagle
 
Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everythingTheo Schlossnagle
 

More from Theo Schlossnagle (13)

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to Complexity
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped Software
 
Monitoring 101
Monitoring 101Monitoring 101
Monitoring 101
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or Not
 
Applying SRE techniques to micro service design
Applying SRE techniques to micro service designApplying SRE techniques to micro service design
Applying SRE techniques to micro service design
 
Commandments of scale
Commandments of scaleCommandments of scale
Commandments of scale
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoring
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approach
 
Webops dashboards
Webops dashboardsWebops dashboards
Webops dashboards
 
Web Operations Career
Web Operations CareerWeb Operations Career
Web Operations Career
 
Http front-ends
Http front-endsHttp front-ends
Http front-ends
 
Applying operations culture to everything
Applying operations culture to everythingApplying operations culture to everything
Applying operations culture to everything
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Understanding Slowness

  • 1. Slow is the new down. Understanding Slowness When shit goes wrong, the gloves come off.
  • 2. Goals ❖ Approach an understanding of your architecture, ❖ Convert this understanding into a strategic plan ❖ Develop logistics for diagnosis ❖ Discuss discipline around remediation
  • 3. The first step Understand Build a map ! Build two ! ! “If you don’t have a good map of your architecture, Dora will whoop you.” -Theo
  • 4. How you’d like to think of Your architecture Elegant
 Beautiful in its simplicity
 Robust
 Resilient
  • 5. When in actuality Your
 Architecture is ! Organically grown
 Cancerous tumors Disaster waiting to happen Hella complicated ! of which you are
 
 Inexplicably proud Photograph courtesy of Herman Rhoids
  • 6. Map #1 High-level map Architectural components
 Connectedness
 Data flow
  • 7. Map #2 Low-level map Component versions Component languages OS/NICs/HBAs Location Switches/Routers/FW Connected Service details
  • 8. Develop Strategic Plan There are 2 types useful SREs: ! Spanning several boundaries ! Spanning all boundaries Photograph courtesy of Tambako The Jaguar https://www.flickr.com/photos/tambako/4598642399
  • 9. You can’t play ball without bases. Who’s on first? Establish who is responsible for each component in each context. ! Establish who is responsible when that person fails
 (upward). ! Establish who is responsible when that person needs help
 (upward and downward)
  • 10. Nothing will ever be “broken”
 if it isn’t expected to “work.” Expectations Set expectations for
 breakages and slowdowns. ! What you build will break, understanding under what stress is your job as an engineer.
  • 11. Parts are parts. Ø tech loyalty Constructing a solution from parts. ! Parts are replaceable. ! Have a list of replacement vendors of part alternates. ! If you design a solution relying on a part available only from a single vendor, you have accomplished lock-in. Photograph courtesy of Jason Ilagan https://www.flickr.com/photos/thepen/428014152
  • 12. When things are broken (or slow) Logistics matter Observability ! Tool parity ! Safety harnesses
  • 13. You cannot improve
 what you cannot measure Measure Cut once Rear Admiral Grace Murray Hopper 1906-1992
  • 14. The one beast you cannot slay: Latency You must subdue it
 
 First you must understand it
  • 15. Averages are for chumps Histograms over Aggregations Reducing many observations S to N values (∀ |N| << |S|) is the definition of lossy. ! or… “you don’t know shit”
  • 16. Exploring quantiles is simple and can provide increased understanding. Quantiles Time-series histograms are a lot of information to digest. ! Moving quantiles can often provide much more insight.
  • 17. Remember that you’re consolidating time. Granular data Time consolidation is needed. ! It can be misleading. ! Ask good statistical questions.
  • 18. Knowing your q(0.99) is “too high” is one thing… Work backwards Work backwards. ! At what quantile are you?
  • 20. man(1) is a tool’s tool. Tools Tools do not a master craftsman make. ! Regardless, know your damn tools. ! There are three types of tools. Photograph courtesy of James Bowe https://www.flickr.com/photos/jamesrbowe/7164489201
  • 21. Tool type #1 Observation Taking measurements. ! Inspecting state. ! Inspecting conversation. Photograph courtesy of Gordon Wrigley https://www.flickr.com/photos/tolomea/4196160169
  • 22. Tool type #2 Synthesis Synthesizes something to enable the use of tool type #1 Photograph courtesy of Simon Yao https://www.flickr.com/photos/smjb/8107539280
  • 23. Tool type #3 Manipulation Changing state. ! Used for testing hypotheses. Photograph courtesy of DragonFlyCC https://www.flickr.com/photos/ladydragonflyherworld/4299545598
  • 24. Favorite tools Martial Arts • DTrace • truss/ktrace/strace • tcpdump/snoop • mdb/gdb/dbx/lldb • sar/mpstat/iostat/vmstat ! • curl ! • vi/echo • sysctl/mdb(-w) • DTrace(-w) #1#2#3 Photograph courtesy of Republic of Korea https://www.flickr.com/photos/koreanet/6099430458
  • 25. Lorem Ipsum Dolor Indeed Anecdotes This one time at band camp Photograph courtesy of umjanedoan https://www.flickr.com/photos/umjanedoan/497411169
  • 26. Latency I’m huge in Japan Latency for a hot landing page jumps from around 300ms to around 450ms. ! No changes in latency to other regions.
  • 27. Latency Scrub in
 or go home Latency for disk writes radically change behavior. ! It’s as if we have a new workload. ! We do not have a new workload. ! … we do have a new workload. ! Photograph courtesy of Phalinn Ooi https://www.flickr.com/photos/umjanedoan/497411169
  • 28. Latent effect Hitting the wall Disk I/O latency goes to hell at 3pm. ! Turns out disk throughput is plateaued. ! No change in configuration near 3pm. ! Oops, I tripped at 10am. Illustration courtesy of Jeff Warren https://www.flickr.com/photos/jeffreywarren/354553098