SlideShare una empresa de Scribd logo
1 de 103
Descargar para leer sin conexión
Operational InsightJune 15, 2015
Roy Rapoport
@royrapoport / linkedin.com/in/royrapoport / rrapoport@netflix.com
Oh, The Places
We’ll Go!
John Boyd
Observe
Observe
Orient
Observe
Orient
Decide
Observe
Orient
Decide
Act
Observe
Orient
Decide
Act
OODA
Observe
Orient
Decide
Act
OODA
“This approach favors agility over raw power in dealing with human
opponents in any endeavor” - Wikipedia
This Is What We
Do
OODA KPI
OODA KPI
Speed
OODA KPI
Speed Effort
OODA KPI
Speed Effort Reliability
Winning
Speed Effort Reliability
Winning
Speed
Effort Reliability
Winning
Speed
Effort
Reliability
Winning
Speed
Effort
Reliability
Implications …
for Observation (aka measurement, telemetry, metrics)
Implications …
for Observation (aka measurement, telemetry, metrics)
• Make It Easy
Implications …
for Observation (aka measurement, telemetry, metrics)
• Make It Easy
• Make It Scalable
Implications …
for Observation (aka measurement, telemetry, metrics)
• Make It Easy
• Make It Scalable
• Make it pluggable
Implications …
for Observation (aka measurement, telemetry, metrics)
• Make It Easy
• Make It Scalable
• Make it pluggable
• (Eventually) Ruthlessly Cull
Implications …
for Observation (aka measurement, telemetry, metrics)
• Make It Easy
• Make It Scalable
• Make it pluggable
• (Eventually) Ruthlessly Cull
“What decision will this help me make?”
A Joke
52
48
% of servers in major region
with an even IP address
Implications …
for Orientation (aka graphing, visualization)
Implications …
for Orientation (aka graphing, visualization)
• First-class product
Implications …
for Orientation (aka graphing, visualization)
• First-class product
• Different decisions require different viz
Implications …
for Orientation (aka graphing, visualization)
• First-class product
• Different decisions require different viz
• Low cognitive load better than
Implications …
for Orientation (aka graphing, visualization)
• First-class product
• Different decisions require different viz
• Low cognitive load better than
• High refresh rates
Implications …
for Orientation (aka graphing, visualization)
• First-class product
• Different decisions require different viz
• Low cognitive load better than
• High refresh rates
• Deep data density
Better Like This …
Or Better Like That …
Implications …
for Decisions (aka alerting, real-time analytics, etc)
Implications …
for Decisions (aka alerting, real-time analytics, etc)
• You already have (some of) this
Implications …
for Decisions (aka alerting, real-time analytics, etc)
• You already have (some of) this
• Incremental improvement
Implications …
for Decisions (aka alerting, real-time analytics, etc)
• You already have (some of) this
• Incremental improvement
• Sky’s the limit
Implications …
for Decisions (aka alerting, real-time analytics, etc)
• You already have (some of) this
• Incremental improvement
• Sky’s the limit
• For benefits
Implications …
for Decisions (aka alerting, real-time analytics, etc)
• You already have (some of) this
• Incremental improvement
• Sky’s the limit
• For benefits
• For cost
Implications …
for Action
Implications …
for Action
1. Humans beat bureaucracy
Implications …
for Action
1. Humans beat bureaucracy
2. Machines beat humans
Implications …
for Action
1. Humans beat bureaucracy
2. Machines beat humans
3. Repeatability beats one-offs
Implications …
for Action
1. Humans beat bureaucracy
2. Machines beat humans
3. Repeatability beats one-offs
Repeatable machine processes TROUNCE one-off human
bureaucracy
Implications …
for Action
1. Humans beat bureaucracy
2. Machines beat humans
3. Repeatability beats one-offs
4. Start with humans
Repeatable machine processes TROUNCE one-off human
bureaucracy
Implications …
for Action
1. Humans beat bureaucracy
2. Machines beat humans
3. Repeatability beats one-offs
4. Start with humans
5. If IFTTT, deprecate humans
Repeatable machine processes TROUNCE one-off human
bureaucracy
Decision:
Do I Have Enough
Instances?
Decision:
Is My Canary Good?
25
Been there.
Done that.
Manually.Artisanally.
25
Been there.
• Started in the Data Center
Done that.
Manually.Artisanally.
25
Been there.
• Started in the Data Center
• Manual, dashboard-driven
Done that.
Manually.Artisanally.
25
Been there.
Done that.
Manually.
26
CPURequestsErrors
Been there.
Done that.
Manually.
27
Been there.
Done that.
Manually.
• Context vs Precision
27
Been there.
Done that.
Manually.
• Context vs Precision
• No …
27
Been there.
Done that.
Manually.
• Context vs Precision
• No …
• Repeatability
27
Been there.
Done that.
Manually.
• Context vs Precision
• No …
• Repeatability
• Trending
27
Been there.
Done that.
Manually.
• Context vs Precision
• No …
• Repeatability
• Trending
• Manual effort is manual
27
So Now What?
28
So Now What?
• Automate Analysis
28
So Now What?
• Automate Analysis
• Took Some Effort
28
So Now What?
• Automate Analysis
• Took Some Effort
• Approach and analytics
28
So Now What?
• Automate Analysis
• Took Some Effort
• Approach and analytics
• Presentation matters
28
Version
Control
System
1000
servers
@ 1.0.1
Customers
Build &
Deployment
System
Automated
Canary
Analysis
Pretty Pictures
29
Version
Control
System
1000
servers
@ 1.0.1
Customers
Build &
Deployment
System
1 server
@ 1.0.2
Automated
Canary
Analysis
Pretty Pictures
29
10 servers
@ 1.0.2
Version
Control
System
1000
servers
@ 1.0.1
Customers
Build &
Deployment
System
Automated
Canary
Analysis
Pretty Pictures
29
1000
servers
@ 1.0.2
Version
Control
System
1000
servers
@ 1.0.1
Customers
Build &
Deployment
System
Automated
Canary
Analysis
Pretty Pictures
29
Versi
on
1000
servers
@ 1.0.1
Custome
Build &
Deployment
Automat
ed
1000
servers
@ 1.0.2
Pretty Pictures
30
Version
Control
System
Build &
Deployment
System
Automated
Canary
Analysis
Customers
Versi
on
Custome
Build &
Deployment
Automat
ed
1000
servers
@ 1.0.2
Pretty Pictures
30
Version
Control
System
Build &
Deployment
System
Automated
Canary
Analysis
Customers
Versi
on
1000
servers
@ 1.0.1
Custome
Build &
Deployment
Automat
ed
1000
servers
@ 1.0.2
Pretty Pictures
31
Version
Control
System
Build &
Deployment
System
Automated
Canary
Analysis
Versi
on
1000
servers
@ 1.0.1
Custome
Build &
Deployment
Automat
ed
1000
servers
@ 1.0.2
Pretty Pictures
31
Version
Control
System
Build &
Deployment
System
Automated
Canary
Analysis
Just The Stats
4-Week View
Just The Stats
4-Week View
6309 canary analysis cycles
Just The Stats
4-Week View
6309 canary analysis cycles
16% canaries failed
Decision:
Do I Have an Outlier?
Outlier Detection
Would You Like to Play a
Game?
Spot the Outlier
The
Outlier Is
“A”
Just The Stats
4-Week View
Just The Stats
4-Week View
739 Server Terminations
In a Nutshell
Observe
Orient
Decide
Act
In a Nutshell
Observe
Orient
Decide
Act
Need This First
http://bit.ly/nflx-atlas-2013
http://metrics20.org
In a Nutshell
Observe
Orient
Decide
Act
Need This First
http://bit.ly/nflx-atlas-2013
http://metrics20.org
Understand the decision
http://bit.ly/nflx-qcon-aca-2014
In a Nutshell
Observe
Orient
Decide
Act
Need This First
http://bit.ly/nflx-atlas-2013
http://metrics20.org
Understand the decision
http://bit.ly/nflx-qcon-aca-2014
Make it easier for humans
In a Nutshell
Observe
Orient
Decide
Act
Need This First
http://bit.ly/nflx-atlas-2013
http://metrics20.org
Understand the decision
http://bit.ly/nflx-qcon-aca-2014
Make it easier for humans
Make machines

do it
In a Nutshell
Observe
Orient
Decide
Act
Need This First
http://bit.ly/nflx-atlas-2013
http://metrics20.org
Understand the decision
http://bit.ly/nflx-qcon-aca-2014
Make it easier for humans
Make machines

do it
Higher speed
Lower effort
Higher reliability
Questions, Attributions, Feedback
42
Questions, Attributions, Feedback
@royrapoport
rsr@netflix.com
linkedin.com/in/royrapoport
?42

Más contenido relacionado

La actualidad más candente

SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today  SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
Rundeck
 
The Art of Better
The Art of BetterThe Art of Better
The Art of Better
Arty Starr
 
Let's Make the PAIN Visible!
Let's Make the PAIN Visible!Let's Make the PAIN Visible!
Let's Make the PAIN Visible!
Arty Starr
 

La actualidad más candente (20)

Testing within an Agile Environment - Beyza Sakir and Chris Gollop
Testing within an Agile Environment - Beyza Sakir and Chris GollopTesting within an Agile Environment - Beyza Sakir and Chris Gollop
Testing within an Agile Environment - Beyza Sakir and Chris Gollop
 
Making Tomorrow Better than Today - Unlocking the Full Potential of Operations
Making Tomorrow Better than Today - Unlocking the Full Potential of OperationsMaking Tomorrow Better than Today - Unlocking the Full Potential of Operations
Making Tomorrow Better than Today - Unlocking the Full Potential of Operations
 
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today  SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today
 
Data-Driven Software Mastery @Open Mastery Austin
Data-Driven Software Mastery @Open Mastery AustinData-Driven Software Mastery @Open Mastery Austin
Data-Driven Software Mastery @Open Mastery Austin
 
Sww 2006 Redesigning Processes For Solid Works
Sww 2006   Redesigning Processes For Solid WorksSww 2006   Redesigning Processes For Solid Works
Sww 2006 Redesigning Processes For Solid Works
 
Esteem and Estimates (Ti Stimo Fratello)
Esteem and Estimates (Ti Stimo Fratello)Esteem and Estimates (Ti Stimo Fratello)
Esteem and Estimates (Ti Stimo Fratello)
 
Value stream mapping
Value stream mapping  Value stream mapping
Value stream mapping
 
Lego Lean Game (Agile Australia 2011)
Lego Lean Game (Agile Australia 2011)Lego Lean Game (Agile Australia 2011)
Lego Lean Game (Agile Australia 2011)
 
The Art of Better
The Art of BetterThe Art of Better
The Art of Better
 
Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0 - (a...
Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0 - (a...Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0 - (a...
Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0 - (a...
 
The Lego Lean Game (XP 2009 version)
The Lego Lean Game (XP 2009 version)The Lego Lean Game (XP 2009 version)
The Lego Lean Game (XP 2009 version)
 
The Pursuit of Quality - Chasing Tornadoes or Just Hot Air?
The Pursuit of Quality - Chasing Tornadoes or Just Hot Air?The Pursuit of Quality - Chasing Tornadoes or Just Hot Air?
The Pursuit of Quality - Chasing Tornadoes or Just Hot Air?
 
Let's Make the PAIN Visible!
Let's Make the PAIN Visible!Let's Make the PAIN Visible!
Let's Make the PAIN Visible!
 
141015 Discovering Scrum at Scrum Roma
141015 Discovering Scrum at Scrum Roma141015 Discovering Scrum at Scrum Roma
141015 Discovering Scrum at Scrum Roma
 
ABC's of Problem Solving
ABC's of Problem SolvingABC's of Problem Solving
ABC's of Problem Solving
 
Devops at scale is a hard problem challenges, insights and lessons learned
Devops at scale is a hard problem  challenges, insights and lessons learnedDevops at scale is a hard problem  challenges, insights and lessons learned
Devops at scale is a hard problem challenges, insights and lessons learned
 
Agile Intro and 2014 trends for AgileSparks open day at John-Bryce - March 2014
Agile Intro and 2014 trends for AgileSparks open day at John-Bryce - March 2014Agile Intro and 2014 trends for AgileSparks open day at John-Bryce - March 2014
Agile Intro and 2014 trends for AgileSparks open day at John-Bryce - March 2014
 
No Projects - Beyond Projects (Refreshed version)
No Projects - Beyond Projects (Refreshed version)No Projects - Beyond Projects (Refreshed version)
No Projects - Beyond Projects (Refreshed version)
 
Innovation, Lean, Agile. Myths and Misconception
Innovation, Lean, Agile. Myths and MisconceptionInnovation, Lean, Agile. Myths and Misconception
Innovation, Lean, Agile. Myths and Misconception
 
CTQ Tree Webinar 11-17-2020
CTQ Tree Webinar 11-17-2020CTQ Tree Webinar 11-17-2020
CTQ Tree Webinar 11-17-2020
 

Similar a Operational Insight: Concepts and Examples (w/o Presenter Notes)

Similar a Operational Insight: Concepts and Examples (w/o Presenter Notes) (20)

Operations as a Service: Because Failure Still Happens
Operations as a Service: Because Failure Still Happens Operations as a Service: Because Failure Still Happens
Operations as a Service: Because Failure Still Happens
 
bp
bpbp
bp
 
Nondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of Us
 
Agile Beyond Development
Agile Beyond DevelopmentAgile Beyond Development
Agile Beyond Development
 
Goto Chicago; Journeys To Cloud Native Architecture: Sun, Sea And Emergencies...
Goto Chicago; Journeys To Cloud Native Architecture: Sun, Sea And Emergencies...Goto Chicago; Journeys To Cloud Native Architecture: Sun, Sea And Emergencies...
Goto Chicago; Journeys To Cloud Native Architecture: Sun, Sea And Emergencies...
 
From devoops to devops
From devoops to devopsFrom devoops to devops
From devoops to devops
 
DriveBuild: Automation of Tests in the Field of Autonomous Cars
DriveBuild: Automation of Tests in the Field of Autonomous CarsDriveBuild: Automation of Tests in the Field of Autonomous Cars
DriveBuild: Automation of Tests in the Field of Autonomous Cars
 
Its not about the tooling
Its not about the toolingIts not about the tooling
Its not about the tooling
 
Self-Service Operations: Because Failure Still Happens (Developer Edition)
Self-Service Operations: Because Failure Still Happens (Developer Edition)Self-Service Operations: Because Failure Still Happens (Developer Edition)
Self-Service Operations: Because Failure Still Happens (Developer Edition)
 
No, we can't do continuous delivery
No, we can't do continuous deliveryNo, we can't do continuous delivery
No, we can't do continuous delivery
 
The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps The "Ops" Side of DevSecOps
The "Ops" Side of DevSecOps
 
Metrics-driven Continuous Delivery
Metrics-driven Continuous DeliveryMetrics-driven Continuous Delivery
Metrics-driven Continuous Delivery
 
Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014Monitoring Complex Systems - Chicago Erlang, 2014
Monitoring Complex Systems - Chicago Erlang, 2014
 
Devops is not about Tooling
Devops is not about ToolingDevops is not about Tooling
Devops is not about Tooling
 
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Keeping Your DevOps Transformation From Crushing Your Ops Capacity
Keeping Your DevOps Transformation From Crushing Your Ops Capacity
 
Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...
Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...
Navigation in 3 d environment with reinforcement learning by Predrag Njegovan...
 
Devops, The future is here, it's just not evenly distributed
Devops, The future is here, it's just not evenly distributedDevops, The future is here, it's just not evenly distributed
Devops, The future is here, it's just not evenly distributed
 
Faster apps. faster time to market. faster mean time to repair
Faster apps. faster time to market. faster mean time to repairFaster apps. faster time to market. faster mean time to repair
Faster apps. faster time to market. faster mean time to repair
 
Devopsdays barcelona
Devopsdays barcelonaDevopsdays barcelona
Devopsdays barcelona
 
Em.key
Em.keyEm.key
Em.key
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Operational Insight: Concepts and Examples (w/o Presenter Notes)