SlideShare una empresa de Scribd logo
Josh Evans - Director of Operations Engineering
November 16, 2015
Beyond DevOps:
How Netflix Bridges the Gap
Technical Debt
• Java 6
• Perforce
• Single Master Jenkins
• Ant
• CentOS
• Asgard/Mimir
Fall 2013
How do we drive broad-based change?
The Paved Road
• Java 7
• Stash
• Jenkins Shards
• Gradle
• Ubuntu
Some said
• You’re overloading us
• Too many projects
• Poor targeting
Others said
• What took you so long?
• We’ve moved on
• Now we need to migrate
That’s great but…
We’re paying a high tax
• Expectations gap
– Division of labor
– Timing of solutions
– Leadership
• Affects
– Reputation
– Relationships
– Lost opportunities
Organizational Debt
How do we bridge the gap?
“Remember that TIME is money…”
Time is a form of currency
• Product Engineering
• Operations Engineering
• Challenges & Strategies
Our time today…
• Product Engineering
• Operations Engineering
• Challenges & Strategies
Our time today…
Product Innovation
winning moments of truth
● Every facet of the product
● 1400 AB tests in the last year & accelerating
Continuous Innovation
But wait, there’s more…
Build It
• design
• code
• build
• bake
• test
• deploy
Run It
• configure
• monitor
• triage
• fix
…at scale, globally
You build it, you run it
Internet
• 1000s of starts per second
• 100,000s of requests per second
• 100,000,000 hours of content / day
• 3 AWS Regions, 3 AZs per region
Relentless product innovation
Building & running micro-
services at scale, globally
• Product Engineering
• Operations Engineering
• Challenges & Strategies
Our time today…
DevOps is a software development method that
emphasizes the roles of both software developers and
other information-technology (IT) professionals with an
emphasis on IT Operations.
- Wikipedia
The Gap
Why? How?
Quality Velocity
Operational Excellence
Operational Excellence is the continuous improvement of
the management, design, and function of operational
environments to achieve greater quality, velocity, and
competitive advantage.
• Engineering Tools
• Insight & Real-time Analytics
• Performance & Reliability
Operations Engineering is the application of software
engineering practices to achieve and sustain operational
excellence.
Operations Engineering
• Service provider
• Operational excellence driver
• Cross-cutting solutions
• Undifferentiated heavy lifting
• Product Engineering
• Operations Engineering
• Challenges & Strategies
Our time today…
• You’re overloading us
• What took you so long?
Remember that feedback?
• We made assumptions
– Requirements – what & when
– Time for non-product work
• Move from assumptions to knowledge
• Affect change without imposing a tax?
• Achieve and sustain operational excellence?
How do we…
Time is a form of currency
5 strategies for success
in time-based economies
software & organizational engineering
1. Reach out
• What are your biggest operational pain points?
• How can we help?
• How well are we meeting your needs today?
• What would you like to see from us in the future?
Listen
Shower, rinse, repeat
Talk to your engineering customers
Grease the Squeaky Wheels
• low tolerance for tax
• more vocal than most
• High impact solutions
• Clarity on deliverables
• Lower operational tax
• Leadership, innovation, and partnership
What they wanted
• Deliver on solutions
• Better road map definition & communication
• A more aggressive stance on automation
• Deeper investment into leadership, innovation, planning
Our commitments
2. Make an impact
• Apply what you’ve learned
• Deliver what matters
• global cloud console
• end to end delivery
• automation platform
• velocity with confidence
Pipelines - Automated Global Delivery
3. Make it easy to do the right thing
• Engineering time is scarce
• We must do more heavy lifting
Supply & Demand
• Spinnaker manual step
• Automated migrations – Mimir
Provide on-ramps
Automate proven practices
• Alerting and Monitoring
• Apache & Tomcat Hardening
• Automated Canary Analysis
• Autoscaling
• Chaos Participation
• Consistent Naming
• ELB Configuration
• Healthcheck Configured
• Red-Black Pipeline
• Squeeze Testing
• Timeout & Fallback Tuning
• Workload Reliability
Production Ready?
• Alerting and Monitoring
• Apache & Tomcat Hardening
• Automated Canary Analysis
• Autoscaling
• Chaos Participation
• Consistent Naming
• ELB Configuration
• Healthcheck Configured
• Red-Black Pipeline
• Squeeze Testing
• Timeout & Fallback Tuning
• Workload Reliability
Production Ready?
Old Version (v1.0)
New Version
(v1.1)
Load BalancerCustomers
100 Servers
5 Servers
95%
5%
Metrics
Canaries
Old Version (v1.0)
New Version
(v1.1)
Load BalancerCustomers
0 Servers
100 Servers
100%
Metrics
Canaries
Define
• Metrics
• A threshold
Every n minutes
● Classify metrics
● Compute score
● Make a decision
Automated Canary Analysis
Canary Analysis
Performance
Integration Tests
Chaos
Conformity
Static
Unit Tests
Make it easy to do the
right thing
Static &
Functional
Testing
4. Reduce the cost of change
• Ongoing migrations
• Library propagation
• 100s of micro-services
• Complex dependencies
Continuous, Broad-based Change
Change Engineering
• Locate
• Communicate
• Facilitate
• Automated forensics
– Who last touched x?
– What team?
– Who was their manager?
Who owns this artifact, repository, service?
Whitepages
• Workday wrapper
• App & REST API
• Organization hierarchy
• Metadata
• Change log
(###) ###-####
Krieger
• REST-based service
• Sources
– Whitepages
– Stash
– Edda
– Jenkins
– Spinnaker
– Etc…
{
"content": {},
"_links": {
"employees": {
"href": "/api/employees/"
},
"projects": {
"href": "/api/projects/"
},
"teams": {
"href": "/api/teams/"
},
"applications": {
"href": "/api/applications/"
},
"jobs": {
"href": "/api/build/jobs"
},
"masters": {
"href": "/api/build/masters"
},
"projectDistribution": {
"href": "/api/teams/projectDistribution"
}
}
}
/api/employees?q=jevans "employees": [
{
"id": "241",
"firstName": "Josh",
"lastName": "Evans",
"username": "jevans",
"email": "jevans@netflix.com",
"jobTitle": "Director of Operations Engineering",
"isManager": true,
"isCurrent": true,
"title": "Josh Evans (jevans) - Operations Engineering",
"_links": {
"self": {
"href": "/api/employees/241"
},
"manager": {
"href": "/api/employees/117890"
},
"team": {
"href": "/api/teams/f9134a81"
},
"projects": {
"href": "/api/teams/f9134a81/projects"
}
}
}
]
}
• Security vulnerabilities
– Who owns this service?
• Platform updates
– Who is using this version of this library?
Today – Targeted Coordination
Automated, efficient technical
project management
• Communication
• Guidance
• Tracking
Low tax for TPMs & engineers
Security Fix Guava
Future – Change Campaigns
5. Develop Partnerships
Beyond supply & demand
• Nearing completion
• Aggressive schedule
• Unexpected delays
• Commitment to June delivery
Spinnaker 1.0 – 1H 2015
• Built their own continuous delivery solution
• Not positioned for engineering-wide support
• Believes common solutions
Edge Engineering
Partnership in Action
• Strong relationship
• Open discussions about concerns
• Decision - leaned forward
• +2 engineers on Spinnaker
• Successful 1.0 launch
Moving Forward Together
• Containers?
• Achieving alignment
• Collaborative exploration
– Edge, Platform, Operations
– A new paved road?
• Paved Road adopted
– Adding new ones
• Production Ready ongoing
• Migrations easier
• Reputation improving
• Improved
– Service uptime
– Rate of change
Payoffs
Putting it to the test in 2016
• Streaming production & test - EC2 Classic to VPC
• Highly cross-functional
• Complex dependencies
• Zero downtime
Stay tuned…
Five Strategies
1. Reach out
2. Make an impact
3. Make it easy to do the right thing
4. Reduce the cost of change
5. Develop partnerships
Open Sourced!
https://netflix.github.io/
Josh Evans
jevans@netflix.com
@ops_engineering
Questions?

Más contenido relacionado

La actualidad más candente

GitHub Copilot.pptx
GitHub Copilot.pptxGitHub Copilot.pptx
GitHub Copilot.pptx
Luis Beltran
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
Pekka Abrahamsson / Tampere University
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
DianaGray10
 
Lean & Agile Performance Measurement: Metrics, Models, & Measures
Lean & Agile Performance Measurement: Metrics, Models, & MeasuresLean & Agile Performance Measurement: Metrics, Models, & Measures
Lean & Agile Performance Measurement: Metrics, Models, & Measures
David Rico
 
The Build Trap
The Build TrapThe Build Trap
The Build Trap
Melissa Perri
 
How to Run a Hackathon
How to Run a HackathonHow to Run a Hackathon
How to Run a Hackathon
Centric Consulting
 
Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?
Thoughtworks
 
The AI Rush
The AI RushThe AI Rush
Agile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation SlidesAgile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation Slides
SlideTeam
 
Illuminating the potential of Scrum by comparing LeSS with SAFe
Illuminating the potential of Scrum by comparing LeSS with SAFeIlluminating the potential of Scrum by comparing LeSS with SAFe
Illuminating the potential of Scrum by comparing LeSS with SAFe
Rowan Bunning
 
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
apidays
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos Engineering
Gremlin
 
Digital transformation
Digital transformationDigital transformation
Digital transformation
Scopernia
 
11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations
Christian Buckley
 
Atlassian Overview
Atlassian OverviewAtlassian Overview
Atlassian Overview
Atlassian
 
Essential SAFe and Launching your first Agile Release Train
Essential SAFe and Launching your first Agile Release TrainEssential SAFe and Launching your first Agile Release Train
Essential SAFe and Launching your first Agile Release Train
Cprime
 
Six Building Blocks Of Digital Transformation PowerPoint Presentation Slides
Six Building Blocks Of Digital Transformation PowerPoint Presentation SlidesSix Building Blocks Of Digital Transformation PowerPoint Presentation Slides
Six Building Blocks Of Digital Transformation PowerPoint Presentation Slides
SlideTeam
 
Composale DXP with MACH architecture.pptx
Composale DXP with MACH architecture.pptxComposale DXP with MACH architecture.pptx
Composale DXP with MACH architecture.pptx
Pieter Brinkman
 
The Spotify Tribe
The Spotify TribeThe Spotify Tribe
The Spotify Tribe
Kevin Goldsmith
 
Chaos engineering and chaos testing
Chaos engineering and chaos testingChaos engineering and chaos testing
Chaos engineering and chaos testing
jeetendra mandal
 

La actualidad más candente (20)

GitHub Copilot.pptx
GitHub Copilot.pptxGitHub Copilot.pptx
GitHub Copilot.pptx
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Lean & Agile Performance Measurement: Metrics, Models, & Measures
Lean & Agile Performance Measurement: Metrics, Models, & MeasuresLean & Agile Performance Measurement: Metrics, Models, & Measures
Lean & Agile Performance Measurement: Metrics, Models, & Measures
 
The Build Trap
The Build TrapThe Build Trap
The Build Trap
 
How to Run a Hackathon
How to Run a HackathonHow to Run a Hackathon
How to Run a Hackathon
 
Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?Chaos Engineering, When should you release the monkeys?
Chaos Engineering, When should you release the monkeys?
 
The AI Rush
The AI RushThe AI Rush
The AI Rush
 
Agile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation SlidesAgile Delivery Powerpoint Presentation Slides
Agile Delivery Powerpoint Presentation Slides
 
Illuminating the potential of Scrum by comparing LeSS with SAFe
Illuminating the potential of Scrum by comparing LeSS with SAFeIlluminating the potential of Scrum by comparing LeSS with SAFe
Illuminating the potential of Scrum by comparing LeSS with SAFe
 
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
apidays LIVE Australia 2021 - Composing a Headless and Composable Commerce Ar...
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos Engineering
 
Digital transformation
Digital transformationDigital transformation
Digital transformation
 
11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations11 Strategic Considerations for SharePoint Migrations
11 Strategic Considerations for SharePoint Migrations
 
Atlassian Overview
Atlassian OverviewAtlassian Overview
Atlassian Overview
 
Essential SAFe and Launching your first Agile Release Train
Essential SAFe and Launching your first Agile Release TrainEssential SAFe and Launching your first Agile Release Train
Essential SAFe and Launching your first Agile Release Train
 
Six Building Blocks Of Digital Transformation PowerPoint Presentation Slides
Six Building Blocks Of Digital Transformation PowerPoint Presentation SlidesSix Building Blocks Of Digital Transformation PowerPoint Presentation Slides
Six Building Blocks Of Digital Transformation PowerPoint Presentation Slides
 
Composale DXP with MACH architecture.pptx
Composale DXP with MACH architecture.pptxComposale DXP with MACH architecture.pptx
Composale DXP with MACH architecture.pptx
 
The Spotify Tribe
The Spotify TribeThe Spotify Tribe
The Spotify Tribe
 
Chaos engineering and chaos testing
Chaos engineering and chaos testingChaos engineering and chaos testing
Chaos engineering and chaos testing
 

Similar a Beyond DevOps - How Netflix Bridges the Gap

Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
C4Media
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery
XebiaLabs
 
Introduction to Agile Hardware
Introduction to Agile Hardware Introduction to Agile Hardware
Introduction to Agile Hardware
Cprime
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
apidays
 
My Little Webap - DevOpsSec is Magic
My Little Webap - DevOpsSec is MagicMy Little Webap - DevOpsSec is Magic
My Little Webap - DevOpsSec is Magic
Apollo Clark
 
Dev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps SuccessDev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps Success
C4Media
 
Enabling your DevOps culture with AWS-webinar
Enabling your DevOps culture with AWS-webinarEnabling your DevOps culture with AWS-webinar
Enabling your DevOps culture with AWS-webinar
Aaron Walker
 
SRV318_Research at PNNL Powered by AWS
SRV318_Research at PNNL Powered by AWSSRV318_Research at PNNL Powered by AWS
SRV318_Research at PNNL Powered by AWS
Amazon Web Services
 
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
Amazon Web Services
 
Dev ops lessons learned - Michael Collins
Dev ops lessons learned  - Michael CollinsDev ops lessons learned  - Michael Collins
Dev ops lessons learned - Michael CollinsDevopsdays
 
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
Puppet
 
Achieving Continuous Delivery with Puppet
Achieving Continuous Delivery with PuppetAchieving Continuous Delivery with Puppet
Achieving Continuous Delivery with Puppet
Devoteam Revolve
 
Extreme Makeover OnBase Edition
Extreme Makeover OnBase EditionExtreme Makeover OnBase Edition
Extreme Makeover OnBase Edition
DataBank, A KYOCERA Group Company
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best Practices
Shiva Narayanaswamy
 
Operations for databases: the agile/devops journey
Operations for databases: the agile/devops journeyOperations for databases: the agile/devops journey
Operations for databases: the agile/devops journey
Eduardo Piairo
 
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
Digital Bond
 
DevOps-as-a-Service: Towards Automating the Automation
DevOps-as-a-Service: Towards Automating the AutomationDevOps-as-a-Service: Towards Automating the Automation
DevOps-as-a-Service: Towards Automating the Automation
Keith Pleas
 
Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401
Amazon Web Services
 
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project ServicesAgile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
Craig Pearson
 
What is DevOps?
What is DevOps?What is DevOps?
What is DevOps?
Mesut Güneş
 

Similar a Beyond DevOps - How Netflix Bridges the Gap (20)

Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery
 
Introduction to Agile Hardware
Introduction to Agile Hardware Introduction to Agile Hardware
Introduction to Agile Hardware
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
 
My Little Webap - DevOpsSec is Magic
My Little Webap - DevOpsSec is MagicMy Little Webap - DevOpsSec is Magic
My Little Webap - DevOpsSec is Magic
 
Dev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps SuccessDev "Programming" Ops For DevOps Success
Dev "Programming" Ops For DevOps Success
 
Enabling your DevOps culture with AWS-webinar
Enabling your DevOps culture with AWS-webinarEnabling your DevOps culture with AWS-webinar
Enabling your DevOps culture with AWS-webinar
 
SRV318_Research at PNNL Powered by AWS
SRV318_Research at PNNL Powered by AWSSRV318_Research at PNNL Powered by AWS
SRV318_Research at PNNL Powered by AWS
 
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017
 
Dev ops lessons learned - Michael Collins
Dev ops lessons learned  - Michael CollinsDev ops lessons learned  - Michael Collins
Dev ops lessons learned - Michael Collins
 
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
Puppet Camp Paris 2014: Achieving Continuous Delivery and DevOps with Puppet
 
Achieving Continuous Delivery with Puppet
Achieving Continuous Delivery with PuppetAchieving Continuous Delivery with Puppet
Achieving Continuous Delivery with Puppet
 
Extreme Makeover OnBase Edition
Extreme Makeover OnBase EditionExtreme Makeover OnBase Edition
Extreme Makeover OnBase Edition
 
DevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best PracticesDevOps, Common use cases, Architectures, Best Practices
DevOps, Common use cases, Architectures, Best Practices
 
Operations for databases: the agile/devops journey
Operations for databases: the agile/devops journeyOperations for databases: the agile/devops journey
Operations for databases: the agile/devops journey
 
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
Time Traveling: Adapting Techniques from the Future to Improve Reliability, J...
 
DevOps-as-a-Service: Towards Automating the Automation
DevOps-as-a-Service: Towards Automating the AutomationDevOps-as-a-Service: Towards Automating the Automation
DevOps-as-a-Service: Towards Automating the Automation
 
Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401Application Delivery Patterns for Developers - Technical 401
Application Delivery Patterns for Developers - Technical 401
 
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project ServicesAgile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
 
What is DevOps?
What is DevOps?What is DevOps?
What is DevOps?
 

Más de Josh Evans

Vision and Strategy - Epiphanies of a Netflix leader
Vision and Strategy - Epiphanies of a Netflix leaderVision and Strategy - Epiphanies of a Netflix leader
Vision and Strategy - Epiphanies of a Netflix leader
Josh Evans
 
Refactoring Organizations - A Netflix Study (QCon NYC 2017)
Refactoring Organizations - A Netflix Study (QCon NYC 2017)Refactoring Organizations - A Netflix Study (QCon NYC 2017)
Refactoring Organizations - A Netflix Study (QCon NYC 2017)
Josh Evans
 
Mastering Chaos - A Netflix Guide to Microservices
Mastering Chaos - A Netflix Guide to MicroservicesMastering Chaos - A Netflix Guide to Microservices
Mastering Chaos - A Netflix Guide to Microservices
Josh Evans
 
#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture
Josh Evans
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the Cloud
Josh Evans
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
Josh Evans
 

Más de Josh Evans (6)

Vision and Strategy - Epiphanies of a Netflix leader
Vision and Strategy - Epiphanies of a Netflix leaderVision and Strategy - Epiphanies of a Netflix leader
Vision and Strategy - Epiphanies of a Netflix leader
 
Refactoring Organizations - A Netflix Study (QCon NYC 2017)
Refactoring Organizations - A Netflix Study (QCon NYC 2017)Refactoring Organizations - A Netflix Study (QCon NYC 2017)
Refactoring Organizations - A Netflix Study (QCon NYC 2017)
 
Mastering Chaos - A Netflix Guide to Microservices
Mastering Chaos - A Netflix Guide to MicroservicesMastering Chaos - A Netflix Guide to Microservices
Mastering Chaos - A Netflix Guide to Microservices
 
#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture
 
Engineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the CloudEngineering Netflix Global Operations in the Cloud
Engineering Netflix Global Operations in the Cloud
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
 

Último

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 

Último (20)

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 

Beyond DevOps - How Netflix Bridges the Gap

  • 1. Josh Evans - Director of Operations Engineering November 16, 2015 Beyond DevOps: How Netflix Bridges the Gap
  • 2. Technical Debt • Java 6 • Perforce • Single Master Jenkins • Ant • CentOS • Asgard/Mimir Fall 2013
  • 3. How do we drive broad-based change?
  • 4. The Paved Road • Java 7 • Stash • Jenkins Shards • Gradle • Ubuntu
  • 5. Some said • You’re overloading us • Too many projects • Poor targeting Others said • What took you so long? • We’ve moved on • Now we need to migrate That’s great but… We’re paying a high tax
  • 6. • Expectations gap – Division of labor – Timing of solutions – Leadership • Affects – Reputation – Relationships – Lost opportunities Organizational Debt
  • 7. How do we bridge the gap?
  • 8. “Remember that TIME is money…”
  • 9. Time is a form of currency
  • 10. • Product Engineering • Operations Engineering • Challenges & Strategies Our time today…
  • 11. • Product Engineering • Operations Engineering • Challenges & Strategies Our time today…
  • 13.
  • 14.
  • 15. ● Every facet of the product ● 1400 AB tests in the last year & accelerating Continuous Innovation
  • 17. Build It • design • code • build • bake • test • deploy Run It • configure • monitor • triage • fix …at scale, globally You build it, you run it
  • 18. Internet • 1000s of starts per second • 100,000s of requests per second • 100,000,000 hours of content / day • 3 AWS Regions, 3 AZs per region
  • 19. Relentless product innovation Building & running micro- services at scale, globally
  • 20. • Product Engineering • Operations Engineering • Challenges & Strategies Our time today…
  • 21. DevOps is a software development method that emphasizes the roles of both software developers and other information-technology (IT) professionals with an emphasis on IT Operations. - Wikipedia The Gap
  • 24. Operational Excellence is the continuous improvement of the management, design, and function of operational environments to achieve greater quality, velocity, and competitive advantage.
  • 25. • Engineering Tools • Insight & Real-time Analytics • Performance & Reliability Operations Engineering is the application of software engineering practices to achieve and sustain operational excellence.
  • 26. Operations Engineering • Service provider • Operational excellence driver • Cross-cutting solutions • Undifferentiated heavy lifting
  • 27. • Product Engineering • Operations Engineering • Challenges & Strategies Our time today…
  • 28. • You’re overloading us • What took you so long? Remember that feedback? • We made assumptions – Requirements – what & when – Time for non-product work
  • 29. • Move from assumptions to knowledge • Affect change without imposing a tax? • Achieve and sustain operational excellence? How do we…
  • 30. Time is a form of currency
  • 31. 5 strategies for success in time-based economies software & organizational engineering
  • 33. • What are your biggest operational pain points? • How can we help? • How well are we meeting your needs today? • What would you like to see from us in the future? Listen Shower, rinse, repeat Talk to your engineering customers
  • 34. Grease the Squeaky Wheels • low tolerance for tax • more vocal than most
  • 35. • High impact solutions • Clarity on deliverables • Lower operational tax • Leadership, innovation, and partnership What they wanted
  • 36. • Deliver on solutions • Better road map definition & communication • A more aggressive stance on automation • Deeper investment into leadership, innovation, planning Our commitments
  • 37. 2. Make an impact • Apply what you’ve learned • Deliver what matters
  • 38. • global cloud console • end to end delivery • automation platform • velocity with confidence
  • 39.
  • 40. Pipelines - Automated Global Delivery
  • 41.
  • 42. 3. Make it easy to do the right thing
  • 43. • Engineering time is scarce • We must do more heavy lifting Supply & Demand
  • 44. • Spinnaker manual step • Automated migrations – Mimir Provide on-ramps
  • 46. • Alerting and Monitoring • Apache & Tomcat Hardening • Automated Canary Analysis • Autoscaling • Chaos Participation • Consistent Naming • ELB Configuration • Healthcheck Configured • Red-Black Pipeline • Squeeze Testing • Timeout & Fallback Tuning • Workload Reliability Production Ready?
  • 47. • Alerting and Monitoring • Apache & Tomcat Hardening • Automated Canary Analysis • Autoscaling • Chaos Participation • Consistent Naming • ELB Configuration • Healthcheck Configured • Red-Black Pipeline • Squeeze Testing • Timeout & Fallback Tuning • Workload Reliability Production Ready?
  • 48. Old Version (v1.0) New Version (v1.1) Load BalancerCustomers 100 Servers 5 Servers 95% 5% Metrics Canaries
  • 49. Old Version (v1.0) New Version (v1.1) Load BalancerCustomers 0 Servers 100 Servers 100% Metrics Canaries
  • 50. Define • Metrics • A threshold Every n minutes ● Classify metrics ● Compute score ● Make a decision Automated Canary Analysis
  • 51. Canary Analysis Performance Integration Tests Chaos Conformity Static Unit Tests Make it easy to do the right thing Static & Functional Testing
  • 52. 4. Reduce the cost of change
  • 53. • Ongoing migrations • Library propagation • 100s of micro-services • Complex dependencies Continuous, Broad-based Change
  • 54. Change Engineering • Locate • Communicate • Facilitate
  • 55. • Automated forensics – Who last touched x? – What team? – Who was their manager? Who owns this artifact, repository, service?
  • 56. Whitepages • Workday wrapper • App & REST API • Organization hierarchy • Metadata • Change log (###) ###-####
  • 57. Krieger • REST-based service • Sources – Whitepages – Stash – Edda – Jenkins – Spinnaker – Etc… { "content": {}, "_links": { "employees": { "href": "/api/employees/" }, "projects": { "href": "/api/projects/" }, "teams": { "href": "/api/teams/" }, "applications": { "href": "/api/applications/" }, "jobs": { "href": "/api/build/jobs" }, "masters": { "href": "/api/build/masters" }, "projectDistribution": { "href": "/api/teams/projectDistribution" } } }
  • 58. /api/employees?q=jevans "employees": [ { "id": "241", "firstName": "Josh", "lastName": "Evans", "username": "jevans", "email": "jevans@netflix.com", "jobTitle": "Director of Operations Engineering", "isManager": true, "isCurrent": true, "title": "Josh Evans (jevans) - Operations Engineering", "_links": { "self": { "href": "/api/employees/241" }, "manager": { "href": "/api/employees/117890" }, "team": { "href": "/api/teams/f9134a81" }, "projects": { "href": "/api/teams/f9134a81/projects" } } } ] }
  • 59. • Security vulnerabilities – Who owns this service? • Platform updates – Who is using this version of this library? Today – Targeted Coordination
  • 60. Automated, efficient technical project management • Communication • Guidance • Tracking Low tax for TPMs & engineers Security Fix Guava Future – Change Campaigns
  • 62. • Nearing completion • Aggressive schedule • Unexpected delays • Commitment to June delivery Spinnaker 1.0 – 1H 2015
  • 63. • Built their own continuous delivery solution • Not positioned for engineering-wide support • Believes common solutions Edge Engineering
  • 64. Partnership in Action • Strong relationship • Open discussions about concerns • Decision - leaned forward • +2 engineers on Spinnaker • Successful 1.0 launch
  • 65. Moving Forward Together • Containers? • Achieving alignment • Collaborative exploration – Edge, Platform, Operations – A new paved road?
  • 66. • Paved Road adopted – Adding new ones • Production Ready ongoing • Migrations easier • Reputation improving • Improved – Service uptime – Rate of change Payoffs
  • 67. Putting it to the test in 2016 • Streaming production & test - EC2 Classic to VPC • Highly cross-functional • Complex dependencies • Zero downtime Stay tuned…
  • 68. Five Strategies 1. Reach out 2. Make an impact 3. Make it easy to do the right thing 4. Reduce the cost of change 5. Develop partnerships

Notas del editor

  1. Java 6 – needed to move forward on Java but struggled to drive adoption Perforce – many teams moving to Git – no story for supporting perforce in the cloud Jenkins – long queues & build times Ant – long build times, inefficient dependency management CentOS – slow delivery of new kernel and userland binaries Asgard served us well as a deployment & cloud management Mimir gave a great prototype and we learned a lot Tech debt kept us from doing our jobs well
  2. Does this sound familiar? Have any of you been on one side or the other of this situation?
  3. To move forward we defined the concept of the paved road The paved road promises a well supported integrated developer experience. Java 7 – just to move forward – Java 8 already on the horizon Git – organically adopted by many teams Gradle – built time reduced due to efficient dependency management Ubuntu – more frequent, well vetted userland binarie & kernels Jenkins shards to fix long build times Started building our next generation cloud console & continuous delivery platform Spinnaker We staffed up and went for it – big bang
  4. Read to the audience: He that can earn ten shillings a day by his labour, and goes abroad, or sits idle one half of that day, tho' he spends but sixpence during his diversion or idleness, ought not to reckon that the only expense; he has really spent or rather thrown away five shillings besides. - Advice to a Young Tradesman
  5. Please raise you hand if you know which puritanical workaholic wrote this? In addition to the obvious intent behind this there is a more profound message. Time spent working is related to the money you make but time is also in and of itself a form of currency. It’s the exchange or giving of time that drives the economics of an engineering organization
  6. Netflix has a freedom & responsibility culture. You build it you run it perfectly aligns with our values around autonomy & ownership
  7. This leads a high pressure situation created a shortage of time.
  8. Read definition out loud Out of curiosity – who agrees with this definition? Who disagrees? Not only is there disagreement but the general construct isn’t really that helpful
  9. It doesn’t address how to bridge the gap or why it matters to do so? What’s are the strategies for success? It’s the practices, tools, culture Motivations the reason for doing DevOps is to achieve operational excellence
  10. We do the undifferentiated heavy lifting for out customers. This means we take on the operationally oriented common engineering work across teams so that each team can focus on their core charter.
  11. We do the undifferentiated heavy lifting for out customers. This means we take on the operationally oriented common engineering work across teams so that each team can focus on their core charter.
  12. Going back to our Ben Franklin quote – time is a form of currency. In our engineering world time really is currency. We don’t pay each other to do work. We commit time to projects. In other words we have a time-based economy.
  13. Audience – can anyone name one of the strategies?
  14. Stop spamming us!
  15. Audience – can anyone name one of the strategies? A free chaos monkey for good ones
  16. \
  17. There are several approaches that you might take to solve for this problem. I’ll explore each one.
  18. And once you’ve proven that you can deliver you have some money in the bank. You have earned a seat at the table. Now you’re ready to build strong partnerships.