SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
JUST EAT: embracing DevOps
Or: How we make a Windows-based ecommerce platform work
(with AWS)
@petemounce & @justeat_tech
JUST EAT: Who are we?
● In business since 2001 in DK, 2005 in UK
● Tech team is ~50 people in UK, ~20 people in Ukraine
● Cloud native in AWS
○ Except for the bits that aren’t (yet)
● Very predictable load
● ~900 orders/minute at peak in UK
● We’re recruiting!
○ http://tech.just-eat.com/jobs/
○ http://tech.just-eat.com/jobs/senior-software-engineer-
platform-services/
○ Lots of other roles
JUST EAT: Who are we?
Oh, yeah - we do online takeaway.
We’re an extra sales channel for our restaurant partners.
We do the online part.
Challenging!
We make this work.
On Windows.
What are we?
We do high-volume ecommerce.
Windows platform.
Most production code is C#, .NET 4 or 4.5.
Most automation is ruby 1.9.x. Some powershell.
Ongoing legacy transformation; no big rewrites.
Splitting up a monolithic system into SOA/APIs, incrementally.
Architecture, before AWS
Data centre life, pre 2013
Physical hardware
Snowflake servers - no configuration management tooling
Manual deployments, done by operations team
No real time monitoring - SQL queries only
Monolithic applications, not much fast-running test coverage
… But at least we had source control and decent continuous
integration! (since 2010)
Architecture, post AWS migration
Estate & High Availability by default
At peak, we run ~500-600 EC2 instances
We migrated from the single data centre in DK, to eu-west-1.
We run everything multi-AZ, auto-scaling by default.
(Almost).
Delivery pipeline
Very standard. Nothing to see here.
Multi-tenant.
Tenants are isolated against bad-neighbour issues; individually
scalable.
This basically means our tools take a tenant parameter as well
as an environment parameter.
Tech organisation structure
We stole from AWS - “two-pizza teams”
(we understand metrics couched in terms of food)
We have a team each for
● consumer web app
● consumer native apps (one iOS, one Android)
● restaurant apps
● business-support apps
● APIs (actually, four teams in one unit)
● PaaS
○ responsible for internal services; monitoring/alerting/logs
○ systems automation
Tech culture
“You ship it, you operate it”
Each team owns their own features, infrastructure-up.
Minimise dependencies between teams.
Each team has autonomy to work on what they want within
some constraints.
Rules:
● don’t break backwards compatibility
● use what you want - but operate it yourself
● other teams must be able to launch & verify your stuff in
their environments
But how?
Table-stakes for this to work (well):
1. Persistent group chat
2. Real-time monitoring
3. Real-time alerting
4. Centralised logging
Make it easier to debug in production without a debugger.
Persistent group chat
We use HipChat.
You could use IRC / Campfire / Hangouts.
● Persistent - jump in, read up
● Searchable history
● Integrate other tools to it
● hubot for fun and profit
○ @jebot trg pd emergency with msg “we’re out of champagne in the
office fridge”
Real-time monitoring
Microsoft’s SCOM requires an AD
Publish OS-level performance counters with perftap - windows
analogue of collectd we found and customised
Receive metrics into statsd
Visualise time-series data with graphite
○ 10s granularity retained for 13 months
○ AWS’ CloudWatch gives you 1min / 2 weeks
Addictive!
Real-time alerting
This is the 21st century; emailing someone their server is down
doesn’t cut it.
seyren runs our checks.
Publishes to
● HipChat
● PagerDuty
● SMS
● statsd event metrics (coming soon, hopefully)
Centralised logging
Windows doesn’t have syslog.
Out of the box EventLog isn’t quite it.
Publish logs via nxlog agent.
Receive logs into logstash cluster.
Filter, transform and enrich into elasticsearch cluster.
Query, visualise and dashboard via kibana.
Without these things, operating a distributed system on
Windows is hard.
Windows at scale assumes that you have an Active Directory.
We don’t.
● No Windows network load-balancing.
● No centrally trusted authentication.
● No central monitoring (SCOM) to harvest performance
counters.
● No easy remote command execution (WinRM wants an AD,
too)
● Other stuff; these are the highlights.
Open source & build vs buy
We treat Microsoft as just another third party vendor
dependency.
We lean on open-source libraries and tools a lot.
Anatomy of a feature
We decompose the platform into its component parts
Imaginatively, we call these “platform features”
For example
● consumer web app == publicweb
● back office tools == handle, guard
● etc
Platform features
Features are defined by AWS CloudFormation.
● Everything is pull-deployment, from S3.
● No state is kept (for long) on the instance itself.
● No external actor can tell an instance to do something,
beyond what the feature itself allows.
Instances boot, and then bootstrap themselves from content in
S3 based on CloudFormation::Init metadata
Platform feature: Servers
We have several “baseline” AMIs.
These have required system dependencies like .NET
framework, ruby, 7-zip, etc.
Periodically we update them for OS-level patches, and roll out
new baseline AMIs. We deprecate the older AMIs.
Platform feature: Infrastructure
Defined by CloudFormation. Each one stands up everything
that feature needs to run, excluding cross-cutting
dependencies (like DNS, firewall rules).
Mostly standard:
● ELB
● AutoScaling Group + Launch Configuration
● IAM as necessary
● … anything else required by the feature
Platform feature: Infrastructure
Platform feature: code package
● A standardised package containing
○ built code (website, service, combinations)
○ configuration + deltas to run any tenant/environment
○ automation to deploy the feature
● CloudFormation::Init has a configSet to
○ unzip
○ install automation dependencies
○ execute the deployment automation
○ warm up the feature, post-install
What have we gained?
Instances are disposable and short lived.
● Enables “shoot it in the head” debugging
● Disks no longer ever fill up
● Minimal environmental differences
● New environment == mostly automated
● Infrastructure as code == testable, repeatable - and we do!
Culture again: On-call
Teams are on-call for their features.
Decide own rota; coverage minimums for peak-time
But: teams (must!) have autonomy to improve their features so
they don’t get called as often.
Otherwise, constant fire-fighting
Things still break!
Page me once, shame on you.
Page me twice, shame on me.
Teams do root-cause analysis of incidents that triggered
incidents.
… An operations team / NOC does not.
Warn call-centre proatively
Take action proactively
Automate mitigation steps!
Feature toggles: not just for launching new stuff.
The role of our PaaS team
Enablement.
● Run monitoring & alerting
● Run centralised logging
● Run deployment service
● Apply security updates
Why not Azure / OpenStack et al?
Decision to migrate to AWS made in late 2011.
AWS was more mature than alternatives at the time. It offered
many hosted services on top of the IaaS offering.
Still is, even accounting for Azure’s recent advances.
The future
Immutable/golden instances; faster provisioning.
Failover to secondary region (we operate in CA).
Always: more test coverage, more confidence.
Publish some of our tools as OSS
https://github.com/justeat
The most important things
● Culture
● Principles that everyone lives by
● Devolve autonomy down to people on the ground
● (Tools)
Did we mention we’re hiring?
We’re pragmatic.
We’re successful.
We support each other.
We use sharp tools that we pick ourselves based on merit.
Join us!
○ http://tech.just-eat.com/jobs/
○ http://tech.just-eat.com/jobs/senior-software-engineer-
platform-services/
○ Lots of other roles
Any questions?

Más contenido relacionado

La actualidad más candente

An Introduction to the World of User Research
An Introduction to the World of User ResearchAn Introduction to the World of User Research
An Introduction to the World of User ResearchMethods
 
Mom Test - Customer Development - 30m
Mom Test - Customer Development - 30mMom Test - Customer Development - 30m
Mom Test - Customer Development - 30mRob Fitzpatrick
 
Personas and its importance
Personas and its importancePersonas and its importance
Personas and its importanceSankarshan D
 
Product Validation With Product Discovery
Product Validation With Product Discovery Product Validation With Product Discovery
Product Validation With Product Discovery Hengki Sihombing
 
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...UXPA 2022: Assessing friction in complex user journeys: the User Experience I...
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...UXPA International
 
Defining Personas, A User Experience Approach
Defining Personas, A User Experience ApproachDefining Personas, A User Experience Approach
Defining Personas, A User Experience ApproachLeon Kadoch Hardie
 
Including Everyone: Web Accessibility 101
Including Everyone: Web Accessibility 101Including Everyone: Web Accessibility 101
Including Everyone: Web Accessibility 101Helena Zubkow
 

La actualidad más candente (9)

An Introduction to the World of User Research
An Introduction to the World of User ResearchAn Introduction to the World of User Research
An Introduction to the World of User Research
 
Implementación del Sandbox Regulatorio retos y aprendizajes
Implementación del Sandbox Regulatorio retos y aprendizajesImplementación del Sandbox Regulatorio retos y aprendizajes
Implementación del Sandbox Regulatorio retos y aprendizajes
 
Lean UX
Lean UXLean UX
Lean UX
 
Mom Test - Customer Development - 30m
Mom Test - Customer Development - 30mMom Test - Customer Development - 30m
Mom Test - Customer Development - 30m
 
Personas and its importance
Personas and its importancePersonas and its importance
Personas and its importance
 
Product Validation With Product Discovery
Product Validation With Product Discovery Product Validation With Product Discovery
Product Validation With Product Discovery
 
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...UXPA 2022: Assessing friction in complex user journeys: the User Experience I...
UXPA 2022: Assessing friction in complex user journeys: the User Experience I...
 
Defining Personas, A User Experience Approach
Defining Personas, A User Experience ApproachDefining Personas, A User Experience Approach
Defining Personas, A User Experience Approach
 
Including Everyone: Web Accessibility 101
Including Everyone: Web Accessibility 101Including Everyone: Web Accessibility 101
Including Everyone: Web Accessibility 101
 

Destacado

AWS Summit London 2014 - JUST EAT - High Availability and Rapid Change
AWS Summit London 2014 - JUST EAT - High Availability and Rapid ChangeAWS Summit London 2014 - JUST EAT - High Availability and Rapid Change
AWS Summit London 2014 - JUST EAT - High Availability and Rapid Changedaniel-richardson
 
AWSome Day 2016 - Module 2: Infrastructure Services
AWSome Day 2016 - Module 2: Infrastructure ServicesAWSome Day 2016 - Module 2: Infrastructure Services
AWSome Day 2016 - Module 2: Infrastructure ServicesAmazon Web Services
 
AWS Webcast - Getting Started with Amazon Web Services
AWS Webcast - Getting Started with Amazon Web ServicesAWS Webcast - Getting Started with Amazon Web Services
AWS Webcast - Getting Started with Amazon Web ServicesAmazon Web Services
 
What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?Amazon Web Services
 
AWSome Day 2016 - Module 1: AWS Introduction and History
AWSome Day 2016 - Module 1: AWS Introduction and HistoryAWSome Day 2016 - Module 1: AWS Introduction and History
AWSome Day 2016 - Module 1: AWS Introduction and HistoryAmazon Web Services
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesAmazon Web Services
 
AWS 101: Cloud Computing Seminar (2012)
AWS 101: Cloud Computing Seminar (2012)AWS 101: Cloud Computing Seminar (2012)
AWS 101: Cloud Computing Seminar (2012)Amazon Web Services
 
AWS 101: Introduction to AWS
AWS 101: Introduction to AWSAWS 101: Introduction to AWS
AWS 101: Introduction to AWSIan Massingham
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web ServicesAmazon Web Services
 

Destacado (11)

AWS Summit London 2014 - JUST EAT - High Availability and Rapid Change
AWS Summit London 2014 - JUST EAT - High Availability and Rapid ChangeAWS Summit London 2014 - JUST EAT - High Availability and Rapid Change
AWS Summit London 2014 - JUST EAT - High Availability and Rapid Change
 
AWSome Day 2016 - Module 2: Infrastructure Services
AWSome Day 2016 - Module 2: Infrastructure ServicesAWSome Day 2016 - Module 2: Infrastructure Services
AWSome Day 2016 - Module 2: Infrastructure Services
 
AWS Webcast - Getting Started with Amazon Web Services
AWS Webcast - Getting Started with Amazon Web ServicesAWS Webcast - Getting Started with Amazon Web Services
AWS Webcast - Getting Started with Amazon Web Services
 
What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?
 
AWSome Day 2016 - Module 1: AWS Introduction and History
AWSome Day 2016 - Module 1: AWS Introduction and HistoryAWSome Day 2016 - Module 1: AWS Introduction and History
AWSome Day 2016 - Module 1: AWS Introduction and History
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 
Overview of Amazon Web Services
Overview of Amazon Web ServicesOverview of Amazon Web Services
Overview of Amazon Web Services
 
AWS 101: Cloud Computing Seminar (2012)
AWS 101: Cloud Computing Seminar (2012)AWS 101: Cloud Computing Seminar (2012)
AWS 101: Cloud Computing Seminar (2012)
 
What is AWS?
What is AWS?What is AWS?
What is AWS?
 
AWS 101: Introduction to AWS
AWS 101: Introduction to AWSAWS 101: Introduction to AWS
AWS 101: Introduction to AWS
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web Services
 

Similar a JUST EAT: Embracing DevOps

Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftYaniv cohen
 
JUST EAT: Tools we use to enable our culture
JUST EAT: Tools we use to enable our cultureJUST EAT: Tools we use to enable our culture
JUST EAT: Tools we use to enable our culturePeter Mounce
 
DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.Vlad Fedosov
 
DevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleDevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleArnaud LEMAIRE
 
SiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageSiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageAlvaro Folgado Rueda
 
Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Kris Buytaert
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureYshay Yaacobi
 
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as CodeConfoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as CodeSteve Mercier
 
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...DevClub_lv
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Cloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudCloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudEberhard Wolff
 
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...Logan Best
 
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDMulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDGonzalo Marcos Ansoain
 
Intalio create and cloudfoudry - short
Intalio create and cloudfoudry - shortIntalio create and cloudfoudry - short
Intalio create and cloudfoudry - shorthmalphettes
 
Enterprise software needs a PaaS
Enterprise software needs a PaaSEnterprise software needs a PaaS
Enterprise software needs a PaaShmalphettes
 
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...Puppet
 
Reproducibility in artificial intelligence
Reproducibility in artificial intelligenceReproducibility in artificial intelligence
Reproducibility in artificial intelligenceCarlos Toxtli
 
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructureHow Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructurePercolate
 

Similar a JUST EAT: Embracing DevOps (20)

Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
JUST EAT: Tools we use to enable our culture
JUST EAT: Tools we use to enable our cultureJUST EAT: Tools we use to enable our culture
JUST EAT: Tools we use to enable our culture
 
DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.DevOps Fest 2020. immutable infrastructure as code. True story.
DevOps Fest 2020. immutable infrastructure as code. True story.
 
DevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & AnsibleDevOps, A brief introduction to Vagrant & Ansible
DevOps, A brief introduction to Vagrant & Ansible
 
Docker in Production at the Aurora Team
Docker in Production at the Aurora TeamDocker in Production at the Aurora Team
Docker in Production at the Aurora Team
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
SiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team VillageSiestaTime - Defcon27 Red Team Village
SiestaTime - Defcon27 Red Team Village
 
Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.Devops, the future is here, it's just not evenly distributed yet.
Devops, the future is here, it's just not evenly distributed yet.
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
 
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as CodeConfoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
 
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
“Practical DevOps by a small team of devs” by Ilgvars Jēcis from FinoTech  at...
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Cloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and CloudCloudy in Indonesia: Java and Cloud
Cloudy in Indonesia: Java and Cloud
 
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...
AnsibleFest 2019 - Greenfielding Network and Systems Automation in a Large an...
 
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDMulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
 
Intalio create and cloudfoudry - short
Intalio create and cloudfoudry - shortIntalio create and cloudfoudry - short
Intalio create and cloudfoudry - short
 
Enterprise software needs a PaaS
Enterprise software needs a PaaSEnterprise software needs a PaaS
Enterprise software needs a PaaS
 
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...
PuppetConf 2017: Deploying is Only Half the Battle! Operationalizing Applicat...
 
Reproducibility in artificial intelligence
Reproducibility in artificial intelligenceReproducibility in artificial intelligence
Reproducibility in artificial intelligence
 
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless InfrastructureHow Percolate uses CFEngine to Manage AWS Stateless Infrastructure
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
 

Último

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 

Último (20)

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

JUST EAT: Embracing DevOps

  • 1. JUST EAT: embracing DevOps Or: How we make a Windows-based ecommerce platform work (with AWS) @petemounce & @justeat_tech
  • 2. JUST EAT: Who are we? ● In business since 2001 in DK, 2005 in UK ● Tech team is ~50 people in UK, ~20 people in Ukraine ● Cloud native in AWS ○ Except for the bits that aren’t (yet) ● Very predictable load ● ~900 orders/minute at peak in UK ● We’re recruiting! ○ http://tech.just-eat.com/jobs/ ○ http://tech.just-eat.com/jobs/senior-software-engineer- platform-services/ ○ Lots of other roles
  • 3. JUST EAT: Who are we? Oh, yeah - we do online takeaway. We’re an extra sales channel for our restaurant partners. We do the online part. Challenging! We make this work. On Windows.
  • 4. What are we? We do high-volume ecommerce. Windows platform. Most production code is C#, .NET 4 or 4.5. Most automation is ruby 1.9.x. Some powershell. Ongoing legacy transformation; no big rewrites. Splitting up a monolithic system into SOA/APIs, incrementally.
  • 6. Data centre life, pre 2013 Physical hardware Snowflake servers - no configuration management tooling Manual deployments, done by operations team No real time monitoring - SQL queries only Monolithic applications, not much fast-running test coverage … But at least we had source control and decent continuous integration! (since 2010)
  • 8. Estate & High Availability by default At peak, we run ~500-600 EC2 instances We migrated from the single data centre in DK, to eu-west-1. We run everything multi-AZ, auto-scaling by default. (Almost).
  • 9. Delivery pipeline Very standard. Nothing to see here. Multi-tenant. Tenants are isolated against bad-neighbour issues; individually scalable. This basically means our tools take a tenant parameter as well as an environment parameter.
  • 10. Tech organisation structure We stole from AWS - “two-pizza teams” (we understand metrics couched in terms of food) We have a team each for ● consumer web app ● consumer native apps (one iOS, one Android) ● restaurant apps ● business-support apps ● APIs (actually, four teams in one unit) ● PaaS ○ responsible for internal services; monitoring/alerting/logs ○ systems automation
  • 11. Tech culture “You ship it, you operate it” Each team owns their own features, infrastructure-up. Minimise dependencies between teams. Each team has autonomy to work on what they want within some constraints. Rules: ● don’t break backwards compatibility ● use what you want - but operate it yourself ● other teams must be able to launch & verify your stuff in their environments
  • 12. But how? Table-stakes for this to work (well): 1. Persistent group chat 2. Real-time monitoring 3. Real-time alerting 4. Centralised logging Make it easier to debug in production without a debugger.
  • 13. Persistent group chat We use HipChat. You could use IRC / Campfire / Hangouts. ● Persistent - jump in, read up ● Searchable history ● Integrate other tools to it ● hubot for fun and profit ○ @jebot trg pd emergency with msg “we’re out of champagne in the office fridge”
  • 14. Real-time monitoring Microsoft’s SCOM requires an AD Publish OS-level performance counters with perftap - windows analogue of collectd we found and customised Receive metrics into statsd Visualise time-series data with graphite ○ 10s granularity retained for 13 months ○ AWS’ CloudWatch gives you 1min / 2 weeks Addictive!
  • 15. Real-time alerting This is the 21st century; emailing someone their server is down doesn’t cut it. seyren runs our checks. Publishes to ● HipChat ● PagerDuty ● SMS ● statsd event metrics (coming soon, hopefully)
  • 16. Centralised logging Windows doesn’t have syslog. Out of the box EventLog isn’t quite it. Publish logs via nxlog agent. Receive logs into logstash cluster. Filter, transform and enrich into elasticsearch cluster. Query, visualise and dashboard via kibana.
  • 17. Without these things, operating a distributed system on Windows is hard. Windows at scale assumes that you have an Active Directory. We don’t. ● No Windows network load-balancing. ● No centrally trusted authentication. ● No central monitoring (SCOM) to harvest performance counters. ● No easy remote command execution (WinRM wants an AD, too) ● Other stuff; these are the highlights.
  • 18. Open source & build vs buy We treat Microsoft as just another third party vendor dependency. We lean on open-source libraries and tools a lot.
  • 19. Anatomy of a feature We decompose the platform into its component parts Imaginatively, we call these “platform features” For example ● consumer web app == publicweb ● back office tools == handle, guard ● etc
  • 20. Platform features Features are defined by AWS CloudFormation. ● Everything is pull-deployment, from S3. ● No state is kept (for long) on the instance itself. ● No external actor can tell an instance to do something, beyond what the feature itself allows. Instances boot, and then bootstrap themselves from content in S3 based on CloudFormation::Init metadata
  • 21. Platform feature: Servers We have several “baseline” AMIs. These have required system dependencies like .NET framework, ruby, 7-zip, etc. Periodically we update them for OS-level patches, and roll out new baseline AMIs. We deprecate the older AMIs.
  • 22. Platform feature: Infrastructure Defined by CloudFormation. Each one stands up everything that feature needs to run, excluding cross-cutting dependencies (like DNS, firewall rules). Mostly standard: ● ELB ● AutoScaling Group + Launch Configuration ● IAM as necessary ● … anything else required by the feature
  • 24. Platform feature: code package ● A standardised package containing ○ built code (website, service, combinations) ○ configuration + deltas to run any tenant/environment ○ automation to deploy the feature ● CloudFormation::Init has a configSet to ○ unzip ○ install automation dependencies ○ execute the deployment automation ○ warm up the feature, post-install
  • 25. What have we gained? Instances are disposable and short lived. ● Enables “shoot it in the head” debugging ● Disks no longer ever fill up ● Minimal environmental differences ● New environment == mostly automated ● Infrastructure as code == testable, repeatable - and we do!
  • 26. Culture again: On-call Teams are on-call for their features. Decide own rota; coverage minimums for peak-time But: teams (must!) have autonomy to improve their features so they don’t get called as often. Otherwise, constant fire-fighting
  • 27. Things still break! Page me once, shame on you. Page me twice, shame on me. Teams do root-cause analysis of incidents that triggered incidents. … An operations team / NOC does not. Warn call-centre proatively Take action proactively Automate mitigation steps! Feature toggles: not just for launching new stuff.
  • 28. The role of our PaaS team Enablement. ● Run monitoring & alerting ● Run centralised logging ● Run deployment service ● Apply security updates
  • 29. Why not Azure / OpenStack et al? Decision to migrate to AWS made in late 2011. AWS was more mature than alternatives at the time. It offered many hosted services on top of the IaaS offering. Still is, even accounting for Azure’s recent advances.
  • 30. The future Immutable/golden instances; faster provisioning. Failover to secondary region (we operate in CA). Always: more test coverage, more confidence. Publish some of our tools as OSS https://github.com/justeat
  • 31. The most important things ● Culture ● Principles that everyone lives by ● Devolve autonomy down to people on the ground ● (Tools)
  • 32. Did we mention we’re hiring? We’re pragmatic. We’re successful. We support each other. We use sharp tools that we pick ourselves based on merit. Join us! ○ http://tech.just-eat.com/jobs/ ○ http://tech.just-eat.com/jobs/senior-software-engineer- platform-services/ ○ Lots of other roles