SlideShare una empresa de Scribd logo
1 de 15
Using Kubernetes to increase
developer velocity
(without sacrificing quality)
Adam Schepis
Architect @ CloudHealth Technologies
About Me
● At CloudHealth I build things in the cloud
that help our customers to confidently
build things in the cloud.
● I love working on distributed systems with
high scalability requirements.
● I have met Spiderman.
@aschepis
Our Challenges
Growth
Our Challenges
Maturing Market
Our Challenges
Innovation in the Cloud
● AWS - 100+ feature announcements in 2018
● Azure - 13 announcements (chunkier)
● GCP - Next '18 in July (more than 100
announcements last year)
● Started our Kubernetes journey in early
2017
● Running a number of production workloads
● Kubernetes is a key component of platform
overhaul in 2018
CloudHealth
and Kubernetes
7 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
Our Stack
Helm
● Service Lifecycle
● Values templates
● Trivial rollback
● Canary
deployments were
a bit tricky (would
love suggestions)
Linkerd
● Service discovery
● Circuit breakers
● Metrics
● Distributed tracing
(via Zipkin)
Romana
● CNI
● Cloud Native
● Works well with
AWS
● Not a full mesh
network
8 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
Primary Clusters
Development
● ~50 Nodes
● Namespace per
developer
● Devs given free
reign within their
namespace.
● Collaboration via
linkerd
Test/Staging
● ~20 Nodes
● Stable version of
each service
● Namespace per-
pull request
● Integration tests w/
new code + stable
services
Production
● ~50 Nodes
● Namespace per-
service group
● Tight restrictions on
access
● Distributed tracing
9 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
Development Environments
● ch CLI tool
● Light wrapper around setup, dev, and service lifecycle
● simplifies and accelerates dev workflow
● commands
○ ch init
○ ch new service <foo>
○ ch build
○ ch deploy
○ ch run (build + deploy)
○ ch supervisor
Consistency drives both velocity and quality
10 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
Development Environments
● Service endpoints are always in helm chart values
○ Injected as environment variables
○ can use any namespace https://auth/graphql for my namespace or
https://auth.david/graphql for David's namespace
● Test locally by using linkerd endpoint as http_proxy to reach remote services in
desired namespace
● Namespace for each developer
● Collaborate without deploying the world.
Collaborating in a shared dev cluster
11 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
Our Build Pipeline
Pull Request
Human Gate
Staging
PR namespace
Prod
Canary
Prod
Svc Group
- Unit
- Integration
- Pact
(Contracts)
- Javadoc
- RDoc
- etc.
Published to
S3.
12 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
When things go wrong
● Each PR has its own namespace to deploy service into
● Integration tests operate against stable versions of services it depends on
● When failure happens dev can:
○ Access resources in namespace through linkerd with customer header
○ Shell into a pod to check it out
○ Look at logs
○ Exercise failing service manually
○ Access UI of failing service (if one exists)
Failed Builds
13 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
When things go wrong
● Human gates between canary and full prod deploy
● Canaries can be validated by
○ verifying that it is serving production requests
○ looking at error reporting service
○ using linkerd headers to ensure a request against canary
○ tailing logs
○ looking at performance and application metrics compared to current production
code
● Can temporarily scale down to 0.
● Rollback with helm is fast and trivial.
Bad Canaries
14 © 2018 CLOUDHEALTH® TECHNOLOGIES INC.
● Consistency in tooling == velocity and quality
● Shared dev cluster
○ collaboration, shared understanding
● Namespace/deploy per PR
○ Velocity - faster to diagnose and fix test failures
○ Quality - easier to reach root cause via live debugging
● Canary Builds
○ Quickly detect bad deploys without heavy impact to customers
○ Confidence in deploys post-canary
In Summary
Thank you! Questions?
Adam Schepis
@aschepis

Más contenido relacionado

La actualidad más candente

GitOps (& Flux) for Helm Users with Scott Rigby
GitOps (& Flux) for Helm Users with Scott RigbyGitOps (& Flux) for Helm Users with Scott Rigby
GitOps (& Flux) for Helm Users with Scott Rigby
Weaveworks
 

La actualidad más candente (20)

[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker[WSO2Con EU 2018] Deploying Applications in K8S and Docker
[WSO2Con EU 2018] Deploying Applications in K8S and Docker
 
Continuous Deployment for Staging and Production Environments
Continuous Deployment for Staging and Production EnvironmentsContinuous Deployment for Staging and Production Environments
Continuous Deployment for Staging and Production Environments
 
Meetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOpsMeetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOps
 
The Building Blocks of DX: K8s Evolution from CLI to GitOps
The Building Blocks of DX: K8s Evolution from CLI to GitOpsThe Building Blocks of DX: K8s Evolution from CLI to GitOps
The Building Blocks of DX: K8s Evolution from CLI to GitOps
 
Beyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real LifeBeyond OpenStack | OpenStack in Real Life
Beyond OpenStack | OpenStack in Real Life
 
C# development workflow @ criteo
C# development workflow @ criteoC# development workflow @ criteo
C# development workflow @ criteo
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
GitOps (& Flux) for Helm Users with Scott Rigby
GitOps (& Flux) for Helm Users with Scott RigbyGitOps (& Flux) for Helm Users with Scott Rigby
GitOps (& Flux) for Helm Users with Scott Rigby
 
Security, Automation and the Software Supply Chain
Security, Automation and the Software Supply ChainSecurity, Automation and the Software Supply Chain
Security, Automation and the Software Supply Chain
 
The what, why and how of knative
The what, why and how of knativeThe what, why and how of knative
The what, why and how of knative
 
[Konveyor] address technical risks when implementing workload modernization u...
[Konveyor] address technical risks when implementing workload modernization u...[Konveyor] address technical risks when implementing workload modernization u...
[Konveyor] address technical risks when implementing workload modernization u...
 
Building Event-Driven Workflows with Knative and Tekton
Building Event-Driven Workflows with Knative and TektonBuilding Event-Driven Workflows with Knative and Tekton
Building Event-Driven Workflows with Knative and Tekton
 
Knative Intro
Knative IntroKnative Intro
Knative Intro
 
Cost Control and Rapid Innovation in Kubernetes with OpenRewrite
Cost Control and Rapid Innovation in Kubernetes with OpenRewriteCost Control and Rapid Innovation in Kubernetes with OpenRewrite
Cost Control and Rapid Innovation in Kubernetes with OpenRewrite
 
WKP Team Workspaces Webinar
WKP Team Workspaces WebinarWKP Team Workspaces Webinar
WKP Team Workspaces Webinar
 
Accelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStackAccelerate your business and reduce cost with OpenStack
Accelerate your business and reduce cost with OpenStack
 
SFScon18 - Gerhard Sulzberger - Jason Tevnan - gitops with gitlab + terraform
SFScon18 - Gerhard Sulzberger - Jason Tevnan  - gitops with gitlab + terraformSFScon18 - Gerhard Sulzberger - Jason Tevnan  - gitops with gitlab + terraform
SFScon18 - Gerhard Sulzberger - Jason Tevnan - gitops with gitlab + terraform
 
Exploring Kubeflow on Kubernetes for AI/ML | DevNation Tech Talk
Exploring Kubeflow on Kubernetes for AI/ML | DevNation Tech TalkExploring Kubeflow on Kubernetes for AI/ML | DevNation Tech Talk
Exploring Kubeflow on Kubernetes for AI/ML | DevNation Tech Talk
 
E bpf and profilers
E bpf and profilersE bpf and profilers
E bpf and profilers
 
Cicd pixelfederation
Cicd pixelfederationCicd pixelfederation
Cicd pixelfederation
 

Similar a Kubernetes: Increasing velocity without sacrificing quality

Similar a Kubernetes: Increasing velocity without sacrificing quality (20)

[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...
 
The rise of microservices
The rise of microservicesThe rise of microservices
The rise of microservices
 
FICO Open Shift presentation
FICO Open Shift presentationFICO Open Shift presentation
FICO Open Shift presentation
 
Developing Microservices Directly in AKS/Kubernetes
Developing Microservices Directly in AKS/KubernetesDeveloping Microservices Directly in AKS/Kubernetes
Developing Microservices Directly in AKS/Kubernetes
 
Deploy prometheus on kubernetes
Deploy prometheus on kubernetesDeploy prometheus on kubernetes
Deploy prometheus on kubernetes
 
Enabling Devops using Jenkins
Enabling Devops using JenkinsEnabling Devops using Jenkins
Enabling Devops using Jenkins
 
Transformacion e innovacion digital Meetup - Application Modernization and Mi...
Transformacion e innovacion digital Meetup - Application Modernization and Mi...Transformacion e innovacion digital Meetup - Application Modernization and Mi...
Transformacion e innovacion digital Meetup - Application Modernization and Mi...
 
The Decoupled CMS in Financial Services
The Decoupled CMS in Financial ServicesThe Decoupled CMS in Financial Services
The Decoupled CMS in Financial Services
 
Cloud: Shift in the Mindset
Cloud: Shift in the MindsetCloud: Shift in the Mindset
Cloud: Shift in the Mindset
 
Dr. Strangeconfig or: How I Learned to Stop Using Chef and Puppet and Love th...
Dr. Strangeconfig or: How I Learned to Stop Using Chef and Puppet and Love th...Dr. Strangeconfig or: How I Learned to Stop Using Chef and Puppet and Love th...
Dr. Strangeconfig or: How I Learned to Stop Using Chef and Puppet and Love th...
 
GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
 
Microservices at Mercari
Microservices at MercariMicroservices at Mercari
Microservices at Mercari
 
AWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to GatewaysAWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to Gateways
 
What is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 DepokWhat is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 Depok
 
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdfServerless Days Ahmedabad - Dhaval Nagar.pptx.pdf
Serverless Days Ahmedabad - Dhaval Nagar.pptx.pdf
 
Platform Engineering
Platform EngineeringPlatform Engineering
Platform Engineering
 
Get the Exact Identity Solution You Need - In the Cloud - Overview
Get the Exact Identity Solution You Need - In the Cloud - OverviewGet the Exact Identity Solution You Need - In the Cloud - Overview
Get the Exact Identity Solution You Need - In the Cloud - Overview
 
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CISecure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Kubernetes: Increasing velocity without sacrificing quality

  • 1. Using Kubernetes to increase developer velocity (without sacrificing quality) Adam Schepis Architect @ CloudHealth Technologies
  • 2. About Me ● At CloudHealth I build things in the cloud that help our customers to confidently build things in the cloud. ● I love working on distributed systems with high scalability requirements. ● I have met Spiderman. @aschepis
  • 5. Our Challenges Innovation in the Cloud ● AWS - 100+ feature announcements in 2018 ● Azure - 13 announcements (chunkier) ● GCP - Next '18 in July (more than 100 announcements last year)
  • 6. ● Started our Kubernetes journey in early 2017 ● Running a number of production workloads ● Kubernetes is a key component of platform overhaul in 2018 CloudHealth and Kubernetes
  • 7. 7 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. Our Stack Helm ● Service Lifecycle ● Values templates ● Trivial rollback ● Canary deployments were a bit tricky (would love suggestions) Linkerd ● Service discovery ● Circuit breakers ● Metrics ● Distributed tracing (via Zipkin) Romana ● CNI ● Cloud Native ● Works well with AWS ● Not a full mesh network
  • 8. 8 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. Primary Clusters Development ● ~50 Nodes ● Namespace per developer ● Devs given free reign within their namespace. ● Collaboration via linkerd Test/Staging ● ~20 Nodes ● Stable version of each service ● Namespace per- pull request ● Integration tests w/ new code + stable services Production ● ~50 Nodes ● Namespace per- service group ● Tight restrictions on access ● Distributed tracing
  • 9. 9 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. Development Environments ● ch CLI tool ● Light wrapper around setup, dev, and service lifecycle ● simplifies and accelerates dev workflow ● commands ○ ch init ○ ch new service <foo> ○ ch build ○ ch deploy ○ ch run (build + deploy) ○ ch supervisor Consistency drives both velocity and quality
  • 10. 10 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. Development Environments ● Service endpoints are always in helm chart values ○ Injected as environment variables ○ can use any namespace https://auth/graphql for my namespace or https://auth.david/graphql for David's namespace ● Test locally by using linkerd endpoint as http_proxy to reach remote services in desired namespace ● Namespace for each developer ● Collaborate without deploying the world. Collaborating in a shared dev cluster
  • 11. 11 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. Our Build Pipeline Pull Request Human Gate Staging PR namespace Prod Canary Prod Svc Group - Unit - Integration - Pact (Contracts) - Javadoc - RDoc - etc. Published to S3.
  • 12. 12 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. When things go wrong ● Each PR has its own namespace to deploy service into ● Integration tests operate against stable versions of services it depends on ● When failure happens dev can: ○ Access resources in namespace through linkerd with customer header ○ Shell into a pod to check it out ○ Look at logs ○ Exercise failing service manually ○ Access UI of failing service (if one exists) Failed Builds
  • 13. 13 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. When things go wrong ● Human gates between canary and full prod deploy ● Canaries can be validated by ○ verifying that it is serving production requests ○ looking at error reporting service ○ using linkerd headers to ensure a request against canary ○ tailing logs ○ looking at performance and application metrics compared to current production code ● Can temporarily scale down to 0. ● Rollback with helm is fast and trivial. Bad Canaries
  • 14. 14 © 2018 CLOUDHEALTH® TECHNOLOGIES INC. ● Consistency in tooling == velocity and quality ● Shared dev cluster ○ collaboration, shared understanding ● Namespace/deploy per PR ○ Velocity - faster to diagnose and fix test failures ○ Quality - easier to reach root cause via live debugging ● Canary Builds ○ Quickly detect bad deploys without heavy impact to customers ○ Confidence in deploys post-canary In Summary
  • 15. Thank you! Questions? Adam Schepis @aschepis

Notas del editor

  1. I'm adam architect at cloudhealth What gets me excited in the morning is building systems (often distributed) with high scalability requirements
  2. We have grown (a lot!) 30-260 in 3 years eng 10 -> 70 code "grew organically" with us More devs + big, complex platform + tribal knowledge = a drag on velocity
  3. Market has matured QUALITY! Our customers aren't early adopters any more No tolerance for product or data quality issues
  4. Innovation in the Cloud VELOCITY! 100+ announcements in 6 weeks of 2018 Azure - 13 very chunk announcements Hybrid Cloud/Datacenters enterprise customers ask for this cloud + datacenter will exist for the foreseeable future in large enterprises International growth Alibaba supporting many currencies
  5. Decided on k8s in early 2017 Evaluated ECS, Mesos, Docker Swarm We run production workloads for background analytics and batch jobs for serving data in mainline customer requests in application Kubernetes is one of the backbones of our platform overhaul strategy in 2018
  6. We use helm (wrapped in some light custom tooling) for managing service lifecycles It has worked very well canary deployments were a bit painful i would love to talk to people who have done canaries via helm or use helm and do canary deploys Linkerd for our service mesh daemonset in k8s teams don't have to worry about deploying sidecars Platform team doesn't have to run around explaining why they should) we get distributed tracing via zipkin telemeter Romana CNI Originally used weave but had some issues as cluster approached 50 nodes this may have been our inexperience Romana has been beneficial for us since we are on AWS and it intelligently manages route tables for us, avoiding limitation imposed by AWS Like pretty much everyone else we also use a whole bunch of other technologies for building, delivering, and monitoring services.
  7. Dev Cluster shared by engineering team each engineer has a namespace and they can deploy
  8. Golang built for macOS, linux light wrapper enough to make faster, not so heavy that you can't see under the covers. ch init set up dev env setup dev tools (kubectl, helm, ...) minikube (not by default anymore) Self-provision access to development/staging cluster through google auth
  9. Adding service http_proxy No native support on Node. 😢
  10. Reasons for failures Unit test failure integration test failure performance regressions contract validation failures What can a dev do Because the failing build still lives in a namespace a dev can inspect the running service, perform tests, etc