SlideShare a Scribd company logo
1 of 29
Download to read offline
Custom High Availability of Kubernetes
Mike Splain | @mikesplain
Our requirements for running K8s
• Fast recovery without human intervention
• Nodes are ephemeral
• Autoscaling
• Testable on developer machines
• K8s as an artifact
Technology choices for running K8s
Hasn’t someone else already built this?
K8s official scripts
• Simple bash scripts
• “Just Works!”
• Sets up Autoscaling group for
• Uses Salt
• No etcd HA
• No Master HA
• Salt Master is coupled with K8s
• Ubuntu / Fedora
CoreOS’s official scripts
• Cool go app to start it!
• Or Cloud formation?
• But what now?
• No etcd HA
• No Master HA
• Lots of magic
“It can’t be that hard right?”
• Easy.
• “Just works”
• Cluster discovery:
• Discovery Service
• ?
advertise-client-urls: "http://$public_ipv4:2379"
initial-advertise-peer-urls: "http://$private_ipv4:2380"
listen-client-urls: ","
listen-peer-urls: "http://$private_ipv4:2380,http://$private_ipv4:7001"
discovery-token: “<token here>”
- name: etcd2.service
command: start
reboot-strategy: none
• etcd-aws-cluster
• Uses Autoscaling groups for
• Requires IAM Instance Roles
advertise-client-urls: "http://$public_ipv4:2379"
initial-advertise-peer-urls: "http://$private_ipv4:2380"
listen-client-urls: ","
listen-peer-urls: "http://$private_ipv4:2380,http://$private_ipv4:7001"
- name: etcd2.service
command: stop
- name: etcd-peers.service
command: start
content: |
Description=Write a file with the etcd peers that we should
bootstrap to
ExecStartPre=/usr/bin/docker pull
ExecStartPre=/usr/bin/docker run --rm=true -v /etc/sysconfig/:/etc/
ExecStart=/usr/bin/systemctl start etcd2
- path: /etc/systemd/system/etcd2.service.d/30-etcd_peers.conf
permissions: 0644
content: |
# Load the other hosts in the etcd leader autoscaling group from file
Terraform to launch etcd
• References static cloud-init files
resource "aws_launch_configuration" "terraform_etcd" {
name_prefix = "${var.environment}_etcd_conf-"
image_id = "${var.coreos_ami}"
instance_type = "t2.small"
key_name = "${var.key_name}"
security_groups = ["$
user_data = "${file("../cloud-config/output/etcd.yml")}"
enable_monitoring = true
ebs_optimized = false
iam_instance_profile = "$
root_block_device {
volume_size = 20
lifecycle {
create_before_destroy = true
resource "aws_autoscaling_group" "terraform_etcd" {
name = "${var.environment}_etcd"
launch_configuration = "$
availability_zones = ["us-east-1c"]
max_size = "${var.capacities_etcd_max}"
min_size = "${var.capacities_etcd_min}"
health_check_grace_period = 300
desired_capacity = "${var.capacities_etcd_desired}"
vpc_zone_identifier = ["${}"]
force_delete = true
tag {
key = "Name"
value = "${var.environment}_etcd"
propagate_at_launch = true
lifecycle {
create_before_destroy = true
Master Node
• Kubelet service
• API Server
• Replication Controller Manager
• Scheduler
• Podmaster
• Proxy
• Flannel
Consider Master pods as
additional artifact
• Docker container that takes env
• Outputs templated pods to disk for
Kubelet to load
• We use j2cli to template these files
with little overhead
apiVersion: v1
kind: Pod
name: kube-podmaster
namespace: kube-system
hostNetwork: true
- name: scheduler-elector
imagePullPolicy: Always
- /podmaster
- --etcd-servers={{ ETCD_ENDPOINTS }}
- --key=scheduler
- --whoami={{ ADVERTISE_IP }}
- --source-file=/src/manifests/kube-scheduler.yaml
- --dest-file=/dst/manifests/kube-scheduler.yaml
- mountPath: /src/manifests
name: manifest-src
readOnly: true
- mountPath: /dst/manifests
name: manifest-dst
• Setup box for services needed for
real Docker to run
• Get etcd server IPs and write to file
• Start flannel with those IPs
- name: etcd.service
command: stop
- name: etcd2.service
command: stop
- name: early-docker.service
command: start
- name: kub_get_etcd.service
command: start
content: |
Description= Write K8s etcd urls to disk.
ExecStart=/usr/bin/sh -c "/usr/bin/docker pull"
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --net=host -v /etc/
barkly/:/etc/barkly/ {{
KUB_ETCD_ASG }} > /etc/etcd_servers.env"
- name: flanneld.service
command: start
- name: 10-environment_vars.conf
content: |
ExecStartPre=/usr/bin/sh -c "/usr/bin/echo -n
FLANNELD_ETCD_ENDPOINTS= > /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/sh -c "/usr/bin/cat /etc/etcd_servers.env
>> /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/sh -c "/usr/bin/echo FLANNELD_IFACE=
$private_ipv4 >> /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/ln -sf /etc/flannel_etcd_servers.env /run/
Other config
• Grab certs from S3
• Terraform only allows
permissions to specific files
• Format Master pod files to disk
- name: kub_certs.service
command: start
content: |
Description=Writes kubernetes cluster certs to disk.
ExecStart=/usr/bin/sh -c /usr/bin/mkdir -p /etc/kubernetes/ssl
ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/
ssl s3 cp s3://
our_k8s_cluster_bucket/ca.pem /ssl
ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/
ssl s3 cp s3://
our_k8s_cluster_bucket/apiserver.pem /ssl
ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/
ssl s3 cp s3://
our_k8s_cluster_bucket/apiserver-key.pem /ssl
- name: kub_pods.service
command: start
content: |
Description=Writes kubernetes pod files to disk.
ExecStart=/usr/bin/sh -c "/usr/bin/mkdir -p /etc/kubernetes/ssl"
ExecStart=/usr/bin/docker run --net=host -v /etc/barkly/:/etc/
barkly/ -e K8S_VERSION='1.1.7' -e CLOUD_PROVIDER='--cloud-provider=aws' -e
SERVICE_IP_RANGE="" -e ADVERTISE_IP="$private_ipv4" -e
ETCD_AUTOSCALE_GROUP_NAME="our_etcd_autoscaling_group_name" -v /srv/
kubernetes/manifests:/output_src -v /etc/kubernetes/manifests:/output_dst
Start Docker & Kubelet
• Kubelet will wait for docker and
flannel to be ready
• Kubelet will load manifests for
Master services outputed from
previous container
- name: docker.service
command: start
- name: 40-flannel.conf
content: |
- name: kubelet.service
command: start
content: |
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
Get Kubelet
• /usr/bin & /usr/local/bin are read
only in CoreOS
- name: kubelet.service
command: start
- name: 10-download-binary.conf
content: |
ExecStartPre=/bin/bash -c "/etc/bin/download-k8s-binary kubelet"
# Since systemd needs these files before it will start
- path: /etc/bin/download-k8s-binary
permissions: '0755'
content: |
#!/usr/bin/env bash
export K8S_VERSION="v1.1.8"
mkdir -p /etc/bin
if [ ! -f /usr/bin/$FILE ]; then
curl -sSL -o /etc/bin/$FILE
chmod +x /etc/bin/$FILE
# we check the version of the binary
INSTALLED_VERSION=$(/etc/bin/$FILE --version)
MATCH=$(echo "${INSTALLED_VERSION}" | grep -c "${K8S_VERSION}")
if [ $MATCH -eq 0 ]; then
# the version is different
curl -sSL -o /etc/bin/$FILE
chmod +x /etc/bin/$FILE
Terraform to build
• Similar to as before.. we reference
our cloudinit script
resource "aws_launch_configuration" "terraform_master" {
name_prefix = "${var.environment}_master_conf-"
image_id = "${var.coreos_ami}"
instance_type = "t2.medium"
key_name = "${var.key_name}"
security_groups = ["$
user_data = "${file("../cloud-config/output/master.yml")}"
enable_monitoring = true
ebs_optimized = false
iam_instance_profile = "$
root_block_device {
volume_size = 20
lifecycle {
create_before_destroy = true
resource "aws_autoscaling_group" "terraform_master" {
name = "${var.environment}_master"
launch_configuration = "$
availability_zones = ["us-east-1c"]
max_size = "${var.capacities_master_max}"
min_size = "${var.capacities_master_min}"
health_check_grace_period = 300
desired_capacity = "${var.capacities_master_desired}"
vpc_zone_identifier = ["${}"]
force_delete = true
load_balancers = ["${}"]
tag {
key = "Name"
value = "${var.environment}_master"
propagate_at_launch = true
lifecycle {
create_before_destroy = true
Minion Node (now just Nodes)
• Kubelet Service manages all other services.
• Proxy
• Pods
Early Docker / Flannel
• Same as master!
- name: etcd.service
command: stop
- name: etcd2.service
command: stop
- name: early-docker.service
command: start
- name: kub_get_etcd.service
command: start
content: |
Description= Write K8s etcd urls to disk.
ExecStart=/usr/bin/sh -c "/usr/bin/docker pull"
ExecStart=/usr/bin/sh -c "/usr/bin/docker run --net=host -v /etc/
barkly/:/etc/barkly/ {{
KUB_ETCD_ASG }} > /etc/etcd_servers.env"
- name: flanneld.service
command: start
- name: 10-environment_vars.conf
content: |
ExecStartPre=/usr/bin/sh -c "/usr/bin/echo -n
FLANNELD_ETCD_ENDPOINTS= > /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/sh -c "/usr/bin/cat /etc/etcd_servers.env
>> /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/sh -c "/usr/bin/echo FLANNELD_IFACE=
$private_ipv4 >> /etc/flannel_etcd_servers.env"
ExecStartPre=/usr/bin/ln -sf /etc/flannel_etcd_servers.env /run/
• Similar to master
- name: docker.service
command: start
- name: 40-flannel.conf
content: |
- name: kubelet.service
command: start
content: |
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
- name: 10-download-binary.conf
content: |
ExecStartPre=/bin/bash -c "/etc/bin/download-k8s-binary kubelet"
Manifests can be hard coded
• Minion manifests are far less
dynamic and can be hard coded
as CloudInit files
- path: "/etc/kubernetes/manifests/kube-proxy.yaml"
content: |
apiVersion: v1
kind: Pod
name: kube-proxy
namespace: kube-system
hostNetwork: true
- name: kube-proxy
- /hyperkube
- proxy
- --master=
- --kubeconfig=/etc/kubernetes/worker-kubeconfig-proxy.yaml
- --proxy-mode=iptables
- --v=4
privileged: true
- mountPath: /etc/ssl/certs
name: "ssl-certs"
- mountPath: /etc/kubernetes/worker-kubeconfig-proxy.yaml
name: "kubeconfig"
readOnly: true
- mountPath: /etc/kubernetes/ssl
name: "etc-kube-ssl"
readOnly: true
- name: "ssl-certs"
path: "/usr/share/ca-certificates"
- name: "kubeconfig"
path: "/etc/kubernetes/worker-kubeconfig-proxy.yaml"
- name: "etc-kube-ssl"
path: "/etc/kubernetes/ssl"
Manifests can be hard coded
• Minion manifests are far less
dynamic and can be hard coded
as CloudInit files
- path: "/etc/kubernetes/worker-kubeconfig-proxy.yaml"
content: |
apiVersion: v1
kind: Config
- name: local
certificate-authority: /etc/kubernetes/ssl/ca.pem
- name: kubelet
client-certificate: /etc/kubernetes/ssl/worker.pem
client-key: /etc/kubernetes/ssl/worker-key.pem
- context:
cluster: local
user: kubelet
name: kubelet-context
current-context: kubelet-context
- path: "/etc/kubernetes/worker-kubeconfig-kubelet.yaml"
content: |
apiVersion: v1
kind: Config
- name: local
certificate-authority: /etc/kubernetes/ssl/ca.pem
- name: kubelet
client-certificate: /etc/kubernetes/ssl/worker.pem
client-key: /etc/kubernetes/ssl/worker-key.pem
- context:
cluster: local
user: kubelet
name: kubelet-context
current-context: kubelet-context

More Related Content

What's hot

Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...Docker, Inc.
Networking in Kubernetes
Networking in KubernetesNetworking in Kubernetes
Networking in KubernetesMinhan Xia
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...Docker, Inc.
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondCoreOS
CoreOS Overview and Current Status
CoreOS Overview and Current StatusCoreOS Overview and Current Status
CoreOS Overview and Current StatusSreenivas Makam
Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker, Inc.
Kubernetes 1.3 - Highlights
Kubernetes 1.3 - HighlightsKubernetes 1.3 - Highlights
Kubernetes 1.3 - HighlightsMatthew Barker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
An Introduction to the Kubernetes API
An Introduction to the Kubernetes APIAn Introduction to the Kubernetes API
An Introduction to the Kubernetes APIStefan Schimanski
Installaling Puppet Master and Agent
Installaling Puppet Master and AgentInstallaling Puppet Master and Agent
Installaling Puppet Master and AgentRanjit Avasarala
Packet Walk(s) In Kubernetes
Packet Walk(s) In KubernetesPacket Walk(s) In Kubernetes
Packet Walk(s) In KubernetesDon Jayakody
Learn basic ansible using docker
Learn basic ansible using dockerLearn basic ansible using docker
Learn basic ansible using dockerLarry Cai
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart Docker, Inc.
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014Amazon Web Services
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSYevgeniy Brikman
Ansible Oxford - Cows & Containers
Ansible Oxford - Cows & ContainersAnsible Oxford - Cows & Containers
Ansible Oxford - Cows & Containersjonatanblue
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherDocker, Inc.

What's hot (20)

Docker toolbox
Docker toolboxDocker toolbox
Docker toolbox
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Beyond static configuration
Beyond static configurationBeyond static configuration
Beyond static configuration
Networking in Kubernetes
Networking in KubernetesNetworking in Kubernetes
Networking in Kubernetes
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...
Orchestration? You Don't Need Orchestration. What You Want Is Choreography by...
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and BeyondTectonic Summit 2016: Kubernetes 1.5 and Beyond
Tectonic Summit 2016: Kubernetes 1.5 and Beyond
CoreOS Overview and Current Status
CoreOS Overview and Current StatusCoreOS Overview and Current Status
CoreOS Overview and Current Status
Docker Swarm 0.2.0
Docker Swarm 0.2.0Docker Swarm 0.2.0
Docker Swarm 0.2.0
Kubernetes 1.3 - Highlights
Kubernetes 1.3 - HighlightsKubernetes 1.3 - Highlights
Kubernetes 1.3 - Highlights
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
An Introduction to the Kubernetes API
An Introduction to the Kubernetes APIAn Introduction to the Kubernetes API
An Introduction to the Kubernetes API
Installaling Puppet Master and Agent
Installaling Puppet Master and AgentInstallaling Puppet Master and Agent
Installaling Puppet Master and Agent
Packet Walk(s) In Kubernetes
Packet Walk(s) In KubernetesPacket Walk(s) In Kubernetes
Packet Walk(s) In Kubernetes
Learn basic ansible using docker
Learn basic ansible using dockerLearn basic ansible using docker
Learn basic ansible using docker
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart
Thinking Inside the Container: A Continuous Delivery Story by Maxfield Stewart
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014
(APP313) NEW LAUNCH: Amazon EC2 Container Service in Action | AWS re:Invent 2014
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECS
Ansible Oxford - Cows & Containers
Ansible Oxford - Cows & ContainersAnsible Oxford - Cows & Containers
Ansible Oxford - Cows & Containers
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross Boucher

Viewers also liked

Using OpenContrail with Kubernetes
Using OpenContrail with KubernetesUsing OpenContrail with Kubernetes
Using OpenContrail with KubernetesMatt Baldwin
Container Network Interface: Network Plugins for Kubernetes and beyond
Container Network Interface: Network Plugins for Kubernetes and beyondContainer Network Interface: Network Plugins for Kubernetes and beyond
Container Network Interface: Network Plugins for Kubernetes and beyondKubeAcademy
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes NetworkingCJ Cullen
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Mike Splain
K8 - Research from Facebook and Kenshoo
K8 - Research from Facebook and KenshooK8 - Research from Facebook and Kenshoo
K8 - Research from Facebook and KenshooKenshoo
KubeCon EU 2016: Trading in the Kube
KubeCon EU 2016: Trading in the KubeKubeCon EU 2016: Trading in the Kube
KubeCon EU 2016: Trading in the KubeKubeAcademy
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons Learned
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons LearnedRansomware in Healthcare: 5 Attacks on Hospitals & Lessons Learned
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons LearnedBarkly
You're monitoring Kubernetes Wrong
You're monitoring Kubernetes WrongYou're monitoring Kubernetes Wrong
You're monitoring Kubernetes WrongSysdig
Production deployment
Production deploymentProduction deployment
Production deploymentMongoDB
Kubernetes SDN performance and architecture
Kubernetes SDN performance and architectureKubernetes SDN performance and architecture
Kubernetes SDN performance and architectureJakub Pavlik
KubeCon EU 2016: Integrated trusted computing in Kubernetes
KubeCon EU 2016: Integrated trusted computing in KubernetesKubeCon EU 2016: Integrated trusted computing in Kubernetes
KubeCon EU 2016: Integrated trusted computing in KubernetesKubeAcademy
Kubernetes OpenContrail Meetup
Kubernetes OpenContrail MeetupKubernetes OpenContrail Meetup
Kubernetes OpenContrail MeetupLachlan Evenson
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesEvolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesJakub Pavlik
Scaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesScaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesCarlos Sanchez
AWS and GKE Migration and Multicloud
AWS and GKE Migration and MulticloudAWS and GKE Migration and Multicloud
AWS and GKE Migration and MulticloudChris Gaun
Scaling jenkins with kubernetes
Scaling jenkins with kubernetesScaling jenkins with kubernetes
Scaling jenkins with kubernetesAmi Mahloof
Scaling Docker with Kubernetes
Scaling Docker with KubernetesScaling Docker with Kubernetes
Scaling Docker with KubernetesCarlos Sanchez
Next-gen DevOps engineering with Docker and Kubernetes by Antons Kranga
Next-gen DevOps engineering with Docker and Kubernetes by Antons KrangaNext-gen DevOps engineering with Docker and Kubernetes by Antons Kranga
Next-gen DevOps engineering with Docker and Kubernetes by Antons KrangaJavaDayUA
Control Flow Testing
Control Flow TestingControl Flow Testing
Control Flow TestingHirra Sultan

Viewers also liked (20)

Using OpenContrail with Kubernetes
Using OpenContrail with KubernetesUsing OpenContrail with Kubernetes
Using OpenContrail with Kubernetes
Container Network Interface: Network Plugins for Kubernetes and beyond
Container Network Interface: Network Plugins for Kubernetes and beyondContainer Network Interface: Network Plugins for Kubernetes and beyond
Container Network Interface: Network Plugins for Kubernetes and beyond
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
Kubernets Helm - Okay so my cluster's up, how do I manage all the sh*t to run...
K8 - Research from Facebook and Kenshoo
K8 - Research from Facebook and KenshooK8 - Research from Facebook and Kenshoo
K8 - Research from Facebook and Kenshoo
KubeCon EU 2016: Trading in the Kube
KubeCon EU 2016: Trading in the KubeKubeCon EU 2016: Trading in the Kube
KubeCon EU 2016: Trading in the Kube
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons Learned
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons LearnedRansomware in Healthcare: 5 Attacks on Hospitals & Lessons Learned
Ransomware in Healthcare: 5 Attacks on Hospitals & Lessons Learned
You're monitoring Kubernetes Wrong
You're monitoring Kubernetes WrongYou're monitoring Kubernetes Wrong
You're monitoring Kubernetes Wrong
Production deployment
Production deploymentProduction deployment
Production deployment
Kubernetes SDN performance and architecture
Kubernetes SDN performance and architectureKubernetes SDN performance and architecture
Kubernetes SDN performance and architecture
Demystifying kubernetes
Demystifying kubernetesDemystifying kubernetes
Demystifying kubernetes
KubeCon EU 2016: Integrated trusted computing in Kubernetes
KubeCon EU 2016: Integrated trusted computing in KubernetesKubeCon EU 2016: Integrated trusted computing in Kubernetes
KubeCon EU 2016: Integrated trusted computing in Kubernetes
Kubernetes OpenContrail Meetup
Kubernetes OpenContrail MeetupKubernetes OpenContrail Meetup
Kubernetes OpenContrail Meetup
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with KubernetesEvolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Evolve or Die: Enterprise Ready OpenStack upgrades with Kubernetes
Scaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesScaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and Kubernetes
AWS and GKE Migration and Multicloud
AWS and GKE Migration and MulticloudAWS and GKE Migration and Multicloud
AWS and GKE Migration and Multicloud
Scaling jenkins with kubernetes
Scaling jenkins with kubernetesScaling jenkins with kubernetes
Scaling jenkins with kubernetes
Scaling Docker with Kubernetes
Scaling Docker with KubernetesScaling Docker with Kubernetes
Scaling Docker with Kubernetes
Next-gen DevOps engineering with Docker and Kubernetes by Antons Kranga
Next-gen DevOps engineering with Docker and Kubernetes by Antons KrangaNext-gen DevOps engineering with Docker and Kubernetes by Antons Kranga
Next-gen DevOps engineering with Docker and Kubernetes by Antons Kranga
Control Flow Testing
Control Flow TestingControl Flow Testing
Control Flow Testing

Similar to Kubernetes Boston — Custom High Availability of Kubernetes

Managing Infrastructure as Code
Managing Infrastructure as CodeManaging Infrastructure as Code
Managing Infrastructure as CodeAllan Shone
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)DECK36
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeSarah Z
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...Timofey Turenko
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStackPuppet
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackke4qqq
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentationSuresh Kumar
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Pavel Chunyayev
Pro2516 10 things about oracle and k8s.pptx-final
Pro2516   10 things about oracle and k8s.pptx-finalPro2516   10 things about oracle and k8s.pptx-final
Pro2516 10 things about oracle and k8s.pptx-finalMichel Schildmeijer
Automating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageAutomating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageVishal Uderani
Artem Zhurbila - docker clusters (solit 2015)
Artem Zhurbila - docker clusters (solit 2015)Artem Zhurbila - docker clusters (solit 2015)
Artem Zhurbila - docker clusters (solit 2015)Artem Zhurbila
VSTS Release Pipelines with Kubernetes
VSTS Release Pipelines with KubernetesVSTS Release Pipelines with Kubernetes
VSTS Release Pipelines with KubernetesMarc Müller
Can puppet help you run docker on a T2.Micro?
Can puppet help you run docker on a T2.Micro?Can puppet help you run docker on a T2.Micro?
Can puppet help you run docker on a T2.Micro?Neil Millard
Openstack Magnum: Container-as-a-Service
Openstack Magnum: Container-as-a-ServiceOpenstack Magnum: Container-as-a-Service
Openstack Magnum: Container-as-a-ServiceChhavi Agarwal
Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)DECK36
Puppet and CloudStack
Puppet and CloudStackPuppet and CloudStack
Puppet and CloudStackke4qqq
EC2 AMI Factory with Chef, Berkshelf, and Packer
EC2 AMI Factory with Chef, Berkshelf, and PackerEC2 AMI Factory with Chef, Berkshelf, and Packer
EC2 AMI Factory with Chef, Berkshelf, and PackerGeorge Miranda

Similar to Kubernetes Boston — Custom High Availability of Kubernetes (20)

Managing Infrastructure as Code
Managing Infrastructure as CodeManaging Infrastructure as Code
Managing Infrastructure as Code
Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)Our Puppet Story (Linuxtag 2014)
Our Puppet Story (Linuxtag 2014)
Ansible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less CoffeeAnsible: How to Get More Sleep and Require Less Coffee
Ansible: How to Get More Sleep and Require Less Coffee
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...
DB proxy server test: run tests on tens of virtual machines with Jenkins, Vag...
Amazon EC2 Container Service
Amazon EC2 Container ServiceAmazon EC2 Container Service
Amazon EC2 Container Service
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStackInfrastructure as code with Puppet and Apache CloudStack
Infrastructure as code with Puppet and Apache CloudStack
Ansible presentation
Ansible presentationAnsible presentation
Ansible presentation
Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015Ansible benelux meetup - Amsterdam 27-5-2015
Ansible benelux meetup - Amsterdam 27-5-2015
Pro2516 10 things about oracle and k8s.pptx-final
Pro2516   10 things about oracle and k8s.pptx-finalPro2516   10 things about oracle and k8s.pptx-final
Pro2516 10 things about oracle and k8s.pptx-final
Automating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageAutomating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngage
Artem Zhurbila - docker clusters (solit 2015)
Artem Zhurbila - docker clusters (solit 2015)Artem Zhurbila - docker clusters (solit 2015)
Artem Zhurbila - docker clusters (solit 2015)
VSTS Release Pipelines with Kubernetes
VSTS Release Pipelines with KubernetesVSTS Release Pipelines with Kubernetes
VSTS Release Pipelines with Kubernetes
Can puppet help you run docker on a T2.Micro?
Can puppet help you run docker on a T2.Micro?Can puppet help you run docker on a T2.Micro?
Can puppet help you run docker on a T2.Micro?
Openstack Magnum: Container-as-a-Service
Openstack Magnum: Container-as-a-ServiceOpenstack Magnum: Container-as-a-Service
Openstack Magnum: Container-as-a-Service
Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)
Puppet and CloudStack
Puppet and CloudStackPuppet and CloudStack
Puppet and CloudStack
EC2 AMI Factory with Chef, Berkshelf, and Packer
EC2 AMI Factory with Chef, Berkshelf, and PackerEC2 AMI Factory with Chef, Berkshelf, and Packer
EC2 AMI Factory with Chef, Berkshelf, and Packer

Recently uploaded

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Recently uploaded (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons

Kubernetes Boston — Custom High Availability of Kubernetes

  • 1. Custom High Availability of Kubernetes Mike Splain | @mikesplain
  • 2. Our requirements for running K8s • Fast recovery without human intervention • Nodes are ephemeral • Autoscaling • Testable on developer machines • K8s as an artifact
  • 4. Hasn’t someone else already built this?
  • 5. K8s official scripts • Simple bash scripts • “Just Works!” • Sets up Autoscaling group for minions • Uses Salt Problems? • No etcd HA • No Master HA • Salt Master is coupled with K8s Master • Ubuntu / Fedora
  • 6. CoreOS’s official scripts • Cool go app to start it! • Or Cloud formation? • But what now? Problems? • No etcd HA • No Master HA • Lots of magic
  • 7. “It can’t be that hard right?”
  • 8.
  • 9.
  • 10. etcd
  • 11. etcd • Easy. • “Just works” • Cluster discovery: • Discovery Service • DNS • ? #cloud-config coreos: etcd2: advertise-client-urls: "http://$public_ipv4:2379" initial-advertise-peer-urls: "http://$private_ipv4:2380" listen-client-urls: "," listen-peer-urls: "http://$private_ipv4:2380,http://$private_ipv4:7001" discovery-token: “<token here>” units: - name: etcd2.service command: start update: reboot-strategy: none
  • 12. etcd • etcd-aws-cluster etcd-aws-cluster • Uses Autoscaling groups for discovery • Requires IAM Instance Roles #cloud-config coreos: etcd2: advertise-client-urls: "http://$public_ipv4:2379" initial-advertise-peer-urls: "http://$private_ipv4:2380" listen-client-urls: "," listen-peer-urls: "http://$private_ipv4:2380,http://$private_ipv4:7001" units: - name: etcd2.service command: stop - name: etcd-peers.service command: start content: | [Unit] Description=Write a file with the etcd peers that we should bootstrap to Requires=docker.service After=docker.service [Service] Restart=on-failure RestartSec=10 TimeoutStartSec=300 ExecStartPre=/usr/bin/docker pull kubernetes/etcd-aws-cluster:latest ExecStartPre=/usr/bin/docker run --rm=true -v /etc/sysconfig/:/etc/ sysconfig/ ExecStart=/usr/bin/systemctl start etcd2 write_files: - path: /etc/systemd/system/etcd2.service.d/30-etcd_peers.conf permissions: 0644 content: | [Service] # Load the other hosts in the etcd leader autoscaling group from file EnvironmentFile=/etc/sysconfig/etcd-peers
  • 13. Terraform to launch etcd • References static cloud-init files resource "aws_launch_configuration" "terraform_etcd" { name_prefix = "${var.environment}_etcd_conf-" image_id = "${var.coreos_ami}" instance_type = "t2.small" key_name = "${var.key_name}" security_groups = ["$ {}"] user_data = "${file("../cloud-config/output/etcd.yml")}" enable_monitoring = true ebs_optimized = false iam_instance_profile = "$ {}" root_block_device { volume_size = 20 } lifecycle { create_before_destroy = true } } resource "aws_autoscaling_group" "terraform_etcd" { name = "${var.environment}_etcd" launch_configuration = "$ {}" availability_zones = ["us-east-1c"] max_size = "${var.capacities_etcd_max}" min_size = "${var.capacities_etcd_min}" health_check_grace_period = 300 desired_capacity = "${var.capacities_etcd_desired}" vpc_zone_identifier = ["${}"] force_delete = true tag { key = "Name" value = "${var.environment}_etcd" propagate_at_launch = true } lifecycle { create_before_destroy = true } }
  • 15. Master Node • Kubelet service • API Server • Replication Controller Manager • Scheduler • Podmaster • Proxy • Flannel
  • 16. Consider Master pods as additional artifact • Docker container that takes env variables • Outputs templated pods to disk for Kubelet to load • We use j2cli to template these files with little overhead apiVersion: v1 kind: Pod metadata: name: kube-podmaster namespace: kube-system spec: hostNetwork: true containers: - name: scheduler-elector image: imagePullPolicy: Always command: - /podmaster - --etcd-servers={{ ETCD_ENDPOINTS }} - --key=scheduler - --whoami={{ ADVERTISE_IP }} - --source-file=/src/manifests/kube-scheduler.yaml - --dest-file=/dst/manifests/kube-scheduler.yaml volumeMounts: - mountPath: /src/manifests name: manifest-src readOnly: true - mountPath: /dst/manifests name: manifest-dst
  • 17. ` • Setup box for services needed for real Docker to run • Get etcd server IPs and write to file • Start flannel with those IPs #cloud-config coreos: units: - name: etcd.service command: stop - name: etcd2.service command: stop - name: early-docker.service command: start - name: kub_get_etcd.service command: start content: | [Unit] Description= Write K8s etcd urls to disk. Requires=early-docker.service After=early-docker.service [Service] Type=oneshot Environment="DOCKER_HOST=unix:///var/run/early-docker.sock" ExecStart=/usr/bin/sh -c "/usr/bin/docker pull" ExecStart=/usr/bin/sh -c "/usr/bin/docker run --net=host -v /etc/ barkly/:/etc/barkly/ {{ KUB_ETCD_ASG }} > /etc/etcd_servers.env" - name: flanneld.service command: start drop-ins: - name: 10-environment_vars.conf content: | [Unit] After=kub_get_etcd.service [Service] ExecStartPre=/usr/bin/sh -c "/usr/bin/echo -n FLANNELD_ETCD_ENDPOINTS= > /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/sh -c "/usr/bin/cat /etc/etcd_servers.env >> /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/sh -c "/usr/bin/echo FLANNELD_IFACE= $private_ipv4 >> /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/ln -sf /etc/flannel_etcd_servers.env /run/ flannel/options.env Restart=always RestartSec=10
  • 18. Other config • Grab certs from S3 • Terraform only allows permissions to specific files • Format Master pod files to disk - name: kub_certs.service command: start content: | [Unit] Description=Writes kubernetes cluster certs to disk. Requires=early-docker.service After=early-docker.service Before=kubelet.service [Service] Type=oneshot Environment="DOCKER_HOST=unix:///var/run/early-docker.sock" ExecStart=/usr/bin/sh -c /usr/bin/mkdir -p /etc/kubernetes/ssl ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/ ssl s3 cp s3:// our_k8s_cluster_bucket/ca.pem /ssl ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/ ssl s3 cp s3:// our_k8s_cluster_bucket/apiserver.pem /ssl ExecStart=/usr/bin/docker run --net=host -v /etc/kubernetes/ssl:/ ssl s3 cp s3:// our_k8s_cluster_bucket/apiserver-key.pem /ssl - name: kub_pods.service command: start content: | [Unit] Description=Writes kubernetes pod files to disk. Requires=early-docker.service After=early-docker.service Before=kubelet.service [Service] Type=oneshot Environment="DOCKER_HOST=unix:///var/run/early-docker.sock" ExecStart=/usr/bin/sh -c "/usr/bin/mkdir -p /etc/kubernetes/ssl" ExecStart=/usr/bin/docker run --net=host -v /etc/barkly/:/etc/ barkly/ -e K8S_VERSION='1.1.7' -e CLOUD_PROVIDER='--cloud-provider=aws' -e SERVICE_IP_RANGE="" -e ADVERTISE_IP="$private_ipv4" -e ETCD_AUTOSCALE_GROUP_NAME="our_etcd_autoscaling_group_name" -v /srv/ kubernetes/manifests:/output_src -v /etc/kubernetes/manifests:/output_dst
  • 19. Start Docker & Kubelet • Kubelet will wait for docker and flannel to be ready • Kubelet will load manifests for Master services outputed from previous container - name: docker.service command: start drop-ins: - name: 40-flannel.conf content: | [Unit] Requires=flanneld.service After=flanneld.service - name: kubelet.service command: start content: | [Unit] Requires=docker.service After=docker.service After=fluentd-elasticsearch.service [Service] ExecStartPre=/usr/bin/mkdir -p /var/log/containers ExecStart=/etc/bin/kubelet --hostname-override="$private_ipv4" --api_servers= --register-node=false --allow-privileged=true --config=/etc/kubernetes/manifests --cluster-dns= --cluster-domain=cluster.local --cloud-provider=aws --v=4 Restart=always RestartSec=10 [Install]
  • 20. Get Kubelet • /usr/bin & /usr/local/bin are read only in CoreOS - name: kubelet.service command: start drop-ins: - name: 10-download-binary.conf content: | [Service] ExecStartPre=/bin/bash -c "/etc/bin/download-k8s-binary kubelet" write_files: # Since systemd needs these files before it will start - path: /etc/bin/download-k8s-binary permissions: '0755' content: | #!/usr/bin/env bash export K8S_VERSION="v1.1.8" mkdir -p /etc/bin FILE=$1 if [ ! -f /usr/bin/$FILE ]; then curl -sSL -o /etc/bin/$FILE kubernetes-builds/${K8S_VERSION}/bin/$FILE chmod +x /etc/bin/$FILE else # we check the version of the binary INSTALLED_VERSION=$(/etc/bin/$FILE --version) MATCH=$(echo "${INSTALLED_VERSION}" | grep -c "${K8S_VERSION}") if [ $MATCH -eq 0 ]; then # the version is different curl -sSL -o /etc/bin/$FILE kubernetes-builds/${K8S_VERSION}/bin/$FILE chmod +x /etc/bin/$FILE fi fi
  • 21. Terraform to build • Similar to as before.. we reference our cloudinit script resource "aws_launch_configuration" "terraform_master" { name_prefix = "${var.environment}_master_conf-" image_id = "${var.coreos_ami}" instance_type = "t2.medium" key_name = "${var.key_name}" security_groups = ["$ {}"] user_data = "${file("../cloud-config/output/master.yml")}" enable_monitoring = true ebs_optimized = false iam_instance_profile = "$ {}" root_block_device { volume_size = 20 } lifecycle { create_before_destroy = true } } resource "aws_autoscaling_group" "terraform_master" { name = "${var.environment}_master" launch_configuration = "$ {}" availability_zones = ["us-east-1c"] max_size = "${var.capacities_master_max}" min_size = "${var.capacities_master_min}" health_check_grace_period = 300 desired_capacity = "${var.capacities_master_desired}" vpc_zone_identifier = ["${}"] force_delete = true load_balancers = ["${}"] tag { key = "Name" value = "${var.environment}_master" propagate_at_launch = true } lifecycle { create_before_destroy = true } }
  • 23. Minion Node (now just Nodes) • Kubelet Service manages all other services. • Proxy • Pods
  • 24. Early Docker / Flannel • Same as master! #cloud-config coreos: units: - name: etcd.service command: stop - name: etcd2.service command: stop - name: early-docker.service command: start - name: kub_get_etcd.service command: start content: | [Unit] Description= Write K8s etcd urls to disk. Requires=early-docker.service After=early-docker.service [Service] Type=oneshot Environment="DOCKER_HOST=unix:///var/run/early-docker.sock" ExecStart=/usr/bin/sh -c "/usr/bin/docker pull" ExecStart=/usr/bin/sh -c "/usr/bin/docker run --net=host -v /etc/ barkly/:/etc/barkly/ {{ KUB_ETCD_ASG }} > /etc/etcd_servers.env" - name: flanneld.service command: start drop-ins: - name: 10-environment_vars.conf content: | [Unit] After=kub_get_etcd.service [Service] ExecStartPre=/usr/bin/sh -c "/usr/bin/echo -n FLANNELD_ETCD_ENDPOINTS= > /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/sh -c "/usr/bin/cat /etc/etcd_servers.env >> /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/sh -c "/usr/bin/echo FLANNELD_IFACE= $private_ipv4 >> /etc/flannel_etcd_servers.env" ExecStartPre=/usr/bin/ln -sf /etc/flannel_etcd_servers.env /run/ flannel/options.env Restart=always RestartSec=10
  • 25. Kubelet • Similar to master - name: docker.service command: start drop-ins: - name: 40-flannel.conf content: | [Unit] Requires=flanneld.service After=flanneld.service - name: kubelet.service command: start content: | [Unit] Requires=docker.service After=docker.service After=fluentd-elasticsearch.service [Service] ExecStartPre=/usr/bin/mkdir -p /var/log/containers ExecStart=/etc/bin/kubelet --api_servers= --hostname-override="$private_ipv4" --register-node=true --allow-privileged=true --config=/etc/kubernetes/manifests --cluster-dns= --cluster-domain=cluster.local --kubeconfig=/etc/kubernetes/worker-kubeconfig-kubelet.yaml --tls-cert-file=/etc/kubernetes/ssl/worker.pem --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem --cloud-provider=aws --v=4 Restart=always RestartSec=10 [Install] drop-ins: - name: 10-download-binary.conf content: | [Service] ExecStartPre=/bin/bash -c "/etc/bin/download-k8s-binary kubelet"
  • 26. Manifests can be hard coded • Minion manifests are far less dynamic and can be hard coded as CloudInit files write_files: - path: "/etc/kubernetes/manifests/kube-proxy.yaml" content: | apiVersion: v1 kind: Pod metadata: name: kube-proxy namespace: kube-system spec: hostNetwork: true containers: - name: kube-proxy image: command: - /hyperkube - proxy - --master= - --kubeconfig=/etc/kubernetes/worker-kubeconfig-proxy.yaml - --proxy-mode=iptables - --v=4 securityContext: privileged: true volumeMounts: - mountPath: /etc/ssl/certs name: "ssl-certs" - mountPath: /etc/kubernetes/worker-kubeconfig-proxy.yaml name: "kubeconfig" readOnly: true - mountPath: /etc/kubernetes/ssl name: "etc-kube-ssl" readOnly: true volumes: - name: "ssl-certs" hostPath: path: "/usr/share/ca-certificates" - name: "kubeconfig" hostPath: path: "/etc/kubernetes/worker-kubeconfig-proxy.yaml" - name: "etc-kube-ssl" hostPath: path: "/etc/kubernetes/ssl"
  • 27. Manifests can be hard coded • Minion manifests are far less dynamic and can be hard coded as CloudInit files - path: "/etc/kubernetes/worker-kubeconfig-proxy.yaml" content: | apiVersion: v1 kind: Config clusters: - name: local cluster: certificate-authority: /etc/kubernetes/ssl/ca.pem users: - name: kubelet user: client-certificate: /etc/kubernetes/ssl/worker.pem client-key: /etc/kubernetes/ssl/worker-key.pem contexts: - context: cluster: local user: kubelet name: kubelet-context current-context: kubelet-context - path: "/etc/kubernetes/worker-kubeconfig-kubelet.yaml" content: | apiVersion: v1 kind: Config clusters: - name: local cluster: certificate-authority: /etc/kubernetes/ssl/ca.pem users: - name: kubelet user: client-certificate: /etc/kubernetes/ssl/worker.pem client-key: /etc/kubernetes/ssl/worker-key.pem contexts: - context: cluster: local user: kubelet name: kubelet-context current-context: kubelet-context
  • 28. Demo!