SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Dockerizing Cassandra on Modern Linux
Myself & Instaclustr
• Adam Zegelin — Founding Software Engineer & Co-founder of Instaclustr

adam@instaclustr.com · @zegelin
• Managed DataStax Enterprise and Apache Cassandra in the ☁ 

(AWS, Azure, SoftLayer)
• Self-service dashboard — create, manage & monitor clusters
• 24/7/365 support, on-call engineers, uptime guarantee
• Focus on developing your awesome apps — we handle the Cassandra
• Grew from a need for Cassandra in a project
2© 2015. All Rights Reserved.
Nodes — Software Stack
• CoreOS — lightweight OS
• Docker — containerisation of everything
• systemd — service managemen
• journald — logging
• D-Bus — controlling systemd from Java from inside containers
3© 2015. All Rights Reserved.
Initial Implementation
• Amazon Web Services only
• Custom Ubuntu AMI (Amazon Machine Image)
• Based on stock Ubuntu AMI
• 2 AMIs (PV/HVM) × 9 regions = 18 images per version!

(became unmaintainable very quickly)
• Custom cloud-init scripts — RAID disks, fetch config, etc.
• Cassandra installed with apt-get install cassandra / dse
4© 2015. All Rights Reserved.
Initial Implementation — AWS
• We selected instance storage backed AWS instances
• Instance storage is fast (SSDs) and low latency (local disk) but is volatile
— terminate the instance and all your data is gone
• The alternative, EBS (Elastic Block Storage), is basically SAN — slower,
higher latency and originally shared instance network bandwidth
• The newer c4.x and m4.x instances are “EBS optimised” and don’t share these limitations
• Only way to change AMI is to start a new machine
• Not possible to use immutable images with persistent ephemeral data
• Only feasible solution for updates is apt-get install
5© 2015. All Rights Reserved.
• One of the first “Docker Operating Systems”
• Available on every provider we support — AWS, Azure, SoftLayer
• CoreOS has pre-built images
• Small and minimalist — not much userland (not even man!)
• Other useful software — etcd, fleet, etc.

(we currently don’t use them — but maybe in the future)
• In-use by some big players (Rackspace, PlayStation, Instaclustr 😀 )
• Recent funding from Google Ventures
6© 2015. All Rights Reserved.
• Container runtime + standardised image distribution & hosting + ecosystem
• Private image hosting options available, such as quay.io
• Immutable images — Yay! 🎉
• Images running in dev, test and production environments are equal
• Software installs, upgrades and uninstalls are clean
• Components are isolated — potentially conflicting components (different library
versions, JVM versions, etc.) can co-exist
• Even different userland layouts (Ubuntu, Debian, CentOS, etc)
7© 2015. All Rights Reserved.
• We containerise everything — C*, internal services, node
management and monitoring apps
• Single, well understood, image build and deploy process —
docker build & docker push
• Executed via Makefiles — one Make target per image — make push-all builds
and pushes everything
• Helps that all our internal apps are Java-based too
8© 2015. All Rights Reserved.
• Docker gives us immutable images for our components without
instance replacement
• CoreOS handles the rest (OS-level) via in-place updates
• Docker is provider agnostic
• CoreOS runs on all major cloud providers and bare-metal
• The result ☞ Instaclustr-managed C* can run anywhere #
9© 2015. All Rights Reserved.
+
systemd
• CoreOS uses systemd for service management
• systemd supports inter-service dependencies
• e.g. cassandra-backups.service “wants” cassandra.service
• aka, cassandra-backups can only run when cassandra is running
• systemd can automatically restart services
• Instaclustr services are fail-fast
• Cassandra not so much — in some cases — watchdog?
10© 2015. All Rights Reserved.
systemd cont’d
• Manages units of different types — service, timer, target, etc.
• service units manage processes
• timers start services on a schedule (ala cron)
• targets are for grouping/sync points
• cassandra.target “wants” cassandra.service, monitoring.serivce, datastax-
agent.service, backups.timer, etc
• All units can define dependencies and conflicts
• Dependencies of different “strengths” — Wants vs. Requires
• In both directions — Requires and RequiredBy
11© 2015. All Rights Reserved.
Basic Integration
• Cassandra runs as PID 1 in the container
• 1 primary process per container model
• Runs in foreground mode (-f)
• Responds to SIGTERM via docker stop, systemctl stop, etc
• Cassandra data and configuration is persistent on host
• Survives container restart
• Cassandra data and configuration directories mounted from host

docker run -v /var/lib/instaclustr/etc/cassandra:/etc/cassandra …
12© 2015. All Rights Reserved.
Basic Integration cont’d
• Docker containers managed via systemd
• cassandra.service execs docker run cassandra …
• systemctl [start|stop|restart|status|…] cassandra
• Cassandra logging configured to write only to stdout
• systemd logging best practice
• Cassandra ⇢ Docker ⇢ systemd ⇢ journald
• journalctl -u cassandra
13© 2015. All Rights Reserved.
Basic Integration — Issues
• systemd starts dependent units when state is active
• process running = service active — unless configured otherwise
• ∴ dependent units start immediately
• process can hang but service stays active
14© 2015. All Rights Reserved.
Cassandra Startup
• JVM starts quickly
• JMX (nodetool) connectivity is available early
• Objects are exposed where they are constructed
• CQL/Thrift available late
• Can be toggled via cassandra.yaml or JMX/nodetool
• When is Cassandra “running”?
• When does cassandra.service transition from activating to active?
• When do dependent services start?
15© 2015. All Rights Reserved.
D-Bus
• RPC between processes
• Notifications
• Socket-based (typically UNIX sockets, but can be TCP)
• Accessible inside a container — mount the socket

docker run -v /run/dbus:/run/dbus -v /run/systemd:/run/systemd …
• Multiple language bindings, including Java
16© 2015. All Rights Reserved.
D-Bus cont’d
• systemd is controlable via D-Bus
• Control host systemd inside a Docker container
• No need to fork/exec to run systemctl and co.

(in-fact, systemctl is a wrapper around D-Bus calls)
17© 2015. All Rights Reserved.
D-Bus cont’d
Java bindings — dbus-java
systemctl restart cassandra
≝
systemdManager.RestartUnit("cassandra.service", "replace");
18© 2015. All Rights Reserved.
Enhanced Integration
• Service status = “active” — process running, or something more?
• Cassandra java process running vs. C* accepting CQL connections
• CQL clients are dependencies, but shouldn’t start until CQL is available
• Clients could fail-fast on no connectivity
• Will be automatically restarted
• Service will oscillate between active and failed — hard to detect
actual failures
• systemd will eventually timeout or give up — configurable
• JVM startup can be expensive — CPU usage spikes
19© 2015. All Rights Reserved.
Enhanced Integration cont’d
• systemd targets for CQL & Thrift — cassandra-cql.target
• Life-cycle tracks internal C* service
• i.e., Starts when CQL is available — not immediate
• nodetool disablebinary implies systemctl stop cassandra-cql.target
• Services that require CQL connectivity use

WantedBy=cassandra-cql.target
• Starting cassandra-cql.target starts these services too
• Inverse of Wants
20© 2015. All Rights Reserved.
Enhanced Integration cont’d
• Java Agent side-loaded into Cassandra JVM
• Hooks into CQL/Thrift service life-cycle
• Implemented using runtime byte-code modification
• Controls systemd via D-Bus to start/stop associated
target units
• But Cassandra is open-source — why not modify‽
• Agents work with DSE & Apache Cassandra
21© 2015. All Rights Reserved.
Java Agent
• Java Agents (java.lang.instrument)
• java -javaagent:instaclustr-agent.jar …
• premain(…) method called at JVM startup
• can hook into JVM class-loading, transform byte-code, etc.
• Javassist, ASM — byte-code modification libraries
22© 2015. All Rights Reserved.
Hooks
public interface Server {

public void start();



public void stop();
⋮

}
// in CassandraDaemon:
// Thrift

thriftServer = new ThriftServer(rpcAddr, rpcPort, listenBacklog);
⋮

thriftServer.start();
⋮

thriftServer.stop();


// CQL

nativeServer = new org.apache.cassandra.transport.Server(nativeAddr, nativePort);
⋮
nativeServer.start();
⋮
nativeServer.stop();
23© 2015. All Rights Reserved.
Hooks
public static void premain(String agentArgs, Instrumentation inst) {

inst.addTransformer((loader, className, classBeingRedefined, protectionDomain, classfileBuffer) -> {

if (!"org/apache/cassandra/transport/Server".equals(className))

return null;



final ClassPool pool = ClassPool.getDefault();

try {

final CtClass ctClass = pool.get("org.apache.cassandra.transport.Server");

// patch start() and stop() methods of the Server class

{

final CtMethod method = ctClass.getDeclaredMethod("start");

method.insertAfter("com.instaclustr.Agent.serverStarted($0);");

}

{

final CtMethod method = ctClass.getDeclaredMethod("stop");

method.insertAfter("com.instaclustr.Agent.serverStopped($0);");

}



byte[] byteCode = ctClass.toBytecode();

ctClass.detach();



return byteCode; // return the modified byte-code



} catch (final Exception e) {…}



return null;

});

}
// called when Server started — call systemd via dbus-java to start cassandra-cql.target
public static void serverStarted(final CassandraDaemon.Server server) {…}

// called when Server stopped — call systemd via dbus-java to stop cassandra-cql.target

public static void serverStopped(final CassandraDaemon.Server server) {…}
24© 2015. All Rights Reserved.
Docker Limitations and Sore Spots
• docker run is just a TTY proxy — actual container process is under
the docker dæmon process/cgroup
• systemd requires startup & watchdog notifications to originate
from started process, child, or process in same cgroup
• docker crash = all containers go bye-bye
• docker … everything — inc. image downloads & builds — runs as
root in the dæmon!
• processes inside containers are run un-elevated
25© 2015. All Rights Reserved.
Future
• Devel. systemd can now launch Docker containers natively via
machinectl
• Tighter integration with systemd
• Process hierarchy is correct — right cgroup and parents
• Java Agent can notify systemd for startup, status &
watchdog — via JNA + libsystemd
26© 2015. All Rights Reserved.
Thanks!

Más contenido relacionado

La actualidad más candente

Docker - The Linux Container
Docker - The Linux ContainerDocker - The Linux Container
Docker - The Linux Container
Balaji Rajan
 

La actualidad más candente (20)

Docker - The Linux Container
Docker - The Linux ContainerDocker - The Linux Container
Docker - The Linux Container
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
Optimizing Docker Images
Optimizing Docker ImagesOptimizing Docker Images
Optimizing Docker Images
 
Basic docker for developer
Basic docker for developerBasic docker for developer
Basic docker for developer
 
Consuming Cinder from Docker
Consuming Cinder from DockerConsuming Cinder from Docker
Consuming Cinder from Docker
 
Bare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and Chef
 
How we dockerized a startup? #meetup #docker
How we dockerized a startup? #meetup #docker How we dockerized a startup? #meetup #docker
How we dockerized a startup? #meetup #docker
 
Docker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCSDocker from A to Z, including Swarm and OCCS
Docker from A to Z, including Swarm and OCCS
 
Docker and containers : Disrupting the virtual machine(VM)
Docker and containers : Disrupting the virtual machine(VM)Docker and containers : Disrupting the virtual machine(VM)
Docker and containers : Disrupting the virtual machine(VM)
 
Wanting distributed volumes - Experiences with ceph-docker
Wanting distributed volumes - Experiences with ceph-dockerWanting distributed volumes - Experiences with ceph-docker
Wanting distributed volumes - Experiences with ceph-docker
 
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
Introduction to Docker and all things containers, Docker Meetup at RelateIQIntroduction to Docker and all things containers, Docker Meetup at RelateIQ
Introduction to Docker and all things containers, Docker Meetup at RelateIQ
 
Orchestrating Docker containers at scale
Orchestrating Docker containers at scaleOrchestrating Docker containers at scale
Orchestrating Docker containers at scale
 
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
 
Docker Ecosystem on Azure
Docker Ecosystem on AzureDocker Ecosystem on Azure
Docker Ecosystem on Azure
 
Nebulaworks Docker Overview 09-22-2015
Nebulaworks Docker Overview 09-22-2015Nebulaworks Docker Overview 09-22-2015
Nebulaworks Docker Overview 09-22-2015
 
Docker Intro at the Google Developer Group and Google Cloud Platform Meet Up
Docker Intro at the Google Developer Group and Google Cloud Platform Meet UpDocker Intro at the Google Developer Group and Google Cloud Platform Meet Up
Docker Intro at the Google Developer Group and Google Cloud Platform Meet Up
 
Hypervisor "versus" Linux Containers with Docker !
Hypervisor "versus" Linux Containers with Docker !Hypervisor "versus" Linux Containers with Docker !
Hypervisor "versus" Linux Containers with Docker !
 
virtualization-vs-containerization-paas
virtualization-vs-containerization-paasvirtualization-vs-containerization-paas
virtualization-vs-containerization-paas
 
Introduction to Docker - Docker workshop @Twitter
Introduction to Docker - Docker workshop @TwitterIntroduction to Docker - Docker workshop @Twitter
Introduction to Docker - Docker workshop @Twitter
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 

Similar a DataStax: Dockerizing Cassandra on Modern Linux

Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps Revolution
Yulian Slobodyan
 
Openstack devops challenges
Openstack devops challenges Openstack devops challenges
Openstack devops challenges
openstackindia
 

Similar a DataStax: Dockerizing Cassandra on Modern Linux (20)

Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
 
Getting Started with Apache CloudStack
Getting Started with Apache CloudStackGetting Started with Apache CloudStack
Getting Started with Apache CloudStack
 
Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018Kubernetes Manchester - 6th December 2018
Kubernetes Manchester - 6th December 2018
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
How we scale DroneCi on demand
How we scale DroneCi on demandHow we scale DroneCi on demand
How we scale DroneCi on demand
 
Rami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with DockerRami Sayar - Node microservices with Docker
Rami Sayar - Node microservices with Docker
 
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
JDD 2016 - Jacek Bukowski - "Flying To Clouds" - Can It Be Easy?
 
Flying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native ApplicationsFlying to clouds - can it be easy? Cloud Native Applications
Flying to clouds - can it be easy? Cloud Native Applications
 
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
 
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSPutting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps Revolution
 
Clocker - The Docker Cloud Maker
Clocker - The Docker Cloud MakerClocker - The Docker Cloud Maker
Clocker - The Docker Cloud Maker
 
Openstack devops challenges
Openstack devops challenges Openstack devops challenges
Openstack devops challenges
 
Docker and Puppet for Continuous Integration
Docker and Puppet for Continuous IntegrationDocker and Puppet for Continuous Integration
Docker and Puppet for Continuous Integration
 
Clocker: Managing Container Networking and Placement
Clocker: Managing Container Networking and PlacementClocker: Managing Container Networking and Placement
Clocker: Managing Container Networking and Placement
 
Intro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containersIntro to cluster scheduler for Linux containers
Intro to cluster scheduler for Linux containers
 
stackconf 2020 | Replace your Docker based Containers with Cri-o Kata Contain...
stackconf 2020 | Replace your Docker based Containers with Cri-o Kata Contain...stackconf 2020 | Replace your Docker based Containers with Cri-o Kata Contain...
stackconf 2020 | Replace your Docker based Containers with Cri-o Kata Contain...
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Climb Technical Overview
Climb Technical OverviewClimb Technical Overview
Climb Technical Overview
 

Más de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Más de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

DataStax: Dockerizing Cassandra on Modern Linux

  • 2. Myself & Instaclustr • Adam Zegelin — Founding Software Engineer & Co-founder of Instaclustr
 adam@instaclustr.com · @zegelin • Managed DataStax Enterprise and Apache Cassandra in the ☁ 
 (AWS, Azure, SoftLayer) • Self-service dashboard — create, manage & monitor clusters • 24/7/365 support, on-call engineers, uptime guarantee • Focus on developing your awesome apps — we handle the Cassandra • Grew from a need for Cassandra in a project 2© 2015. All Rights Reserved.
  • 3. Nodes — Software Stack • CoreOS — lightweight OS • Docker — containerisation of everything • systemd — service managemen • journald — logging • D-Bus — controlling systemd from Java from inside containers 3© 2015. All Rights Reserved.
  • 4. Initial Implementation • Amazon Web Services only • Custom Ubuntu AMI (Amazon Machine Image) • Based on stock Ubuntu AMI • 2 AMIs (PV/HVM) × 9 regions = 18 images per version!
 (became unmaintainable very quickly) • Custom cloud-init scripts — RAID disks, fetch config, etc. • Cassandra installed with apt-get install cassandra / dse 4© 2015. All Rights Reserved.
  • 5. Initial Implementation — AWS • We selected instance storage backed AWS instances • Instance storage is fast (SSDs) and low latency (local disk) but is volatile — terminate the instance and all your data is gone • The alternative, EBS (Elastic Block Storage), is basically SAN — slower, higher latency and originally shared instance network bandwidth • The newer c4.x and m4.x instances are “EBS optimised” and don’t share these limitations • Only way to change AMI is to start a new machine • Not possible to use immutable images with persistent ephemeral data • Only feasible solution for updates is apt-get install 5© 2015. All Rights Reserved.
  • 6. • One of the first “Docker Operating Systems” • Available on every provider we support — AWS, Azure, SoftLayer • CoreOS has pre-built images • Small and minimalist — not much userland (not even man!) • Other useful software — etcd, fleet, etc.
 (we currently don’t use them — but maybe in the future) • In-use by some big players (Rackspace, PlayStation, Instaclustr 😀 ) • Recent funding from Google Ventures 6© 2015. All Rights Reserved.
  • 7. • Container runtime + standardised image distribution & hosting + ecosystem • Private image hosting options available, such as quay.io • Immutable images — Yay! 🎉 • Images running in dev, test and production environments are equal • Software installs, upgrades and uninstalls are clean • Components are isolated — potentially conflicting components (different library versions, JVM versions, etc.) can co-exist • Even different userland layouts (Ubuntu, Debian, CentOS, etc) 7© 2015. All Rights Reserved.
  • 8. • We containerise everything — C*, internal services, node management and monitoring apps • Single, well understood, image build and deploy process — docker build & docker push • Executed via Makefiles — one Make target per image — make push-all builds and pushes everything • Helps that all our internal apps are Java-based too 8© 2015. All Rights Reserved.
  • 9. • Docker gives us immutable images for our components without instance replacement • CoreOS handles the rest (OS-level) via in-place updates • Docker is provider agnostic • CoreOS runs on all major cloud providers and bare-metal • The result ☞ Instaclustr-managed C* can run anywhere # 9© 2015. All Rights Reserved. +
  • 10. systemd • CoreOS uses systemd for service management • systemd supports inter-service dependencies • e.g. cassandra-backups.service “wants” cassandra.service • aka, cassandra-backups can only run when cassandra is running • systemd can automatically restart services • Instaclustr services are fail-fast • Cassandra not so much — in some cases — watchdog? 10© 2015. All Rights Reserved.
  • 11. systemd cont’d • Manages units of different types — service, timer, target, etc. • service units manage processes • timers start services on a schedule (ala cron) • targets are for grouping/sync points • cassandra.target “wants” cassandra.service, monitoring.serivce, datastax- agent.service, backups.timer, etc • All units can define dependencies and conflicts • Dependencies of different “strengths” — Wants vs. Requires • In both directions — Requires and RequiredBy 11© 2015. All Rights Reserved.
  • 12. Basic Integration • Cassandra runs as PID 1 in the container • 1 primary process per container model • Runs in foreground mode (-f) • Responds to SIGTERM via docker stop, systemctl stop, etc • Cassandra data and configuration is persistent on host • Survives container restart • Cassandra data and configuration directories mounted from host
 docker run -v /var/lib/instaclustr/etc/cassandra:/etc/cassandra … 12© 2015. All Rights Reserved.
  • 13. Basic Integration cont’d • Docker containers managed via systemd • cassandra.service execs docker run cassandra … • systemctl [start|stop|restart|status|…] cassandra • Cassandra logging configured to write only to stdout • systemd logging best practice • Cassandra ⇢ Docker ⇢ systemd ⇢ journald • journalctl -u cassandra 13© 2015. All Rights Reserved.
  • 14. Basic Integration — Issues • systemd starts dependent units when state is active • process running = service active — unless configured otherwise • ∴ dependent units start immediately • process can hang but service stays active 14© 2015. All Rights Reserved.
  • 15. Cassandra Startup • JVM starts quickly • JMX (nodetool) connectivity is available early • Objects are exposed where they are constructed • CQL/Thrift available late • Can be toggled via cassandra.yaml or JMX/nodetool • When is Cassandra “running”? • When does cassandra.service transition from activating to active? • When do dependent services start? 15© 2015. All Rights Reserved.
  • 16. D-Bus • RPC between processes • Notifications • Socket-based (typically UNIX sockets, but can be TCP) • Accessible inside a container — mount the socket
 docker run -v /run/dbus:/run/dbus -v /run/systemd:/run/systemd … • Multiple language bindings, including Java 16© 2015. All Rights Reserved.
  • 17. D-Bus cont’d • systemd is controlable via D-Bus • Control host systemd inside a Docker container • No need to fork/exec to run systemctl and co.
 (in-fact, systemctl is a wrapper around D-Bus calls) 17© 2015. All Rights Reserved.
  • 18. D-Bus cont’d Java bindings — dbus-java systemctl restart cassandra ≝ systemdManager.RestartUnit("cassandra.service", "replace"); 18© 2015. All Rights Reserved.
  • 19. Enhanced Integration • Service status = “active” — process running, or something more? • Cassandra java process running vs. C* accepting CQL connections • CQL clients are dependencies, but shouldn’t start until CQL is available • Clients could fail-fast on no connectivity • Will be automatically restarted • Service will oscillate between active and failed — hard to detect actual failures • systemd will eventually timeout or give up — configurable • JVM startup can be expensive — CPU usage spikes 19© 2015. All Rights Reserved.
  • 20. Enhanced Integration cont’d • systemd targets for CQL & Thrift — cassandra-cql.target • Life-cycle tracks internal C* service • i.e., Starts when CQL is available — not immediate • nodetool disablebinary implies systemctl stop cassandra-cql.target • Services that require CQL connectivity use
 WantedBy=cassandra-cql.target • Starting cassandra-cql.target starts these services too • Inverse of Wants 20© 2015. All Rights Reserved.
  • 21. Enhanced Integration cont’d • Java Agent side-loaded into Cassandra JVM • Hooks into CQL/Thrift service life-cycle • Implemented using runtime byte-code modification • Controls systemd via D-Bus to start/stop associated target units • But Cassandra is open-source — why not modify‽ • Agents work with DSE & Apache Cassandra 21© 2015. All Rights Reserved.
  • 22. Java Agent • Java Agents (java.lang.instrument) • java -javaagent:instaclustr-agent.jar … • premain(…) method called at JVM startup • can hook into JVM class-loading, transform byte-code, etc. • Javassist, ASM — byte-code modification libraries 22© 2015. All Rights Reserved.
  • 23. Hooks public interface Server {
 public void start();
 
 public void stop(); ⋮
 } // in CassandraDaemon: // Thrift
 thriftServer = new ThriftServer(rpcAddr, rpcPort, listenBacklog); ⋮
 thriftServer.start(); ⋮
 thriftServer.stop(); 
 // CQL
 nativeServer = new org.apache.cassandra.transport.Server(nativeAddr, nativePort); ⋮ nativeServer.start(); ⋮ nativeServer.stop(); 23© 2015. All Rights Reserved.
  • 24. Hooks public static void premain(String agentArgs, Instrumentation inst) {
 inst.addTransformer((loader, className, classBeingRedefined, protectionDomain, classfileBuffer) -> {
 if (!"org/apache/cassandra/transport/Server".equals(className))
 return null;
 
 final ClassPool pool = ClassPool.getDefault();
 try {
 final CtClass ctClass = pool.get("org.apache.cassandra.transport.Server");
 // patch start() and stop() methods of the Server class
 {
 final CtMethod method = ctClass.getDeclaredMethod("start");
 method.insertAfter("com.instaclustr.Agent.serverStarted($0);");
 }
 {
 final CtMethod method = ctClass.getDeclaredMethod("stop");
 method.insertAfter("com.instaclustr.Agent.serverStopped($0);");
 }
 
 byte[] byteCode = ctClass.toBytecode();
 ctClass.detach();
 
 return byteCode; // return the modified byte-code
 
 } catch (final Exception e) {…}
 
 return null;
 });
 } // called when Server started — call systemd via dbus-java to start cassandra-cql.target public static void serverStarted(final CassandraDaemon.Server server) {…}
 // called when Server stopped — call systemd via dbus-java to stop cassandra-cql.target
 public static void serverStopped(final CassandraDaemon.Server server) {…} 24© 2015. All Rights Reserved.
  • 25. Docker Limitations and Sore Spots • docker run is just a TTY proxy — actual container process is under the docker dæmon process/cgroup • systemd requires startup & watchdog notifications to originate from started process, child, or process in same cgroup • docker crash = all containers go bye-bye • docker … everything — inc. image downloads & builds — runs as root in the dæmon! • processes inside containers are run un-elevated 25© 2015. All Rights Reserved.
  • 26. Future • Devel. systemd can now launch Docker containers natively via machinectl • Tighter integration with systemd • Process hierarchy is correct — right cgroup and parents • Java Agent can notify systemd for startup, status & watchdog — via JNA + libsystemd 26© 2015. All Rights Reserved.