In this webinar experts from DataStax - the lead developer of Cassandra - and from MayaData - the lead developer of OpenEBS and LitmusChaos - will discuss and demonstrate ways to ensure the ease of use and resilience of Cassandra on Kubernetes.
Topics to be discussed and demonstrated include:
Provisioning underlying storage - how to make it consistent irrespective of the underlying hardware or cloud? Are there are ever reasons to have the storage replicate across nodes or is dynamic LocalPV the best choice in all cases?
Cass Operator - DataStax Kubernetes Operator for Apache Cassandra
Resilience - how to proactively assess the overall environment including the underlying Kubernetes with the help of Litmus
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help of OpenEBS & LitmusChaos
1. MayaData
The Data Agility company
• Operating Cassandra on K8s
with the help of OpenEBS & LitmusChaos
September 24, 2020
2. Your Presenters
Follow #OpenEBS channel on Kubernetes Slack
https://kubernetes.slack.com/messages/openebs/
@muratkarslioglu @usatamilan @ThatMightBePaul
Saravanan
Chinnachamy
Solutions Architect
DataStax
Paul Burt
Director of Community
and Marketing
MayaData Inc
Murat Karslioglu
VP of Products
MayaData Inc
murat paulb
3. ● Code is marketing
● Contributing in CNCF ecosystem
● 19 CKAs & growing
● 4X yr/yr growth in container pulls
● #1 CNS in trial per CNCF survey
● Rapidly becoming the defacto
standard for stateful workloads
on Kubernetes
Who is MayaData? Popularity of OpenEBS?
3 | Container Attached Storage
4. Data On Kubernetes Community (DOKC)
DOKCs is an openly governed and self-organizing group of
curious and experienced operators and engineers concerned
with running data-intensive workloads on Kubernetes.
This week on DOKC:
Data on Kubernetes and container
attached storage - an update
PLEASE REGISTER at https://go.dok.community/register
4 | Container Attached Storage
Demetrios
Brinkmann
https://dok.community/
5. Agenda
5 | Container Attached Storage
● Kubernetes for stateful workloads
● Container storage best practices
● Cass Operator - Introduction
● Build trust with LitmusChaos
10. K8s for Stateful : The primitives
● Native interfaces for connecting workloads
(Pods) to Persistent Volumes (PVs).
● Dynamic provisioning of PV via Persistent
Volume Claim (PVC) and Storage Class (SC).
● More abstraction through community
efforts around Persistent Volumes (PV) and
Persistent Volume Claims (PVC) and
Container Storage Interface (CSI)
● CSI to handle vendor specific needs and
avoid wildfire of “volume plugins” or
“drivers” in K8s main repo
In-tree Volumes
Flex
Volumes
CSI
PV
Volume
PVC
SC
External Provisioner
Topology Aware
Pools
Data Source
10 | Container Attached Storage
11. K8s for Stateful: Can’t I just?
Cassandra
Cluster Redis
Micro
service 1 Micro
service 2
UI
REST API
CACHE
service db n
Of course you can. And you do.
However you lose so many
benefits of moving to
Kubernetes.
Most workloads just use
Direct Attached Storage
instead.
CSI
11 | Container Attached Storage
12. UI Middle
ware
DB
Cassandra
Cluster Redis
Micro
service 1 Micro
service 2
UI
REST API
CACHE
service db n
A shared storage system is a complex
monolithic distributed system built before
Kubernetes
These systems have DBs for metadata
They have provisioning systems
They have retry & other logic
They take all the IO, mix it together, and do
their best
Designed when storage media was slow
and apps were NOT resilient
K8s for Stateful: What is inside that SAN?
12 | Container Attached Storage
13. K8s for Stateful : CNCF Survey
Container Attached
Storage most evaluated
13 | Container Attached Storage
14. Conway’s Law for Data Management
14 | Container Attached Storage
loosely coupled teams
loosely coupled
applications
loosely coupled data
16. Summary - challenges around K8s storage
Conway’s Law1
Per workload,
per team
Shared
everything
Costly lock-in2 Process mismatch &
100x more dynamism
3
Traditional
processes
Automated
Kube - Ops
instructions
External know-how required for traditional storage4
vs
vs
SRE
Container native
Under-utilized K8s investment!5
Data
gravity
Storage sys.instructions
External storage
16 | Container Attached Storage
18. CAS : Apps have changed ...
● Meta languages, Go, Rust,...
● Apps are often distributed systems themselves
○ Is a distributed storage system still needed?
● Designed to fail and expected to fail
○ Across racks, DC’s, regions and providers, physical or virtual
● Scalability batteries included
○ HaProxy, Envoy, Nginx, Auto scaling
● Loosely Coupled. Agility:
○ releasing frequently - always changing
18 | Container Attached Storage
19. CAS : Built for Cloud Native IO demands ...
● Datasets of individual containers relativity small in terms of IO
and size
○ Prefer having a collection of small stars over a big sun?
○ Multiple smaller databases than large databases (Conway’s Law)
● Hardware Trends are changing
○ Built using low latency technologies
○ NVMe, DPDK/SPDK
19 | Container Attached Storage
20. CAS: NVMe => Less is More
● NVMe is a protocol that dictates
how bits are moved between the
CPU/device but also -- between
devices
○ Its origin can be found with Infi Band
used in HPC for many years (1999)
● NVMe over Fabrics extends the
protocol over TCP, RDMA, FC, virtio
● A complete replacement of the
SCSI protocol which goes back all
the way to 1978
block layer
SCSI
SAS
SAS
SCSI
NVMe
device device
App App
kernel bypass
20 | Container Attached Storage
21. CAS: Impacts of HW changes on the stack
● Packets come in at a very high rate, single CPU 100% how to scale?
○ CPU has ~67ns per packet @3GHz
● Solution: spread across multiple cores which requires locking
○ Locks are expensive and locks are in memory which is 70-40ns away?
● Amdahl's law starts to dominate the performance envelope
● Context switches and system calls have gotten far more expensive
post spectere meltdown
● What we seem to need are lockless queues that scale per core
○ Poll mode drivers
● Partial rewrites are inevitable, the rewards are high
○ VPP, Open vSwitch,
21 | Container Attached Storage
22. CAS : Pattern - Distributed Apps
PVC 2
Local PV 2
22 | Container Attached Storage
PVC 1
Local PV 1
PVC 3
Mayastor
App (R2)App (R1) App (R3)
Hostpath
Block Device
Local ZFS / LVM
Mayastor
23. Connect a Stateful App to OpenEBS LocalPV
23 | Container Attached Storage
LocalPV HostPath
Node 3
LocalPV Device
Node 1
ZFS LocalPV
Node 2
ZFS Pool
Application
Namespace
Internet
Physical Hard disks
WaitForFirst
Consumer
Persistent
Volume for
Application
Create LocalPV
StorageClass
XFS or EXT:
NDM knows if
disk is in use
Creates
volume in
user
defined
ZFS pool
STS with
Node
Selectors
25. Connect a Stateful App to OpenEBS cStor Storage
25 | Container Attached Storage
iSCSI Target
(PV)
cStor Replica Pods
OpenEBS Namespace
Application
Namespace
Internet
Physical Hard Disks
cStor Replica
Deployments
(nodeSelectors)
Stateful
Application
Running
Inside Pod in
Kubernetes
Persistent
Volume for
Application
Create a cStor
StorageClass
cStor
Replica Pod
with 8
Disks
26. CAS - using K8S as a data layer
SQL
NoSQL
DB
KV DB
Micro service 1
Micro service 2
UI
REST API
CACHE service
db n
Every workload & team its own system
Different engines for different workloads
Built on Kubernetes for Kubernetes
Delivers the benefits of for data
- No lock-in
- Open source
- Runs consistently everywhere
- Any underlying cloud or disk
or SAN
AND the right architecture for NVMe
26 | Container Attached Storage
27. Summary - Unleash the power of K8s with CAS
Built on Kubernetes for Kubernetes
Avoid
Lock-in
Compute
Network
Data
Data portability and
no gravity issues
Based on OpenEBS, Open Source,
5th largest CNCF contributor
50-70%
reduced
storage TCO
Native Hyperconvergence:
Truly Cloud Native Storage
for Kubernetes
High Availability -
limited blast radius
SRE focus on K8s,
and K8s only
Storage.instructions
Cross Cloud visibility
Multi-platform abstraction
Container attached storage using
disk within the K8s cluster.
Workload-down storage - ideal for
small micro ‘2 pizza’ teams
27 | Container Attached Storage
29. Cassandra - Today
29 | Container Attached Storage
● Identify and Prepare Hardware
● Firewall and Ports
● Install DSE/Cassandra
● Install DataStax Agent
● Edit Config Files
● Start Seed Nodes
● Command Line or JMX
● OpsCenter
30. Cass Operator – Codify best practices
30 | Container Attached Storage
● Embed best practices into automated orchestration
● Proper token ring initialization, with only one node bootstrapping at
a time
● Seed node management -
○ one per rack, or three per datacenter, whichever is more
● Server configuration integrated into the CassandraDatacenter CRD
○ Rolling reboot nodes by changing the CRD
○ Store data in a rack-safe way - one replica per cloud AZ
○ Scale up racks evenly with new nodes
○ Replace dead/unrecoverable nodes
● Multi DC clusters (limited to one Kubernetes namespace)
31. Cass Operator - Synergistic & Turnkey
31 | Container Attached Storage
Now
● DSE 6.8.X and OSS 3.11.6/7 (TP)
● OSS Cassandra (TP) & DSE (GA)
● GKE, EKS, AKS, PKS, and k8s
1.13+ supported as certified
platforms
● Powers Astra – SAS Offering
● Monitoring with Prometheus
and Grafana
Future
● More certified platforms
(OpenShift coming soon)
● Better multi-region
● OSS Cassandra GA release
○ Backups
○ Repair
○ Monitoring
32. CassandraDatacenter Custom Resource
32 | Container Attached Storage
● Kubernetes Custom Resource
● Describes a logical C* Datacenter
○ Size
○ Racks (and Availability Zones)
○ Type: OSS or DSE
○ C* version
○ Storage information
○ Configuration overrides
● Submitted to the k8s cluster with
kubectl
●
apiVersion: datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
name: dc1
spec:
clusterName: cluster1
serverType: cassandra
serverVersion: 3.11.6
size: 3
config:
cassandra-yaml:
num_tokens: 12
33. Data Center – CRD Resource
33 | Container Attached Storage
cass-operator
WebHook Security
Operator
dc1 cluster1-dc1-default-sts
cluster1-dc1-default-sts-0
cluster1-dc1-default-sts-1
cluster1-dc1-default-sts-2
server-storage
cluster1-superuser
cluster1-dc1
-all-pods-service
cluster1-dc1-
service
cluster1-dc1-
seed-service
34. Cassandra Container
34 | Container Attached Storage
Kubernetes Worker Node
StatefulSet (Rack)
Pod (C* Node)
C* with Mgt API Side Car
Config Builder
Busybox (tail system log)
36. K8s Pod == C* Node
36 | Container Attached Storage
A workload API object used to
manage stateful applications.
Manages the deployment and
scaling of a set of Pods, and
provides guarantees about the
ordering and uniqueness of these
Pods.
41. Pod-Delete Experiment on Cassandra
41 | Container Attached Storage
Hypothesis
● Upon killing the replica of
Cassandra statefulset, the
load will be re-distributed
over the Cassandra ring.
● After the deletion of the
replica, the new replica will
be created by the replica
controller to maintain the
desired count of the
available replicas.
# kubectl describe chaos result cassandra-chaos-cassandra-pod-delete -n litmus
Name: cassandra-chaos-cassandra-pod-delete
Namespace: litmus
Labels: chaosUID=22d5ba06-1fe8-4b01-bdbb-6246e1cdb2c9
type=ChaosResult
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"litmuschaos.io/v1alpha1","kind":"ChaosResult","metadata":{"annot
ations":{},"labels":{"chaosUID":"22d5ba06-1fe8-4b01-bdbb-62...
API Version: litmuschaos.io/v1alpha1
Kind: ChaosResult
Metadata:
Creation Timestamp: 2020-07-16T12:28:00Z
Generation: 10
Resource Version: 20451
Self Link:
/apis/litmuschaos.io/v1alpha1/namespaces/litmus/chaosresults/cassandra-chaos-ca
ssandra-pod-delete
UID: 32ea5093-47ee-41d8-bc34-e513edb46660
Spec:
Engine: cassandra-chaos
Experiment: cassandra-pod-delete
Status:
Experimentstatus:
Fail Step: N/A
Phase: Completed
Verdict: Pass
Events: <none>
42. Kubera = Multicloud Data Agility toolset
https://account.mayadata.io/login
● Off-cluster Log Storage ● Turn K8s into data layer
● Visualize and Monitor
Kubernetes Topology
● Data Protection & DR
42 | Container Attached Storage
43. MayaData Kubera
Unlimited Team
Management
OpenEBS
Litmus
Many more
CNCF
projects ...
Automated
Deployment
Provisioning,
configuration
templates
Bug fix
supportTopology
Views
Long life
support
Application
discovery
CPU and I/O
performance
budgeting
Unlimited
Alerting
Managed
resilience
SLAs
1yr log
retention
Performance
monitoring
Adaptive
Backup and
Restore
Advanced
Anomaly
Detection
Granular
visualization
Advanced
compliance
Adaptive Cost
Management
Capacity
reporting
Workload
Backup
AI OpsPre-flight
check
Active
Health
Check
ChatOps
K8s Data
Management
Mutl-vendor
Data
Management
CSI Lifecycle
Management
Guided
Benchmark
Advanced
Workload
Back-up
GitOps
Managed
Backup
OnPrem
Air
Gapped
AD
Authentication
Proactive
DR
Chaos
Engineering
Workload
Lifecycle
Management
Open Source
Kubera BASIC
Kubera STANDARD
Kubera ENTERPRISE
43 | Container Attached Storage