In Cassandra Lunch #58, Rahul Singh will be leading a presentation covering a Cassandra topic we're sure you won't want to miss.
Accompanying Blog: Coming Soon!
Accompanying YouTube: https://youtu.be/l4mAt3MDVZA
Sign Up For Our Newsletter: http://eepurl.com/grdMkn
Join Cassandra Lunch Weekly at 12 PM EST Every Wednesday: https://www.meetup.com/Cassandra-DataStax-DC/events/
Cassandra.Link:
https://cassandra.link/
Follow Us and Reach Us At:
Anant:
https://www.anant.us/
Awesome Cassandra:
https://github.com/Anant/awesome-cassandra
Cassandra.Lunch:
https://github.com/Anant/Cassandra.Lunch
Email:
solutions@anant.us
LinkedIn:
https://www.linkedin.com/company/anant/
Twitter:
https://twitter.com/anantcorp
Eventbrite:
https://www.eventbrite.com/o/anant-1072927283
Facebook:
https://www.facebook.com/AnantCorp/
Join The Anant Team:
https://www.careers.anant.us
2. Create and
manage global
data platforms.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037
3. ARCHITECT
noun: architect; chief builder
verb: architect; design or make (COMPUTING)
“We create and manage global platforms that run on
Cassandra and related technologies.”
4. 4
Things We Love : Scalable Fast Data
Without Datastax
With Datastax
7. MobaXTerm/ DBeaver / Hackolade - Day to Day
1. MobaXTerm - Really good multi-exec
for Windows or (SuperPutty)
2. DBeaver - You can rig the free one to
work with Cassandra, but the paid
one is awesome.
3. Hackolade - Really great tool for
general NoSQL data modeling.
Takeaways
1. Do you really want to keep 100 nodes
info in Putty?
2. Sometimes you just want to help
someone out with a simple query
without having to jump into CQLSH
3. Good design tools like Hackolade also
produce good Docs.
12. DSBulk / Spark - Data Operations
1. DSBulk - Not too shabby way to get
data in / out in CSV or JSON format.
2. Spark Migrator - By Scylla, but totally
works for Cassandra to Cassandra
3. Spark Shell - The way to do updates,
deletes, on thousands of partitions
interactively.
Takeaways
1. DSBulk is free and works with
Cassandra.
2. Spark Migrator is free. Just try it out.
3. You don’t need Datastax to have
Spark with Cassandra, but it makes it
easier.
14. Airflow / Jenkins - Data Operations/Scheduling
1. Airflow - Airflow has become the
defacto tool for managing data
pipelines / operations.
2. Jenkins - Can use this for data
operations if you really want to, if you
already have it.
Takeaways
1. Airflow helps you run all your jobs for
data pipeline processes and manages
dependencies in a DAG.
2. Jenkins can also do this, but can’t do
complex pipelines.
17. Reaper / Medusa - Automatic Repair / Backup
1. Cassandra Reaper - GUI for managing
repairs for Cassandra.
2. Cassandra Medusa - Backup / restore
nodes, clusters ( to/from S3, GCS,
etc.)
Takeaways
1. CRONning your own Repairs is not
sustainable.
2. CRONning your own Backups is not
sustainable.
3. Don’t reinvent the wheel.
21. Cassandra.Vision - Offline Diagnostics
1. Diagnostic-collection- Grab all your
logs, configs, etc for analysis.
2. Cassandra.Vision/CassandraAnalyzer -
Visualize diagnostics/ logs offline
3. Cassandra.Vision/TableAnalyzer-
Visualize data/traffic/tombstone skew
across all tables/keyspaces.
Takeaways
1. If you don’t have ELK online , you can
still use it on your desktop using
offline tools.
2. Analyzing the tablestats visually in
one place can help you avert disaster
later.
25. MCAC / Prometheus + Grafana - Metrics
Takeaways
1. Don’t have OpsCenter, get MCAC
2. Need to keep data for thousands of
nodes, look into Cortex
3. Prometheus/Grafana work with
everything else, you can’t go wrong.
1. Metrics Collector for Apache
Cassandra - All in one package.
2. Cortex - Prometheus at Scale on
Cassandra
3. Prometheus / Grafana - The O.G. of
time series system data vis.
28. 1. Filebeat (or Logstash) - Parse, dissect
logs before sending them to elastic.
2. Elasticsearch / Opendistro- Self
explanatory.
3. Kibana - Search your logs in one place,
visually.
Filebeat / Elastic / Kibana - “Free” Log Analytics
Takeaways
1. Tailing logs in MobaXTerm works for
up to 6, maybe 8 computers.
2. Log aggregationwith intelligent
parsing helps find patterns faster.
3. Having dashboards setup beforehand
makes it even easier.
31. Terraform - Infrastructure as Code
1. Terraform - Manage different clouds
with one language.
2. Atlantis - Manage terraform with
Github
3. Terragrunt - Makes your terraform
code simpler for different
environments.
Takeaways
1. Terraform is the best way to manage
infrastructure as code.
2. Will help operators create and destroy
VM Clusters / configurations.
3. Scaling clusters up and down is easy
with Terraform
34. Ansible - Configuration Management
1. Ansible - Organizes commands that
need to be run better - Setup,
Configure, Run ad-hoc commands.
2. Ansible Semaphore - OS GUI for
Ansible
3. AWX - OS version of Ansible Tower
Takeaways
1. Manage configuration consistently
across nodes/ datacenters.
2. Manage environments easier with
variables / templates.
3. Run rolling commands on a cluster,
data center, or multiple clusters.
39. Docker/ Kubernetes / K8ssandra / Stargate
1. Docker - Customize your image.
2. Kubernetes - Run your cluster.
3. K8ssandra - Run your cluster, easier.
4. Stargate - C* API Layer on Day One
Takeaways
1. Containers are the future, play with
this now.
2. Containers make it easier for people
to test out things without as heavy of a
hardware investment.
3. Developers want APIs when possible.