SlideShare una empresa de Scribd logo
1 de 18
Cassandra Operations at Netflix
Gregg Ulrich


                                  1
Agenda
 Who we are
 How much we use Cassandra
 How we do it
 What we learned




                              2
Who we are
 Cloud Database Engineering
   Development – Cassandra and related tools
   Architecture – data modeling and sizing
   Operations – availability, performance and maintenance
 Operations
   24x7 on-call support for all Cassandra clusters
   Cassandra operations tools
   Proactive problem hunting
   Routine and non-routine maintenances

                                                             3
How much we use Cassandra

30         Number of production clusters
12         Number of multi-region clusters
3          Max regions, one cluster
65         Total TB of data across all clusters
472        Number of Cassandra nodes
72/28      Largest Cassandra cluster (nodes/data in TB)
50k/250k   Max read/writes per second on a single cluster
3*         Size of Operations team

                   * Open position for an additional engineer
                                                                4
I read that Netflix doesn’t have operations
 Extension of Amazon’s PaaS
 Decentralized Cassandra ops is expensive at scale
 Immature product that changes rapidly (and drastically)
 Easily apply best practices across all clusters




                                                            5
How we configure Cassandra in AWS
 Most services get their own Cassandra cluster
 Mostly m2.4xlarge instances, but considering others
 Cassandra and supporting tools baked into the AMI
 Data stored on ephemeral drives
 Data durability – all writes to all availabilty zones
    Alternate AZs in a replication set
    RF = 3


                                                          6
Minimum cluster configuration
 Minimum production cluster configuration – 6 nodes
   3 auto-scaling groups
   2 instances per auto-scaling group
   1 availability zone per auto-scaling group




                                                       7
Minimum cluster configuration, illustrated



ASG1 AZ1
                                   RF=3
ASG2 AZ2               PRIAM



ASG3 AZ3




                                             8
Tools we use
 Administration
   Priam
   Jenkins
 Monitoring and alerting
   Cassandra Explorer
   Dashboards
   Epic




                            9
Tools we use – Priam
 Open-sourced Tomcat webapp running on each instance
 Multi-region token management via SimpleDB
 Node replacement and ring expansion
 Backup and restore
   Full nightly snapshot backup to S3
   Incremental backup of flushed SSTables to S3 every 30 seconds
 Metrics collected via JMX
 REST API to most nodetool functions
                                                                    10
Tools we use – Cassandra Explorer
• Kiosk mode – no
  alerting
• High level cluster
  status (thrift, gossip)
• Warns on a small set
  of metrics




                                    11
Tools we use – Epic
• Netflix-wide
  monitoring and
  alerting tool based on
  RRD
• Priam proxies all JMX
  data to Epic
• Very useful for finding
  specific issues




                            12
Tools we use – Dashboards
• Next level cluster
  metrics
    • Throughput
    • Latency
    • Gossip status
    • Maintenance
      operations
    • Trouble indicators
• Useful for finding
  anomalies
• Most investigations
  start here

                            13
Tools we use – Jenkins
•   Scheduling tool for additional
    monitors and maintenance
    tasks

•   Push button automation for
    recurring tasks

•   Repairs, upgrades, and other
    tasks are only performed
    through Jenkins to preserve
    history of actions

•   On-call dashboard displays
    current issues and maintenance
    required




                                     14
Things we monitor
Cassandra                 System
   Throughput               Disk space
   Latency                  Load average
   Compactions              I/O errors
   Repairs                  Network errors
   Pending threads
   Dropped operations
   Java heap
   SSTable counts
   Cassandra log files
                                               15
Other things we monitor
 Compaction predictions
 Backup failures
 Recent restarts
 Schema changes
 Monitors




                           16
What we learned
 Having Cassandra developers in house is crucial
 Repairs are incredibly expensive
 Multi-tenanted clusters are challenging
 A down node is better than a slow node
 Better to compact on our terms and not Cassandra’s
 Sizing and tuning is difficult and often done live
 Smaller per-node data size is better

                                                       17
Q&A (and Recommended viewing)
     The Best of Times
     Taft and Bakersfield are real places


     South Park
     Later season episodes like F-Word and Elementary School Musical


     Caillou
     My kids love this show; I don’t know why


     Until the Light Takes Us
     Scary documentary on Norwegian Black Metal

                                                                       18

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and Druid
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
 

Similar a Cassandra Operations at Netflix

Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
sandeep_tata
 

Similar a Cassandra Operations at Netflix (20)

BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Cassandra
CassandraCassandra
Cassandra
 
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL PerformanceShift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
Shift into High Gear: Dramatically Improve Hadoop & NoSQL Performance
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
Devops kc
Devops kcDevops kc
Devops kc
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
 
MYSQL
MYSQLMYSQL
MYSQL
 
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Cassandra Operations at Netflix

  • 1. Cassandra Operations at Netflix Gregg Ulrich 1
  • 2. Agenda  Who we are  How much we use Cassandra  How we do it  What we learned 2
  • 3. Who we are  Cloud Database Engineering  Development – Cassandra and related tools  Architecture – data modeling and sizing  Operations – availability, performance and maintenance  Operations  24x7 on-call support for all Cassandra clusters  Cassandra operations tools  Proactive problem hunting  Routine and non-routine maintenances 3
  • 4. How much we use Cassandra 30 Number of production clusters 12 Number of multi-region clusters 3 Max regions, one cluster 65 Total TB of data across all clusters 472 Number of Cassandra nodes 72/28 Largest Cassandra cluster (nodes/data in TB) 50k/250k Max read/writes per second on a single cluster 3* Size of Operations team * Open position for an additional engineer 4
  • 5. I read that Netflix doesn’t have operations  Extension of Amazon’s PaaS  Decentralized Cassandra ops is expensive at scale  Immature product that changes rapidly (and drastically)  Easily apply best practices across all clusters 5
  • 6. How we configure Cassandra in AWS  Most services get their own Cassandra cluster  Mostly m2.4xlarge instances, but considering others  Cassandra and supporting tools baked into the AMI  Data stored on ephemeral drives  Data durability – all writes to all availabilty zones  Alternate AZs in a replication set  RF = 3 6
  • 7. Minimum cluster configuration  Minimum production cluster configuration – 6 nodes  3 auto-scaling groups  2 instances per auto-scaling group  1 availability zone per auto-scaling group 7
  • 8. Minimum cluster configuration, illustrated ASG1 AZ1 RF=3 ASG2 AZ2 PRIAM ASG3 AZ3 8
  • 9. Tools we use  Administration  Priam  Jenkins  Monitoring and alerting  Cassandra Explorer  Dashboards  Epic 9
  • 10. Tools we use – Priam  Open-sourced Tomcat webapp running on each instance  Multi-region token management via SimpleDB  Node replacement and ring expansion  Backup and restore  Full nightly snapshot backup to S3  Incremental backup of flushed SSTables to S3 every 30 seconds  Metrics collected via JMX  REST API to most nodetool functions 10
  • 11. Tools we use – Cassandra Explorer • Kiosk mode – no alerting • High level cluster status (thrift, gossip) • Warns on a small set of metrics 11
  • 12. Tools we use – Epic • Netflix-wide monitoring and alerting tool based on RRD • Priam proxies all JMX data to Epic • Very useful for finding specific issues 12
  • 13. Tools we use – Dashboards • Next level cluster metrics • Throughput • Latency • Gossip status • Maintenance operations • Trouble indicators • Useful for finding anomalies • Most investigations start here 13
  • 14. Tools we use – Jenkins • Scheduling tool for additional monitors and maintenance tasks • Push button automation for recurring tasks • Repairs, upgrades, and other tasks are only performed through Jenkins to preserve history of actions • On-call dashboard displays current issues and maintenance required 14
  • 15. Things we monitor Cassandra System  Throughput  Disk space  Latency  Load average  Compactions  I/O errors  Repairs  Network errors  Pending threads  Dropped operations  Java heap  SSTable counts  Cassandra log files 15
  • 16. Other things we monitor  Compaction predictions  Backup failures  Recent restarts  Schema changes  Monitors 16
  • 17. What we learned  Having Cassandra developers in house is crucial  Repairs are incredibly expensive  Multi-tenanted clusters are challenging  A down node is better than a slow node  Better to compact on our terms and not Cassandra’s  Sizing and tuning is difficult and often done live  Smaller per-node data size is better 17
  • 18. Q&A (and Recommended viewing) The Best of Times Taft and Bakersfield are real places South Park Later season episodes like F-Word and Elementary School Musical Caillou My kids love this show; I don’t know why Until the Light Takes Us Scary documentary on Norwegian Black Metal 18

Notas del editor

  1. Keywords – Agenda
  2. Centralized Cassandra team used as a resource for other teams
  3. Minimum cluster size = 6
  4. Don’t developers do everything?True for most of the services, Cassandra is an exceptionNeeded a team focused on Cassandra so that services could quickly adopt
  5. M2.4xlarge68.4 GB of memory26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)1690 GB of instance storage64-bit platformI/O Performance: HighAPI name: m2.4xlargeEphemeral drives mean that we have to bootstrap new nodes
  6. Brief overview on this slide, go into detail on the next one
  7. Things to cover on this slideHow AWS balances between AZsWhat happens when an AZ goes awayHow PRIAM alternates nodes around the ring, even in MR
  8. (Vijay should have covered a lot of this)Refer back to previous slideREST useful for automation. Do not have to connect to nodes directly or use JMXPriam only supports doubling the ring
  9. Node, AZ and cluster level metricsTime series metrics with extensive historyCan compare multiple metrics one one graphAlso configure to send alerts
  10. Extension of Epic, using preconfigured dashboards for each clusterAdd additional metrics as we learn which to monitor
  11. Cluster level monitoring, or things that we can not easily derive from JMX or Epic
  12. Try to anticipate when a large minor compaction is going to happenFreedom and responsibility has forced us to monitor schema changesWant to understand every time Cassandra restartsAWS very infrequently swaps out bad nodes. Nodes usually become non-responsive
  13. … Developer in house …Quickly find problems by looking into codeDocumentation/tools for troubleshooting are scarce… repairs …Affect entire replication set, cause very high latency in I/O constrained environment… multi-tenant …Hard to track changes being madeShared resources mean that one service can affect another oneIndividual usage only growsMoving services to a new cluster with the service live is non-trivial… smaller per-node data …Instance level operations (bootstrap, compact, etc) are faster