SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
Apache Cassandra 
Fundamentals 
or: 
How I stopped worrying and learned to love the CAP theorem 
Russell Spitzer 
@RussSpitzer 
Software Engineer in Test at DataStax
Who am I? 
• Former Bioinformatics Student 
at UCSF 
• Work on the integration of 
Cassandra (C*) with Hadoop, 
Solr, and Redacted! 
• I Spend a lot of time spinning up 
clusters on EC2, GCE, Azure, … 
http://www.datastax.com/dev/ 
blog/testing-cassandra-1000- 
nodes-at-a-time 
• Developing new ways to make 
sure that C* Scales
Apache Cassandra is a Linearly Scaling 
and Fault Tolerant noSQL Database 
Linearly Scaling: 
The power of the database 
increases linearly with the 
number of machines 
2x machines = 2x throughput 
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html 
Fault Tolerant: 
Nodes down != Database Down 
Datacenter down != Database Down
CAP Theorem Limits What 
Distributed Systems can do 
Consistency 
When I ask the same question to any part of the system I should get the same answer 
How many planes do we have?
CAP Theorem Limits What 
Distributed Systems can do 
Consistency 
When I ask the same question to any part of the system I should get the same answer 
How many planes do we have? 
Consistent 
1 1 1 1 1 1 1
CAP Theorem Limits What 
Distributed Systems can do 
Consistency 
When I ask the same question to any part of the system I should get the same answer 
How many planes do we have? 
Not Consistent 
1 4 1 2 1 8 1
CAP Theorem Limits What 
Distributed Systems can do 
When I ask a question I will get an answer 
Availability 
How many planes do we have? 
Available 
1 zzzzz *snort* zzz
CAP Theorem Limits What 
Distributed Systems can do 
Availability 
When I ask a question I will get an answer 
How many planes do we have? 
I have to wait for major snooze to wake up 
zzzzz *snort* zzz 
Not Available
CAP Theorem Limits What 
Distributed Systems can do 
Partition Tolerance 
I can ask questions even when the system is having intra-system communication 
problems 
How many planes do we have? 
Team Edward Team Jacob 
1 
Tolerant
CAP Theorem Limits What 
Distributed Systems can do 
Partition Tolerance 
I can ask questions even when the system is having intra-system communication 
problems 
How many planes do we have? 
Not Tolerant 
Team Edward Team Jacob 
I’m not sure without asking those 
vampire lovers and we aren’t speaking
Cassandra is an AP System 
which is Eventually Consistent 
Eventually consistent: 
New information will make it to everyone eventually 
How many planes do we have? How many planes do we have? 
I don’t know without asking those 
vampire lovers and we aren’t speaking 
1 1 1 1 1 1 
I just heard ! 
we actually ! 
have 2 
2 2 2 2 2 2 2
Two knobs control fault tolerance in 
C*: Replication and Consistency Level 
Server Side - Replication: 
How many copies of a data should exist in the cluster? 
Coordinator 
for this operation 
ABD ABC 
ACD 
BCD 
RF=3 
Client 
SimpleStrategy: Replicas 
NetworkTopologyStrategy: Replicas per Datacenter
Two knobs control fault tolerance in 
C*: Replication and Consistency Level 
Client Side - Consistency Level: 
How many replicas should we check before 
acknowledgment? 
ABD ABC 
ACD 
BCD 
Client 
Coordinator 
for this operation 
CL = One
Two knobs control fault tolerance in 
C*: Replication and Consistency Level 
Client Side - Consistency Level: 
How many replicas should we check before 
acknowledgment? 
ABD ABC 
ACD 
BCD 
CL = Quorum 
Client 
Coordinator 
for this operation
Nodes own data whose primary key 
hashes to their their token ranges 
ABD ABC 
ACD 
BCD 
Every piece of data belongs on 
the node who owns the 
Murmur3(2.0) Hash of its 
partition key + (RF-1) other 
nodes 
Partition Key Clustering Key 
Rest of Data 
ID: ICBM_432 Time: 30 
Loc: SF , Status: Idle 
ID: ICBM_432 
Murmur3Hash 
Murmur3: A
Cassandra writes are FAST 
due to log-append storage 
Par Clu Re Memory 
Memtable 
Memtable Memtable 
Commit Log 
Par Clu Re 
Par Clu Re 
Par Clu Re 
Disk Flushed 
SSTable SSTable
Deletes in a distributed 
System are Challenging 
We need to keep records of 
deletions in case of network 
partitions 
Node1 
Node2 Power Outage 
Time 
Tombstone Tombstone 
Tombstone
Compactions merge and 
unify data in our stables 
SSTable 
1 
+ SSTable 
SSTable 
2 3 
Since SSTables are immutable 
this is our chance to 
consolidate rows and remove 
tombstones (After GC Grace)
Layout of Data Allows for Rapid 
Queries Along Clustering Columns 
ID: ICBM_432 
ID: ICBM_900 
ID: ICBM_9210 
Time: 30 
Loc: 
SF 
Status: 
Idle 
Time: 45 
Loc: 
SF 
Status: 
Idle 
Time: 60 
Loc: 
SF 
Status: 
Idle 
Time: 30 
Loc: 
Boston 
Status: 
Idle 
Time: 45 
Loc: 
Boston 
Status: 
Idle 
Time: 60 
Loc: 
Boston 
Status: 
Idle 
Time: 30 
Loc: 
Tulsa 
Status: 
Idle 
Time: 45 
Loc: 
Tulsa 
Status: 
Idle 
Time: 60 
Loc: 
Tulsa 
Status: 
Idle 
Disclaimer: Not exactly like this (Use sstable2json to see real layout)
CQL allows easy definition 
of Table Structures 
ID: ICBM_432 
Time: 30 
Loc: 
SF 
Status: 
Idle 
Time: 45 
Loc: 
SF 
Status: 
Idle 
Time: 60 
Loc: 
SF 
Status: 
Idle 
CREATE TABLE icbmlog ( 
name text, 
time timestamp, 
location text, 
status text, 
PRIMARY KEY (name,time) 
);
Reading data is FAST but 
limited by disk IO 
Memory 
Memtable 
Memtable Memtable 
Commit Log 
Par Clu Re 
Par Clu Re 
Par Clu Re 
Disk 
SSTable SSTable 
Client 
Par Clu Re 
LWW 
Replica 
Par Clu Re
Reading data is FAST but 
limited by disk IO 
Memory 
Memtable 
Memtable Memtable 
Commit Log 
Par Clu Re 
Par Clu Re 
Par Clu Re 
Disk 
SSTable SSTable 
Client 
Par Clu Re 
LWW 
Replica 
Par Clu Re 
Read 
Repair
New Clients provide a 
holistic view of the C* cluster 
Client 
ABD ABC 
ACD 
BCD 
Initial Contact 
Cluster.builder().addContactPoint("127.0.0.1").build()
Session Objects Are used 
for Executing Requests 
session = cluster.connect() 
session.execute("DROP KEYSPACE IF EXISTS icbmkey") 
session.execute("CREATE KEYSPACE icbmkey with 
replication = 
{'class':'SimpleStrategy','replication_factor':'1'}") 
For highest throughput use asynchronous methods 
ResultSetFuture executeAsync(Query query) 
Then add a callback or Queue the ResultSetFutures 
ResultSetFuture 
ResultSetFuture 
ResultSetFuture
Token Aware Policies allow the reduction 
in the number of intra-network requests 
made 
Client 
ABD ABC 
ACD 
BCD 
A
Prepared statements allow for 
sending less data over the wire 
Query is prepared on all nodes by driver 
Prepared batch statements 
can further improve throughput 
PreparedStatement ps = session.prepare("INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?)"); 
BatchStatement batch = new BatchStatement(); 
batch.add(ps.bind(uid, mid1, title1, body1)); 
batch.add(ps.bind(uid, mid2, title2, body2)); 
batch.add(ps.bind(uid, mid3, title3, body3)); 
session.execute(batch);
Avoid 
• Preparing statements more than once 
• Creating batches which are too large 
• Running statements in serial 
• Using consistency-levels above your need 
• Secondary Indexes in your main queries 
• or really at all unless you are doing analytics
Have fun with C* 
Questions?

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
 
Zero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and CassandraZero to Streaming: Spark and Cassandra
Zero to Streaming: Spark and Cassandra
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Spark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and FutureSpark Cassandra Connector: Past, Present, and Future
Spark Cassandra Connector: Past, Present, and Future
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 
Analytics with Cassandra, Spark & MLLib - Cassandra Essentials Day
Analytics with Cassandra, Spark & MLLib - Cassandra Essentials DayAnalytics with Cassandra, Spark & MLLib - Cassandra Essentials Day
Analytics with Cassandra, Spark & MLLib - Cassandra Essentials Day
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelines
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 

Similar a Cassandra Fundamentals - C* 2.0

Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh
 

Similar a Cassandra Fundamentals - C* 2.0 (20)

Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014Netflix at-disney-09-26-2014
Netflix at-disney-09-26-2014
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET Developers
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Percona XtraDB 集群文档
Percona XtraDB 集群文档Percona XtraDB 集群文档
Percona XtraDB 集群文档
 
Cassandra basic
Cassandra basicCassandra basic
Cassandra basic
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Apache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep diveApache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep dive
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Azure Data Lake Analytics Deep Dive
Azure Data Lake Analytics Deep DiveAzure Data Lake Analytics Deep Dive
Azure Data Lake Analytics Deep Dive
 
System Design.pdf
System Design.pdfSystem Design.pdf
System Design.pdf
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!
 
Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!
 

Último

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Último (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 

Cassandra Fundamentals - C* 2.0

  • 1. Apache Cassandra Fundamentals or: How I stopped worrying and learned to love the CAP theorem Russell Spitzer @RussSpitzer Software Engineer in Test at DataStax
  • 2. Who am I? • Former Bioinformatics Student at UCSF • Work on the integration of Cassandra (C*) with Hadoop, Solr, and Redacted! • I Spend a lot of time spinning up clusters on EC2, GCE, Azure, … http://www.datastax.com/dev/ blog/testing-cassandra-1000- nodes-at-a-time • Developing new ways to make sure that C* Scales
  • 3. Apache Cassandra is a Linearly Scaling and Fault Tolerant noSQL Database Linearly Scaling: The power of the database increases linearly with the number of machines 2x machines = 2x throughput http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html Fault Tolerant: Nodes down != Database Down Datacenter down != Database Down
  • 4. CAP Theorem Limits What Distributed Systems can do Consistency When I ask the same question to any part of the system I should get the same answer How many planes do we have?
  • 5. CAP Theorem Limits What Distributed Systems can do Consistency When I ask the same question to any part of the system I should get the same answer How many planes do we have? Consistent 1 1 1 1 1 1 1
  • 6. CAP Theorem Limits What Distributed Systems can do Consistency When I ask the same question to any part of the system I should get the same answer How many planes do we have? Not Consistent 1 4 1 2 1 8 1
  • 7. CAP Theorem Limits What Distributed Systems can do When I ask a question I will get an answer Availability How many planes do we have? Available 1 zzzzz *snort* zzz
  • 8. CAP Theorem Limits What Distributed Systems can do Availability When I ask a question I will get an answer How many planes do we have? I have to wait for major snooze to wake up zzzzz *snort* zzz Not Available
  • 9. CAP Theorem Limits What Distributed Systems can do Partition Tolerance I can ask questions even when the system is having intra-system communication problems How many planes do we have? Team Edward Team Jacob 1 Tolerant
  • 10. CAP Theorem Limits What Distributed Systems can do Partition Tolerance I can ask questions even when the system is having intra-system communication problems How many planes do we have? Not Tolerant Team Edward Team Jacob I’m not sure without asking those vampire lovers and we aren’t speaking
  • 11. Cassandra is an AP System which is Eventually Consistent Eventually consistent: New information will make it to everyone eventually How many planes do we have? How many planes do we have? I don’t know without asking those vampire lovers and we aren’t speaking 1 1 1 1 1 1 I just heard ! we actually ! have 2 2 2 2 2 2 2 2
  • 12. Two knobs control fault tolerance in C*: Replication and Consistency Level Server Side - Replication: How many copies of a data should exist in the cluster? Coordinator for this operation ABD ABC ACD BCD RF=3 Client SimpleStrategy: Replicas NetworkTopologyStrategy: Replicas per Datacenter
  • 13. Two knobs control fault tolerance in C*: Replication and Consistency Level Client Side - Consistency Level: How many replicas should we check before acknowledgment? ABD ABC ACD BCD Client Coordinator for this operation CL = One
  • 14. Two knobs control fault tolerance in C*: Replication and Consistency Level Client Side - Consistency Level: How many replicas should we check before acknowledgment? ABD ABC ACD BCD CL = Quorum Client Coordinator for this operation
  • 15. Nodes own data whose primary key hashes to their their token ranges ABD ABC ACD BCD Every piece of data belongs on the node who owns the Murmur3(2.0) Hash of its partition key + (RF-1) other nodes Partition Key Clustering Key Rest of Data ID: ICBM_432 Time: 30 Loc: SF , Status: Idle ID: ICBM_432 Murmur3Hash Murmur3: A
  • 16. Cassandra writes are FAST due to log-append storage Par Clu Re Memory Memtable Memtable Memtable Commit Log Par Clu Re Par Clu Re Par Clu Re Disk Flushed SSTable SSTable
  • 17. Deletes in a distributed System are Challenging We need to keep records of deletions in case of network partitions Node1 Node2 Power Outage Time Tombstone Tombstone Tombstone
  • 18. Compactions merge and unify data in our stables SSTable 1 + SSTable SSTable 2 3 Since SSTables are immutable this is our chance to consolidate rows and remove tombstones (After GC Grace)
  • 19. Layout of Data Allows for Rapid Queries Along Clustering Columns ID: ICBM_432 ID: ICBM_900 ID: ICBM_9210 Time: 30 Loc: SF Status: Idle Time: 45 Loc: SF Status: Idle Time: 60 Loc: SF Status: Idle Time: 30 Loc: Boston Status: Idle Time: 45 Loc: Boston Status: Idle Time: 60 Loc: Boston Status: Idle Time: 30 Loc: Tulsa Status: Idle Time: 45 Loc: Tulsa Status: Idle Time: 60 Loc: Tulsa Status: Idle Disclaimer: Not exactly like this (Use sstable2json to see real layout)
  • 20. CQL allows easy definition of Table Structures ID: ICBM_432 Time: 30 Loc: SF Status: Idle Time: 45 Loc: SF Status: Idle Time: 60 Loc: SF Status: Idle CREATE TABLE icbmlog ( name text, time timestamp, location text, status text, PRIMARY KEY (name,time) );
  • 21. Reading data is FAST but limited by disk IO Memory Memtable Memtable Memtable Commit Log Par Clu Re Par Clu Re Par Clu Re Disk SSTable SSTable Client Par Clu Re LWW Replica Par Clu Re
  • 22. Reading data is FAST but limited by disk IO Memory Memtable Memtable Memtable Commit Log Par Clu Re Par Clu Re Par Clu Re Disk SSTable SSTable Client Par Clu Re LWW Replica Par Clu Re Read Repair
  • 23. New Clients provide a holistic view of the C* cluster Client ABD ABC ACD BCD Initial Contact Cluster.builder().addContactPoint("127.0.0.1").build()
  • 24. Session Objects Are used for Executing Requests session = cluster.connect() session.execute("DROP KEYSPACE IF EXISTS icbmkey") session.execute("CREATE KEYSPACE icbmkey with replication = {'class':'SimpleStrategy','replication_factor':'1'}") For highest throughput use asynchronous methods ResultSetFuture executeAsync(Query query) Then add a callback or Queue the ResultSetFutures ResultSetFuture ResultSetFuture ResultSetFuture
  • 25. Token Aware Policies allow the reduction in the number of intra-network requests made Client ABD ABC ACD BCD A
  • 26. Prepared statements allow for sending less data over the wire Query is prepared on all nodes by driver Prepared batch statements can further improve throughput PreparedStatement ps = session.prepare("INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?)"); BatchStatement batch = new BatchStatement(); batch.add(ps.bind(uid, mid1, title1, body1)); batch.add(ps.bind(uid, mid2, title2, body2)); batch.add(ps.bind(uid, mid3, title3, body3)); session.execute(batch);
  • 27. Avoid • Preparing statements more than once • Creating batches which are too large • Running statements in serial • Using consistency-levels above your need • Secondary Indexes in your main queries • or really at all unless you are doing analytics
  • 28. Have fun with C* Questions?