SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
CASSANDRA VS. THE FIELD
blueplastic.com/c.pdf
BY SAMEER FAROOQUI
SAMEER@BLUEPLASTIC.COM
linkedin.com/in/blueplastic/
@blueplastic
http://youtu.be/ziqx2hJY8Hg
#Cassandra13
COMPARING ARCHITECTURES:
NoSQL Options
Key -> Value Key -> Doc Column Family Graph ~Real Time
Riak
Redis
Memcached DB
Berkeley DB
Hamster DB
Amazon Dynamo
Voldemort
FoundationDB
LevelDB
Tokyo Cabinet
MongoDB
CouchDB
Terrastore
OrientDB
RavenDB
Elasticsearch
Cassandra
HBase
Hypertable
Amazon SimpleDB
Accumulo
HPCC
Cloudata
Neo4J
Infinite Graph
OrientDB
FlockDB
Gremlin
Titan
Storm
Impala
Stinger/Tez
Drill
Solr/Lucene
Key -> Value
Key (ID) Value (Name)
0001 Winston Smith
0002 Julia
0003 O'Brien
0004 Emmanuel Goldstein
- Simple API: get, put, delete
- K/V pairs are stored in containers called buckets
- Consistency only for a single key
Use cases: Content caching, Web Session info, User profiles, Preferences, Shopping Carts
- Very fast lookups and good scalability (sharding)
- All access via primary key
Don’t use for: Querying by data, multi-operation transactions, relationships between data
Can also be an object, blob, JSON, XML, etc.
Key -> Document
- Structure of docs stored should be similar,
but doesn’t need to be identical
- Like K/V, but value is examinable
Use cases: Event logging, content management systems, blogging platforms, web analytics
- Documents: XML, JSON, BSON, etc
Don’t use for: Complex transactions spanning Different Operations, Strict Schema applications
Key: 0001
Value: {firstname: “Nuru”,
lastname: “Abdalla”,
location: “Uguanda”,
languages: [“English, Swahili”],
mother: “Aziza”,
father: “Mufa”,
refugee_camp: “camp-10”
picture: “01010110”
}
Key: 0039
Value: {firstname: “Dee”,
location: “Uguanda”,
languages: “Swahili”,
refugee_camp: “camp-54”
picture: “01010110”
}
- Tolerant of incomplete data
Graph Databases
Use cases: Connected Data (social networks), shortest path, Recommendation Engines
Routing-Dispatch-Location services (node = location/address)
Don’t use for: Not easy to cluster/scale beyond one node, sometimes has to traverse entire graph
407-666-4012
GPS coordinates
IMSI #
407-384-4924
+44 #
+44 #
415-242-9492
407-336-1193
~ Real Time
Storm
Impala
Stinger/Tez
Drill
Spark/Shark
- Distributed, real time computation system / Stream processing
- For doing a continuous query on data streams and streaming the results into clients
(continuous computation)
- Still emerging, most are in alpha or beta stages
- Count hash tags
#
Spout
Bolt
Column Family
Col Fam 1
C1 C2 C3 C4
X
Col Fam 2
A B C D
Y
Col Fam 3
1 2 3 4
ROW-1
ROW-2
ROW-3
ROW-4
ROW-5
ROW-6
v1=Z
v2=K
(Table, Row Key, Col. Family, Column, Timestamp) → Value (Z)
Table-Name-Alpha
Region-1
Region-2
Column Family
- Know your R + W queries up front
- Design the data model and system architecture
to optimally fulfil those queries
- Important to understand the architecture fundamentals
How to pick a CF database
How to pick a CF database
Google Trends
How to pick a CF database
South Korea
India
USA
China
Russia
Netherlands
South Korea
Belgium
China
Taiwan
Google Trends
How to pick a CF database
Date Apache Cassandra Apache HBase
Jan 2013 739 783
Feb 2013 714 797
March 2013 837 692
April 2013 730 741
May 2013 567 636
- Check activity on the Apache user mailing lists
Dynamo
BigTable
Nov, 2006
Oct, 2007 Storage Engine
Data Model
Cassandra
MapReduce
GFS
BigTable
HDFS
MapReduce
Chubby
ZooKeeper
Oct, 2003
Dec, 2004
Nov, 2006
Nov, 2006
• Written in Java
• Column Family Oriented Databases
• Have reached 1,000+ nodes in production
• Very low latency reads + writes
• Use Log Structured Merge Trees
• Atomic at row level
• No support for joins, transactions, foreign keys
Both:
• Peer to Peer architecture
• Tunable consistency
• Secondary Indexes available
• Writes to ext3/4
• Conflict resolution handled
during reads
• N-way writes
• Random and ordered sorting of
row keys supported
• Master / Slave architecture
• Strict consistency
• No native secondary index
support
• Writes to HDFS
• Conflict resolution handled
during writes
• Pipelined write
• Ordered/Lexicographical sorting
of row keys
vs.
Amazon.com’s Dynamo Use Cases
- Best seller lists
- Customer preferences
- Sales rank
- Product catalog
- Session management
Services that only need primary key
access to data store:
No need for:
- Complex SQL queries
- Operations spanning multiple data items
• Shopping cart service must always allow
customers to add and remove items
• If there are 2 conflicting versions of a write,
the application should be able to pull both
writes and merge them
• Designed for apps that “need tight control
over the tradeoffs between availability,
consistency, cost-effectiveness and
performance”
Google’s BigTable Use Cases
- Gmail
- YouTube
- Google Earth
- Google Finance
- Google Analytics
- Personalized Search
60 products at Google once
used BigTable:
• Must be able to store the entire web crawl
data
• Rely on GFS for replication and data
availability
• Strong integration with MapReduce
1
2
3
4
5
6
7
8
Client
- Gossip runs ever second on a node
to 3 other nodes
- Used to discover location and state
information about other nodes
- Phi Accrual Failure Detector used to
detect failures via a suspicion level
on a continuous scale
NameNode JobTracker HBase Master
ZooKeeper
Standby NN
DNTT
M
M
R RS
DNTT
M
M
R RS
DNTT
M
M
R RS
DNTT
M
M
R RS
M
M
M
M
M
M
RR R
R
OS
2 TB each
SATA
RAID
OSOSOS
JTNN HM
HBase Master Standby
Master Machines
Slave Machines
Client
Effort to deploy
- One monolithic database install (1 JVM per node) + 1 log file
and 1 config file (YAML)
- No single points of failure, so no standby master nodes
- Good default settings
- More complex to deploy (multiple JVMs per node) +
many log files and many config files (XMLs)
- More moving parts: HDFS, HBase, MapReduce,
Passive NameNode, Standby HBase Master,
ZooKeeper
- Default settings usually need tweaking
#Cassandra13
Where to write?
Client
ZooKeeper
. . .. . .
x xx
Synchronous
Replication via HDFS
-ROOT-
. . . . . .
.META.1 .META.2 .META.3
go to:
go to: go to: go to:
META 1, 2 or 3
RS a,b,c RS a,b,c RS a,b,c
No control over replication or
consistency for each write!
-ROOT-
Region Location?
.META.
Root & Meta
Locations cached
R
.META. table?
M
1
2
3
4
5
6
7
8
Client
coordinator
R2
R1
R3
Replication F. = 3
Consistency = 1
Where to write?
1
2
3
4
5
6
7
8
Client
coordinator
R2
R1
R3
Replication F. = 3
Consistency = 2
Where to write?
1
2
3
4
5
6
7
8
Client
coordinator
R4
R2
Replication F. = 4
Consistency = 2
R1
R3
Where to write?
Strong Consistency Costs
- Write to 3 nodes (RF = 3, C = 2)
- Read from at least 2 nodes to guarantee strong consistency
- Write to 3 nodes (RF=3, C=3)
- Read from only 1 node to guarantee strong consistency
#Cassandra13
Log Structured Merge Trees
C* HBase
(Table, Row Key, Col. Family, Column, Timestamp) → Value (Z)
Node
JVM
WAL
Memstore
Z
Z A B C D
HFile
FlushCommit
Log
Memtable
SStable
Log Structured Merge Trees
C* HBase
Node
JVM
WAL
Memstore
Z
Z A B C D
Flush
Commit
Log
Memtable
SStable
, Value
Z A B C D
Flush
HFile
SStable
HFile
Flush
SStable
Z A B C D
HFile
Flush Details
Bloom Filter
Block Index
R only R + C
Z
A
B
C
D
- In HBase BF and BI are stored in the Hfile
- In C*, there are separate data, BF and Index Files.
Flush per Column Family
- Supported
- Flushes all Column Families together
- Unnecessary flushing puts more network pressure on
HBase since Hfiles have to be replicated to 2 other
HDFS nodes
- Flush per CF is under development via JIRA 3149
#Cassandra13
Secondary Indexes
- Native support for Secondary Indexes
- No native Secondary Indexes
- But a trigger can be launched after a put to
keep a secondary index (another CF) up to
date and not put the burden on the client
#Cassandra13
SSD Support
- It is possible to place just the SStables on SSD
In YAML file, set commitlog_directory to spinning disks and
set data_file_directories to SSD
- See Rick Branson’s talk:
youtube.com/watch?v=zQdDi9pdf3I
- Not possible to tell HDFS to only store WAL or HFiles on SSD
- There is some support in MapR and Intel distributions for this
- Apache HDFS JIRAs 2832 & 4672 have preliminary discussions
#Cassandra13
Compactions
- Tiered and Leveled
- For leveled, see J. Ellis’s blog post:
- Only Tiered
- Note, many new algorithms and improvements coming in
HBase 0.95 like Stripe Compactions (JIRA 7667)
datastax.com/dev/blog/leveled-compaction-in-apache-cassandra
#Cassandra13
https://issues.apache.org/jira/secure/attachment/12575449/Stripe%20compactions.pdf
Reading after disk failure
- Reads can just be fulfilled from another node natively
- After a disk failure, the slave machine will read
missing data from a remote disk until compaction
happens. So, region reads can be slow.
Data Partitioning
- Supports ordered partitioner and random partitioner
- Only supports ordered partitioner
- Row key range scans possible
- It is possible to externally md5 hash the row key and
add the hash to the row key: md5-rowkey
#Cassandra13
Triggers / Coprocessors
- Under development for C* 2.0, JIRA 1311
- Supported by Coprocessors (so after a get/put/del
on a column family, a trigger can be executed.
- Triggers are coded as java classes
#Cassandra13
Compare & Set
- Under development for C* 2.0
- Supported
#Cassandra13
Multi-Datacenter/DR Support
- Very mature and well tested
- Synchronous or Asynchronous replication to DR
- Recovery Point Objective (RPO) can be 0
- Not as robust
- Only Asynchronous replication to DR
- Recovery Point Objective (RPO) cannot be 0
#Cassandra13
blueplastic.com/c.pdf
Sameer Farooqui
sameer@blueplastic.com
- Freelance Big Data consultant and trainer
- Taught 50+ courses on Hadoop, HBase, Cassandra and OpenStack
- Datastax authorized training partner
Ex: Hortonworks, Accenture R&D, Symantec
linkedin.com/in/blueplastic/
@blueplastic
http://youtu.be/ziqx2hJY8Hg
#Cassandra13

Más contenido relacionado

Más de DataStax Academy

Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkDataStax Academy
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and CassandraDataStax Academy
 

Más de DataStax Academy (20)

Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
 

C* Summit 2013: Comparing Architectures: Cassandra vs the Field by Sameer Farooqui

  • 1. CASSANDRA VS. THE FIELD blueplastic.com/c.pdf BY SAMEER FAROOQUI SAMEER@BLUEPLASTIC.COM linkedin.com/in/blueplastic/ @blueplastic http://youtu.be/ziqx2hJY8Hg #Cassandra13 COMPARING ARCHITECTURES:
  • 2. NoSQL Options Key -> Value Key -> Doc Column Family Graph ~Real Time Riak Redis Memcached DB Berkeley DB Hamster DB Amazon Dynamo Voldemort FoundationDB LevelDB Tokyo Cabinet MongoDB CouchDB Terrastore OrientDB RavenDB Elasticsearch Cassandra HBase Hypertable Amazon SimpleDB Accumulo HPCC Cloudata Neo4J Infinite Graph OrientDB FlockDB Gremlin Titan Storm Impala Stinger/Tez Drill Solr/Lucene
  • 3. Key -> Value Key (ID) Value (Name) 0001 Winston Smith 0002 Julia 0003 O'Brien 0004 Emmanuel Goldstein - Simple API: get, put, delete - K/V pairs are stored in containers called buckets - Consistency only for a single key Use cases: Content caching, Web Session info, User profiles, Preferences, Shopping Carts - Very fast lookups and good scalability (sharding) - All access via primary key Don’t use for: Querying by data, multi-operation transactions, relationships between data Can also be an object, blob, JSON, XML, etc.
  • 4. Key -> Document - Structure of docs stored should be similar, but doesn’t need to be identical - Like K/V, but value is examinable Use cases: Event logging, content management systems, blogging platforms, web analytics - Documents: XML, JSON, BSON, etc Don’t use for: Complex transactions spanning Different Operations, Strict Schema applications Key: 0001 Value: {firstname: “Nuru”, lastname: “Abdalla”, location: “Uguanda”, languages: [“English, Swahili”], mother: “Aziza”, father: “Mufa”, refugee_camp: “camp-10” picture: “01010110” } Key: 0039 Value: {firstname: “Dee”, location: “Uguanda”, languages: “Swahili”, refugee_camp: “camp-54” picture: “01010110” } - Tolerant of incomplete data
  • 5. Graph Databases Use cases: Connected Data (social networks), shortest path, Recommendation Engines Routing-Dispatch-Location services (node = location/address) Don’t use for: Not easy to cluster/scale beyond one node, sometimes has to traverse entire graph 407-666-4012 GPS coordinates IMSI # 407-384-4924 +44 # +44 # 415-242-9492 407-336-1193
  • 6. ~ Real Time Storm Impala Stinger/Tez Drill Spark/Shark - Distributed, real time computation system / Stream processing - For doing a continuous query on data streams and streaming the results into clients (continuous computation) - Still emerging, most are in alpha or beta stages - Count hash tags # Spout Bolt
  • 7. Column Family Col Fam 1 C1 C2 C3 C4 X Col Fam 2 A B C D Y Col Fam 3 1 2 3 4 ROW-1 ROW-2 ROW-3 ROW-4 ROW-5 ROW-6 v1=Z v2=K (Table, Row Key, Col. Family, Column, Timestamp) → Value (Z) Table-Name-Alpha Region-1 Region-2
  • 8. Column Family - Know your R + W queries up front - Design the data model and system architecture to optimally fulfil those queries - Important to understand the architecture fundamentals
  • 9. How to pick a CF database
  • 10. How to pick a CF database Google Trends
  • 11. How to pick a CF database South Korea India USA China Russia Netherlands South Korea Belgium China Taiwan Google Trends
  • 12. How to pick a CF database Date Apache Cassandra Apache HBase Jan 2013 739 783 Feb 2013 714 797 March 2013 837 692 April 2013 730 741 May 2013 567 636 - Check activity on the Apache user mailing lists
  • 13. Dynamo BigTable Nov, 2006 Oct, 2007 Storage Engine Data Model Cassandra
  • 15. • Written in Java • Column Family Oriented Databases • Have reached 1,000+ nodes in production • Very low latency reads + writes • Use Log Structured Merge Trees • Atomic at row level • No support for joins, transactions, foreign keys Both:
  • 16. • Peer to Peer architecture • Tunable consistency • Secondary Indexes available • Writes to ext3/4 • Conflict resolution handled during reads • N-way writes • Random and ordered sorting of row keys supported • Master / Slave architecture • Strict consistency • No native secondary index support • Writes to HDFS • Conflict resolution handled during writes • Pipelined write • Ordered/Lexicographical sorting of row keys vs.
  • 17. Amazon.com’s Dynamo Use Cases - Best seller lists - Customer preferences - Sales rank - Product catalog - Session management Services that only need primary key access to data store: No need for: - Complex SQL queries - Operations spanning multiple data items • Shopping cart service must always allow customers to add and remove items • If there are 2 conflicting versions of a write, the application should be able to pull both writes and merge them • Designed for apps that “need tight control over the tradeoffs between availability, consistency, cost-effectiveness and performance”
  • 18. Google’s BigTable Use Cases - Gmail - YouTube - Google Earth - Google Finance - Google Analytics - Personalized Search 60 products at Google once used BigTable: • Must be able to store the entire web crawl data • Rely on GFS for replication and data availability • Strong integration with MapReduce
  • 19. 1 2 3 4 5 6 7 8 Client - Gossip runs ever second on a node to 3 other nodes - Used to discover location and state information about other nodes - Phi Accrual Failure Detector used to detect failures via a suspicion level on a continuous scale
  • 20. NameNode JobTracker HBase Master ZooKeeper Standby NN DNTT M M R RS DNTT M M R RS DNTT M M R RS DNTT M M R RS M M M M M M RR R R OS 2 TB each SATA RAID OSOSOS JTNN HM HBase Master Standby Master Machines Slave Machines Client
  • 21. Effort to deploy - One monolithic database install (1 JVM per node) + 1 log file and 1 config file (YAML) - No single points of failure, so no standby master nodes - Good default settings - More complex to deploy (multiple JVMs per node) + many log files and many config files (XMLs) - More moving parts: HDFS, HBase, MapReduce, Passive NameNode, Standby HBase Master, ZooKeeper - Default settings usually need tweaking #Cassandra13
  • 22. Where to write? Client ZooKeeper . . .. . . x xx Synchronous Replication via HDFS -ROOT- . . . . . . .META.1 .META.2 .META.3 go to: go to: go to: go to: META 1, 2 or 3 RS a,b,c RS a,b,c RS a,b,c No control over replication or consistency for each write! -ROOT- Region Location? .META. Root & Meta Locations cached R .META. table? M
  • 25. 1 2 3 4 5 6 7 8 Client coordinator R4 R2 Replication F. = 4 Consistency = 2 R1 R3 Where to write?
  • 26. Strong Consistency Costs - Write to 3 nodes (RF = 3, C = 2) - Read from at least 2 nodes to guarantee strong consistency - Write to 3 nodes (RF=3, C=3) - Read from only 1 node to guarantee strong consistency #Cassandra13
  • 27. Log Structured Merge Trees C* HBase (Table, Row Key, Col. Family, Column, Timestamp) → Value (Z) Node JVM WAL Memstore Z Z A B C D HFile FlushCommit Log Memtable SStable
  • 28. Log Structured Merge Trees C* HBase Node JVM WAL Memstore Z Z A B C D Flush Commit Log Memtable SStable , Value Z A B C D Flush HFile SStable HFile
  • 29. Flush SStable Z A B C D HFile Flush Details Bloom Filter Block Index R only R + C Z A B C D - In HBase BF and BI are stored in the Hfile - In C*, there are separate data, BF and Index Files.
  • 30. Flush per Column Family - Supported - Flushes all Column Families together - Unnecessary flushing puts more network pressure on HBase since Hfiles have to be replicated to 2 other HDFS nodes - Flush per CF is under development via JIRA 3149 #Cassandra13
  • 31. Secondary Indexes - Native support for Secondary Indexes - No native Secondary Indexes - But a trigger can be launched after a put to keep a secondary index (another CF) up to date and not put the burden on the client #Cassandra13
  • 32. SSD Support - It is possible to place just the SStables on SSD In YAML file, set commitlog_directory to spinning disks and set data_file_directories to SSD - See Rick Branson’s talk: youtube.com/watch?v=zQdDi9pdf3I - Not possible to tell HDFS to only store WAL or HFiles on SSD - There is some support in MapR and Intel distributions for this - Apache HDFS JIRAs 2832 & 4672 have preliminary discussions #Cassandra13
  • 33. Compactions - Tiered and Leveled - For leveled, see J. Ellis’s blog post: - Only Tiered - Note, many new algorithms and improvements coming in HBase 0.95 like Stripe Compactions (JIRA 7667) datastax.com/dev/blog/leveled-compaction-in-apache-cassandra #Cassandra13 https://issues.apache.org/jira/secure/attachment/12575449/Stripe%20compactions.pdf
  • 34. Reading after disk failure - Reads can just be fulfilled from another node natively - After a disk failure, the slave machine will read missing data from a remote disk until compaction happens. So, region reads can be slow.
  • 35. Data Partitioning - Supports ordered partitioner and random partitioner - Only supports ordered partitioner - Row key range scans possible - It is possible to externally md5 hash the row key and add the hash to the row key: md5-rowkey #Cassandra13
  • 36. Triggers / Coprocessors - Under development for C* 2.0, JIRA 1311 - Supported by Coprocessors (so after a get/put/del on a column family, a trigger can be executed. - Triggers are coded as java classes #Cassandra13
  • 37. Compare & Set - Under development for C* 2.0 - Supported #Cassandra13
  • 38. Multi-Datacenter/DR Support - Very mature and well tested - Synchronous or Asynchronous replication to DR - Recovery Point Objective (RPO) can be 0 - Not as robust - Only Asynchronous replication to DR - Recovery Point Objective (RPO) cannot be 0 #Cassandra13
  • 39. blueplastic.com/c.pdf Sameer Farooqui sameer@blueplastic.com - Freelance Big Data consultant and trainer - Taught 50+ courses on Hadoop, HBase, Cassandra and OpenStack - Datastax authorized training partner Ex: Hortonworks, Accenture R&D, Symantec linkedin.com/in/blueplastic/ @blueplastic http://youtu.be/ziqx2hJY8Hg #Cassandra13