SlideShare una empresa de Scribd logo
1 de 56
Descargar para leer sin conexión
©2013 DataStax Confidential. Do not distribute without consent.
@spyced
Jonathan Ellis
CTO, DataStax / Project Chair, Apache Cassandra
Cassandra 2.0
Five Years of Cassandra
Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13
0.1 0.3 0.6 0.7 1.0 1.2
...
2.0
DSE
Jul-08
Core values
0
20000
40000
60000
80000
0 2 4 6 8 10 12
Cassandra HBase Redis MySQL
• Massive scalablility
• High performance
• Reliability/Availability
CREATE TABLE users (
id uuid PRIMARY KEY,
name text,
state text,
birth_date int
);
CREATE INDEX ON users(state);
SELECT * FROM users
WHERE state=‘Texas’
AND birth_date > 1950;
New Core Value
• Massive scalablility
• High performance
• Reliability/Availability
• Ease of use
*Key concepts?
*Data Modeling section of documentation: http://www.datastax.com/
documentation/cassandra/1.2/index.html#cassandra/ddl/
ddl_anatomy_table_c.html
CQL delivers
"Coming from a relational database background we found
the transition to Cassandra to be very straightforward. There are a
few simple key concepts one must grasp at first but ever since it's
been smooth sailing for us."
Boris Wolf, Comcast
1.2 for Developers
• CQL3
• Thrift compatibility
• Collections
• Data dictionary
• Auth support
• Hadoop support
• Native drivers
• Tracing
• Atomic batches
[cassandra.yaml]
authenticator: PasswordAuthenticator
# DSE offers KerberosAuthenticator as well
CREATE USER robin WITH PASSWORD 'manager' SUPERUSER;
ALTER USER cassandra WITH PASSWORD 'newpassword';
LIST USERS;
DROP USER cassandra;
Authentication
[cassandra.yaml]
authorizer: CassandraAuthorizer
GRANT select ON audit TO jonathan;
GRANT modify ON users TO robin;
GRANT all ON ALL KEYSPACES TO lara;
Authorization
Native drivers
• CQL native protocol: efficient, lightweight, asynchronous
• Java (GA): https://github.com/datastax/java-driver
• .NET (Beta): https://github.com/datastax/csharp-driver
• Python (Beta): https://github.com/datastax/python-driver
• Coming soon: PHP, Ruby, others
1.2 for Operators
• Virtual nodes
• “Dense node” support (5-10TB/machine)
• JBOD improvements
• Off-heap bloom filters, compression metadata
• Parallel leveled compaction
1.2.5+
1.2.5+
• ~1/2 memory usage in partition summary
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
• Thread-local allocation
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
• Thread-local allocation
• LZ4 compression (default in 2.0)
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
• Thread-local allocation
• LZ4 compression (default in 2.0)
• (1.2.7) CQL Input/Output for Hadoop
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
• Thread-local allocation
• LZ4 compression (default in 2.0)
• (1.2.7) CQL Input/Output for Hadoop
• (1.2.7) Range tombstone performance
1.2.5+
• ~1/2 memory usage in partition summary
• Improved compaction throttle
• Parallel leveled compaction
• Removed cell-name bloom filters
• Thread-local allocation
• LZ4 compression (default in 2.0)
• (1.2.7) CQL Input/Output for Hadoop
• (1.2.7) Range tombstone performance
• (1.2.9) Larger default LCS filesize (160MB > 5MB)
Cassandra 2.0
2.0
• Lightweight transactions
• Triggers (experimental)
• Improved compaction
• CQL cursors
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
INSERT INTO users (...)
VALUES (’jbellis’, ...)
Session 1
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
INSERT INTO users (...)
VALUES (’jbellis’, ...)
Session 2
Lightweight transactions: the problem
Paxos
• All operations are quorum-based
• Each replica sends information about unfinished operations to the leader
during prepare
• Paxos made Simple
LWT: details
• 4 round trips vs 1 for normal updates
• Paxos state is durable
• Immediate consistency with no leader election or failover
• ConsistencyLevel.SERIAL
• http://www.datastax.com/dev/blog/lightweight-transactions-in-
cassandra-2-0
LWT: Use with caution
• Great for 1% of your application
• Eventual consistency is your friend
• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-
hopeful-consistency-by-christos-kalantzis
UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
WHERE username = ’jbellis’
IF email = ’jbellis@datastax.com’;
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;
Using LWT
Triggers
CREATE TRIGGER <name> ON <table> USING <classname>;
Trigger implementation
class MyTrigger implements ITrigger
{
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update)
{
...
}
}
Experimental!
• Relies on internal RowMutation, ColumnFamily classes
• [partition] key is a ByteBuffer
• Expect changes in 2.1
Compaction
• Single-pass, always
• LCS performs STCS in L0
Healthy leveled compaction
Sad leveled compaction
STCS in L0
Cursors (before)
SELECT *
FROM timeline
WHERE (user_id = :last_key
AND tweet_id > :last_tweet)
OR token(user_id) > token(:last_key)
LIMIT 100
CREATE TABLE timeline (
  user_id uuid,
  tweet_id timeuuid,
  tweet_author uuid,
tweet_body text,
  PRIMARY KEY (user_id, tweet_id)
);
Cursors (after)
SELECT *
FROM timeline
Misc. performance improvements
Misc. performance improvements
• Tracking statistics on clustered columns allows eliminating unnecessary
sstables from the read path
Misc. performance improvements
• Tracking statistics on clustered columns allows eliminating unnecessary
sstables from the read path
• New half-synchronous, half-asynchronous Thrift server based on LMAX
Disruptor
Misc. performance improvements
• Tracking statistics on clustered columns allows eliminating unnecessary
sstables from the read path
• New half-synchronous, half-asynchronous Thrift server based on LMAX
Disruptor
• Faster partition index lookups and cache reads by improving performance
of off-heap memory
Misc. performance improvements
• Tracking statistics on clustered columns allows eliminating unnecessary
sstables from the read path
• New half-synchronous, half-asynchronous Thrift server based on LMAX
Disruptor
• Faster partition index lookups and cache reads by improving performance
of off-heap memory
• Faster reads of compressed data by switching from CRC32 to Adler
checksums
Misc. performance improvements
• Tracking statistics on clustered columns allows eliminating unnecessary
sstables from the read path
• New half-synchronous, half-asynchronous Thrift server based on LMAX
Disruptor
• Faster partition index lookups and cache reads by improving performance
of off-heap memory
• Faster reads of compressed data by switching from CRC32 to Adler
checksums
• JEMalloc support for off-heap allocation
Spring cleaning
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
• The potentially dangerous countPendingHints JMX call has been replaced
by a Hints Created metric
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
• The potentially dangerous countPendingHints JMX call has been replaced
by a Hints Created metric
• The on-heap partition cache (“row cache”) has been removed
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
• The potentially dangerous countPendingHints JMX call has been replaced
by a Hints Created metric
• The on-heap partition cache (“row cache”) has been removed
• Vnodes are on by default
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
• The potentially dangerous countPendingHints JMX call has been replaced
by a Hints Created metric
• The on-heap partition cache (“row cache”) has been removed
• Vnodes are on by default
• the old token range bisection code for non-vnode clusters is gone
Spring cleaning
• Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
• The potentially dangerous countPendingHints JMX call has been replaced
by a Hints Created metric
• The on-heap partition cache (“row cache”) has been removed
• Vnodes are on by default
• the old token range bisection code for non-vnode clusters is gone
• Removed emergency memory pressure valve logic
Operational concerns
Operational concerns
• Java7 is now required!
Operational concerns
• Java7 is now required!
• Leveled compaction level information has been moved into sstable
metadata
Operational concerns
• Java7 is now required!
• Leveled compaction level information has been moved into sstable
metadata
• Kernel page cache skipping has been removed in favor of optional row
preheating (preheat_kernel_page_cache)
Operational concerns
• Java7 is now required!
• Leveled compaction level information has been moved into sstable
metadata
• Kernel page cache skipping has been removed in favor of optional row
preheating (preheat_kernel_page_cache)
• Streaming has been rewritten to be more transparent and robust.
Operational concerns
• Java7 is now required!
• Leveled compaction level information has been moved into sstable
metadata
• Kernel page cache skipping has been removed in favor of optional row
preheating (preheat_kernel_page_cache)
• Streaming has been rewritten to be more transparent and robust.
• Streaming support for old-version sstables
©2013 DataStax Confidential. Do not distribute without consent. 31

Más contenido relacionado

La actualidad más candente

Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDTony Rogerson
 
Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Frazer Clement
 
Oracle sharding : Installation & Configuration
Oracle sharding : Installation & ConfigurationOracle sharding : Installation & Configuration
Oracle sharding : Installation & Configurationsuresh gandhi
 
MySQL 5.5&5.6 new features summary
MySQL 5.5&5.6 new features summaryMySQL 5.5&5.6 new features summary
MySQL 5.5&5.6 new features summaryLouis liu
 
Architecture of exadata database machine – Part II
Architecture of exadata database machine – Part IIArchitecture of exadata database machine – Part II
Architecture of exadata database machine – Part IIParesh Nayak,OCP®,Prince2®
 
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09OSSCube
 
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)datastaxjp
 
MySQL User Camp: MySQL Cluster
MySQL User Camp: MySQL ClusterMySQL User Camp: MySQL Cluster
MySQL User Camp: MySQL ClusterShivji Kumar Jha
 
MySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialMySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialFrazer Clement
 
Oracle 12.2 sharding learning more
Oracle 12.2 sharding learning moreOracle 12.2 sharding learning more
Oracle 12.2 sharding learning moreLeyi (Kamus) Zhang
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
 
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...Ludovico Caldara
 
Oracle 12.2 sharded database management
Oracle 12.2 sharded database managementOracle 12.2 sharded database management
Oracle 12.2 sharded database managementLeyi (Kamus) Zhang
 
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016Dave Stokes
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarDenny Lee
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Enterprise Virtualization with Xen
Enterprise Virtualization with XenEnterprise Virtualization with Xen
Enterprise Virtualization with XenFrank Martin
 

La actualidad más candente (20)

Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
 
Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)Breakthrough performance with MySQL Cluster (2012)
Breakthrough performance with MySQL Cluster (2012)
 
MySQL 开发
MySQL 开发MySQL 开发
MySQL 开发
 
DataStax 6 and Beyond
DataStax 6 and BeyondDataStax 6 and Beyond
DataStax 6 and Beyond
 
Oracle sharding : Installation & Configuration
Oracle sharding : Installation & ConfigurationOracle sharding : Installation & Configuration
Oracle sharding : Installation & Configuration
 
MySQL 5.5&5.6 new features summary
MySQL 5.5&5.6 new features summaryMySQL 5.5&5.6 new features summary
MySQL 5.5&5.6 new features summary
 
Architecture of exadata database machine – Part II
Architecture of exadata database machine – Part IIArchitecture of exadata database machine – Part II
Architecture of exadata database machine – Part II
 
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09
OSSCube MySQL Cluster Tutorial By Sonali At Osspac 09
 
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
[Cassandra summit Tokyo, 2015] Cassandra 2015 最新情報 by ジョナサン・エリス(Jonathan Ellis)
 
MySQL User Camp: MySQL Cluster
MySQL User Camp: MySQL ClusterMySQL User Camp: MySQL Cluster
MySQL User Camp: MySQL Cluster
 
MySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialMySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorial
 
Oracle 12.2 sharding learning more
Oracle 12.2 sharding learning moreOracle 12.2 sharding learning more
Oracle 12.2 sharding learning more
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
 
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
 
Oracle 12.2 sharded database management
Oracle 12.2 sharded database managementOracle 12.2 sharded database management
Oracle 12.2 sharded database management
 
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery Webinar
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Enterprise Virtualization with Xen
Enterprise Virtualization with XenEnterprise Virtualization with Xen
Enterprise Virtualization with Xen
 

Similar a London + Dublin Cassandra 2.0

Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial Na Zhu
 
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J..."Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...Dataconomy Media
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideMohammed Fazuluddin
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016DataStax
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 

Similar a London + Dublin Cassandra 2.0 (20)

BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Devops kc
Devops kcDevops kc
Devops kc
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J..."Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 

Más de jbellis

Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)jbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011jbellis
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandrajbellis
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialjbellis
 
Cassandra 0.7, Los Angeles High Scalability Group
Cassandra 0.7, Los Angeles High Scalability GroupCassandra 0.7, Los Angeles High Scalability Group
Cassandra 0.7, Los Angeles High Scalability Groupjbellis
 
Cassandra devoxx 2010
Cassandra devoxx 2010Cassandra devoxx 2010
Cassandra devoxx 2010jbellis
 
Cassandra FrOSCon 10
Cassandra FrOSCon 10Cassandra FrOSCon 10
Cassandra FrOSCon 10jbellis
 
State of Cassandra, August 2010
State of Cassandra, August 2010State of Cassandra, August 2010
State of Cassandra, August 2010jbellis
 

Más de jbellis (20)

Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandra
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorial
 
Cassandra 0.7, Los Angeles High Scalability Group
Cassandra 0.7, Los Angeles High Scalability GroupCassandra 0.7, Los Angeles High Scalability Group
Cassandra 0.7, Los Angeles High Scalability Group
 
Cassandra devoxx 2010
Cassandra devoxx 2010Cassandra devoxx 2010
Cassandra devoxx 2010
 
Cassandra FrOSCon 10
Cassandra FrOSCon 10Cassandra FrOSCon 10
Cassandra FrOSCon 10
 
State of Cassandra, August 2010
State of Cassandra, August 2010State of Cassandra, August 2010
State of Cassandra, August 2010
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

London + Dublin Cassandra 2.0

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. @spyced Jonathan Ellis CTO, DataStax / Project Chair, Apache Cassandra Cassandra 2.0
  • 2. Five Years of Cassandra Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13 0.1 0.3 0.6 0.7 1.0 1.2 ... 2.0 DSE Jul-08
  • 3. Core values 0 20000 40000 60000 80000 0 2 4 6 8 10 12 Cassandra HBase Redis MySQL • Massive scalablility • High performance • Reliability/Availability
  • 4. CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > 1950; New Core Value • Massive scalablility • High performance • Reliability/Availability • Ease of use
  • 5. *Key concepts? *Data Modeling section of documentation: http://www.datastax.com/ documentation/cassandra/1.2/index.html#cassandra/ddl/ ddl_anatomy_table_c.html CQL delivers "Coming from a relational database background we found the transition to Cassandra to be very straightforward. There are a few simple key concepts one must grasp at first but ever since it's been smooth sailing for us." Boris Wolf, Comcast
  • 6. 1.2 for Developers • CQL3 • Thrift compatibility • Collections • Data dictionary • Auth support • Hadoop support • Native drivers • Tracing • Atomic batches
  • 7. [cassandra.yaml] authenticator: PasswordAuthenticator # DSE offers KerberosAuthenticator as well CREATE USER robin WITH PASSWORD 'manager' SUPERUSER; ALTER USER cassandra WITH PASSWORD 'newpassword'; LIST USERS; DROP USER cassandra; Authentication
  • 8. [cassandra.yaml] authorizer: CassandraAuthorizer GRANT select ON audit TO jonathan; GRANT modify ON users TO robin; GRANT all ON ALL KEYSPACES TO lara; Authorization
  • 9. Native drivers • CQL native protocol: efficient, lightweight, asynchronous • Java (GA): https://github.com/datastax/java-driver • .NET (Beta): https://github.com/datastax/csharp-driver • Python (Beta): https://github.com/datastax/python-driver • Coming soon: PHP, Ruby, others
  • 10. 1.2 for Operators • Virtual nodes • “Dense node” support (5-10TB/machine) • JBOD improvements • Off-heap bloom filters, compression metadata • Parallel leveled compaction
  • 12. 1.2.5+ • ~1/2 memory usage in partition summary
  • 13. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle
  • 14. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction
  • 15. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters
  • 16. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation
  • 17. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0)
  • 18. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop
  • 19. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop • (1.2.7) Range tombstone performance
  • 20. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop • (1.2.7) Range tombstone performance • (1.2.9) Larger default LCS filesize (160MB > 5MB)
  • 22. 2.0 • Lightweight transactions • Triggers (experimental) • Improved compaction • CQL cursors
  • 23. SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] INSERT INTO users (...) VALUES (’jbellis’, ...) Session 1 SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] INSERT INTO users (...) VALUES (’jbellis’, ...) Session 2 Lightweight transactions: the problem
  • 24. Paxos • All operations are quorum-based • Each replica sends information about unfinished operations to the leader during prepare • Paxos made Simple
  • 25. LWT: details • 4 round trips vs 1 for normal updates • Paxos state is durable • Immediate consistency with no leader election or failover • ConsistencyLevel.SERIAL • http://www.datastax.com/dev/blog/lightweight-transactions-in- cassandra-2-0
  • 26. LWT: Use with caution • Great for 1% of your application • Eventual consistency is your friend • http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis
  • 27. UPDATE USERS SET email = ’jonathan@datastax.com’, ... WHERE username = ’jbellis’ IF email = ’jbellis@datastax.com’; INSERT INTO USERS (username, email, ...) VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... ) IF NOT EXISTS; Using LWT
  • 28. Triggers CREATE TRIGGER <name> ON <table> USING <classname>;
  • 29. Trigger implementation class MyTrigger implements ITrigger { public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { ... } }
  • 30. Experimental! • Relies on internal RowMutation, ColumnFamily classes • [partition] key is a ByteBuffer • Expect changes in 2.1
  • 31. Compaction • Single-pass, always • LCS performs STCS in L0
  • 35. Cursors (before) SELECT * FROM timeline WHERE (user_id = :last_key AND tweet_id > :last_tweet) OR token(user_id) > token(:last_key) LIMIT 100 CREATE TABLE timeline (   user_id uuid,   tweet_id timeuuid,   tweet_author uuid, tweet_body text,   PRIMARY KEY (user_id, tweet_id) );
  • 38. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path
  • 39. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor
  • 40. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory
  • 41. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory • Faster reads of compressed data by switching from CRC32 to Adler checksums
  • 42. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory • Faster reads of compressed data by switching from CRC32 to Adler checksums • JEMalloc support for off-heap allocation
  • 44. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
  • 45. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric
  • 46. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed
  • 47. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default
  • 48. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default • the old token range bisection code for non-vnode clusters is gone
  • 49. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default • the old token range bisection code for non-vnode clusters is gone • Removed emergency memory pressure valve logic
  • 51. Operational concerns • Java7 is now required!
  • 52. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata
  • 53. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache)
  • 54. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache) • Streaming has been rewritten to be more transparent and robust.
  • 55. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache) • Streaming has been rewritten to be more transparent and robust. • Streaming support for old-version sstables
  • 56. ©2013 DataStax Confidential. Do not distribute without consent. 31