Cassandra - Say Goodbye to the Relational Database (5-6-2010)
1. v
Cassandra
Say Goodbye to the Relation Database
Twin Cities PHP User Group
May 6, 2010
Chris Barber
CB1, INC.
http://www.cb1inc.com/
2. About Me
● Chris Barber
● Open source hacker
● Software consultant
● JavaScript, C++, PHP
● http://www.cb1inc.com/
● http://twitter.com/cb1inc
● http://twitter.com/cb1kenobi
● http://slideshare.net/cb1kenobi
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
4. A highly scalable, eventually
consistent, distributed,
structured key-value store.
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
5. About Cassandra
● Started by Facebook
● Open Source
● Apache Project
● Apache License 2.0
● Written in Java
● Mutli-platform
● Current Version 0.6.1
● http://cassandra.apache.org/
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
9. Cassandra Overview
● Like a big hash table of hash tables
● Column Database (schemaless)
● Highly scalable
● Add nodes in minutes
● Fault tolerant
● Distributed
● Tunable
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
10. Dynamo + BigTable = Cassandra
● Amazon Dynamo
● Cluster management
● Replication
● Fault tolerance
● Google BigTable
● Sparse
● Columnar data model
● Storage architecture
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
11. Pros & Cons
● Pros ● Cons
● Easy to scale ● No joins
● No single point of failure ● Index & sort keys only
● High write-through ● Not good for large blobs
● Handles lots of data ● Rows must fit in
● Durable memory
● No more SQL injection
● Built on Thrift
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
12. CAP Theorem
● CAP Theorem
● Consistency
● Availability
● Partitioning
● You can only have 2
● Cassandra is Available and Partitioning
● Eventually consistent
– Can be defined on a per request basis
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
13. Consistency
● Specified for each operation
● Zero
● One
● Quorum (N-1)
● All
● Any
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
14. Replication Ring
● Ring of servers
● Talk to each other using "gossip"
● Data distributed between nodes
● Uses "tokens" to partition data
● Must be unique per node
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
15. Partitioning
● RandomPartitioner
● Inefficient range queries
● Doesn't sort properly
● OrderPreservingPartitioner
● Can cause unevenly distributed data
● Stores data sorted
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
16. Replica Placement Strategy
● Rack-unware
● Default
● Rack-aware
● Place one replica in a different datacenter, and the
others on different racks in the same one
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
17. Data Model
● Keyspace
● Column Family (standard or super)
● Columns & Super Columns
● Keys and column names
Keyspace1: {
users: {
"cb1kenobi": {
"FirstName": "chris",
"LastName": "barber"
}
}
}
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
18. Installing & Deploying
Cassandra
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
19. Getting Cassandra
● http://cassandra.apache.org/download/
● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz
● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz
● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra
●
git clone git://git.apache.org/cassandra.git
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
20. Getting Cassandra
● http://cassandra.apache.org/download/
● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz
● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz
● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra
●
git clone git://git.apache.org/cassandra.git
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
21. Installing Cassandra
su
cd /usr/local
wget http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz
tar xzf apache-cassandra-0.6.1-src.tar.gz
mkdir -p /var/log/cassandra
chown -R `whoami` /var/log/cassandra
mkdir -p /var/lib/cassandra
chown -R `whoami` /var/lib/cassandra
cd apache-cassandra-0.6.1-src
ant
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
22. Configuration
● Main config file
● conf/storage-conf.xml
● Keyspaces
● Partitioner
● AutoBootstrap
● Authentication method
● Buffer sizes
● Timeouts
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
23. Automatically Start Cassanrda
useradd -G cassandra cassandra
<editor of choice> /etc/init.d/cassandra
# paste contents of next slide
chmod +x /etc/init.d/cassandra
# Ubuntu/Debian method:
update-rc.d -f cassandra defaults
# Red Hat/Fedora method: use chkconfig
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
29. PHP Thrift Client Example
<?php
$GLOBALS['THRIFT_ROOT'] = './thrift';
require $GLOBALS['THRIFT_ROOT'] . '/Thrift.php';
require $GLOBALS['THRIFT_ROOT'] . '/transport/TSocket.php';
require $GLOBALS['THRIFT_ROOT'] . '/transport/TBufferedTransport.php';
require $GLOBALS['THRIFT_ROOT'] . '/protocol/TBinaryProtocol.php';
require $GLOBALS['THRIFT_ROOT'] . '/packages/cassandra/Cassandra.php';
$socket = new TSocket('127.0.0.1', 9160);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TbinaryProtocolAccelerated($transport);
$client = new CassandraClient($protocol);
$transport->open();
$columnPath = new cassandra_ColumnPath();
$columnPath->column_family = 'Standard1';
$columnPath->super_column = null;
$columnPath->column = 'firstname';
$client->insert('Keyspace1', 'mykey', $columnPath, 'Chris', time(),
cassandra_ConsistencyLevel::ONE);
$name = $client->get('Keyspace1', 'mykey', $columnPath, cassandra_ConsistencyLevel::ONE);
var_dump($name);
$transport->close();
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
30. Prophet PHP Extension
● C++ PHP Extension
● Built on top of Thrift C library
● Very, very, very far from usable/working/complete
● Goals
● Speed!
● Full API support
● CRUD/ORM magic
● Serialization helper
● Developed for PHP 5.3, Linux, non-threaded (i.e.
FastCGI)
● http://github.com/cb1kenobi/prophet
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
32. Roadmap 0.7 & Beyond
● SSTable compression
● Live keyspace & column family changes
● Vector clock support
● Truncate support
● Range delete
● byte[] keys
● Memory efficient compactions
● Apache Avro
● Multi-tenant support
* Taken from other presentations
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
33. Resources
● Cassandra Wiki
● http://wiki.apache.org/cassandra/
● IRC
● #cassandra on irc.freenode.net
● Cassandra Users Mailing List
● user-subscribe@cassandra.apache.org
● Follow people on Twitter
● @cassandra ● @jericevans
● @spyced ● @riptano
● @b6n
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
34. Getting Help
CB1, INC
http://www.cb1inc.com/
Web Applications
Open Source Solutions
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
35. Thanks!
Questions?
http://www.cb1inc.com/
http://twitter.com/cb1inc
http://slideshare.net/cb1kenobi
http://twitter.com/cb1kenobi
Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/