This document provides an introduction to Apache Cassandra, including an overview of key concepts like the cluster, nodes, data model, and data modeling best practices. It discusses Cassandra's origins and popularity. The presentation covers the cluster architecture with consistent hashing and token ranges, replication strategies, consistency levels, and more. It also summarizes the Cassandra data model including tables, columns, SSTables, caching, compaction and discusses building a Twitter-like data model in CQL.
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
1. CASSANDRA COMMUNITY WEBINARS APRIL 2013
INTRODUCTION TO
APACHE CASSANDRA 1.2
Aaron Morton
Apache Cassandra Committer, Data Stax MVP for Apache Cassandra
@aaronmorton
www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
69. Some users
cqlsh:cass_community> INSERT INTO User
... (user_name, password, real_name)
... VALUES
... ('fred', 'sekr8t', 'Mr Foo');
cqlsh:cass_community> select * from User;
user_name | password | real_name
-----------+----------+-----------
fred | sekr8t | Mr Foo
70. Some users
cqlsh:cass_community> INSERT INTO User
... (user_name, password)
... VALUES
... ('bob', 'pwd');
cqlsh:cass_community> select * from User where user_name =
'bob';
user_name | password | real_name
-----------+----------+-----------
bob | pwd | null
71. Data Model (so far)
Table /
Value
User
user_name Primary Key
76. UserTweetsTable...
cqlsh:cass_community> INSERT INTO UserTweets
... (tweet_id, body, user_name, timestamp)
... VALUES
... (1, 'The Tweet','fred',1352150816917);
cqlsh:cass_community> select * from UserTweets where
user_name='fred';
user_name | tweet_id | body | timestamp
-----------+----------+-----------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
77. UserTweetsTable...
cqlsh:cass_community> select * from UserTweets where
user_name='fred' and tweet_id=1;
user_name | tweet_id | body | timestamp
-----------+----------+-----------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
78. UserTweetsTable...
cqlsh:cass_community> INSERT INTO UserTweets
... (tweet_id, body, user_name, timestamp)
... VALUES
... (2, 'Second Tweet', 'fred', 1352150816918);
cqlsh:cass_community> select * from UserTweets where user_name = 'fred';
user_name | tweet_id | body | timestamp
-----------+----------+--------------+--------------------------
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
fred | 2 | Second Tweet | 2012-11-06 10:26:56+1300
79. UserTweetsTable...
cqlsh:cass_community> select * from UserTweets where user_name = 'fred' order by
tweet_id desc;
user_name | tweet_id | body | timestamp
-----------+----------+--------------+--------------------------
fred | 2 | Second Tweet | 2012-11-06 10:26:56+1300
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300
80. UserTimeline
CREATE TABLE UserTimeline
(
user_name text,
tweet_id bigint,
tweet_user text,
body text,
timestamp timestamp,
PRIMARY KEY (user_name, tweet_id)
)
WITH CLUSTERING ORDER BY (tweet_id DESC);
82. UserTimeline
cqlsh:cass_community> select * from UserTimeline where user_name = 'fred';
user_name | tweet_id | body | timestamp | tweet_user
-----------+----------+-----------+--------------------------+------------
fred | 100 | My Tweet | 2012-11-06 10:27:26+1300 | bob
fred | 1 | The Tweet | 2012-11-06 10:26:56+1300 | fred
83. Data Model (so far)
Table /
Value
User Tweet
User
Tweets
User
Timeline
user_name Primary Key Field Primary Key Primary Key
tweet_id Primary Key
Primary Key
Component
Primary Key
Component
85. UserMetricsTable...
cqlsh:cass_community> UPDATE
... UserMetrics
... SET
... tweets = tweets + 1
... WHERE
... user_name = 'fred';
cqlsh:cass_community> select * from UserMetrics where
user_name = 'fred';
user_name | followers | following | tweets
-----------+-----------+-----------+--------
fred | null | null | 1
86. Data Model (so far)
Table /
Value
User Tweet
User
Tweets
User
Timeline
User Metrics
user_name
Primary
Key
Field
Primary
Key
Primary
Key
Primary
Key
tweet_id
Primary
Key
Primary Key
Component
Primary Key
Component
89. Relationships
cqlsh:cass_community> select * from Following;
user_name | following | timestamp
-----------+-----------+--------------------------
bob | fred | 2012-11-07 13:22:29+1300
cqlsh:cass_community> select * from Followers;
user_name | follower | timestamp
-----------+----------+--------------------------
fred | bob | 2012-11-07 13:22:29+1300
90. Data Model
Table /
Value
User Tweet
User
Tweets
User
Timeline
User
Metrics
Follows
Followers
user_name
Primary
Key
Field
Primary
Key
Primary
Key
Primary
Key
Primary
Key
tweet_id
Primary
Key
Primary Key
Component
Primary Key
Component