11. ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Company
Employee
ACME
Employee:1:Name Employee:2:Name
John Smith
Mike Kowalski
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
ID: 2
works with
John
Smith
Tuesday, October 22, 13
Mike
Kowalski
12. ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Relational (MySQL,
Oracle, ...)
Company
Employee
ACME
Employee:1:Name Employee:2:Name
John Smith
Mike Kowalski
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
ID: 2
works with
John
Smith
Tuesday, October 22, 13
Mike
Kowalski
13. Key-Value
(Redis, Riak, Dynamo, ...)
ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Relational (MySQL,
Oracle, ...)
Company
Employee
ACME
Employee:1:Name Employee:2:Name
John Smith
Mike Kowalski
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
ID: 2
works with
John
Smith
Tuesday, October 22, 13
Mike
Kowalski
14. Key-Value
(Redis, Riak, Dynamo, ...)
ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Relational (MySQL,
Oracle, ...)
Company
Employee
ACME
Employee:1:Name Employee:2:Name
John Smith
Mike Kowalski
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
ID: 2
Document (MongoDB,
Couchbase, ...)
Tuesday, October 22, 13
works with
John
Smith
Mike
Kowalski
15. Key-Value
(Redis, Riak, Dynamo, ...)
ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Relational (MySQL,
Oracle, ...)
Company
Employee
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
Employee:1:Name Employee:2:Name
ACME
John Smith
Graph (Neo4j, ...)
ID: 2
Document (MongoDB,
Couchbase, ...)
Tuesday, October 22, 13
Mike Kowalski
works with
John
Smith
Mike
Kowalski
16. Key-Value
(Redis, Riak, Dynamo, ...)
ID
FIRST
LAST
1
John
Smith
2
Mike
Kowalski
Relational (MySQL,
Oracle, ...)
:name_1 -> “John Smith”
:name_2 -> “Mike Kowalski”
Wide Column
(BigTable, Cassandra, HBase, ...)
Company
Employee
Name: John Smith
Employee
ID: 1
Name: Mike Kowalski
Employee:1:Name Employee:2:Name
ACME
John Smith
Graph (Neo4j, ...)
ID: 2
Document (MongoDB,
Couchbase, ...)
Tuesday, October 22, 13
Mike Kowalski
works with
John
Smith
Mike
Kowalski
24. • Dynamo
• simple
• no
partitioning + BigTable model
architecture, minimal administration
single point of failure
• closer
• low
to the metal (e.g. no HDFS)
latency
Tuesday, October 22, 13
33. TWO PARTITIONERS OUT OF
THE BOX
• Byte
Ordered Partitioner
• Random
Partitioner
http://www.datastax.com/docs/1.0/cluster_architecture/partitioning
Tuesday, October 22, 13
34. TWO PARTITIONERS OUT OF
THE BOX
• Byte
Ordered Partitioner
• Random
Forget it:
•hot spots
•uneven distribution
•load balancing
Partitioner
http://www.datastax.com/docs/1.0/cluster_architecture/partitioning
Tuesday, October 22, 13
64. Author
Year
Number of words
George Orwell
Animal Farm
1945
32451
George Orwell
1984
1949
110581
James Joyce
Tuesday, October 22, 13
Book
Ulysses
1922
265192
65. Author
Book
Year
Number of words
George Orwell
Animal Farm
1945
32451
George Orwell
1984
1949
110581
James Joyce
Ulysses
1922
265192
CREATE TABLE books (
author varchar,
title varchar,
year integer,
number_of_words integer,
PRIMARY KEY (author, title)
);
Tuesday, October 22, 13
66. Author
Book
Year
Number of words
George Orwell
Animal Farm
1945
32451
George Orwell
1984
1949
110581
James Joyce
Ulysses
1922
265192
CREATE TABLE books (
author varchar,
title varchar,
year integer,
number_of_words integer,
PRIMARY KEY (author, title)
);
George Orwell
[1984, Year]: 1949
[1984, Number of
words]: 110581
James Joyce
[Ulysses, Year]: 1922
[Ulysses, Number of
words]: 265192
Tuesday, October 22, 13
[Animal Farm, Year]:
1945
[Animal Farm, Number
of words]: 32451
71. • no
DESCRIBE when calling from a client
• cache
settings
• insertion
performance with 100 000’s of columns
• PRIMARY
KEY((a,b,c),d)
• compaction
Tuesday, October 22, 13
settings