Más contenido relacionado
La actualidad más candente (20)
Similar a CQL3 in depth (20)
Más de Yuki Morishita (13)
CQL3 in depth
- 1. CQL3 in depth
Cassandra Conference in Tokyo, 11/29/2012
Yuki Morishita
Software Engineer@DataStax / Apache Cassandra Committer
©2012 DataStax
1
- 2. Agenda!
• Why CQL3?
• CQL3 walkthrough
• Defining Schema
• Querying / Mutating Data
• New features
• Related topics
• Native transport
©2012 DataStax
2
- 4. Cassandra Storage
create column family profiles
with key_validation_class = UTF8Type
and comparator = UTF8Type
and column_metadata = [
{column_name: first_name, validation_class: UTF8Type},
{column_name: last_name, validation_class: UTF8Type},
{column_name: year, validation_class: IntegerType}
];
row key columns values are validated by validation_class
nobu first_name Nobunaga
columns are sorted
last_name Oda
in comparator order
year 1582
©2012 DataStax
4
- 5. Thrift API
• Low level: get, get_slice, mutate...
• Directly exposes internal storage
structure
• Hard to change the signature of API
©2012 DataStax
5
- 6. Inserting data with Thrift
Column col = new Column(ByteBuffer.wrap("name".getBytes()));
col.setValue(ByteBuffer.wrap("value".getBytes()));
col.setTimestamp(System.currentTimeMillis());
ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
mutation.setColumn_or_supercolumn(cosc);
List<Mutation> mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>();
cf.put("Standard1", mutations);
Map<ByteBuffer, Map<String, List<Mutation>>> records
= new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
records.put(ByteBuffer.wrap("key".getBytes()), cf);
client.batch_mutate(records, consistencyLevel);
©2012 DataStax
6
- 7. ... with Cassandra Query Language
INSERT INTO “Standard1” (key, name)
VALUES (“key”, “value”);
• Introduced in 0.8(CQL), updated in
1.0(CQL2)
• Syntax similar to SQL
• More extensible than Thrift API
©2012 DataStax
7
- 8. CQL2 Problems
• Almost 1 to 1 mapping to Thrift API, so
not compose with the row-oriented parts
of SQL
• No support for CompositeType
©2012 DataStax
8
- 9. CQL3
• Maps storage to a more natural rows-
and-columns representation using
CompositeType
• Wide rows are “transposed” and unpacked
into named columns
• beta in 1.1, default in 1.2
• New features
• Collection support
©2012 DataStax
9
- 11. Defining Keyspace
• Syntax is changed from CQL2
CREATE KEYSPACE my_keyspace WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 2
};
©2012 DataStax
11
- 12. Defining Static Column Family
• “Strict” schema definition (and it’s good
thing)
• You cannot add column arbitrary
• You need ALTER TABLE ... ADD column
first
• Columns are defined and sorted using
CompositeType comparator
©2012 DataStax
12
- 13. Defining Static Column Family
CREATE TABLE profiles (
user_id text PRIMARY KEY, user_id | first_name | last_name | year
first_name text, ---------+------------+-----------+------
last_name text,
year int nobu | Nobunaga | Oda | 1582
)
CompositeType(UTF8Type)
user_id values are validated by type definition
nobu :
first_name: Nobunaga
columns are sorted
last_name: Oda
in comparator order
year: 1582
©2012 DataStax
13
- 14. Defining Dynamic Column Family
• Then, how can we add columns
dynamically to our time series data like
we did before?
• Use compound key
©2012 DataStax
14
- 15. Compound key
CREATE TABLE comments (
article_id uuid,
posted_at timestamp,
author text,
content text,
PRIMARY KEY (article_id, posted_at)
)
CompositeType(DateType, UTF8Type)
article_id values are validated by type definition
550e8400-.. 1350499616:
1350499616:author yukim
columns are sorted
1350499616:content blah, blah, blah in comparator order,
first by date, and then
1368499616: column name
1368499616:author yukim
1368499616:content well, well, well
...
©2012 DataStax
15
- 16. Compound key
cqlsh:ks> SELECT * FROM comments;
article_id | posted_at | author | content
--------------+--------------------------+--------+------------------
550e8400-... | 1970-01-17 00:08:19+0900 | yukim | blah, blah, blah
550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well
cqlsh:ks> SELECT * FROM comments WHERE posted_at >= '1970-01-17 05:08:19+0900';
article_id | posted_at | author | content
--------------+--------------------------+--------+------------------
550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well
©2012 DataStax
16
- 17. Changes worth noting
• Identifiers (keyspace/table/columns
names) are always case insensitive by
default
• Use double quote(“) to force case
• Compaction setting is now map type
CREATE TABLE test (
...
) WITH COMPACTION = {
'class': 'SizeTieredCompactionStrategy',
'min_threshold': 2,
'max_threshold': 4
};
©2012 DataStax
17
- 18. Changes worth noting
• system.schema_*
• All schema information are stored in system
Keyspace
• schema_keyspaces, schema_columnfamilies,
schema_columns
• system tables themselves are CQL3 schema
• CQL3 schema are not visible through
cassandra-cli’s ‘describe’ command.
• use cqlsh’s ‘describe columnfamily’
©2012 DataStax
18
- 19. More on CQL3 schema
• Thrift to CQL3 migration
• http://www.datastax.com/dev/blog/thrift-to-cql3
• For better understanding
• http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
• http://www.datastax.com/dev/blog/cql3-evolutions
• http://www.datastax.com/dev/blog/cql3-for-cassandra-experts
©2012 DataStax
19
- 20. Mutating Data
INSERT INTO example (id, name) VALUES (...)
UPDATE example SET f = ‘foo’ WHERE ...
DELETE FROM example WHERE ...
• No more USING CONSISTENCY
• Consistency level setting is moved to protocol
level
©2012 DataStax
20
- 21. Batch Mutate
BEGIN BATCH
INSERT INTO aaa (id, col) VALUES (...)
UPDATE bbb SET col1 = ‘val1’ WHERE ...
...
APPLY BATCH;
• Batches are atomic by default from 1.2
• does not mean mutations are isolated
(mutation within a row is isolated from 1.1)
• some performance penalty because of batch
log process
©2012 DataStax
21
- 22. Batch Mutate
• Use non atomic batch if you need
performance, not atomicity
BEGIN UNLOGGED BATCH
...
APPLY BATCH;
• More on dev blog
• http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2
©2012 DataStax
22
- 23. Querying Data
SELECT article_id, posted_at, author
FROM comments
WHERE
article_id >= ‘...’
ORDER BY posted_at DESC
LIMIT 100;
©2012 DataStax
23
- 24. Querying Data
• TTL/WRITETIME
• You can query TTL or write time of the column.
cqlsh:ks> SELECT WRITETIME(author) FROM comments;
writetime(author)
-------------------
1354146105288000
©2012 DataStax
24
- 25. Collection support
• Collection
• Set
• Unordered, no duplicates
• List
• Ordered, allow duplicates
• Map
• Keys and associated values
©2012 DataStax
25
- 26. Collection support
CREATE TABLE example (
id uuid PRIMARY KEY,
tags set<text>,
points list<int>,
attributes map<text, text>
);
• Collections are typed, but cannot be
nested(no list<list<text>>)
• No secondary index on collections
©2012 DataStax
26
- 27. Collection support
INSERT INTO example (id, tags, points, attributes)
VALUES (
‘62c36092-82a1-3a00-93d1-46196ee77204’,
{‘foo’, ‘bar’, ‘baz’}, // set
[100, 20, 93], // list
{‘abc’: ‘def’} // map
);
©2012 DataStax
27
- 28. Collection support
• Set
UPDATE example SET tags = tags + {‘qux’} WHERE ...
UPDATE example SET tags = tags - {‘foo’} WHERE ...
• List
UPDATE example SET points = points + [20, 30] WHERE ...
UPDATE example SET points = points - [100] WHERE ...
• Map
UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ...
DELETE attributes[‘abc’] FROM example WHERE ...
©2012 DataStax
28
- 29. Collection support
SELECT tags, points, attributes FROM example;
tags | points | attributes
-----------------+---------------+--------------
{baz, foo, bar} | [100, 20, 93] | {abc: def}
• You cannot retrieve item in collection
individually
©2012 DataStax
29
- 30. Collection support
• Each element in collection is internally
stored as one Cassandra column
• More on dev blog
• http://www.datastax.com/dev/blog/cql3_collections
©2012 DataStax
30
- 32. Native Transport
• CQL3 still goes through Thrift’s
execute_cql3_query API
• Native Transport support introduces
Cassandra’s original binary protocol
• Async IO, server event push, ...
• http://www.datastax.com/dev/blog/binary-protocol
• Try DataStax Java native driver with C*
1.2 beta today!
• https://github.com/datastax/java-driver
©2012 DataStax
32
- 33. Question ?
Or contact me later if you have one
yuki@datastax.com
yukim (IRC, twitter) Now
Hiring
talented engineers from all
over the world!
©2012 DataStax
33