Slides from a brief talk I gave at the local JUG, javaBin. It's about our experiences using Cassandra in a production environment, with some philosophizing here and there.
6. Cassandra Essentials
● Inspired by BigTable (Google) and Dynamo
(Amazon)
● Eventually consistent
● Multi-level map-like
● Column store
● Released by Facebook, adopted by Apache
● Supported by DataStax
● EC2 AMI
● Commercial product on top: Brisk
7. Data Model in Brief
● Atomic unit of storage: The Column
– Possibly stored in a Super Column
● Collections of columns: The Row
– Or Super Columns
● Collections of rows: The Column Family
– Or the Super Column Family
● Collections of column families: The Keyspace
8. The Column
● Key, value and timestamp:
Age
29
1330945017654
9. The Row
● Many (many, many) columns:
● Columns are sorted on key, good for range queries
● Scales wildly – just keep on adding columns
● In practice, a persistent hash map
● Rows can be stored sorted, or hashed
Age
Kjetil 29
1330945017654
10. The Column Family
● Consists of many (many) rows:
YOUNG_AND_PROMISING
Age
Kjetil 29
1330945017654
11. The Keyspace
● Consists of (many)
column families: JUST_YOUNG
● Usually a statically known
set
YOUNG_AND_PROMISING
12. WTF a Super Column is
● Columns holding (a few) other columns:
● Serialized as single value. Do NOT scale wildly.
Kjetil
1330945017654
13. Can You Relate?
● Concepts mapped to RDB data model levels
● Keyspace => Schema
● Column family => Table
● Row => Row, but without known columns
● Column => Column name and value found in a row
● RDB: Rows, column values are dynamic/data,
column names are static/structure
● NoSQL: Column keys are dynamic/data, too.
14. The Column Revisited
● Columns are dynamic
● Columns are data, not structure
● Column keys don't have to be strings
● Columns can be any supported, sortable primitive
type, e.g. timestamps (Long)
● Don't say column name, say column key
● Columns are sorted
● Some RDB unlearning required
15. What's in a KeyspaceSchema?
● Keyspace settings
● Partitioning: Decides which node(s) will store rows
● Replication factor
● Custom strategies for partitioning, placement etc.
● The set of Column Families
● For each Column Family, the type of its keys
● Optional meta-data:
● Pre-defined columns
16. Data Model Notes/(Anti-)Patterns
● Super columns are losing favor
● Prefer “synthetic” columns (e.g. columns grouped by prefix)
● Columns in super columns are schema, NOT data!
● Cassandra devs hate them
● Partitioning inside of rows is common
● E.g. for x partitions, compute hash value from column
name and mod by x, obtaining i. E.g. if “Age” hashes to
module 2, write to row name Kjetil[2]
● Helps to distribute r/w traffic among nodes, for column
families with busy/crowded rows
18. What We Do
● Count displays of, and clicks on, ads
● Use Cassandra to track # of hits, in time intervals:
● Ads
● Groups of ads
● Advertiser campaigns
● Display boxes
● Publisher channels
● Publisher sites
● Other
● ... and combinations thereof
20. Example List of Updates
● Count +1 for:
● 6 ads, 6 ad groups, 6 campaigns. (No overlap.)
● 2 display boxes, 1 channel (in this case, same
channel), 1 site
● 2 channel/ad combinations
● Various secret sauce, e.g. another 4
● 28 updates
● If click: 11 updates, count +1 for:
● 1 ad, ad group, campaign, box, channel, etc.
21. But wait, there's more!
● Spec says “ in time intervals” => +1 for each of:
● The current hour
● Today
● This week
● This month
● This year
● Total
● Total: 6x28 = 168 updates
● For average of 500 requests/sec, ~100 updates/req:
● ~50,000 writes/second
22. Cassandra 1.0 Applied
● New feature/godsend: Counter columns!
● Like Long values, but
● Accept updates that are increments to current value
● Combined with batched updates
● Phew!
● Scale out for write traffic and workable read
speed
● Done!
23. Real data: Row and columns
● D[0]
● D: Daily interval, partition 0 (hashed from key)
● 20120121
● The day: January 21 this year
● channel_ad/Channel:b29-Ad:e13083
● 1 click, 7 hits for ad 13083 in channel 29 on that day
24. Stupid Pet Tricks for Sorting
● Funny-looking values in the column key?
● a1
● b29
● c432
● d2345
● e34345
● Sortable, more compact and scalable than:
● 00000000029
● 00000000432
● ...
25. Given hit in channel 29 ...
● Read from an application-configured set of rows
● Example config: last 4 hours, 3 days, 2 weeks.
● 9 logical rows to read from
● Assume 3 partitions for each logical row.
● Read from 27 physical rows, all (or a minimum count of)
columns beginning with:
– channel_ad/Channel:b29-Ad:
● Obtain synthetic clicks/hits ratio for each ad
● And channel_ad is just one of the ratios to use
26. Caching of Synthetic Ratios
● Use ehcache
● In-memory, fast
● In-memory, clutters heap, provokes stop-the-world GC
● Cache in Cassandra
● Store synthetic reads back in Cassandra (on-demand “denormalization”)
● Still sensitive to high Cassandra loads
● Instance-local Redis instance each box
● Stand-alone: Isolated from high Cassandra loads
● Off-heap: Reduce stop-the-world GC
● Fast: Configured for in-memory caching behavior
● Typical time to retrieve a Java object from 200µs to 2ms
● Good trade-off
27. Client Libraries
● Out-of-the-box: Thrift
● Usable, but should not be mixed up with business
logic
● Java recommendation: Hector
● https://github.com/rantav/hector
● Connection pooling
● Just-above-Thrift-level
● Type-safe(r) r/w
29. Operations: Quickstart on EC2
● DataStax AMI:
● http://datastax.com/docs/1.0/install/install_ami
● Readymade cluster of N nodes
● Free OpsCenter
30. Operations: Scaling
● Scaling Strategy:
● Doubling/halving capacity is very convenient
● => New nodes automatically redistribute load
naturally
31. Operations: Backup
● System-wide backups
● Nodes can be asked to dump Snapshots
● Recovery: New nodes started from Snapshots
● Selective backups
● Selected data can be dumped to/read from JSON
● sstable2json/json2sstable
● Incremental backups
33. Introducing Cassandra
● Look for data that
● Grows fast
● Holds useful information, given time to analyze it
● Can be reproduced from source data (e.g. log files)
● Avoid business-critical data
● Let RDBMS handle all that
34. Living with Cassandra
● Columns are data that live in a context:
● Sorted in pre-defined ways, determining query
efficiency
● Queried for by application in other ways
● Columns are data coupled to your logic
● Typical: Encoding and parsing column names
● Queries will change in development/maintenance
– Persisted formats should change
– Code must change
35. Cost of Change
● Your NoSQL data are, relative to your RDB data:
● Bigger
● More loosely-defined
● More closely-coupled to application code
● Harder to query (and easier queries => bigger data)
● Less supported by mature tools
● Affects cost of change
● Rebuild-from-source-data is a better option than
migrate-existing-data - if it's practical