Cassandra 2012 scandit

Cassandra for Barcodes, Products and Scans:

The Backend Infrastructure at Scandit

Christof Roduner
Co-founder and COO
christof@scandit.com Link: NoSQL concept and data model

@scandit
www.scandit.com February 1, 2012

AGENDA
2

 About Scandit

 Requirements

 Apache Cassandra

 Scandit backend

WHAT IS SCANDIT?
3

Scandit provides developers best-in-class tools to
build, analyze and monetize product-centric apps.

IDENTIFY ANALYZE MONETIZE
Products User Interest Apps

ANALYZE:
THE SCANALYTICS PLATFORM
4

 Tool for app publishers
 App-specific usage statistics

 Insights into consumer behavior:
 What do users scan?
 Product categories? Groceries, electronics, books, cosmetics, …?
 Where do users scan?
 At home? Or while in a retail store?
 Top products and brands

 Identify new opportunities:
 Customer engagement
 Product interest
 Cross-selling and up-selling

BACKEND REQUIREMENTS
5

 Product database
 Many millions of products
 Many different data sources
 Curation of product data (filtering, etc.)

 Analysis of scans
 Accept and store high volumes of scans
 Generate statistics over extended time periods
 Correlate with product data

 Provide reports to developers

BACKEND DESIGN GOALS
6

 Scalability
 High-volume storage
 High-volume throughput
 Support large number of concurrent client requests (app)

 Availability

 Low maintenance

WHICH DATABASE?
7

Apache Cassandra
 Large, distributed key-value store (DHT)

 «NoSQL» Polyglot Persistence

 Inspired by:

 Amazon’s Dynamo distributed storage system
 Google’s BigTable data model
 Originally developed at Facebook
 Inbox search

WHY DID WE CHOOSE IT?
8

 Looked very fast
 Even when data is much larger than RAM

 Performs well in write-heavy environment

 Proven scalability
 Without downtime

 Tunable replication

 Easy to run and maintain
 No sharding
 All nodes are the same - no coordinators, masters, slaves, …

 Data model
 YMMV…

WHAT YOU HAVE TO GIVE UP
9

 Joins
 Referential integrity
 Transactions
 Expressive query language
 Consistency (tunable, but…)

 Limited support for:
 Schema
 Secondary indices

CASSANDRA DATA MODEL
10

Disclaimer: I tend to say «hash»
when I mean «dictionary, map,
 Column families associative array» (Can you tell
my favorite language?)
 Rows
 Columns

 (Supercolumns)
 We’ll skip them - Cassandra developers don’t like
them

COLUMNS AND ROWS
11

 Column:
 Is a name-value pair
 row_key, CF, column, timestamp and value

 Row:
 Has exactly one key
 Contains any number of columns
 Columns are always automatically sorted by their name

 Column family:
 A collection of any number of rows (!)
 Has a name
 «Like a table»

EXAMPLE COLUMN FAMILY
12

"users": {
Row with key «christof»
"christof": {
"email": "christof@scandit.com",
"phone": "123-456-7890"
}

"moritz": {
"email": "moritz@scandit.com",
Two columns, automatically
"web": "www.example.com" sorted by their names
} («email», «web»)
}

 A column family «users» containing two rows
 Columns can be different in every row
 First row has a column named «phone», second row does not
 Rows can have many columns
 You can add millions of them

DATA IN COLUMN NAMES
13

 Column names can be used to store data
 Frequent pattern in Cassandra
 Takes advantage of column sorting
"logins": {

"christof": {
"2012-01-29 16:22:30 +0100": "208.115.113.86",
"2012-01-30 07:48:03 +0100": "66.249.66.183",
"2012-01-30 18:06:55 +0100": "208.115.111.70",
"2012-01-31 12:37:26 +0100": "66.249.66.183"
}

"moritz": {
"2012-01-23 01:12:49 +0100": "205.209.190.116"
}

}

SCHEMA AND DATA TYPES
14

 Schema is optional
 Data type can be defined for:
 Keys
 The values of all columns with a given name
 The column names in a CF
 By default, data type BLOB is used

 Data Types
 BLOB (default)  UUID
 ASCII text  Integer (arbitrary length)
 UTF8 text  Float
 Timestamp  Double
 Boolean  Decimal

CLUSTER ORGANIZATION
15

Range 1-64,
Node 1 stored on node 2
Token 0

Node 4 Node 2
Token 192 Token 64

Node 3 Range 65-128,
Token 128 stored on node 3

STORING A ROW
16
Range 1-64,
1. Calculate md5 hash for row key stored on node 2
Example: md5(“foobar") = 48
Node 1
Token 0
2. Determine data range for hash
Example: 48 lies within range 1-64

3. Store row on node responsible
Node 4 Node 2
for range
Token 192 Token 64
Example: store on node 2

Node 3
Token 128
Range 65-128,
stored on node 3

IMPLICATIONS
17

 Cluster automatically balanced
 Load is shared equally between nodes
 No hotspots

 Scaling out?
 Easy
 Divide data ranges by adding more nodes
 Cluster rebalances itself automatically

 Range queries not possible
 You can’t retrieve «all rows from A-C»
 Rows are not stored in their «natural» order
 Rows are stored in order of their md5 hashes

IF YOU NEED RANGE QUERIES…
18

Option 1: «Order Preserving Partitioner» (OPP)
 OPP determines node based on a row’s key instead of its hash
 Don’t use it…
 Manually balancing a cluster is hard
 Hotspots
 Balancing cluster for one column family creates hotspot for another

Option 2: Use columns instead of rows
 Columns are always sorted
 Rows can store millions of columns

REPLICATION
19

Replica 1
 Tunable replication factor of row
(RF) Node 1 «foobar»
Token 0
 RF > 1: rows are automatically
replicated to next RF-1 nodes

 Tunable replication strategy Node 2
Node 4
 «Ensure two replicas in Token 64
different data centers, racks, Token 192
etc.»

Node 3
Token 128 Replica 2
of row
«foobar»

CLIENT ACCESS
20

 Clients can send read and write
requests to any node
 This node will act as
coordinator
Node 1 Replica 1
 Coordinator forwards request Token 0 of row
to nodes where data resides «foobar»

Client Node 4 Node 2
Token 192 Token 64

Request:
Replica 2
insert( Node 3 of row
"foobar": { "email": "fb@example.com" } Token 128 «foobar»
)

CONSISTENCY LEVELS
21

 For all requests, clients can set a consistency level (CL)

 For writes:
 CL defines how many replicas must be written before
«success» is returned to client

 For reads:
 CL defines how many replicas must respond before result is
returned to client

 Consistency levels:
 ONE
 QUORUM
 ALL
 … (data center-aware levels)

INCONSISTENT DATA
22

 Example scenario:
 Replication factor 2
 Two existing replica for row «foobar»
 Client overwrites existing columns in «foobar»
 Replica 2 is down

 What happens:
 Column is updated in replica 1, but not replica 2 (even with CL=ALL !)

 Timestamps to the rescue
 Every column has a timestamp
 Timestamps are supplied by clients
 Upon read, column with latest timestamp wins

 →Use NTP

PREVENTING INCONSISTENCIES
23

 Read repair

 Hinted handoff

 Anti entropy

RETRIEVING DATA (API)
24
 At a row level, you can…
 Get all rows
 Get a single row by specifying its key
 Get a number of rows by specifying their keys
 Get a range of rows
 Only with OPP, strongly discouraged

 At a column level, you can…
 Get all columns
 Get a single column by specifying its name
 Get a number of columns by specifying their names
 Get a range of columns by specifying the name of the first and
last column

 Again: no ranges of rows

CASSANDRA QUERY LANGUAGE
(CQL)
25

UPDATE users SET
"email" = "christof@scandit.com",
"phone" = "123-456-7890"
WHERE KEY = "christof";

"users": {

"christof": {
"phone": "123-456-7890"
}

"moritz": {
"web": "www.example.com"
}

}

(CQL)
26

SELECT * FROM users WHERE KEY = "christof";

"users": {

"christof": {
"phone": "123-456-7890"
}

"moritz": {
"web": "www.example.com"
}

}

(CQL)
27

SELECT FIRST 3 "2012-01-30 00:00:00 +0100" ..
"2012-01-31 23:59:59 +0100"
FROM logins
WHERE KEY = "christof";

"logins": {

"christof": {
"2012-01-29 16:22:30 +0100": "208.115.113.86",
"2012-01-30 07:48:03 +0100": "66.249.66.183",
"2012-01-30 18:06:55 +0100": "208.115.111.70",
"2012-01-31 12:37:26 +0100": "66.249.66.183"
"2012-02-09 10:27:28 +0100": "218.315.37.21",
}

"moritz": {
"2012-01-23 01:12:49 +0100": "205.209.190.116"
}

}

SECONDARY INDICES
28

 Secondary indices can be defined for (single) columns

 Secondary indices only support equality predicate (=)
in queries

 Each node maintains index for data it owns
 When indexed column is queried, request must be forwarded
to all nodes
 Sometimes better to manually maintain your own index

PRODUCTION EXPERIENCE
29

 No stability issues

 Very fast

 Language bindings don’t have the same quality
 Out of sync, bugs

 Data model is a mental twist

 Design-time decisions sometimes hard to change

 Rudimentary access control

TRYING OUT CASSANDRA
30

 DataStax website
 Company founded by Cassandra developers
 Provides
 Documentation
 Amazon Machine Image

 Apache website

 Mailing lists

CLUSTER AT SCANDIT
31

 Several nodes in two data centers

 Linux machines

 Identical setup on every node
 Allows for easy failover

NODE ARCHITECTURE
32
from mobile apps and web browsers

Phusion Passenger
mod_passenger

Website & REST API
Ruby on Rails, Rack

to other nodes

Cassandra 2012 scandit

Recommended

Recommended

More Related Content

Similar to Cassandra 2012 scandit

Similar to Cassandra 2012 scandit (20)

Recently uploaded

Recently uploaded (20)

Cassandra 2012 scandit