2. What is SHIFT.com?
Shift is a platform that enables marketers to
communicate across organizations and
departments in one single place.
It’s also an open application platform with a
set of applications built on top of it that can
communicate with one another.
5. Why did we move to Cassandra?
● Operational Benefits
○ Adding and removing nodes is much easier,
compared to Mongo’s shards
● Control over our Data on Disk (LSMT)
● Love CQL3
● Long term scalability
○ Scales Linearly
○ Multi DC Support Baked in
6. Migration Goals
● Zero downtime
○ We wanted to roll out Cassandra without any
service interruptions
● No loss of performance
○ By carefully structuring our schema we were able
to match MongoDB’s performance.
8. Benefits of CQL3
● Easy to understand if you’re coming from
RDBMS
● Collections
○ sets, lists, maps
● Batch Queries
● Clustering Keys
○ Handles ordering of logical rows
○ Saved us from column name management scheme
and allowed us to focus on our data
12. Data Modelling Patterns
● considerations: working with Mongo’s dbrefs
and optimizing layout on disk
● structured tables as materialized views of
the queries we planned on using
● moving multiple documents into a single
physical row
● creating supporting index tables for looking
up logical rows
13. Time Series: Message Stream
● Users have tens of thousands of messages
● Each users message stream is specific to
them, like a twitter feed
● This is Cassandra’s strength - Time Series
● Considered Redis - but poor for multi-dc
create table news_feed (
user_id uuid,
message_id timeuuid,
message,
primary key (user_id, message_id));
14. cqlengine
●
●
●
●
●
cqlengine.org
the Python CQL3 object-row mapper
exposes CQL3 tables as Python classes
maps columns to properties
builds CQL queries
#model definition
class ExampleModel(Model):
example_id
= columns.UUID
(primary_key=True)
example_type
= columns.Integer(index=True)
created_at
= columns.DateTime()
description
= columns.Text(required=False)
# example query
ExampleModel.objects(example_type=1)
15. Improvements from moving to C*
●
●
●
●
Operationally we’ve had zero problems
Outstanding Performance
Easy to build new features
Community has been amazing (mailing list
and #cassandra)
16. misc tips
● leveled compaction - good for read heavy
workloads
● use secondary indexes sparingly,
understand how they work and when to use
them
● to reiterate, think about how you’re going
to query your data
● use elastic search / solr for ad hoc queries