SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
SHIFT.com
Migrating from MongoDB to Cassandra
by: Blake Eggleston & Jon Haddad
What is SHIFT.com?
Shift is a platform that enables marketers to
communicate across organizations and
departments in one single place.
It’s also an open application platform with a
set of applications built on top of it that can
communicate with one another.
Initial Stack
● Python
○ Flask
○ Celery

● MongoDB
○ mongoengine

● Neo4j / Titan
○ Bulbs
○ thunderdome

● Redis
● AWS
○ m1.xlarge for mongo
Current Stack
● Python
○ still flask
○ still celery
○ gevent (it rocks)

● Cassandra
○ 1.2.6
○ cqlengine

● ElasticSearch
● Redis
○ jondis

● AWS
○ m1.xlarge
Why did we move to Cassandra?
● Operational Benefits
○ Adding and removing nodes is much easier,
compared to Mongo’s shards

● Control over our Data on Disk (LSMT)
● Love CQL3
● Long term scalability
○ Scales Linearly
○ Multi DC Support Baked in
Migration Goals
● Zero downtime
○ We wanted to roll out Cassandra without any
service interruptions

● No loss of performance
○ By carefully structuring our schema we were able
to match MongoDB’s performance.
Migration Strategy
Benefits of CQL3
● Easy to understand if you’re coming from
RDBMS
● Collections
○ sets, lists, maps

● Batch Queries
● Clustering Keys
○ Handles ordering of logical rows
○ Saved us from column name management scheme
and allowed us to focus on our data
Physical vs Logical Row
Single Row
Clustered Row
Data Modelling Patterns
● considerations: working with Mongo’s dbrefs
and optimizing layout on disk
● structured tables as materialized views of
the queries we planned on using
● moving multiple documents into a single
physical row
● creating supporting index tables for looking
up logical rows
Time Series: Message Stream
● Users have tens of thousands of messages
● Each users message stream is specific to
them, like a twitter feed
● This is Cassandra’s strength - Time Series
● Considered Redis - but poor for multi-dc
create table news_feed (
user_id uuid,
message_id timeuuid,
message,
primary key (user_id, message_id));
cqlengine
●
●
●
●
●

cqlengine.org
the Python CQL3 object-row mapper
exposes CQL3 tables as Python classes
maps columns to properties
builds CQL queries
#model definition
class ExampleModel(Model):
example_id
= columns.UUID
(primary_key=True)
example_type
= columns.Integer(index=True)
created_at
= columns.DateTime()
description
= columns.Text(required=False)
# example query
ExampleModel.objects(example_type=1)
Improvements from moving to C*
●
●
●
●

Operationally we’ve had zero problems
Outstanding Performance
Easy to build new features
Community has been amazing (mailing list
and #cassandra)
misc tips
● leveled compaction - good for read heavy
workloads
● use secondary indexes sparingly,
understand how they work and when to use
them
● to reiterate, think about how you’re going
to query your data
● use elastic search / solr for ad hoc queries
Contact Info
Jon Haddad
@rustyrazorblade
jon@shift.com
Blake Eggleston
@blakeeggleston
blake@shift.com
….we’re hiring!

Más contenido relacionado

La actualidad más candente

Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsClaudiu Coman
 
Entity Framework Core
Entity Framework CoreEntity Framework Core
Entity Framework CoreKiran Shahi
 
SqliteToRealm
SqliteToRealmSqliteToRealm
SqliteToRealmPluu love
 
Comparing Orchestration
Comparing OrchestrationComparing Orchestration
Comparing OrchestrationKnoldus Inc.
 
MinIO January 2020 Briefing
MinIO January 2020 BriefingMinIO January 2020 Briefing
MinIO January 2020 BriefingJonathan Symonds
 
Groovy reactive streams
Groovy reactive streamsGroovy reactive streams
Groovy reactive streamsadam1davis
 
BigData in IoT #iotconfua
BigData in IoT #iotconfuaBigData in IoT #iotconfua
BigData in IoT #iotconfuaAndy Shutka
 
All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka Florian Benz
 
Intro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucIntro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucFraugster
 
Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018Orit Wasserman
 
Docker @haufe lexware tech lunch
Docker @haufe lexware tech lunchDocker @haufe lexware tech lunch
Docker @haufe lexware tech lunchHaufeLexwareRomania
 
Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad openstackindia
 
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesData Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesPradeeban Kathiravelu, Ph.D.
 
Введение в Apache Cassandra
Введение в Apache CassandraВведение в Apache Cassandra
Введение в Apache CassandraOpen-IT
 
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB
 
Intro to NoSQL and MongoDB
 Intro to NoSQL and MongoDB Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBMongoDB
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 

La actualidad más candente (20)

Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
 
Entity Framework Core
Entity Framework CoreEntity Framework Core
Entity Framework Core
 
SqliteToRealm
SqliteToRealmSqliteToRealm
SqliteToRealm
 
Comparing Orchestration
Comparing OrchestrationComparing Orchestration
Comparing Orchestration
 
MinIO January 2020 Briefing
MinIO January 2020 BriefingMinIO January 2020 Briefing
MinIO January 2020 Briefing
 
Introduction to Bizur
Introduction to BizurIntroduction to Bizur
Introduction to Bizur
 
Groovy reactive streams
Groovy reactive streamsGroovy reactive streams
Groovy reactive streams
 
Dataspace presentatie
Dataspace presentatieDataspace presentatie
Dataspace presentatie
 
BigData in IoT #iotconfua
BigData in IoT #iotconfuaBigData in IoT #iotconfua
BigData in IoT #iotconfua
 
All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka
 
Intro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana GoriucIntro To Graph Databases - Oxana Goriuc
Intro To Graph Databases - Oxana Goriuc
 
Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018
 
Docker @haufe lexware tech lunch
Docker @haufe lexware tech lunchDocker @haufe lexware tech lunch
Docker @haufe lexware tech lunch
 
Migration
MigrationMigration
Migration
 
Open stack @ iiit hyderabad
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad
 
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesData Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
 
Введение в Apache Cassandra
Введение в Apache CassandraВведение в Apache Cassandra
Введение в Apache Cassandra
 
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
 
Intro to NoSQL and MongoDB
 Intro to NoSQL and MongoDB Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 

Destacado

Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandraJon Haddad
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core ConceptsJon Haddad
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkJon Haddad
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Jon Haddad
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoJon Haddad
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandraJon Haddad
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best FriendsJon Haddad
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Jon Haddad
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to CassandraJon Haddad
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profilingJon Haddad
 
A educação de mônica
A educação de mônicaA educação de mônica
A educação de mônicaDalila Melo
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax Academy
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
 
DataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax Academy
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax Academy
 

Destacado (20)

Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy Spark
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandra
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best Friends
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
 
A educação de mônica
A educação de mônicaA educação de mônica
A educação de mônica
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
 
DataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra Ops
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
 

Similar a Cassandra meetup slides - Oct 15 Santa Monica Coloft

Shift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraShift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraDataStax
 
Cassandra data access
Cassandra data accessCassandra data access
Cassandra data accesstechblog
 
eBuddy | Cassandra Data Access in Java
eBuddy | Cassandra Data Access in JavaeBuddy | Cassandra Data Access in Java
eBuddy | Cassandra Data Access in JavaDataStax Academy
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Building RESTtful services in MEAN
Building RESTtful services in MEANBuilding RESTtful services in MEAN
Building RESTtful services in MEANMadhukara Phatak
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo dbLawrence Mwai
 
What are the major components of MongoDB and the major tools used in it.docx
What are the major components of MongoDB and the major tools used in it.docxWhat are the major components of MongoDB and the major tools used in it.docx
What are the major components of MongoDB and the major tools used in it.docxTechnogeeks
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...Anna Ossowski
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleLviv Startup Club
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개Ha-Yang(White) Moon
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user groupAdam Doyle
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
Public Cloud Workshop
Public Cloud WorkshopPublic Cloud Workshop
Public Cloud WorkshopAmer Ather
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerMichael Spector
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for BeginnersDigitalOcean
 

Similar a Cassandra meetup slides - Oct 15 Santa Monica Coloft (20)

Shift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraShift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to Cassandra
 
Cassandra data access
Cassandra data accessCassandra data access
Cassandra data access
 
eBuddy | Cassandra Data Access in Java
eBuddy | Cassandra Data Access in JavaeBuddy | Cassandra Data Access in Java
eBuddy | Cassandra Data Access in Java
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Building RESTtful services in MEAN
Building RESTtful services in MEANBuilding RESTtful services in MEAN
Building RESTtful services in MEAN
 
MongoDB
MongoDBMongoDB
MongoDB
 
Mongodb
MongodbMongodb
Mongodb
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
What are the major components of MongoDB and the major tools used in it.docx
What are the major components of MongoDB and the major tools used in it.docxWhat are the major components of MongoDB and the major tools used in it.docx
What are the major components of MongoDB and the major tools used in it.docx
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Public Cloud Workshop
Public Cloud WorkshopPublic Cloud Workshop
Public Cloud Workshop
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness Platform
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Cassandra meetup slides - Oct 15 Santa Monica Coloft

  • 1. SHIFT.com Migrating from MongoDB to Cassandra by: Blake Eggleston & Jon Haddad
  • 2. What is SHIFT.com? Shift is a platform that enables marketers to communicate across organizations and departments in one single place. It’s also an open application platform with a set of applications built on top of it that can communicate with one another.
  • 3. Initial Stack ● Python ○ Flask ○ Celery ● MongoDB ○ mongoengine ● Neo4j / Titan ○ Bulbs ○ thunderdome ● Redis ● AWS ○ m1.xlarge for mongo
  • 4. Current Stack ● Python ○ still flask ○ still celery ○ gevent (it rocks) ● Cassandra ○ 1.2.6 ○ cqlengine ● ElasticSearch ● Redis ○ jondis ● AWS ○ m1.xlarge
  • 5. Why did we move to Cassandra? ● Operational Benefits ○ Adding and removing nodes is much easier, compared to Mongo’s shards ● Control over our Data on Disk (LSMT) ● Love CQL3 ● Long term scalability ○ Scales Linearly ○ Multi DC Support Baked in
  • 6. Migration Goals ● Zero downtime ○ We wanted to roll out Cassandra without any service interruptions ● No loss of performance ○ By carefully structuring our schema we were able to match MongoDB’s performance.
  • 8. Benefits of CQL3 ● Easy to understand if you’re coming from RDBMS ● Collections ○ sets, lists, maps ● Batch Queries ● Clustering Keys ○ Handles ordering of logical rows ○ Saved us from column name management scheme and allowed us to focus on our data
  • 12. Data Modelling Patterns ● considerations: working with Mongo’s dbrefs and optimizing layout on disk ● structured tables as materialized views of the queries we planned on using ● moving multiple documents into a single physical row ● creating supporting index tables for looking up logical rows
  • 13. Time Series: Message Stream ● Users have tens of thousands of messages ● Each users message stream is specific to them, like a twitter feed ● This is Cassandra’s strength - Time Series ● Considered Redis - but poor for multi-dc create table news_feed ( user_id uuid, message_id timeuuid, message, primary key (user_id, message_id));
  • 14. cqlengine ● ● ● ● ● cqlengine.org the Python CQL3 object-row mapper exposes CQL3 tables as Python classes maps columns to properties builds CQL queries #model definition class ExampleModel(Model): example_id = columns.UUID (primary_key=True) example_type = columns.Integer(index=True) created_at = columns.DateTime() description = columns.Text(required=False) # example query ExampleModel.objects(example_type=1)
  • 15. Improvements from moving to C* ● ● ● ● Operationally we’ve had zero problems Outstanding Performance Easy to build new features Community has been amazing (mailing list and #cassandra)
  • 16. misc tips ● leveled compaction - good for read heavy workloads ● use secondary indexes sparingly, understand how they work and when to use them ● to reiterate, think about how you’re going to query your data ● use elastic search / solr for ad hoc queries
  • 17. Contact Info Jon Haddad @rustyrazorblade jon@shift.com Blake Eggleston @blakeeggleston blake@shift.com ….we’re hiring!