SlideShare una empresa de Scribd logo
1 de 68
Cabs, Cassandra, and Hailo

David Gardner, Architect at Hailo
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
0.6 to 1.2
• 1,352 changed files with 235,413 additions and 47,487 deletions
• 7,429 commits
• 1,653 tickets completed

https://github.com/apache/cassandra/compare/cassandra-0.6.0...cassandra-1.2
https://github.com/apache/cassandra/blob/trunk/CHANGES.txt

#CASSANDRAEU

CASSANDRASUMMITEU
What this talk is about
Cassandra adoption at Hailo from three perspectives:
1. Development
2. Operational
3. Management

#CASSANDRAEU

CASSANDRASUMMITEU
What is Hailo?
Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.

#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
What is Hailo?
• The world’s highest-rated taxi app – over 11,000 five-star reviews
• Over 500,000 registered passengers
• A Hailo hail is accepted around the world every 4 seconds
• Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in
nearly 2 years of operation

#CASSANDRAEU

CASSANDRASUMMITEU
Hailo is growing
• Hailo is a marketplace that facilitates over $100M in run-rate
transactions and is making the world a better place for passengers
and drivers
• Hailo has raised over $50M in financing from the world's best
investors including Union Square Ventures, Accel, the founder of
Skype (via Atomico), Wellington Partners (Spotify), Sir Richard
Branson, and our CEO's mother, Janice

#CASSANDRAEU

CASSANDRASUMMITEU
The history
The story behind Cassandra adoption at Hailo

#CASSANDRAEU

CASSANDRASUMMITEU
Hailo launched in London in November 2011
• Launched on AWS
• Two PHP/MySQL web apps plus a Java backend
• Mostly built by a team of 3 or 4 backend engineers
• MySQL multi-master for single AZ resilience

#CASSANDRAEU

CASSANDRASUMMITEU
Why Cassandra?
• A desire for greater resilience – “become a utility”
Cassandra is designed for high availability

• Plans for international expansion around a single consumer app
Cassandra is good at global replication
• Expected growth
Cassandra scales linearly for both reads and writes
• Prior experience
I had experience with Cassandra and could recommend it
#CASSANDRAEU

CASSANDRASUMMITEU
The path to adoption
• Largely unilateral decision by developers – a result of a startup
culture

• Replacement of key consumer app functionality, splitting up the
PHP/MySQL web app into a mixture of global PHP/Java services
backed by a Cassandra data store
• Launched into production in September 2012 – originally just
powering North American expansion, before gradually switching
over Dublin and London
#CASSANDRAEU

CASSANDRASUMMITEU
One year on...
• Further breakdown of functionality into Go/Java SOA
• Migrating all online databases to Cassandra

#CASSANDRAEU

CASSANDRASUMMITEU
Development perspective

#CASSANDRAEU

CASSANDRASUMMITEU
“Cassandra just works”
Dom W, Senior Engineer

#CASSANDRAEU

CASSANDRASUMMITEU
Use cases
1. Entity storage
2. Time series data

#CASSANDRAEU

CASSANDRASUMMITEU
CF = customers
126007613634425612:
createdTimestamp:
email:
givenName:
familyName:
locale:
phone:

#CASSANDRAEU

1370465412
dave@cruft.co
Dave
Gardner
en_GB
+447911111111

CASSANDRASUMMITEU
Considerations for entity storage
• Do not read the entire entity, update one property and then write
back a mutation containing every column

• Only mutate columns that have been set
• This avoids read-before-write race conditions

#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
CF = stats_db
2013-06-01:
55374fa0-ce2b-11e2-8b8b-0800200c9a66:
a48bd800-ce2b-11e2-8b8b-0800200c9a66:
b0e15850-ce2b-11e2-8b8b-0800200c9a66:
bfac6c80-ce2b-11e2-8b8b-0800200c9a66:

#CASSANDRAEU

{“action”:”…
{“action”:”…
{“action”:”…
{“action”:”…

CASSANDRASUMMITEU
CF = stats_db
LON123456:
13b247f0-ce2c-11e2-8b8b-0800200c9a66:
20f70a40-ce2c-11e2-8b8b-0800200c9a66:
2b44d3b0-ce2c-11e2-8b8b-0800200c9a66:
338a22f0-ce2c-11e2-8b8b-0800200c9a66:

#CASSANDRAEU

{“action”:”…
{“action”:”…
{“action”:”…
{“action”:”…

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
Considerations for time series storage
• Choose row key carefully, since this partitions the records
• Think about how many records you want in a single row

• Denormalise on write into many indexes

#CASSANDRAEU

CASSANDRASUMMITEU
Client libraries
• Gossie (Go)
• Astyanax (Java)

• phpcassa (PHP)

#CASSANDRAEU

CASSANDRASUMMITEU
Analytics
• With Cassandra we lost the ability to carry out analytics
eg: COUNT, SUM, AVG, GROUP BY

• We use Acunu Analytics to give us this abilty in real time, for preplanned query templates
• It is backed by Cassandra and therefore highly available, resilient
and globally distributed
• Integration is straightforward
#CASSANDRAEU

CASSANDRASUMMITEU
events

#CASSANDRAEU

NSQ

Acunu

C*

CASSANDRASUMMITEU
AQL
SELECT
SUM(accepted),
SUM(ignored),
SUM(declined),
SUM(withdrawn)
FROM Allocations
WHERE timestamp BETWEEN '1 week ago' AND 'now’
AND driver='LON123456789’
GROUP BY timestamp(day)
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
Operational perspective

#CASSANDRAEU

CASSANDRASUMMITEU
“Allows a team of 2 to achieve things they
wouldn’t have considered before Cassandra
existed”
Chris H, Operations Engineer

#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
6

machines per region

3

regions

us-east-1

eu-west-1

us-east-1

eu-west-1

Operational
Cluster

clusters

Stats
Cluster

3

(stats cluster is a long story)

ap-southeast-1

#CASSANDRAEU

CASSANDRASUMMITEU
eu-west-1

us-east-1

ap-southeast-1

AZ1

AZ1

AZ1

AZ1

AZ1

AZ1

AZ2

AZ2

AZ2

AZ2

AZ2

AZ2

AZ3

AZ3

AZ3

AZ3

AZ3

AZ3

#CASSANDRAEU

CASSANDRASUMMITEU
Stats
Cluster

AWS VPCs with Open
VPN links
3 AZs per region

m1.large machines

~ 1TB/node

Provisoned IOPS EBS

#CASSANDRAEU

Operational
Cluster

~ 200GB/node

CASSANDRASUMMITEU
Backups
• SSTable snapshot
• Used to upload to S3, but this was taking >6 hours and consuming
all our network bandwidth
• Now take EBS snapshot of the data volumes

#CASSANDRAEU

CASSANDRASUMMITEU
Encryption
• Requirement for NYC launch
• We use dmcrypt to encrypt the entire EBS volume

• Chose dmcrypt because it is uncomplicated
• Our tests show a 1% performance hit in disk performance, which
concurs with what Amazon suggest

#CASSANDRAEU

CASSANDRASUMMITEU
Datastax Ops Centre is a quick win

#CASSANDRAEU

CASSANDRASUMMITEU
Multi DC
• Something that Cassandra makes trivial
• Would have been very difficult to accomplish active-active inter-DC
replication with a team of 2 without Cassandra
• Rolling repair needed to make it safe (we use LOCAL_QUORUM)
• We schedule “narrow repairs” on different nodes in our cluster
each night

#CASSANDRAEU

CASSANDRASUMMITEU
Compression
• Our stats cluster was running at ~1.5TB per node
• We didn’t want to add more nodes

• With compression, we are now back to ~600GB
• Easy to accomplish

• `nodetool upgradesstables` on a rolling schedule

#CASSANDRAEU

CASSANDRASUMMITEU
Management perspective

#CASSANDRAEU

CASSANDRASUMMITEU
“The days of the quick and dirty are over”
Simon V, EVP Operations

#CASSANDRAEU

CASSANDRASUMMITEU
Technically, everything is fine…
• Our COO feels that C* is “technically good and beautiful”, a
“perfectly good option”

• Our EVPO says that C* reminds him of a time series database in
use at Goldman Sachs that had “very good performance”

…but there are concerns
#CASSANDRAEU

CASSANDRASUMMITEU
People who can
attempt to query
MySQL
People who can
attempt to
query Cassandra

#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
Lessons learned

#CASSANDRAEU

CASSANDRASUMMITEU
There might be a gulf in experience

#CASSANDRAEU

CASSANDRASUMMITEU
10

Average years experience
per team member

MySQL
#CASSANDRAEU

Cassandra
CASSANDRASUMMITEU
Lesson learned
• Have an advocate - get someone who will sell the vision internally
• Learn the theory - teach each team member the fundamentals

• Make an effort to get everyone on board

#CASSANDRAEU

CASSANDRASUMMITEU
Things can drift into failure

#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
#CASSANDRAEU

CASSANDRASUMMITEU
Lesson learned
• Be pro-active with Cassandra, even if it seems to be running
smoothly

• Peer-review data models, take time to think about them
• Big rows are bad - use cfstats to look for them
• Mixed workloads can cause problems - use cfhistograms and look
out for signs of data modeling problems
• Think about the compaction strategy for each CF
#CASSANDRAEU

CASSANDRASUMMITEU
EBS is terrible

#CASSANDRAEU

CASSANDRASUMMITEU
Lessons learned
• EBS is nearly always the cause of Amazon outages
• EBS is a single point of failure (it will fail everywhere in your
cluster)
• EBS is slow
• EBS is expensive
• EBS is unnecessary!

#CASSANDRAEU

CASSANDRASUMMITEU
Management need to know the trade offs

#CASSANDRAEU

CASSANDRASUMMITEU
Lessons learned
• Keep the business informed – explain the tradeoffs in simple terms
• Sing from the same hymn sheet
• Make sure there solutions in place for every use case from the
beginning

#CASSANDRAEU

CASSANDRASUMMITEU
People who can
attempt to query
MySQL

#CASSANDRAEU

People who can
attempt to
query Cassandra

CASSANDRASUMMITEU
Conclusions

#CASSANDRAEU

CASSANDRASUMMITEU
We like Cassandra
• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation

#CASSANDRAEU

CASSANDRASUMMITEU
Lessons for successful adoption
• Have an advocate, sell the dream
• Learn the fundamentals, get the best out of Cassandra
• Invest in tools to make life easier
• Keep management in the loop, explain the trade offs

#CASSANDRAEU

CASSANDRASUMMITEU
The future
• We will continue to invest in Cassandra as we expand globally
• We will hire people with experience running Cassandra
• We will focus on expanding our reporting facilities
• We aspire to extend our network (1M consumer installs, wallet)
beyond cabs
• We will continue to hire the best engineers in London, NYC and
Asia
#CASSANDRAEU

CASSANDRASUMMITEU
Questions?

#CASSANDRAEU

CASSANDRASUMMITEU

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache CassandraCassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
Cassandra Day SV 2014: Beyond Read-Modify-Write with Apache Cassandra
 
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy CobleyC* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
 
To Cloud or Not To Cloud?
To Cloud or Not To Cloud?To Cloud or Not To Cloud?
To Cloud or Not To Cloud?
 
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesOrchestrating Cassandra with Kubernetes: Challenges and Opportunities
Orchestrating Cassandra with Kubernetes: Challenges and Opportunities
 
Cassandra Redis
Cassandra RedisCassandra Redis
Cassandra Redis
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 
Cassandra Bootstrap from Backups
Cassandra Bootstrap from BackupsCassandra Bootstrap from Backups
Cassandra Bootstrap from Backups
 
Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimization
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandra
 
Apache Cassandra Management
Apache Cassandra ManagementApache Cassandra Management
Apache Cassandra Management
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...
C* Summit 2013: Practice Makes Perfect: Extreme Cassandra Optimization by Alb...
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 

Destacado

Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
Matthew Dennis
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data Modeling
Matthew Dennis
 

Destacado (17)

Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Cassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUGCassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUG
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
durability, durability, durability
durability, durability, durabilitydurability, durability, durability
durability, durability, durability
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling Webinar
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big Data
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data Modeling
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Culture
CultureCulture
Culture
 

Similar a Cabs, Cassandra, and Hailo (at Cassandra EU)

Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
jbellis
 

Similar a Cabs, Cassandra, and Hailo (at Cassandra EU) (20)

Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
Mixing Batch and Real-time: Cassandra with Shark (Cassandra Europe 2013)
 
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
C* Summit EU 2013: Mixing Batch and Real-Time: Cassandra with Shark
 
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
Camunda and Apache Cassandra
Camunda and Apache CassandraCamunda and Apache Cassandra
Camunda and Apache Cassandra
 
C* Summit EU 2013: Hardware Agnostic: Cassandra on Raspberry Pi
C* Summit EU 2013: Hardware Agnostic: Cassandra on Raspberry Pi C* Summit EU 2013: Hardware Agnostic: Cassandra on Raspberry Pi
C* Summit EU 2013: Hardware Agnostic: Cassandra on Raspberry Pi
 
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
What is Apache Cassandra? | Apache Cassandra Tutorial | Apache Cassandra Intr...
 
C* Summit EU 2013: Cassandra Internals
C* Summit EU 2013: Cassandra Internals C* Summit EU 2013: Cassandra Internals
C* Summit EU 2013: Cassandra Internals
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achilles
 
C* Summit EU 2013: Effective Cassandra Development with Achilles
C* Summit EU 2013: Effective Cassandra Development with AchillesC* Summit EU 2013: Effective Cassandra Development with Achilles
C* Summit EU 2013: Effective Cassandra Development with Achilles
 
Apache Cassandra Interview Questions and Answers | Cassandra Tutorial | Cassa...
Apache Cassandra Interview Questions and Answers | Cassandra Tutorial | Cassa...Apache Cassandra Interview Questions and Answers | Cassandra Tutorial | Cassa...
Apache Cassandra Interview Questions and Answers | Cassandra Tutorial | Cassa...
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerC* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
 
PyCon Russia 2014 - Auto Scale in the Cloud
PyCon Russia 2014 - Auto Scale in the CloudPyCon Russia 2014 - Auto Scale in the Cloud
PyCon Russia 2014 - Auto Scale in the Cloud
 
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
AWS re:Invent 2016: Case Study: How Startups Like Smartsheet and Quantcast Ac...
 

Más de Dave Gardner (6)

Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011Intro slides from Cassandra London July 2011
Intro slides from Cassandra London July 2011
 
2011.07.18 cassandrameetup
2011.07.18 cassandrameetup2011.07.18 cassandrameetup
2011.07.18 cassandrameetup
 
Cassandra + Hadoop = Brisk
Cassandra + Hadoop = BriskCassandra + Hadoop = Brisk
Cassandra + Hadoop = Brisk
 
Introduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web MeetupIntroduction to Cassandra at London Web Meetup
Introduction to Cassandra at London Web Meetup
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
PHP and Cassandra
PHP and CassandraPHP and Cassandra
PHP and Cassandra
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Cabs, Cassandra, and Hailo (at Cassandra EU)

Notas del editor

  1. I started using Cassandra in 2010, back in version 0.6. Back then it was quite hard work.
  2. I founded the London meetup group in 2010 and have been flying the C* flag over London ever since. My motivation was to connect with others who were using Cassandra. Back then “swapping war stories” was a common theme. Cassandra was not easy to use.
  3. Fast forward to 2013. 7,429 commits later. Cassandra “just works”. Kudos to the team of committers and contributors who have made this happen.
  4. 4:30Whilst “it just works” is quite compelling, there are still challenges to successful adoption of C* in an organisation. I am going to talk about our experiences at Hailo, from three perpsectives: dev, ops and management.
  5. On iOS and Android, live in London, New York, Chicago, Toronto, Boston, Dublin, Madrid
  6. Founded by 3 taxi drivers and 3 seasoned entrepreneurs.
  7. Built by a small team, in one room, on a boat on the Thames, but with global ambitions. Cloud native from day 1 – run solely on AWS.
  8. My recommendation was based on the solid design principles behind C*, something I’ve talked about in the past.
  9. 13:00
  10. Row key = entity ID, in this instance, a 64 bit integer a-la SnowflakeColumn name = property nameValue = property valueA key point when using this pattern is to only mutate columns that you change
  11. Row key = entity ID, in this instance, a 64 bit integer a-la SnowflakeColumn name = property nameValue = property valueA key point when using this pattern is to only mutate columns that you change
  12. Read heavy, demand-driven. Writes consistent.
  13. Time series for storing records of all actions in Hailo. In this instance bucketed by a daily row key, for all messages. The column name is a type 1 UUID.
  14. We also denormalise for other indexes, eg: here we store every message sent to a given address under a single row.
  15. Stats service – insert rate at 5k/sec. Responsible for storing business events from all areas of our system.
  16. Row key = entity ID, in this instance, a 64 bit integer a-la SnowflakeColumn name = property nameValue = property valueA key point when using this pattern is to only mutate columns that you change
  17. We are not using CQL.
  18. We can execute AQL
  19. Some screenshot
  20. 27:00
  21. London, NYC, Tokyo, Osaka, Dublin, Toronto, Boston, Chicago, Madrid, Barcelona, Washington, Montreal
  22. Our rings, plus key stats (m1.large, 18 nodes in cluster A, 12 nodes in cluster B, 100GB per node in cluster A, ~ 600GB in cluster B)
  23. EC2 snitch
  24. Our rings, plus key stats (m1.large, 18 nodes in cluster A, 12 nodes in cluster B, 100GB per node in cluster A, ~ 600GB in cluster B)
  25. I interviewed key people from our management team to gauge their reaction to our C* deployment.
  26. There is a perceptionthat we have made it much harder to get at our data. In the early days at Hailo, when we all worked in one room, developers could execute ad-hoc queries on the fly for management. Nowadays we can’t. The reasons behind this are two-fold – firstly it is true that C* is harder to execute ad-hoc queries. But that’s not the whole picture. Much of our data is still in MySQL, and the queries we used to do against this data do not run smoothly either. The perception, however, is that it is the “new database” that is the cause of problems.
  27. It’s easy to cause yourself a “Big Data” problem. Developers collect and store data because they can, without being clear about the business implications.
  28. 1. Most people have N years of SQL experience where N >= 5
  29. Sometimes C* works too well. Clearly this cluster needs some attention, but our application is still working fine.We are probably at the point where we need a dedicated C* expert.
  30. 2. It’s possible to shoot yourself in the foot – but this is true of SQL (eg: joins that work with low data volumes)
  31. Big rows are bad – they expose a data modeling problem
  32. Big rows are bad – they expose a data modeling problem
  33. Big rows are bad – they expose a data modeling problem
  34. With the right tools, we could change the picture completely.
  35. 43:00