Cabs, Cassandra, and Hailo (at Cassandra EU)

Cabs, Cassandra, and Hailo

David Gardner, Architect at Hailo
#CASSANDRAEU

CASSANDRASUMMITEU

#CASSANDRAEU

CASSANDRASUMMITEU

0.6 to 1.2
• 1,352 changed files with 235,413 additions and 47,487 deletions
• 7,429 commits
• 1,653 tickets completed

https://github.com/apache/cassandra/compare/cassandra-0.6.0...cassandra-1.2
https://github.com/apache/cassandra/blob/trunk/CHANGES.txt

#CASSANDRAEU

CASSANDRASUMMITEU

What this talk is about
Cassandra adoption at Hailo from three perspectives:
1. Development
2. Operational
3. Management

#CASSANDRAEU

CASSANDRASUMMITEU

What is Hailo?
Hailo is The Taxi Magnet. Use Hailo to get a cab wherever you are, whenever you want.

#CASSANDRAEU

CASSANDRASUMMITEU

What is Hailo?
• The world’s highest-rated taxi app – over 11,000 five-star reviews
• Over 500,000 registered passengers
• A Hailo hail is accepted around the world every 4 seconds
• Hailo operates in 15 cities on 3 continents from Tokyo to Toronto in
nearly 2 years of operation

#CASSANDRAEU

CASSANDRASUMMITEU

Hailo is growing
• Hailo is a marketplace that facilitates over $100M in run-rate
transactions and is making the world a better place for passengers
and drivers
• Hailo has raised over $50M in financing from the world's best
investors including Union Square Ventures, Accel, the founder of
Skype (via Atomico), Wellington Partners (Spotify), Sir Richard
Branson, and our CEO's mother, Janice

#CASSANDRAEU

CASSANDRASUMMITEU

The history
The story behind Cassandra adoption at Hailo

#CASSANDRAEU

CASSANDRASUMMITEU

Hailo launched in London in November 2011
• Launched on AWS
• Two PHP/MySQL web apps plus a Java backend
• Mostly built by a team of 3 or 4 backend engineers
• MySQL multi-master for single AZ resilience

#CASSANDRAEU

CASSANDRASUMMITEU

Why Cassandra?
• A desire for greater resilience – “become a utility”
Cassandra is designed for high availability

• Plans for international expansion around a single consumer app
Cassandra is good at global replication
• Expected growth
Cassandra scales linearly for both reads and writes
• Prior experience
I had experience with Cassandra and could recommend it
#CASSANDRAEU

CASSANDRASUMMITEU

The path to adoption
• Largely unilateral decision by developers – a result of a startup
culture

• Replacement of key consumer app functionality, splitting up the
PHP/MySQL web app into a mixture of global PHP/Java services
backed by a Cassandra data store
• Launched into production in September 2012 – originally just
powering North American expansion, before gradually switching
over Dublin and London
#CASSANDRAEU

CASSANDRASUMMITEU

One year on...
• Further breakdown of functionality into Go/Java SOA
• Migrating all online databases to Cassandra

#CASSANDRAEU

CASSANDRASUMMITEU

Development perspective

#CASSANDRAEU

CASSANDRASUMMITEU

“Cassandra just works”
Dom W, Senior Engineer

#CASSANDRAEU

CASSANDRASUMMITEU

Use cases
1. Entity storage
2. Time series data

#CASSANDRAEU

CASSANDRASUMMITEU

CF = customers
126007613634425612:
createdTimestamp:
email:
givenName:
familyName:
locale:
phone:

#CASSANDRAEU

1370465412
dave@cruft.co
Dave
Gardner
en_GB
+447911111111

CASSANDRASUMMITEU

Considerations for entity storage
• Do not read the entire entity, update one property and then write
back a mutation containing every column

• Only mutate columns that have been set
• This avoids read-before-write race conditions

#CASSANDRAEU

CASSANDRASUMMITEU

CF = stats_db
2013-06-01:
55374fa0-ce2b-11e2-8b8b-0800200c9a66:
a48bd800-ce2b-11e2-8b8b-0800200c9a66:
b0e15850-ce2b-11e2-8b8b-0800200c9a66:
bfac6c80-ce2b-11e2-8b8b-0800200c9a66:

#CASSANDRAEU

{“action”:”…

CASSANDRASUMMITEU

CF = stats_db
LON123456:
13b247f0-ce2c-11e2-8b8b-0800200c9a66:
20f70a40-ce2c-11e2-8b8b-0800200c9a66:
2b44d3b0-ce2c-11e2-8b8b-0800200c9a66:
338a22f0-ce2c-11e2-8b8b-0800200c9a66:

#CASSANDRAEU


CASSANDRASUMMITEU

Considerations for time series storage
• Choose row key carefully, since this partitions the records
• Think about how many records you want in a single row

• Denormalise on write into many indexes

#CASSANDRAEU

CASSANDRASUMMITEU

Client libraries
• Gossie (Go)
• Astyanax (Java)

• phpcassa (PHP)

#CASSANDRAEU

CASSANDRASUMMITEU

Analytics
• With Cassandra we lost the ability to carry out analytics
eg: COUNT, SUM, AVG, GROUP BY

• We use Acunu Analytics to give us this abilty in real time, for preplanned query templates
• It is backed by Cassandra and therefore highly available, resilient
and globally distributed
• Integration is straightforward
#CASSANDRAEU

CASSANDRASUMMITEU

events

#CASSANDRAEU

NSQ

Acunu

C*

CASSANDRASUMMITEU

AQL
SELECT
SUM(accepted),
SUM(ignored),
SUM(declined),
SUM(withdrawn)
FROM Allocations
WHERE timestamp BETWEEN '1 week ago' AND 'now’
AND driver='LON123456789’
GROUP BY timestamp(day)
#CASSANDRAEU

CASSANDRASUMMITEU

Operational perspective

#CASSANDRAEU

CASSANDRASUMMITEU

“Allows a team of 2 to achieve things they
wouldn’t have considered before Cassandra
existed”
Chris H, Operations Engineer

#CASSANDRAEU

CASSANDRASUMMITEU

6

machines per region

3

regions

us-east-1

eu-west-1

us-east-1

eu-west-1

Operational
Cluster

clusters

Stats
Cluster

3

(stats cluster is a long story)

ap-southeast-1

#CASSANDRAEU

CASSANDRASUMMITEU

eu-west-1

us-east-1

ap-southeast-1

AZ1

AZ1

AZ1

AZ1

AZ1

AZ1

AZ2

AZ2

AZ2

AZ2

AZ2

AZ2

AZ3

AZ3

AZ3

AZ3

AZ3

AZ3

#CASSANDRAEU

CASSANDRASUMMITEU

Stats
Cluster

AWS VPCs with Open
VPN links
3 AZs per region

m1.large machines

~ 1TB/node

Provisoned IOPS EBS

#CASSANDRAEU

Operational
Cluster

~ 200GB/node

CASSANDRASUMMITEU

Backups
• SSTable snapshot
• Used to upload to S3, but this was taking >6 hours and consuming
all our network bandwidth
• Now take EBS snapshot of the data volumes

#CASSANDRAEU

CASSANDRASUMMITEU

Encryption
• Requirement for NYC launch
• We use dmcrypt to encrypt the entire EBS volume

• Chose dmcrypt because it is uncomplicated
• Our tests show a 1% performance hit in disk performance, which
concurs with what Amazon suggest

#CASSANDRAEU

CASSANDRASUMMITEU

Datastax Ops Centre is a quick win

#CASSANDRAEU

CASSANDRASUMMITEU

Multi DC
• Something that Cassandra makes trivial
• Would have been very difficult to accomplish active-active inter-DC
replication with a team of 2 without Cassandra
• Rolling repair needed to make it safe (we use LOCAL_QUORUM)
• We schedule “narrow repairs” on different nodes in our cluster
each night

#CASSANDRAEU

CASSANDRASUMMITEU

Compression
• Our stats cluster was running at ~1.5TB per node
• We didn’t want to add more nodes

• With compression, we are now back to ~600GB
• Easy to accomplish

• `nodetool upgradesstables` on a rolling schedule

#CASSANDRAEU

CASSANDRASUMMITEU

Management perspective

#CASSANDRAEU

CASSANDRASUMMITEU

“The days of the quick and dirty are over”
Simon V, EVP Operations

#CASSANDRAEU

CASSANDRASUMMITEU

Technically, everything is fine…
• Our COO feels that C* is “technically good and beautiful”, a
“perfectly good option”

• Our EVPO says that C* reminds him of a time series database in
use at Goldman Sachs that had “very good performance”

…but there are concerns
#CASSANDRAEU

CASSANDRASUMMITEU

People who can
attempt to query
MySQL
People who can
attempt to
query Cassandra

#CASSANDRAEU

CASSANDRASUMMITEU

Lessons learned

#CASSANDRAEU

CASSANDRASUMMITEU

There might be a gulf in experience

#CASSANDRAEU

CASSANDRASUMMITEU

10

Average years experience
per team member

MySQL
#CASSANDRAEU

Cassandra
CASSANDRASUMMITEU

Lesson learned
• Have an advocate - get someone who will sell the vision internally
• Learn the theory - teach each team member the fundamentals

• Make an effort to get everyone on board

#CASSANDRAEU

CASSANDRASUMMITEU

Things can drift into failure

#CASSANDRAEU

CASSANDRASUMMITEU

Lesson learned
• Be pro-active with Cassandra, even if it seems to be running
smoothly

• Peer-review data models, take time to think about them
• Big rows are bad - use cfstats to look for them
• Mixed workloads can cause problems - use cfhistograms and look
out for signs of data modeling problems
• Think about the compaction strategy for each CF
#CASSANDRAEU

CASSANDRASUMMITEU

EBS is terrible

#CASSANDRAEU

CASSANDRASUMMITEU

Lessons learned
• EBS is nearly always the cause of Amazon outages
• EBS is a single point of failure (it will fail everywhere in your
cluster)
• EBS is slow
• EBS is expensive
• EBS is unnecessary!

#CASSANDRAEU

CASSANDRASUMMITEU

Management need to know the trade offs

#CASSANDRAEU

CASSANDRASUMMITEU

Lessons learned
• Keep the business informed – explain the tradeoffs in simple terms
• Sing from the same hymn sheet
• Make sure there solutions in place for every use case from the
beginning

#CASSANDRAEU

CASSANDRASUMMITEU

People who can
attempt to query
MySQL

#CASSANDRAEU

People who can
attempt to
query Cassandra

CASSANDRASUMMITEU

Conclusions

#CASSANDRAEU

CASSANDRASUMMITEU

We like Cassandra
• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation

#CASSANDRAEU

CASSANDRASUMMITEU

Lessons for successful adoption
• Have an advocate, sell the dream
• Learn the fundamentals, get the best out of Cassandra
• Invest in tools to make life easier
• Keep management in the loop, explain the trade offs

#CASSANDRAEU

CASSANDRASUMMITEU

The future
• We will continue to invest in Cassandra as we expand globally
• We will hire people with experience running Cassandra
• We will focus on expanding our reporting facilities
• We aspire to extend our network (1M consumer installs, wallet)
beyond cabs
• We will continue to hire the best engineers in London, NYC and
Asia
#CASSANDRAEU

CASSANDRASUMMITEU

Questions?

#CASSANDRAEU

CASSANDRASUMMITEU

Cabs, Cassandra, and Hailo (at Cassandra EU)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Destacado

Destacado (17)

Similar a Cabs, Cassandra, and Hailo (at Cassandra EU)

Similar a Cabs, Cassandra, and Hailo (at Cassandra EU) (20)

Más de Dave Gardner

Más de Dave Gardner (6)

Último

Último (20)

Cabs, Cassandra, and Hailo (at Cassandra EU)

Notas del editor