C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner

#CASSANDRA13
Cassandra at Hailo
David Gardner | Architect @ Hailo
CASSANDRASUMMIT2013

#CASSANDRA13 CASSANDRASUMMIT2013

What is this talk about?

•  1,352 changed ﬁles with 235,413 additions and 47,487 deletions
•  7,429 commits
•  1,653 tickets completed
https://github.com/apache/cassandra/compare/cassandra-0.6.0...cassandra-1.2
https://github.com/apache/cassandra/blob/trunk/CHANGES.txt
0.6 to 1.2

Cassandra adoption at Hailo from three perspectives:
1.  Development
2.  Operational
3.  Management
What this talk is about

What is Hailo?
Hailo is the taxi app. Use Hailo to get a taxi wherever you are, whenever you want.

•  The world’s highest-rated taxi app - over 7,000 ﬁve-star reviews
•  Over 300,000 registered passengers
•  A Hailo hail is accepted around the world every 5 seconds
•  Hailo is growing (30%+) every month
•  Became the largest taxi network in all of Ireland within two months
of launch
What is Hailo?

The history
The story behind Cassandra adoption at Hailo

Hailo launched in London in November 2011
•  Launched on AWS
•  Two PHP/MySQL web apps plus a Java backend
•  Mostly built by a team of 3 or 4 backend engineers
•  MySQL multi-master for single AZ resilience

Why Cassandra?
•  A desire for greater resilience – “become a utility”
Cassandra is designed for high availability
•  Plans for international expansion around a single consumer app
Cassandra is good at global replication
•  Expected growth
Cassandra scales linearly for both reads and writes
•  Prior experience
I had experience with Cassandra and could recommend it

The path to adoption
•  Largely unilateral decision by developers – a result of a startup
culture
•  Replacement of key consumer app functionality, splitting up the
PHP/MySQL web app into a mixture of global PHP/Java services
backed by a Cassandra data store
•  Launched into production in XYZ– originally just powering North
American expansion, before gradually switching over Dublin and
London

Development perspective

“Cassandra just works”
Dom Wong, Senior Engineer

Use cases
1.  Entity storage
2.  Time series data

CF = customers
126007613634425612:
createdTimestamp: 1370465412
email: dave@cruft.co
givenName: Dave
familyName: Gardner
locale: en_GB
phone: +447911111111

Considerations for entity storage
•  Do not read the entire entity, update one property and then write
back a mutation containing every column
•  Only mutate columns that have been set
•  This avoids read-before-write race conditions

CF = comms
2013-06-01:
55374fa0-ce2b-11e2-8b8b-0800200c9a66: {“to”:”dave@c…
a48bd800-ce2b-11e2-8b8b-0800200c9a66: {“to”:”foo@ex…
b0e15850-ce2b-11e2-8b8b-0800200c9a66: {“to”:”bar@ho …
bfac6c80-ce2b-11e2-8b8b-0800200c9a66: {“to”:”baz@fo…

CF = comms
dave@cruft.co:
13b247f0-ce2c-11e2-8b8b-0800200c9a66: {“to”:”dave@c…
20f70a40-ce2c-11e2-8b8b-0800200c9a66: {“to”:”dave@c…
2b44d3b0-ce2c-11e2-8b8b-0800200c9a66: {“to”:”dave@c…
338a22f0-ce2c-11e2-8b8b-0800200c9a66: {“to”:”dave@c…

Considerations for time series storage
•  Choose row key carefully, since this partitions the records
•  Think about how many records you want in a single row
•  Denormalise on write into many indexes

Client libraries
•  Astyanax (Java)
•  phpcassa (PHP)
•  github.com/carloscm/gossie (Go)

Analytics
•  With Cassandra we lost the ability to carry out analytics
eg: COUNT, SUM, AVG, GROUP BY
•  We use Acunu Analytics to give us this abilty in real time, for pre-
planned query templates
•  It is backed by Cassandra and therefore highly available, resilient
and globally distributed
•  Integration is straightforward

AQL
SELECT
SUM(accepted),
SUM(ignored),
SUM(declined),
SUM(withdrawn)
FROM Allocations
WHERE timestamp BETWEEN '1 week ago' AND 'now’
AND driver='LON123456789’
GROUP BY timestamp(day)

Challenges

10 Average years experience
per team member
MySQL Cassandra

Lessons learned

Have an advocate
•  Get someone who will sell the vision internally
•  Make an effort to get everyone on board

Learn the theory
•  Teach each team member the fundamentals
•  CQL can encourage an SQL mindset, but it’s important to
understand the underlying data model
•  Make a real effort to share knowledge – keep in mind the gulf in
experience for most team members between their old world and the
new world (SQL vs NoSQL)
•  Peer review data models

Operational perspective

“Allows a team of 2 to achieve things they wouldn’t
have considered before Cassandra existed”
Chris Hoolihan, Operations Engineer

2 clusters
6 machines per region
3 regions
(stats cluster pending addition
of third DC)
Operational
Cluster
Stats
Cluster
ap-southeast-1 us-east-1 eu-west-1
us-east-1 eu-west-1

AWS VPCs with Open
VPN links
3 AZs per region
m1.large machines
Provisoned IOPS EBS
Operational
Cluster
Stats
Cluster
~ 600GB/node
~ 100GB/node

Backups
•  SSTable snapshot
•  Used to upload to S3, but this was taking >6 hours and consuming
all our network bandwidth
•  Now take EBS snapshot of the SSTable snapshots

Encryption
•  Requirement for NYC launch
•  We use dmcrypt to encrypt the entire EBS volume
•  Chose dmcrypt because it is uncomplicated
•  Our tests show a 1% performance hit in disk performance, which
concurs with what Amazon suggest

Datastax Ops Centre
•  We run the free version
•  Offers up easily accessible “one screen” overviews of the activity of
the entire cluster
•  Big fans – an easy win

Multi DC
•  Something that Cassandra makes trivial
•  Would have been very difﬁcult to accomplish active-active inter-DC
replication with a team of 2 without Cassandra
•  Rolling repair needed to make it safe (we use LOCAL_QUORUM)
•  We schedule “narrow repairs” on different nodes in our cluster each
night

Compression
•  Our stats cluster was running at ~1.5TB per node
•  We didn’t want to add more nodes
•  With compression, we are now back to ~600GB
•  Easy to accomplish
•  `nodetool upgradesstables` on a rolling schedule

Management perspective

“The days of the quick and dirty are over”
Simon Veingard, EVP Operations

Technically, everything is ﬁne…
•  Our COO feels that C* is “technically good and beautiful”, a
“perfectly good option”
•  Our EVPO says that C* reminds him of a time series database in
use at Goldman Sachs that had “very good performance”
…but there are concerns

People who can
attempt to query
MySQL
People who can
attempt to
query Cassandra

Keep the business informed
•  Pre-launch, we were tasked with increasing resiliency
•  Cassandra addressed immediate business needs, but the trade offs
involved should have been communicated more clearly

Sing from the same hymn sheet
•  A senior founding engineer had doubts about the adoption of
Cassandra until very recently
•  In the presence of business doubt, this lack of consistency
amongst developers exacerbated the concerns
•  We should have made more effort to make bilateral decisions on
adoption – I don’t think this would have been hard to achieve

Provide solutions
•  There are many options for ad-hoc querying of Cassandra
•  We underestimated the impact of not having a good solution for
this from the very beginning

The future

Cassandra at Hailo
•  We will continue to invest in Cassandra as we expand globally
•  We will hire people with experience running Cassandra
•  We will focus on expanding our reporting facilities

#CASSANDRA13
Thank you
CASSANDRASUMMIT2013

C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner

C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner

Similar a C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner (20)

Más de DataStax Academy

Más de DataStax Academy (20)

Último

Último (20)

C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gardner