Speaker: Jean Armel Luce — Senior Software Engineer/Cassandra Admin at Orange
Video: http://www.youtube.com/watch?v=mefOE9K7sLI&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=28
At Orange, Jean Armel has helped develop an open source tool for the migration of data to Cassandra; Jean and his team were in need of the NoSQL solution Apache Cassandra in order to sustain the growth of requests and volume of data required by their application PnS. In this session, Jean Armel will start out with an overview of the Orange application PnS and dive into why they chose Apache Cassandra how they did their data migration without any interruption of service. Jean Armel will also show how his application behaves after the migration
C* Summit EU 2013: The Cassandra Experience at Orange
1. The
Cassandra
Experience
at
Orange
Project
PnS
3.0
Jean
Armel
Luce
Orange
France/DSIF/DF/SDF
V1.0
2. Summary
§
§
Our
migraGon
strategy
§
AMer
the
migraGon
…
§
AnalyGcs
with
Hadoop/Pig/Hive
over
Cassandra
§
2
Short
descripGon
of
PnS.
Why
did
we
choose
C*
?
ContribuGons
and
open
sourced
modules
from
Orange
&
conclusions
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
4. PnS – Short description
§
PnS means Profiles and Syndication : PnS is a highly available
service for collecting and serving live data about Orange customers
§
End users of PnS are :
– Orange customers (logged to Portal www.orange.fr)
– Sellers in Orange shops
– Some services in Orange (advertisements, …)
4
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
5. PnS – The Big Picture
Millions of HTTP requests
(Rest or Soap)
Fast and highly available
WebService to get or set
data stored by pns :
- postProcessing(data1)
- postProcessing(data2)
- postProcessing(data3)
- postProcessing(datax)
- …
End users
DB Queries
R/W operations
Thousands of files
(Csv or Xml)
Scheduled data injection
PNS
Data providers
5
Jean Armel Luce - Orange-DOP-PnS 3.0
Database
Cassandra Summit Europe – October 17 2013
6. PnS2 – Architecture
2 DCs architecture for high availability
§
Until 2012, data were stored in 2
differents backends :
ü
MySQL cluster (for volatile data)
ü
PostGres « cluster » (sharding and
replication)
Bagnol
et
and
§
web services
(read and writes)
Sophia
Antipolis
for batch updates
§
6
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
7. Timeline
–
Key
dates
of
PnS
3.0
PNS 2
• Study
phase
2010
to 2012
We
did
a
large
study
about
a
few
NoSQL
databases
(Cassandra,
MongoDB,
Riak,
Hbase,
Hypertable,
…)
è
We
chose
Cassandra
as
the
single
backend
for
PnS
• Design
phase
06/2012
09/2012
We
started
the
design
phase
of
PnS3.0
• Proof
Of
Concept
We
started
a
1st
(small)
Cassandra
cluster
in
producGon
for
a
non
criGcal
applicaGon
:
1
table,
key
value
access
• Produc7on
phase
04/2013
MigraGon
of
the
1st
subset
of
data
of
PnS
from
mysql
cluster
to
Cassandra
in
produc7on
• Complete
migra7on
05/2013
to
12/2013
7
MigraGon
of
all
other
subsets
of
data
from
Mysql
cluster
and
Postgres
to
Cassandra
Add
new
nodes
in
the
cluster
(From
8
nodes
in
each
DC
to
16
nodes
in
each
DC)
Add
a
3rd
datacenter
for
AnalyGcs
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
8. PnS – Why did we choose Cassandra ?
§
Cassandra fits our requirements :
– Very high availability
PnS2 = 99,95% availability
we want to improve it !!!
– Low latency
20 ms < RT PnS2 web service < 150 ms
we want to improve it !!!
Higher load, higher volume next
years ? unpredictable; better scalability
brings new businesses
– Scalability
§
And also :
– Ease of use : Cassandra is easy to administrate and operate
– Some features that I like (rack aware, CL per request, …)
– Cassandra is very efficient for simple requests :
«
SELECT
mycol1,
mycol2,
…,
mycolx
FROM
mytable
WHERE
myprimarykey
=
‘mycustomerid’
»
8
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
10. The
migraGon
-‐
Input
§
During the migration, we need to :
§
§
maintain (or lower) the latency during the migration
§
§
maintain a very high availability
guarantee no functional regression
Question :
§
10
How can we migrate the data to Cassandra without any interruption of
service ?
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
11. The
migraGon
:
Step
by
step
processing
§
Subdivision of data into many subsets according to many criteria :
§
§
§
Same source of data
Relationships between data
And then, migrate each subset 1 by 1
S ubdivision
into subsets
§
Definition of a generic process for all the subsets
S witch
q ueries
to
C assandra
for
the
s ubset
Check /validation
of
the
m ig ration
11
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
goto
1st subset
goto next
subset
Mig ration
data
of
the
subset
12. The
migraGon
:
Tools
and
Gps
§
The strategy of migration is based on 2 main facilities :
mod_dup
PNS 2
An Apache module developped (and open sourced) by Orange
teams.
HTTP Req
Mod_dup can duplicate web requests, filter them on some criteria,
substitute characters (regexp), and send the duplicated requests to
another pool of web servers.
Used in order to fill legacy (relational) database and Cassandra
database simultaneously during the migration of the subset
mod_du
p
PNS 3
the timestamp management by Cassandra
Each data stored in C* is timestamped.
It is possible to set this timestamp when inserting/updating/deleting a data in Cassandra.
When Cassandra retrieves a data item, it returns the value having the most recent
timestamp.
We use this feature to distinguish the values stored before the migration started and the
values inserted during or inserted after the migration
12
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
13. The migration : initial state
Step 0
HTTP Rest/Soap Read
HTTP Rest/Soap Write
WebServer
End users
SQL Read/
Write
PNS 2
DB
Files transfer via FTP or CFT
BatchInjector
Data providers
13
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
14. The migration : double feed
Step 1
Duplicate HTTP update
streams from end users
mod_dup
WebServer
CQL
update
Data providers
Duplicate streams (files)
from data providers
14
Jean Armel Luce - Orange-DOP-PnS 3.0
BatchInjector
Cassandra Summit Europe – October 17 2013
PNS 3
Cassandra
DB
15. The migration : copy data form PnS2 to PnS3
Step 2
HTTP Write
mod_dup
WebServer
Batc
h In
jecti
on
TimeStamp
=
start
date
of
extraction
Data providers
BatchInjector
15
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
PNS 3
Cassandra
DB
16. The migration : control
Step 3
HTTP Write
mod_dup
WebServer
SQL
Synchro
Control
Data providers
CQL
BatchInjector
16
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
PNS 3
Cassandra
DB
17. The migration : switch reads
Step 4
100 % read now on Cassandra
HTTP Read
requests
HTTP Write
mod_dup
HTTP Rest/Soap
Read
HTTP Rest/Soap Write
WebServer
WebServer
End users
PNS 2
DB
Files transfer via FTP or CFT
Data providers
BatchInjector
BatchInjector
17
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
PNS 3
Cassandra
DB
18. The migration : stop double feed
Step 5
HTTP Read/Write
HTTPrequests
Read
requests
100 % read on Cassandra
100% write on Cassandra
for HTTP request
HTTP Write
mod_dup
HTTP Rest/Soap Write
WebServer
WebServer
End users
PNS 2
DB
Files transfer via FTP or CFT
Data providers
BatchInjector
Files transfer
100% write on Cassandra
for Data injection
18
Jean Armel Luce - Orange-DOP-PnS 3.0
BatchInjector
Cassandra Summit Europe – October 17 2013
PNS 3
Cassandra
DB
19. The
migraGon
§
Using this procedure :
§
§
During the control phase, we can take time (a few days, a few
weeks) to check that everything is OK before switching to
Cassandra
§
It is possible to easily rollback the migration of a subset if errors
are found during the control phase, without losing any update
§
19
It is possible to switch progressively to Cassandra rather than
doing a one shot switch
§
§
We can migrate to Cassandra without any interruption of
service
Doesn’t work if the queries are not idempotent.
After the migration, we can easily duplicate production requests
(entirely or partially) and send them to a bench platform thanks to
mod_dup
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
21. The
latency
§
Comparison before/after migration to Cassandra
§
Some graphs about the latency of the web services are very
explicit :
Service push mail
Service push webxms
dates of
migration to C*
21
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
22. The
latency
§
Read and write latencies are now in microseconds in the datanodes :
Thanks to
and
This
latency
will
be
improved
by
(tests
in
progress)
:
ALTER
TABLE
syndic
WITH
compacGon
=
{
'class'
:
'LeveledCompacGonStrategy',
'sstable_size_in_mb'
:
??
};
22
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
23. The
availability
•
We got a few hardware failures and network outages
•
No impact on QoS :
•
•
23
no error returned by the application
no real impact on latency
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
24. The
scalability
•
PnS activity is always increasing (volume of data and requests/sec)
•
How to measure the capacity of a cluster ?
Capacity of a C* cluster = capacity of a node * number of nodes
(true if all nodes are identical)
•
there are 2 ways to deal with the expansion of activity :
Ø
scale up (add more resources such as CPU, disks, RAM to each
node)
Ø scale
out
(add
new
nodes
in
the
cluster)
24
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
25. The
scalability
•
Thanks to vnodes (available since Cassandra 1.2), it is easy to scale
out
With NetworkTopologyStrategy, make sure to distribute evenly the nodes
in the racks
25
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
27. Basic
architecture
of
the
Cassandra
cluster
§
Cluster without Hadoop : 2 datacenters, 16 nodes in each DC
§
RF (DC1, DC2) = (3, 3)
§
Requests from web servers in DC1 are sent to C* nodes in DC1
§
Requests from web servers in DC2 are sent to C* nodes in DC2
Pool
of
web
servers
DC1
27
Jean Armel Luce - Orange-DOP-PnS 3.0
DC1
DC2
Cassandra Summit Europe – October 17 2013
Pool
of
web
servers
DC2
28. Architecture
of
the
Cassandra
cluster
with
the
datacenter
for
analyGcs
§
Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in
DC2, 4 nodes in DC3
§
RF (DC1, DC2, DC3) = (3, 3, 1)
§
§
We favor cheaper disks (SATA) in DC3 rather than SSDs or
FusionIo cards
§
28
Because RF = 1 in DC3, we shall need less storage space in this
datacenter
Works better with HSHA Thrift server (tests in progress)
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
29. Architecture
of
the
Cassandra
cluster
with
the
datacenter
for
analyGcs
Pool
of
web
servers
DC1
DC1
DC2
DC3
29
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
Pool
of
web
servers
DC2
31. ContribuGons
and
open
sourced
modules
§
Open sources by Orange
§
PHP driver for Cassandra :
https://github.com/Orange-OpenSource/YACassandraPDO
Thanks to Sandro Lex & Mathieu Lornac
§
Mod_dup (Migration to Cassandra)
§ https://github.com/Orange-OpenSource/mod_dup
Thanks to Jonas Wustrack & Emmanuel Courreges
§
Other contributions
§
C driver (libdbi driver) :
http://libdbi-drivers.cvs.sourceforge.net/viewvc/libdbi-drivers/libdbidrivers/?pathrev=Branch-2012-07-02-cassandra
Thanks to Emmanuel Courreges
31
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
32. Conclusions
§
With Cassandra, we have improved our QoS
§
We are able to open our service to new opportunities
§
There is an ecosystem around C* (Hadoop, Hive, Pig, Storm, Shark,
…), which offers more capabilities. However, we would love to have
some of the components (Hive) integrated in C* core (as Pig)
§
32
PnS3 works better and hopefully cheaper than
PnS2
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
33. Thank
you
33
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
35. A
few
answers
about
hardware/OS
version
/Java
version/
Cassandra
version
§
Hardware :
§
16 nodes in each DC at the end of 2013 :
§
§
1.2.2 (with a few patches backported from 1.2.3)
Java version :
§
35
Ubuntu Precise (12.04 LTS)
Cassandra version :
§
§
FusionIO 320 GB MLC
OS :
§
§
24 GB RAM
§
§
6 CPU Intel® Xeon® 2.00 GHz
Java7u7 : not recommended, upgrade scheduled soon
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013
36. A
few
answers
about
data
and
requests
§
Data types :
§
§
elementary types : boolean, integer, string, date
§
collection types
§
§
Volume : 6 TB at the end of 2013
complex types : json, xml (between 1 and 20 KB)
Requests :
§
§
80% get
§
§
10.000 requests/sec at the end of 2013
20% set
Consistency level used by PnS :
§
§
36
ONE (95% of the queries)
LOCAL_QUORUM (5% of the queries)
Jean Armel Luce - Orange-DOP-PnS 3.0
Cassandra Summit Europe – October 17 2013