Apache Cassandra at Wayin

August 28, 2013
Cassandra
in the Cloud
August 28, 2013Jamey Wood

Wayin: History
8/30/2013 2
Founded in 2011
Located in beautiful Denver, Colorado
Global clients in largest corporations, sports teams, agencies, and
publishers
$20M raised
Co-founded by Scott McNealy
Twitter Certified May 2013

Wayin: Mission
Transforming Social Media into Brand Experiences
8/30/2013 3

8/30/2013 4
Marketing is becoming more reactive, and the
ability to own, brand, curate and customize
relevant experiences in the moment is more
valuable now, than it has ever been
Why it Works

How it Works
8/30/2013 5
ELB Load
Balancer
CloudFront S3
Route 53 SQS
API
Server
API
Server
API
Server
API Server
Scaling Group
Auto-Scaled Based
on Machine Load
Clients
DB Server
Scaling Groups
Scaled Based on
Data Volume
Cassandra
API
Server
API
Server
Tracking
Server
Tracking Server
Scaling Group
Auto-Scaled Based on
Queue Length

Challenge 1: Provisioning and Deployment
CloudFormation, Auto Scaling Groups, and the Cassandra Ring
8/30/2013 6
Clients
CloudFormation
DB Auto Scaling Group: us-east-1a
DB Auto Scaling Group: us-east-1b
DB Auto Scaling Group: us-east-1c
1a
1a
1b
1c1b
1c
Cassandra
time

Challenge 1: Provisioning and Deployment
Pitfalls and Opportunities
8/30/2013 7
Clients
• Auto Scaling Groups are helpful for
automatically replacing terminated
instances, but certain actions can be
problematic.
• Be familiar with as-suspend-processes
options.
• Token management is important to keep
Cassandra ring balanced, properly
distributed across availability zones, etc.
Also important to be able to bring up rings
(and launch replacement servers) in a
fully automated fashion.
• Netflix’s “Priam” open source tool can
provide this kind of token management
(and more).

Challenge 2: Migration
8/30/2013 8
Clients
Jackson{
“_id”: “abc”,
“author” : “John Doe”,
“body”: “some text”,
…
}
id: “abc” author: “John Doe” data: “{ … }”
id: “def” author: “JaneDoe” data: “{ … }”
id: “ghi” author: “Jim Doe” data: “{ … }”
id: “jkl” author: “Jill Doe” data: “{ … }”
MongoDB Cassandra

Challenge 3: Volatile Performance
Managing EC2 I/O
8/30/2013 9
Clients
Source for EC2 IO Performance Graph: http://blog.scalyr.com/2012/10/16/a-systematic-look-at-ec2-io/
IO Performance for 45 EC2 Instances over Time Mitigation: md(4) RAID0 across Ephemeral Disks

Challenge 3: Volatile Performance
Client Resiliency
8/30/2013 10
Clients
new ConnectionPoolConfigurationImpl("MyConnectionPool")
// Will resort hosts per token partition every 10 seconds
.setLatencyAwareUpdateInterval(10000)
// Will clear the latency every 10 seconds
.setLatencyAwareResetInterval(10000)
// Will sort hosts if a host is more than 100% slower than the best and always
// assign connections to the fastest host, otherwise will use round robin
.setLatencyAwareBadnessThreshold(2)
// Uses last 100 latency samples. These samples are in a FIFO queue and
// will just cycle themselves
.setLatencyAwareWindowSize(100);
Astyanax Example: Configuring Latency Awareness

Challenge 4: Sorting
8/30/2013 11
1a
1b
1c
Cassandra
1b
1c
1a
• Single wide rows make it easy to code sorting/slicing
logic, but can lead to performance hotspots.
• Good rule of thumb is to keep individual rows below
10MB in size[1].
• Our current solution involves using “bucketed” wide
rows (spreading the data for a given sorting range
across multiple keys/servers, and then collating that
data during reads).
• More info:
1. http://rubyscale.com/blog/2011/03/06/basic-time-series-
with-cassandra/
2. http://www.datastax.com/dev/blog/advanced-time-series-
with-cassandra

Challenge 5: Monitoring
Nagios Reports
8/30/2013 12
Clients
Nagios Report: RecentReadLatency

Challenge 5: Monitoring
Nagios Setup
8/30/2013 13
Clients
ColumnFamilies/RecentReadLatencyMicros for some_table table
check_jmx -U service:jmx:rmi:///jndi/rmi://127.0.0.1:7199/jmxrmi
-O org.apache.cassandra.db:columnfamily=some_table
,keyspace=some_keyspace
,type=ColumnFamilies
Monitor Cassandra using JMX Nagios Plugin / NRPE (Nagios Remote Plugin Executor)
http://wiki.apache.org/cassandra/JmxInterface

Challenge 6: We’re Hiring!
Looking for great developers to work with Cassandra (amongst other things)
8/30/2013 14
Clients
http://www.wayin.com/about-us/careers
Senior Software Engineer
Work with great people and great technologies:
• Cassandra
• JVM
• Jetty
• Jersey
• Jackson
• AWS
Vice President of Sales
Work with great brands and agencies:
• Denver Broncos
• Atlanta Falcons
• St. Louis Rams
• San Jose Sharks
• Chevrolet
• Bank of America
• Turtlewax

Apache Cassandra at Wayin

Recomendados

Recomendados

Más contenido relacionado

Similar a Apache Cassandra at Wayin

Similar a Apache Cassandra at Wayin (20)

Más de DataStax Academy

Más de DataStax Academy (20)

Último

Último (20)

Apache Cassandra at Wayin