How to Troubleshoot Apps for the Modern Connected Worker
Scaling Databases On The Cloud
1. Scaling databases on the cloud
D e e p a k A n u p a l l i
S e r v e r A r c h i t e c t
C L O U D C O M P U T I N G - C O M I N G O F A G E
A T R E A T I S E O N R E A L - L I F E U S E C A S E S
Copyright (c) 2009, Pramati Technologies Private Limited. Imaginea is a Pramati business. All
trade names and trade marks are owned by their respective owners
11/4/2009 1
2. We are
• An emerging leader in product
development services offering
specialized services in Product
Engineering, Interaction design
and Test engineering.
• US Headquarters in Sunnyvale,
CA; India development centers in
Hyderabad and Chennai
• A 250+ strong and growing team
• A business unit of Pramati
technologies
• Rich Experience in SaaS
Engineering, Performance
engineering, Cloud Computing,
Web2.0, sf.com integrations and
managing Amazon EC2
Deployment
• Track record of delivering
significant customer satisfaction
4. Application requirements
• High reliability
• Low Latency
• Dynamic Scalability
– Millions of Users
– Volumes of data
• Across the tiers
– Web
– Application
– Data
5. Our biggest challenge
• DB Perf bound by Disk I/O
• Vertical scaling is an option
– Ex: PlentyOfFish.com: 512GB RAM, 32CPUs
– Expensive
– Only possible to an extent on cloud servers
6. Vertical Scaling: Limitations
• Not everything will fit in
memory
• Lot of reads ~ Lot of
page faults + disk seeks
• RAID 6 or RAID 10
disks
• 200MBps-1GBps is the
max speed
Think Horizontal !
7. Replication
• Master-slave replication (MySQL
Writes
or Oracle RAC)
• Writes on one Master
Master
• Reads on many Slaves
• Application aware
• Works in read mostly scenario Writes
• Adds Slave lag
Slave Slave Slave
Reads
8. Sharding
• Partition data across masters
• Writes and Reads are distributed Shard Logic
• Application is modified accordingly
• Also use replication with fewer slaves
to minimize slave lag Master Master Master
• Choose a partitioning strategy that
uniformly distributes data
Slave Slave Slave
9. Sharding Schemes
• Vertical
shard_id = getShard(“profile”)
• Profile DB, friend DB shard_id = getShard(profileID)
• Not uniform
Select * from Profile where id = ?
• Range based
• ID range, Location or Date
based
• Not uniform Corporate Corporate
• Key or Hash based
• ID hash
• Fixed masters
Tweets Posts
• Directory
• Mapping of ID to Shard
• Single point of failure
10. Sharding Complexities
• No Joins
• De-normalize the data
• Data Integrity
• Application should enforce integrity
• Re-shard
• Changing the sharding scheme requires re-partitioning
the entire data
11. De-normalization
• Recent 10 messages to a recipient
• Schema Messages Recipients
• Messages Table stores message info
timestamp
• Recipients Table stores
• Requires Join on Messages & Recipients
table
• De-normalize Messages Recipients
• Store timestamp in Recipients table as
timestamp timestamp
well
12. Relationships
• When data is partitioned into shards,
foreign keys become obsolete
• De-normalization avoids having
relationships Application
• If data can’t be de-normalized further,
use memcached
• But, this requires change in SQL queries MemCached
Shard Shard Shard
1 2 3
14. Amazon SimpleDB
• Schema-less distributed key-value store
• Highly reliable and scalable
• Automatic indexing of columns
• Querying with SQL-like syntax
• Supports multiple values for key/attribute
• Value for Money
15. Problems Addressed
• High Availability
– multiple nodes forming a ring
• Partitioning
– Consistent hashing
• Replication
– Replicated to multiple nodes
• Eventual Consistency
– Asynchronous replication of data using vector clocks
16. SimpleDB adoption
• No Joins
• No transactional support
• String is the only data type
• No aggregator functions
• No full-text searches
• Limits enforced on size of results, predicates, data etc.
17. Google BigTable
• Distributed Key-value store
• Runs on top of Google File System (GFS)
• Timestamp versioned data
• Automatic indexing of columns
18. BigTable adoption
• Google Search, Maps, Earth, Orkut, Youtube,
Reader, etc.
• Google App Engine(GAE) uses BigTable as its
datastore
• DataNucleus supports JPA for BigTable
• Limited transaction support
• Eventual consistency
19. Hive
• Hive is a data warehouse
• Runs on top of Hadoop Distributed
File system (HDFS)
• Supports SQL-like syntax
• User defined types and functions
• Extensibility with Map-Reduce
20. Hive adoption
• Facebook uses Hive to analyze historical
data of users and content
• Doesn’t support indexing of columns
• Brute force mechanism to compute
analytics
21. CouchDB
• CouchDB is a document-oriented datastore
• Schema-free
• Accessible through RESTful JSON API
• Distributed with incremental replication
• Querying through Javascript
22. Is there a solution for all?
• Different data-stores address different problem spaces
• Identify what best suites your app