SlideShare una empresa de Scribd logo
1 de 24
Scaling databases on the cloud

                                                                  D e e p a k A n u p a l l i
                                                                  S e r v e r A r c h i t e c t

                                 C L O U D               C O M P U T I N G - C O M I N G                          O F    A G E

                             A      T R E A T I S E                    O N         R E A L - L I F E        U S E       C A S E S




Copyright (c) 2009, Pramati Technologies Private Limited. Imaginea is a Pramati business. All
trade names and trade marks are owned by their respective owners
                                                                                                11/4/2009     1
We are
 •   An emerging leader in product
     development services offering
     specialized services in Product
     Engineering, Interaction design
     and Test engineering.
 •   US Headquarters in Sunnyvale,
     CA; India development centers in
     Hyderabad and Chennai
 •   A 250+ strong and growing team
 •   A business unit of Pramati
     technologies
 •   Rich Experience in SaaS
     Engineering, Performance
     engineering, Cloud Computing,
     Web2.0, sf.com integrations and
     managing Amazon EC2
     Deployment
 •   Track record of delivering
     significant customer satisfaction
Initiatives in Cloud
• Dekoh:
  http://www.dekoh.com
• SocialTwist:
  http://www.socialtwist.com
• MyPicks Beijing 2008:
  http://apps.new.facebook.com/mypicksbeijing/Home
• Qontext:
  http://www.qontext.com
Application requirements

• High reliability
• Low Latency
• Dynamic Scalability
   – Millions of Users
   – Volumes of data
• Across the tiers
   – Web
   – Application
   – Data
Our biggest challenge

• DB Perf bound by Disk I/O
• Vertical scaling is an option
   – Ex: PlentyOfFish.com: 512GB RAM, 32CPUs
   – Expensive
  – Only possible to an extent on cloud servers
Vertical Scaling: Limitations
  • Not everything will fit in
    memory
  • Lot of reads ~ Lot of
    page faults + disk seeks
  • RAID 6 or RAID 10
    disks
  • 200MBps-1GBps is the
    max speed

         Think Horizontal !
Replication
 • Master-slave replication (MySQL
                                             Writes
   or Oracle RAC)
 • Writes on one Master
                                             Master
 • Reads on many Slaves
 • Application aware
 • Works in read mostly scenario             Writes

 • Adds Slave lag
                                     Slave   Slave    Slave


                                              Reads
Sharding
 • Partition data across masters
 • Writes and Reads are distributed                  Shard Logic
 • Application is modified accordingly
 • Also use replication with fewer slaves
   to minimize slave lag                    Master      Master     Master

 • Choose a partitioning strategy that
   uniformly distributes data

                                            Slave       Slave      Slave
Sharding Schemes
 •   Vertical
                                   shard_id = getShard(“profile”)
 •   Profile DB, friend DB         shard_id = getShard(profileID)
 •   Not uniform
                                   Select * from Profile where id = ?
 •   Range based
 •   ID range, Location or Date
     based
 •   Not uniform                     Corporate           Corporate

 •   Key or Hash based
 •   ID hash
 •   Fixed masters
                                  Tweets         Posts
 •   Directory
 •   Mapping of ID to Shard
 •   Single point of failure
Sharding Complexities
 •   No Joins
 •   De-normalize the data
 •   Data Integrity
 •   Application should enforce integrity
 •   Re-shard
 •   Changing the sharding scheme requires re-partitioning
     the entire data
De-normalization
 • Recent 10 messages to a recipient
 • Schema                                   Messages    Recipients
 • Messages Table stores message info
                                            timestamp
 • Recipients Table stores
 • Requires Join on Messages & Recipients
   table
 • De-normalize                             Messages    Recipients

 • Store timestamp in Recipients table as
                                            timestamp   timestamp
   well
Relationships

• When data is partitioned into shards,
  foreign keys become obsolete
• De-normalization avoids having
  relationships                                      Application
• If data can’t be de-normalized further,
  use memcached
• But, this requires change in SQL queries      MemCached


                                             Shard    Shard    Shard
                                               1        2        3
Cloud Databases/Data stores

•   Amazon SimpleDB
•   Google BigTable
•   Apache HBase
•   Facebook/Apache Hive
•   CouchDB
•   Cassandra
•   Many more…
Amazon SimpleDB
•   Schema-less distributed key-value store
•   Highly reliable and scalable
•   Automatic indexing of columns
•   Querying with SQL-like syntax
•   Supports multiple values for key/attribute
•   Value for Money
Problems Addressed
• High Availability
   – multiple nodes forming a ring
• Partitioning
   – Consistent hashing
• Replication
   – Replicated to multiple nodes
• Eventual Consistency
   – Asynchronous replication of data using vector clocks
SimpleDB adoption

•   No Joins
•   No transactional support
•   String is the only data type
•   No aggregator functions
•   No full-text searches
•   Limits enforced on size of results, predicates, data etc.
Google BigTable
•   Distributed Key-value store
•   Runs on top of Google File System (GFS)
•   Timestamp versioned data
•   Automatic indexing of columns
BigTable adoption
• Google Search, Maps, Earth, Orkut, Youtube,
  Reader, etc.
• Google App Engine(GAE) uses BigTable as its
  datastore
• DataNucleus supports JPA for BigTable
• Limited transaction support
• Eventual consistency
Hive
 • Hive is a data warehouse
 • Runs on top of Hadoop Distributed
   File system (HDFS)
 • Supports SQL-like syntax
 • User defined types and functions
 • Extensibility with Map-Reduce
Hive adoption
 • Facebook uses Hive to analyze historical
   data of users and content
 • Doesn’t support indexing of columns
 • Brute force mechanism to compute
   analytics
CouchDB
•   CouchDB is a document-oriented datastore
•   Schema-free
•   Accessible through RESTful JSON API
•   Distributed with incremental replication
•   Querying through Javascript
Is there a solution for all?


• Different data-stores address different problem spaces
• Identify what best suites your app
Thank You
   deepak@pramati.com



http://hysea.in
C L O U D               C O M P U T I N G - C O M I N G                                      O F      A G E

A     T R E A T I S E                    O N        R E A L - L I F E                       U S E     C A S E S



Scaling databases on the cloud



Copyright © 2009, Imaginea Inc. Not to be distributed or communicated without permission.           11/4/2009   24

Más contenido relacionado

La actualidad más candente

Geek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsGeek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsIDERA Software
 
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Cloudera, Inc.
 
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data AnalyticsEsther Kundin
 
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...Cloudera, Inc.
 
That ORM is Lying to You
That ORM is Lying to YouThat ORM is Lying to You
That ORM is Lying to YouRonen Botzer
 
MySql to HBase in 5 Steps
MySql to HBase in 5 StepsMySql to HBase in 5 Steps
MySql to HBase in 5 StepsScott Cinnamond
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooAndrew Brust
 
12 SQL On-Hadoop Tools
12 SQL On-Hadoop Tools12 SQL On-Hadoop Tools
12 SQL On-Hadoop ToolsXplenty
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaKaushik Dutta
 
Real-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaReal-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaData Science London
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big DataAndrew Brust
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupCaserta
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems Cloudera, Inc.
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopCloudera, Inc.
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברגTaldor Group
 

La actualidad más candente (20)

Geek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsGeek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native Applications
 
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
 
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
2015 GHC Presentation - High Availability and High Frequency Big Data Analytics
 
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
 
That ORM is Lying to You
That ORM is Lying to YouThat ORM is Lying to You
That ORM is Lying to You
 
MySql to HBase in 5 Steps
MySql to HBase in 5 StepsMySql to HBase in 5 Steps
MySql to HBase in 5 Steps
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 
12 SQL On-Hadoop Tools
12 SQL On-Hadoop Tools12 SQL On-Hadoop Tools
12 SQL On-Hadoop Tools
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik Dutta
 
Real-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera ImpalaReal-Time Queries in Hadoop w/ Cloudera Impala
Real-Time Queries in Hadoop w/ Cloudera Impala
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing Meetup
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in Hadoop
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברג
 
NoSQL
NoSQLNoSQL
NoSQL
 

Destacado

N I G H T F A L L N O C T U R N A L E M I S S S O I N D R S H R I N I W ...
N I G H T  F A L L  N O C T U R N A L  E M I S S S O I N  D R  S H R I N I W ...N I G H T  F A L L  N O C T U R N A L  E M I S S S O I N  D R  S H R I N I W ...
N I G H T F A L L N O C T U R N A L E M I S S S O I N D R S H R I N I W ...Abhishek Yelgalwar
 
P R A Y E R F O R 21 S T C E N T U R Y D R S H R I N I W A S K A S H A L...
P R A Y E R  F O R 21 S T  C E N T U R Y  D R  S H R I N I W A S  K A S H A L...P R A Y E R  F O R 21 S T  C E N T U R Y  D R  S H R I N I W A S  K A S H A L...
P R A Y E R F O R 21 S T C E N T U R Y D R S H R I N I W A S K A S H A L...ghanyog
 
S U P E R H A P P I N E S S D R
S U P E R  H A P P I N E S S  D RS U P E R  H A P P I N E S S  D R
S U P E R H A P P I N E S S D RAbhishek Yelgalwar
 
Head Massage Dr
Head  Massage  DrHead  Massage  Dr
Head Massage Drghanyog
 
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas Kashalikar
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas KashalikarSahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas Kashalikar
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas KashalikarAbhishek Yelgalwar
 
D A N C E S A N D D A N C E S D R S H R I N I W A S K A S H A L I K A R
D A N C E S  A N D  D A N C E S  D R  S H R I N I W A S  K A S H A L I K A RD A N C E S  A N D  D A N C E S  D R  S H R I N I W A S  K A S H A L I K A R
D A N C E S A N D D A N C E S D R S H R I N I W A S K A S H A L I K A RAbhishek Yelgalwar
 
T H E C O R E O F S E X D R S H R I N I W A S K A S H A L I K A R
T H E  C O R E  O F  S E X  D R  S H R I N I W A S  K A S H A L I K A RT H E  C O R E  O F  S E X  D R  S H R I N I W A S  K A S H A L I K A R
T H E C O R E O F S E X D R S H R I N I W A S K A S H A L I K A RAbhishek Yelgalwar
 
W H Y N A M A S M A R A N D R
W H Y  N A M A S M A R A N  D RW H Y  N A M A S M A R A N  D R
W H Y N A M A S M A R A N D Rghanyog
 
Imaginea Service Sheet - Performance Engineering
Imaginea Service Sheet - Performance EngineeringImaginea Service Sheet - Performance Engineering
Imaginea Service Sheet - Performance EngineeringImaginea
 
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...Abhishek Yelgalwar
 
D I A B E T E S D I S C U S S I O N D R
D I A B E T E S  D I S C U S S I O N  D RD I A B E T E S  D I S C U S S I O N  D R
D I A B E T E S D I S C U S S I O N D RAbhishek Yelgalwar
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D Rghanyog
 
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.Txt
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.TxtHealth In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.Txt
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.TxtAbhishek Yelgalwar
 
블로그 간담회 1차 발표자료 20091211 공개
블로그 간담회 1차 발표자료 20091211 공개블로그 간담회 1차 발표자료 20091211 공개
블로그 간담회 1차 발표자료 20091211 공개kang Anthony
 
I F E E L D O Y O U D R
I  F E E L  D O  Y O U  D RI  F E E L  D O  Y O U  D R
I F E E L D O Y O U D Rghanyog
 
Whitepaper Cloud Egovernance Imaginea
Whitepaper Cloud Egovernance ImagineaWhitepaper Cloud Egovernance Imaginea
Whitepaper Cloud Egovernance ImagineaImaginea
 

Destacado (17)

N I G H T F A L L N O C T U R N A L E M I S S S O I N D R S H R I N I W ...
N I G H T  F A L L  N O C T U R N A L  E M I S S S O I N  D R  S H R I N I W ...N I G H T  F A L L  N O C T U R N A L  E M I S S S O I N  D R  S H R I N I W ...
N I G H T F A L L N O C T U R N A L E M I S S S O I N D R S H R I N I W ...
 
P R A Y E R F O R 21 S T C E N T U R Y D R S H R I N I W A S K A S H A L...
P R A Y E R  F O R 21 S T  C E N T U R Y  D R  S H R I N I W A S  K A S H A L...P R A Y E R  F O R 21 S T  C E N T U R Y  D R  S H R I N I W A S  K A S H A L...
P R A Y E R F O R 21 S T C E N T U R Y D R S H R I N I W A S K A S H A L...
 
S U P E R H A P P I N E S S D R
S U P E R  H A P P I N E S S  D RS U P E R  H A P P I N E S S  D R
S U P E R H A P P I N E S S D R
 
P R A L H A D S A I D D R
P R A L H A D  S A I D  D RP R A L H A D  S A I D  D R
P R A L H A D S A I D D R
 
Head Massage Dr
Head  Massage  DrHead  Massage  Dr
Head Massage Dr
 
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas Kashalikar
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas KashalikarSahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas Kashalikar
Sahastranetra A Bestseller On Vishnusahasranam Dr. Shriniwas Kashalikar
 
D A N C E S A N D D A N C E S D R S H R I N I W A S K A S H A L I K A R
D A N C E S  A N D  D A N C E S  D R  S H R I N I W A S  K A S H A L I K A RD A N C E S  A N D  D A N C E S  D R  S H R I N I W A S  K A S H A L I K A R
D A N C E S A N D D A N C E S D R S H R I N I W A S K A S H A L I K A R
 
T H E C O R E O F S E X D R S H R I N I W A S K A S H A L I K A R
T H E  C O R E  O F  S E X  D R  S H R I N I W A S  K A S H A L I K A RT H E  C O R E  O F  S E X  D R  S H R I N I W A S  K A S H A L I K A R
T H E C O R E O F S E X D R S H R I N I W A S K A S H A L I K A R
 
W H Y N A M A S M A R A N D R
W H Y  N A M A S M A R A N  D RW H Y  N A M A S M A R A N  D R
W H Y N A M A S M A R A N D R
 
Imaginea Service Sheet - Performance Engineering
Imaginea Service Sheet - Performance EngineeringImaginea Service Sheet - Performance Engineering
Imaginea Service Sheet - Performance Engineering
 
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...
Sampurna Tanavmukti Aani Samasyapurti (Total Stress Management In Marathi) Dr...
 
D I A B E T E S D I S C U S S I O N D R
D I A B E T E S  D I S C U S S I O N  D RD I A B E T E S  D I S C U S S I O N  D R
D I A B E T E S D I S C U S S I O N D R
 
M A R R I A G E & M U T U A L B L O S S O M I N G D R
M A R R I A G E &  M U T U A L  B L O S S O M I N G   D RM A R R I A G E &  M U T U A L  B L O S S O M I N G   D R
M A R R I A G E & M U T U A L B L O S S O M I N G D R
 
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.Txt
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.TxtHealth In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.Txt
Health In 1st Chapter Of Geeta; Dr. Shriniwas Kashalikar.Txt
 
블로그 간담회 1차 발표자료 20091211 공개
블로그 간담회 1차 발표자료 20091211 공개블로그 간담회 1차 발표자료 20091211 공개
블로그 간담회 1차 발표자료 20091211 공개
 
I F E E L D O Y O U D R
I  F E E L  D O  Y O U  D RI  F E E L  D O  Y O U  D R
I F E E L D O Y O U D R
 
Whitepaper Cloud Egovernance Imaginea
Whitepaper Cloud Egovernance ImagineaWhitepaper Cloud Egovernance Imaginea
Whitepaper Cloud Egovernance Imaginea
 

Similar a Scaing databases on the cloud

Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLRichard Schneeman
 
Which Database is Right for My Workload?
Which Database is Right for My Workload?Which Database is Right for My Workload?
Which Database is Right for My Workload?Amazon Web Services
 
Which Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFWhich Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFAmazon Web Services
 
Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoAmazon Web Services
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Manik Surtani
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data ModelingAdam Doyle
 
AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataWeCloudData
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended CutWes McKinney
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...Simon Ambridge
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Spark Summit
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud applicationNoam Sheffer
 

Similar a Scaing databases on the cloud (20)

Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Scaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQLScaling the Web: Databases & NoSQL
Scaling the Web: Databases & NoSQL
 
Which Database is Right for My Workload?
Which Database is Right for My Workload?Which Database is Right for My Workload?
Which Database is Right for My Workload?
 
Which Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SFWhich Database is Right for My Workload: Database Week SF
Which Database is Right for My Workload: Database Week SF
 
Which Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San FranciscoWhich Database is Right for My Workload?: Database Week San Francisco
Which Database is Right for My Workload?: Database Week San Francisco
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
NoSQL-Overview
NoSQL-OverviewNoSQL-Overview
NoSQL-Overview
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Hadoop Data Modeling
Hadoop Data ModelingHadoop Data Modeling
Hadoop Data Modeling
 
AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudData
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud application
 

Más de Imaginea

Web application penetration testing
Web application penetration testingWeb application penetration testing
Web application penetration testingImaginea
 
Network penetration testing
Network penetration testingNetwork penetration testing
Network penetration testingImaginea
 
Require JS
Require JSRequire JS
Require JSImaginea
 
Scala and lift
Scala and liftScala and lift
Scala and liftImaginea
 
Imaginea Service Sheet - Interaction Design
Imaginea Service Sheet - Interaction DesignImaginea Service Sheet - Interaction Design
Imaginea Service Sheet - Interaction DesignImaginea
 
Imaginea - SugarCRM iPhone App - User Guide
Imaginea - SugarCRM iPhone App - User GuideImaginea - SugarCRM iPhone App - User Guide
Imaginea - SugarCRM iPhone App - User GuideImaginea
 
Offline Enterprise and Web Apps: Dekoh Approach
Offline Enterprise and Web Apps: Dekoh ApproachOffline Enterprise and Web Apps: Dekoh Approach
Offline Enterprise and Web Apps: Dekoh ApproachImaginea
 
Imaginea Scales Application using Amazon EC2
Imaginea Scales Application using Amazon EC2Imaginea Scales Application using Amazon EC2
Imaginea Scales Application using Amazon EC2Imaginea
 
Imaginea - Ideas to Life - About Us
Imaginea - Ideas to Life - About UsImaginea - Ideas to Life - About Us
Imaginea - Ideas to Life - About UsImaginea
 
Imaginea_CloudComputing_Services
Imaginea_CloudComputing_ServicesImaginea_CloudComputing_Services
Imaginea_CloudComputing_ServicesImaginea
 
Imaginea_Product Engineering_Services
Imaginea_Product Engineering_ServicesImaginea_Product Engineering_Services
Imaginea_Product Engineering_ServicesImaginea
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The CloudImaginea
 
Imaginea Cloud Offerings
Imaginea Cloud OfferingsImaginea Cloud Offerings
Imaginea Cloud OfferingsImaginea
 
Soa Offerings
Soa OfferingsSoa Offerings
Soa OfferingsImaginea
 
Sharing on Dekoh - Our RIA Desktop Platform
Sharing on Dekoh - Our RIA Desktop PlatformSharing on Dekoh - Our RIA Desktop Platform
Sharing on Dekoh - Our RIA Desktop PlatformImaginea
 
Product QA - A test engineering perspective
Product QA - A test engineering perspectiveProduct QA - A test engineering perspective
Product QA - A test engineering perspectiveImaginea
 
Facebook Olympics
Facebook OlympicsFacebook Olympics
Facebook OlympicsImaginea
 
Process Guidelines V2
Process Guidelines V2Process Guidelines V2
Process Guidelines V2Imaginea
 
Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step Imaginea
 
Cloud Offerings and Services
Cloud Offerings and ServicesCloud Offerings and Services
Cloud Offerings and ServicesImaginea
 

Más de Imaginea (20)

Web application penetration testing
Web application penetration testingWeb application penetration testing
Web application penetration testing
 
Network penetration testing
Network penetration testingNetwork penetration testing
Network penetration testing
 
Require JS
Require JSRequire JS
Require JS
 
Scala and lift
Scala and liftScala and lift
Scala and lift
 
Imaginea Service Sheet - Interaction Design
Imaginea Service Sheet - Interaction DesignImaginea Service Sheet - Interaction Design
Imaginea Service Sheet - Interaction Design
 
Imaginea - SugarCRM iPhone App - User Guide
Imaginea - SugarCRM iPhone App - User GuideImaginea - SugarCRM iPhone App - User Guide
Imaginea - SugarCRM iPhone App - User Guide
 
Offline Enterprise and Web Apps: Dekoh Approach
Offline Enterprise and Web Apps: Dekoh ApproachOffline Enterprise and Web Apps: Dekoh Approach
Offline Enterprise and Web Apps: Dekoh Approach
 
Imaginea Scales Application using Amazon EC2
Imaginea Scales Application using Amazon EC2Imaginea Scales Application using Amazon EC2
Imaginea Scales Application using Amazon EC2
 
Imaginea - Ideas to Life - About Us
Imaginea - Ideas to Life - About UsImaginea - Ideas to Life - About Us
Imaginea - Ideas to Life - About Us
 
Imaginea_CloudComputing_Services
Imaginea_CloudComputing_ServicesImaginea_CloudComputing_Services
Imaginea_CloudComputing_Services
 
Imaginea_Product Engineering_Services
Imaginea_Product Engineering_ServicesImaginea_Product Engineering_Services
Imaginea_Product Engineering_Services
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Imaginea Cloud Offerings
Imaginea Cloud OfferingsImaginea Cloud Offerings
Imaginea Cloud Offerings
 
Soa Offerings
Soa OfferingsSoa Offerings
Soa Offerings
 
Sharing on Dekoh - Our RIA Desktop Platform
Sharing on Dekoh - Our RIA Desktop PlatformSharing on Dekoh - Our RIA Desktop Platform
Sharing on Dekoh - Our RIA Desktop Platform
 
Product QA - A test engineering perspective
Product QA - A test engineering perspectiveProduct QA - A test engineering perspective
Product QA - A test engineering perspective
 
Facebook Olympics
Facebook OlympicsFacebook Olympics
Facebook Olympics
 
Process Guidelines V2
Process Guidelines V2Process Guidelines V2
Process Guidelines V2
 
Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step
 
Cloud Offerings and Services
Cloud Offerings and ServicesCloud Offerings and Services
Cloud Offerings and Services
 

Último

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Scaing databases on the cloud

  • 1. Scaling databases on the cloud D e e p a k A n u p a l l i S e r v e r A r c h i t e c t C L O U D C O M P U T I N G - C O M I N G O F A G E A T R E A T I S E O N R E A L - L I F E U S E C A S E S Copyright (c) 2009, Pramati Technologies Private Limited. Imaginea is a Pramati business. All trade names and trade marks are owned by their respective owners 11/4/2009 1
  • 2. We are • An emerging leader in product development services offering specialized services in Product Engineering, Interaction design and Test engineering. • US Headquarters in Sunnyvale, CA; India development centers in Hyderabad and Chennai • A 250+ strong and growing team • A business unit of Pramati technologies • Rich Experience in SaaS Engineering, Performance engineering, Cloud Computing, Web2.0, sf.com integrations and managing Amazon EC2 Deployment • Track record of delivering significant customer satisfaction
  • 3. Initiatives in Cloud • Dekoh: http://www.dekoh.com • SocialTwist: http://www.socialtwist.com • MyPicks Beijing 2008: http://apps.new.facebook.com/mypicksbeijing/Home • Qontext: http://www.qontext.com
  • 4. Application requirements • High reliability • Low Latency • Dynamic Scalability – Millions of Users – Volumes of data • Across the tiers – Web – Application – Data
  • 5. Our biggest challenge • DB Perf bound by Disk I/O • Vertical scaling is an option – Ex: PlentyOfFish.com: 512GB RAM, 32CPUs – Expensive – Only possible to an extent on cloud servers
  • 6. Vertical Scaling: Limitations • Not everything will fit in memory • Lot of reads ~ Lot of page faults + disk seeks • RAID 6 or RAID 10 disks • 200MBps-1GBps is the max speed Think Horizontal !
  • 7. Replication • Master-slave replication (MySQL Writes or Oracle RAC) • Writes on one Master Master • Reads on many Slaves • Application aware • Works in read mostly scenario Writes • Adds Slave lag Slave Slave Slave Reads
  • 8. Sharding • Partition data across masters • Writes and Reads are distributed Shard Logic • Application is modified accordingly • Also use replication with fewer slaves to minimize slave lag Master Master Master • Choose a partitioning strategy that uniformly distributes data Slave Slave Slave
  • 9. Sharding Schemes • Vertical shard_id = getShard(“profile”) • Profile DB, friend DB shard_id = getShard(profileID) • Not uniform Select * from Profile where id = ? • Range based • ID range, Location or Date based • Not uniform Corporate Corporate • Key or Hash based • ID hash • Fixed masters Tweets Posts • Directory • Mapping of ID to Shard • Single point of failure
  • 10. Sharding Complexities • No Joins • De-normalize the data • Data Integrity • Application should enforce integrity • Re-shard • Changing the sharding scheme requires re-partitioning the entire data
  • 11. De-normalization • Recent 10 messages to a recipient • Schema Messages Recipients • Messages Table stores message info timestamp • Recipients Table stores • Requires Join on Messages & Recipients table • De-normalize Messages Recipients • Store timestamp in Recipients table as timestamp timestamp well
  • 12. Relationships • When data is partitioned into shards, foreign keys become obsolete • De-normalization avoids having relationships Application • If data can’t be de-normalized further, use memcached • But, this requires change in SQL queries MemCached Shard Shard Shard 1 2 3
  • 13. Cloud Databases/Data stores • Amazon SimpleDB • Google BigTable • Apache HBase • Facebook/Apache Hive • CouchDB • Cassandra • Many more…
  • 14. Amazon SimpleDB • Schema-less distributed key-value store • Highly reliable and scalable • Automatic indexing of columns • Querying with SQL-like syntax • Supports multiple values for key/attribute • Value for Money
  • 15. Problems Addressed • High Availability – multiple nodes forming a ring • Partitioning – Consistent hashing • Replication – Replicated to multiple nodes • Eventual Consistency – Asynchronous replication of data using vector clocks
  • 16. SimpleDB adoption • No Joins • No transactional support • String is the only data type • No aggregator functions • No full-text searches • Limits enforced on size of results, predicates, data etc.
  • 17. Google BigTable • Distributed Key-value store • Runs on top of Google File System (GFS) • Timestamp versioned data • Automatic indexing of columns
  • 18. BigTable adoption • Google Search, Maps, Earth, Orkut, Youtube, Reader, etc. • Google App Engine(GAE) uses BigTable as its datastore • DataNucleus supports JPA for BigTable • Limited transaction support • Eventual consistency
  • 19. Hive • Hive is a data warehouse • Runs on top of Hadoop Distributed File system (HDFS) • Supports SQL-like syntax • User defined types and functions • Extensibility with Map-Reduce
  • 20. Hive adoption • Facebook uses Hive to analyze historical data of users and content • Doesn’t support indexing of columns • Brute force mechanism to compute analytics
  • 21. CouchDB • CouchDB is a document-oriented datastore • Schema-free • Accessible through RESTful JSON API • Distributed with incremental replication • Querying through Javascript
  • 22. Is there a solution for all? • Different data-stores address different problem spaces • Identify what best suites your app
  • 23. Thank You deepak@pramati.com http://hysea.in
  • 24. C L O U D C O M P U T I N G - C O M I N G O F A G E A T R E A T I S E O N R E A L - L I F E U S E C A S E S Scaling databases on the cloud Copyright © 2009, Imaginea Inc. Not to be distributed or communicated without permission. 11/4/2009 24