Rightscale Webinar: The number one cause of poor scalable web application performance is the database. This problem is magnified in cloud environments where I/O and bandwidth are generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier by operating on top of proven RDBMS products such as MySQL and PostgreSQL.
In this webinar, you'll learn what it really takes to implement sharding, the role it plays in the effective end-to-end lifecycle management of your entire database environment, and why it is crucial for ensuring reliability.
In this webinar, we will:
- Guide you on how to choose the best technology for your specific application
- Show you how to shard your existing database
- Review a case study on a Top 20 Facebook application built on dbShards
2. 2#
Your Panel Today
Presenting:
• Uri Budnik: Director, ISV Partner Program, RightScale @uribudnik
• Cory Isaacson: CEO & Founder, CodeFutures @dbShards
• David Blinder: CTO, Family Builder
Q&A:
• Jason Altobelli, Inside Sales Representative, RightScale
Please use the chat box window to ask questions anytime!
Webinar Recordings: www.rightscale.com/webinars
3. 3#
Agenda
• Introduction to RightScale
• Introduction to CodeFutures
• Live Demo
• Live Q&A
Please use the chat box window to ask questions anytime!
4. 4#
RightScale
Real Customers, Real Deployments, Real Benefits
• Managed Cloud Deployments for 4 Years
• More than 30,000 users; launched over 2.7MM servers
• Behind the largest production deployments on that cloud to
date
8. 8#
RightScripts in Multi-Cloud Marketplace
• Two RightScripts you can use to analyze you application to
determine if its “shard-safe”
1. Logging Driver for Native MySQL®
2. dbShards/Analyze Driver for JDBC
• Installed in your app server to gather SQL statistics.
• Its an in-depth analysis of what is
needed to shard you database
• Report lists each unique SQL statement
and how it will function once sharded
• Run once and generate a report that
CodeFutures will review with you at
no charge
9. 9#
Introduction
• Who I am
• Cory Isaacson, CEO of CodeFutures
• Providers of dbShards
• Author of Software Pipelines
• Partnerships:
• Rightscale
• The leading Cloud Management Platform
• Leaders in database scalability, performance, and high-availability for
the cloud
• based on real-world experience with dozens of cloud-based applications
• social networking, gaming, data collection, mobile, analytics
• Objective is to provide useful experience you can apply to scaling (and
managing) your database tier…
• especially for high volume applications
• and an overview of dbShards technology
10. 10#
Challenges of cloud computing
• Cloud provides highly attractive service environment
• Flexible, scales with need (up or down)
• No need for dedicated IT staff, fixed facility costs
• Pay-as-you-go model
• Cloud services occasionally fail
• Partial network outages
• Server failures
• by their nature cloud servers are “transient”
• Disk volume issues
• Cloud-based resources are constrained
• CPU
• I/O Rates
• the “Cloud I/O Barrier”
12. 12#
Scaling in the Cloud
• Scaling Load Balancers is easy
• Stateless routing to app server
• Can add redundant Load Balancers if needed
• If one goes down
• failover to another
• Scaling Application Servers is easy
• Stateless
• Sessions can easily transition to another server
• Add or remove servers as need dictates
• If one goes down
• failover to another
13. 13#
Scaling in the Cloud
• Scaling the Database tier is hard
• “Statefull” by definition (and necessity)
• Large, integrated data sets
• 10s of GBs to TBs (or more)
• Difficult to move, reload
• I/O dependent
• adversely affected by cloud service failures
• and slow cloud I/O
• If one goes down
• ouch!
14. 14#
Scaling in the Cloud
• Databases form the “last mile” of true application scalability
• Start with simple optimizations
• implement a follow-on scalability strategy for long-term performance goals
• and a high-availability strategy is a must
• Ensure your databases can failover
• unplanned outages
• and planned maintenance
• The best time to plan your database scalability strategy is now
• don‟t wait until it‟s a “3-alarm fire”
15. 15#
Familybuilder
Innovator in Facebook applications
Among first 500 apps worldwide
David Blinder, CTO
17. 17#
Database slowdown is not linear…
Database Load Curve
10000
9000
8000
7000
Load Time
6000
5000
4000 Time
3000 Expon. (Time)
2000
1000
0
0 10 20 30 40
Data File (GB)
GB Load Time (Min)
.9 1
1.3 2.5
3.5 11.7
39.0 10 days…
18. 18#
Challenges apply to all types of databases
• Traditional RDBMS (MySQL, PostgreSQL, Oracle…)
• I/O bound
• Multi-user, lock contention
• High-availability
• Lifecycle management…
• backup/restore
• schema changes
• index maintenance
• NoSQL Databases (In-memory, Caching, Document)
• Reliability, High-availability
• Limits of a single server
• and a single thread
• Data dumps to disk
• Replication
• Lifecycle Management
19. 19#
Challenges apply to all types of databases
• No matter what the technology, big databases are hard to
manage
• elastic scaling is a real challenge
• degradation from growth in size and volume is a certainty
• application-specific database requirements add to the challenge
• Sound database design is key…
• balance performance vs. convenience vs. data size
20. 20#
The Laws of Databases
• Law #1: Small Databases are fast
• Law #2: Big Databases are slow
• Law #3: Keep databases small
21. 21#
What is the answer?
• Database sharding is the only effective method for
achieving scale, elasticity, reliability and easy management
• regardless of your database technology
22. 22#
What is Database Sharding?
• “Horizontal partitioning is a database design principle
whereby rows of a database table are held separately...
Each partition forms part of a shard, which may in turn be
located on a separate database server or physical
location.” Wikipedia
23. 23#
What is Database Sharding?
• Start with a big monolithic database
• break it into smaller databases
• across many servers
• using a key value
27. 27#
Why does Database Sharding work?
• Maximize CPU/Memory per database instance
• as compared to database size
• Reduce the size of index trees
• speeds writes dramatically
• reads are faster too
• aggregate, list queries are generally much faster
• No contention between servers
• locking, disk, memory, CPU
• Allows for intelligent parallel processing
• Go Fish queries across shards
• Keep CPUs busy and productive
32. 32#
How Relational Sharding works
• Shard key recognition in SQL
• SELECT * FROM customer
WHERE customer_id = 1234
• INSERT INTO customer
(customer_id, first_name, last_name, addr_line1,…)
VALUES
(2345, „John‟, „Jones‟, „123 B Street‟,…)
• UPDATE customer
SET addr_line1 = „456 C Avenue‟
WHERE customer_id = 4567
36. 36#
dbShards/Analyze
• Review Database Schema
• Define your initial shard strategy
• Run dbShards/Analyze Driver
• on your app in a test environment
• generate logs of all application SQL
• Generate dbShards/Analyze reports
• with your data model
• your shard strategy
• your SQL logs as input
• Ensure your application is shard-safe
• before you shard your database
• and identify optimization opportunities
38. 38#
No-charge Shard Analysis
• Drop-in dbShards/Analyze Drivers
• Native MySQL
• JDBC
• ODBC
• Available as Rightscale templates
• search Multi-Cloud Marketplace for CodeFutures
• Logging Driver for Native MySQL®
• dbShards/Analyze Driver for JDBC
• Run driver in your environment, with your app
• ship us the logs, schema
• a dbShards consultant take you through the analysis
• Find out exactly what it takes to shard your database
• regardless of the technology you select
39. 39#
Wrap-up
• Database Sharding is the tool for scaling
your database
• dbShards is a complete, drop-in sharding
solution
• Plug-compatible database drivers
• nothing between you and your database
• Intelligent agents for shard management,
processing
• Database agnostic, pick the DBMS you prefer
• Use dbShards for existing applications
• new ones too
• dbShards supports the entire Database
Sharding infrastructure
• Analyze, Shard, Manage
• 24X7 Monitoring and Support for all customers
41. 41#
We Appreciate Your Time
Contacts
Cory Isaacson: RIGHTSCALE:
CodeFutures Corporation (866) 720-0208
sales@codefutures.com sales@rightscale.com
http://www.dbshards.com http://www.rightscale.com
More Info:
Webinar archive: RightScale.com/webinars
Whitepapers: RightScale.com/whitepapers
Free Edition: RightScale.com/free
Notas del editor
Writes are linear, reads can be faster – depending on your database architecture.
How did your database perform 6 months or a year ago?
Difference between Black box sharding/NoSQL and App Aware sharding with an RDBMS – you can get sets of related data from the same shard, otherwise need to retrieve a row at a time and consolidate