Scaling MySQL: Benefits of Automatic Data Distribution

Webinar: Scaling MySQL
Benefits of Automatic Data Distribution
December 13, 2012

Agenda

1. Who We Are

2. The Scalability Problem

3. Benefits of Automatic Data Distribution

4. Customer ROI/Case Studies

5. Q & A
(please type questions directly into the GoToWebinar side panel)

2

Who We Are

Presenters: Paul Campaniello,
VP of Global Marketing
25 year technology veteran with
marketing experience at Mendix,
Lumigent, Savantis and Precise.

Doron Levari, Founder
A technologist and long-time
veteran of the database industry.
Prior to founding ScaleBase, Doron
was CEO to Aluna.

3

Pain Points – The Scalability Problem

• Thousands of new online and mobile
apps launching every day
• Demand climbs for these apps and
databases can’t keep up
• App must provide uninterrupted
access and availability
• Database performance and
scalability is critical

4

Big Data = Big Scaling Needs

Big Data = Transactions + Interactions + Observations
Sensors/RFID/Devices Mobile Web User Generated Content Spatial & GPS Coordinates

BIG DATA
Petabytes User Click Stream Sentiment Social Interactions & Feeds

Web Logs Dynamic Pricing Search Marketing

WEB
Offer History A/B Testing Affiliate Networks
Terabytes External
Demographics
Segmentation Customer Touches

CRM
Business Data
Offer Details Support Contacts Feeds

Gigabytes
HD Video, Audio, Images
Behavioral
ERP

Purchase Detail
Targeting Speech to Text
Purchase Record
Product/Service Logs
Payment Record Dynamic
Funnels
SMS/MMS
Megabytes

Increasing Data Variety and Complexity

5
The 451 Group & Teradata

Scalability Pain

Infrastructure
Cost $
Large You just lost
Capital customers
Expenditure

Predicted
Demand

Opportunity Traditional
Cost Hardware

Actual
Demand

Dynamic
Scaling

time

6

Ongoing “Scaling MySQL” Series

• August 16 & September 20, 2012
– Scaling MySQL: ScaleUp versus Scale Out

• October 23, 2012
– Methods and challenges to Scale out MySQL

• Today
– Benefits of Automatic Data Distribution

• January 17, 2013
– Catch 22 of read-write splitting

7

The Database Engine is the Bottleneck...

• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment
– Transaction log
• And Multiple Activities in the DB engine memory:
– Buffer management
– Locking
– Thread locks/semaphores
– Recovery tasks

8

The Database Engine is the Bottleneck

• Every write operation is At Least 4 write operations inside the DB:
– Data segment
– Index segment
– Undo segment Now multiply
– Transaction log by 10TB
accessed by
• And Multiple Activities in the DB engine memory:
10000
– Buffer management
concurrent
– Locking
sessions
– Thread locks/semaphores
– Recovery tasks

9

COI – Customer, Order, Item
CUSTOMER ORDER ORDER_ITEM ITEM
C_ID NAME LOCATION RANK O_ID C_ID DATE OI_ID O_ID QUANT I_ID I_ID NAME
1 John MA 10 1 1 2012-02-01 1 1 3 1 1 iPhone
2 James AL 9 2 1 2012-02-01 2 1 6 2 2 iPad
3 Peter CA 10 3 2 2012-02-01 3 2 4 1 3 iPad Mini
4 Chris FL 8 4 6 2012-02-01 4 2 2 2 4 Kindle
5 Oliver MA 9 5 6 2012-02-01 5 2 1 5 5 Kindle Fire
6 Allan MA 9 6 8 2012-02-01 6 3 1 1 6 Galaxy S3
7 Janette CA 8 7 3 6 5
8 David MD 10 8 4 8 3
9 4 9 4
10 5 2 6
11 6 1 5

10

Requirements

• Every day:
• Updates Throughput

– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm

Latency
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…

11

Splitting the data

• CUSTOMER – random (hash)
• ORDER – derivative (C_ID)
• ORDER_ITEM – transitive (O_ID -> C_ID)
• ITEM – global table

12

Sliced Database
CUSTOMER ORDER ORDER_ITEM ITEM
1 John MA 10 1 1 2012-02-01 1 1 3 1 1 iPhone
4 Chris FL 8 2 1 2012-02-01 2 1 6 2 … …
7 Janette CA 8 3 2 4 1 6 Galaxy S3
4 2 2 2

DB - 1 5 2 1 5

2 James AL 9 3 2 2012-02-01 6 3 1 1 1 iPhone
5 Oliver MA 9 6 8 2012-02-01 7 3 6 5 … …
8 David MD 10 11 6 1 5 6 Galaxy S3

DB - 2

3 Peter CA 10 4 6 2012-02-01 8 4 8 3 1 iPhone
6 Allan MA 9 5 6 2012-02-01 9 4 9 4 … …
10 5 2 6 6 Galaxy S3
DB - 3
13

Requirements
Distribution
• Every day:
• Updates Throughput

– 30,000 new customers
– 1,000,000 new orders, average of 5 items per order
– Items catalog is updated once a day, nightly, on 11pm

Parallelism
Latency
• Queries
– Top customers, rank 9 and up)
– New orders, joins across the board…

14

Automatic Data Distribution

• The ultimate way to scale
• Provides significant performance improvements
• The only way to really improve read and also writes
• Good for scaling high session-volume reads and writes
• Good for scaling high data-volume reads and writes
• Home-grown implementations have drawbacks

15

Scale Out Features and Benefits

Feature Benefit
Parallel query execution Great performance of cross-db queries &
maintenance commands
Query result aggregation Support of sophisticated cross-db queries, even with
ORDER BY, GROUP BY, LIMIT, Aggregate functions…

Online data redistribution Flexibility: no need to over-provision
No downtime

100% compatible MySQL proxy Applications unmodified
Standard MySQL tools and interfaces
MySQL databases untouched Data is safe within MySQL InnoDB/MyISAM/any

Data distribution review and analysis Optimization of data distribution policy

Data consistency verifier Validate system-wide data consistency

Real-time monitoring and alerts Simplify management, reduce TCO

16

Scale Out Provides Immediate & Tangible Value

Application Server Database A Standby A

Application Server Database B Standby B

Database C Standby C
BI

Database D Standby D
Management

17

Typical Scale Out (ScaleBase) Deployment

Application Server Database A Standby A

ScaleBase
Central Management

Application Server Database B Standby B

ScaleBase
Data Traffic Manager

Database C Standby C
BI

Database D Standby D
Management

18

Choose Your Scale-out Path

Data Distribution

Database Size

Read/Write Splitting

1 DB?
Good for me!

# of concurrent sessions
19

Scaling Out Achieves Unlimited Scalability

160000

140000

120000

100000
Throughput

84000
80000 Throughput (TPM)
Total DB Size (MB)
60000 60000 # Connections
48000
40000
36000
24000 2500
20000 2000
12000 1500 1500
6000 1000
0 500 500
1 2 4 6 8 10 14
Number of Databases

20

Detailed Scale Out Case Studies

Nokia AppDynamics Mozilla Solar Edge
• Device Apps App • Next gen APM • New Product/ • Next Gen
• Availability company Next Gen App/ Monitoring App
• Scalability • Scalability for the AppStore • Massive Scale
• Geo-clustering Netflix • Scalability • Monitors real
implementation • Geo-sharding time data from
• 100 Apps
thousands of
• 300 MySQL DB
distributed
systems

21

Summary

• Database scalability is a significant problem
– App explosion, Big Data, Mobile
• Scale Up helps somewhat, but Scale Out provides
a long-term, cost-effective solution

• ScaleBase has an effective Scale Out
solution with a proven ROI
– Improves performance &
requires NO changes to
your existing infrastructure
• Choose your scale-out path....
– The ScaleBase platform enables
you to start with R/W splitting and
grow into automatic data distribution

22

Questions (please enter directly into the GTW side panel)

617.630.2800

www.ScaleBase.com

doron.levari@scalebase.com

paul.campaniello@scalebase.com

23

Scaling MySQL: Benefits of Automatic Data Distribution

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (10)

Similar a Scaling MySQL: Benefits of Automatic Data Distribution

Similar a Scaling MySQL: Benefits of Automatic Data Distribution (20)

Scaling MySQL: Benefits of Automatic Data Distribution