Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Building Scalable Databases on AWS - AWS Summit 2012 - NYC
1. Building Scalable Database
Applications on AWS
Sundar Raghavan, Amazon
Greg Scallan, Architect, Flipboard
Eric Weller, Director, Earth Networks
Edward Dingels, Technical Lead, Earth Networks
2. Database Services: One Size Does Not Fit All
Amazon RDS
+ DynamoDB
ElastiCache
MySQL, Oracle,
Memcached
Apps that need Scalable Relational databases Apps that need Massive Scalability
YesSQL NoSQL
3. Building Database Applications – The Old Way
Demand Human Layer • Stuck with peak
capacity
• Human driven
• Time consuming
Q1 Time
4. What We Hear From Customers
“Help us focus on applications – Shift database maintenance time to more
productive application development time”
Security planning
Frequent server upgrades License/doc training
Backup rec 5% 5%
Constant storage upgrades load/unload
20% 5%
Scripting coding
Backup and recovery
Software upgrade and patching Performance and 25% 40%
tuning
Hardware failure replacements Install, upgrade, patch,
migration
Distribution of time
Source: http://www.forrester.com/Events/Content/0,5180,-1110,00.ppt
6. Amazon Relational Database Service
RDS is a fully managed Relational database service that is simple to
deploy, easy to scale, reliable and cost-effective
Choice of Database Engines
Fully Managed Service
Push Button Scalability
Fault Tolerance with Multi-AZ
Works with EC2 & ElastiCache
Amazon Relational Database Service (RDS)
7. High Performance Relational Databases
Amazon RDS Improve Increase Reduce
Configuration Availability Throughput Latency
Push-Button Scaling
Multi AZ
Read Replicas
ElastiCache
Push-Button Scaling Read Replicas ElastiCache
8. Push Button Scaling
Scale up your instance
and storage
Scale out as your app
grows
Automatic backup
Easy restore
9. Read Scaling with Amazon ElastiCache and
Read Replicas
Improve performance
with ElastiCache
Read replica for read
scaling
Off load BI and reporting
from production
14. Introduction
Your Social Magazine for Apple Mobile Devices
Delivers relevant articles and photos based on usage and
interactions within your social networks
Launched 6 months after the initial team put together
Over 5 million users and 2 billion page “flips” per month
15. Operating in the Cloud : Managing Complex, Real Time Data
6 months to deploy a real-time, socially relevant magazine
Challenge
Constantly changing user interests
Architecture Ability to change all hardware and software elastically
Frequently changing system requirements
App needs Complex queries on user and relevancy data
Milliseconds count. So, does uptime
Solution Highly performant, reliable, proven database technology
Amazon RDS MySQL
16. The Data View of the World
Flipboard Application
ElastiCache Memcache for
performance
Reliable and
Operational
Complex data,
Configuration
Queries
and State Data
SimpleDB RDS MySQL
17. Friends, Magazines …
anyone relevant to
me who published
something very
recently that I care
about seeing
Show Me More, Please!
A friend who recently shared a photo
18. Amazon RDS Tips For Success
1. Leverage the Cloud for what it does best. Don’t bring old DC habits.
2. Scale up for better performance.
3. Use RDS for complex, real-time data.
4. Use Read Replicas to augment write heavy databases. They are awesome.
5. Leverage existing SQL knowledge and experience.
6. Use copies of your database for testing new code. It’s trivial and saves time.
7. Scale horizontally with sharding. Plan for it before you need it.
20. Relational Or Non-Relational?
Factors Relational (RDS) NoSQL (DynamoDB)
• Existing database apps • New Web scale applications
Application type
• Business process-centric apps • Large # of small writes and reads
Application • Relational data models, transactions • Simple data models, transactions
characteristics • Complex queries, joins and updates • Range queries, simple updates
Scaling • Application or DBA architected • Seamless, on-demand scaling
• Performance – developer architected • Performance – Automatic
QoS
• Reliability and availability • Reliability and availability
Skillset • SQL + Programming languages • Web style programming
Possible to use both SQL and NoSQL models in one application
21. The “Big Data” Scalability Challenge
Requirement: predictable,
consistent performance
Performance
Hardware purchase and
provisioning
$!
Data sharding
Data caching
Cluster management
Reality: performance Fault management
degrades with scale
Scalability
22. Kit, go
faster
Transformation 3:
From Scaling by
Architecture …
to Scaling By Yes
Command
Michael
23. Amazon DynamoDB
DynamoDB is a fully managed NoSQL database service that provides
extremely fast and predictable performance with seamless scalability
Easy Administration
Low Latency SSD’s
Reserved Capacity
Unlimited Potential Storage
and Throughput
24. Amazon DynamoDB
NoSQL Database
Fast & predictable performance
Seamless Scalability
ADMIN
Easy administration
“Even though we have years of experience with large, complex
NoSQL architectures, we are happy to be finally out of the
business of managing it ourselves.” - Don MacAskill, CEO
25. DynamoDB Highlights
Low Latency
• SSD-based storage
nodes
• Typical latency =
Get latency
single-digit
milliseconds
CPU utilization
26. Provisioned Throughput
Reserve the IOPS needed for each table
Set at table creation
Increase / decrease any time via API call
Pay for throughput and storage (not instances)
• $0.01 per hour for every 10 units of Write Capacity
• $0.01 per hour for every 50 units of Read Capacity
• $1.00 per GB-month of Storage
28. Earth Networks Case Study
Eric Weller – Director of Development
Edward Dingels – Technical Lead
29. Introduction
Gathers and analyzes atmospheric observations from a
global sensor network to promote a better understanding
of the planet
Proprietary lightning network output used to pinpoint
lightning activity - best indicator of dangerous weather
Owner of the WeatherBug brand (mobile, desktop, Web)
30. Problem
Generate lightning alert notifications, in proximity
Need to the user’s location, on a mobile device.
Geospatial queries
Scalable
Constraints • 6 million existing mobile users
• 100% YOY mobile growth
• Severe Weather Outbreaks
Fast
• Speed + Accuracy = Safety
Reduce Time to Market
Cost of Ownership
31. Analysis
Provider Product Throughput Engineering Cost of
per Instance Cost Ownership
Microsoft SQL Server 2008 Medium Low High
MySQL MySQL Medium High* High*
Earth Networks In Memory Quadtree High High Medium
Amazon Mem-Cache High Medium Medium
Amazon DynamoDB High Low Low
* Not currently supported by Earth Networks
32. Solution
Geo-Hash lightning data
Backend Windows service for loading/management
of data inside DynamoDB
Front-end web service for encoding/decoding
Use geometric instead of trigonometric calculations
Client computes actual range and produces alerts
33. Reflection
DynamoDB Advantages
• Easy to Provision
• Built-in Consistency
• Scalable
• High Availability
DynamoDB Wish List
• Automated regional DR failover
• Durability across regions in addition to inside a region
• Auto-scale down without a throttle on change
• Item size limit of 64k
36. Try Amazon DB Services!
Amazon RDS – Free usage
• For Details Visit Aws.amazon.com/rds
DynamoDB – Free usage
• For Details Visit Aws.amazon.com/dynamodb
Notas del editor
RDS today is used by thousands of customers from small startups to the world’s largest corporations.We have a predictable history of delivering reliable services and adding even greater functionality over time. (MAZ, RR, Caching)
RDS prestructure before adding new attributes, with NoSQL you do not have to do this it is a sparse schema. RDS creates DB on EC2 and EBS for you, gives you Host name and put and you take it from there with your fav SQL client. SimpleDB is in the DataPath, you dump the data and it creates the schema (Does Dynamo?)RDS Benefits – 1. Scale with modify API but there is an outage when you swap out instance types (e.g. from med to XL) but only takes a femins. Events are starting and done. IP address stays the same. Same for EBS add storage with no downtime or degradation in perf2. DB Backup and Restore – DB Snapshot or automated backup. 3. Monitor DB – recover from DB crash, EC2 crash, Gumi has been our number 1 customer for a few weeks over $250k in the last 4 weeksFunzio is in the top 5 and so is activisionEA, sony, bightgames also customers.We have been talking to playfish about DynamoDB but I don’t see them in the rds user base.Could not find zynga.
Standard N-Tier Layered Architecture : Logic Tier talks to Data TierData Tier: we have many different DB's depending on the problem space:SimpleDB : Operational Configuration and State goes hereNoSQL : Content Processing (e.g. article categorization), usage processing (e.g. what topics do most readers in England care about at 8am versus 10pm), etc...Memcached (a.k.a ElastiCache) : If you will query it twice based on simple key/value, it goes here.RDS: Reliable AND Complex queries needed? Go hereIn general, we found per user data needed by the presentation tier is best stored using RDS since most requests are user/session focused and requires complex logic based on new features or user interface designs. This includes stuff such as what social networks you belong to, your social interactions such as liking photos or reading articles, synchronization of state across multiple devices, etc.
So, who is using DynamoDB?Many companies within Amazon and outside Amazon.Amazon CloudDrive, Smugmug, Amazon retail are examples within Amazon.Smugmug is using us for storing their metadata on DynamoDB.Formpsring and Tapjoy are one of our early adopters.Finally, the shazam the popular voice recognition application for ads and music is using us to store its data for various applications.