6. Managed services transform operations
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB software patches
Database backups
High Availability
DB software installs
OS installation
Scaling
Operating
Databases
in AWS
App optimization
you
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB software patches
Database backups
Scaling
High Availability
DB software installs
OS installation
you
App optimization
Operating
Databases
in the Old World
7. Scale compute
and storage with a
few clicks; minimal
downtime for your
application
Automatic Multi-AZ
data replication;
automated backup,
snapshots, and
failover
Data encryption at
rest and in transit;
industry compliance
and assurance
programs
Running many databases with Amazon RDS
Managed Relational Database Service with choice
Managed & Automated
Deploy and maintain
hardware, OS, and DB
software; built-in
monitoring
Performant & scalable Available & durable Secure & compliant
8. Key Amazon RDS Features
Managed Relational Database Service with choice
Amazon RDS
Configuration
Improve
Availability
Increase
Throughput
Reduce
Latency
Push-Button Scaling
Multi AZ
Read Replicas
Provisioned IOPS
Read ReplicasPush-Button Scaling
Provisioned IOPS
Region
Multi-AZ
availability
zone
availability
zone
9. Moving to open source
database engines
+
Commercial-grade performance and reliability?
10. Amazon Aurora
MySQL and PostgreSQL compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost
5x throughput of standard
MySQL and 3x of standard
PostgreSQL; scale-out up
to15 read replicas
Fault-tolerant, self-healing
storage; six copies of
data across three AZs;
continuous backup to S3
Network isolation,
encryption at
rest/transit
Managed by RDS: no
server provisioning,
software patching, setup,
configuration, or backups
Performance
& scalability
Availability
& durability
Highly
secure
Fully
managed
11. Large relational databases with Amazon Aurora
Scale-out, distributed, multi-tenant architecture
• Your data is replicated 6 ways
across 3 AZs
• Continuous backup to Amazon
S3 (built for 99.999999999%
durability)
• Up to 15 Aurora Replicas with
instant crash recovery
AZ 1 AZ 2 AZ 3
Virtualized, cross-AZ storage layer
Size for the peak load
-or-
Continuously monitor and
manually scale up/down
12. Aurora Serverless . . .
Responds to your application load
automatically
• Scale capacity with no downtime
• Multi-tenant proxy is highly
available
• Scale target has warm buffer
pool
• Shuts down when not in use
14. Aurora is used by ¾ of the top 100 AWS customers
Aurora customer adoption
Fastest growing service in AWS history
15. AWS Database Migration Service
Migrating
Databases to AWS
90,000+
Databases migrated
Migrate between on-premises and AWS
Migrate between databases
Data replication for zero-downtime migration
Automated schema conversion
16. Two fundamental areas of focus
“Lift and shift” existing
apps to the cloud
Quickly build new
apps in the cloud
17. A one size fits all database doesn’t fit anyone
Modern Applications Need Purpose-Built Databases
Users: 1M+
Data volume: TB–PB–EB
Locality: Global
Performance: Milliseconds–microseconds
Request Rate: Millions
Access: Mobile, IoT, devices
Scale: Up-out-in
Economics: Pay as you go
Developer Access: Instant API access
Relational Key-value Document
In-memory Graph Search
18. AWS purpose-built strategy
The right tool for the right job
Relational
Non-Relational
Aurora RDS
ElastiCacheDynamoDB
Key-value Document
Neptune
Graph
Microsoft SQL Server
19. Let’s take a closer look at…
Key-value Graph In-memory
20. Let’s take a closer look at…
Key-value Graph In-memory
21. Key-value data
• Simple key value
pairs
• Partitioned by
keys
• Resilient to failure
• High throughput,
low-latency reads
and writes
• Consistent
performance at
scale
Gamers
Primary Key Attributes
GamerTag Level Points High Score Plays
Hammer57 21 4050 483610 1722
FluffyDuffy 5 1123 10863 43
Lol777313 14 3075 380500 1307
Jam22Jam 20 3986 478658 1694
ButterZZ_55 7 1530 12547 66
… … … … …
PUT {
TableName:"Gamers",
Item: {
"GamerTag":"Hammer57",
"Level":21,
"Points":4050,
"Score":483610,
"Plays":1722
} }
GET {
TableName:"Gamers",
Key: {
"GamerTag":"Hammer57“,
“ProjectionExpression“:”Points”
} }
22. Amazon.com case
“A deep dive on how we were using our existing databases
revealed that they were frequently not used for their relational
capabilities. About 70 percent of operations were of the key-
value kind, where only a primary key was used and a single row
would be returned. About 20 percent would return a set of rows,
but still operate on only a single table.”
Werner Vogels, A Decade of Dynamo Blog Post
23. Amazon DynamoDB
Fully-managed nonrelational database for any scale
Secure
Encryption at rest and transit
Fine-grained access control
PCI, HIPAA, FIPS140-2 eligible
High performance
Fast, consistent performance
Virtually unlimited throughput
Virtually unlimited storage
Fully managed
Maintenance-free
Serverless
Auto scaling
Backup and restore
Global tables Global Tables
High-performance, globally
distributed applications
Multi-region redundancy
and resiliency
Easy to set up and no application
rewrites required
24. Let’s take a closer look at…
Key-value Graph In-memory
25. Graph data
• Relationships are first-class objects
• Vertices connected by Edges
Vertex
PURCHASED PURCHASED
FOLLOWS
PURCHASED
KNOWS
PRODUCT
SPORT
FOLLOWS
Edge
26. Amit
Kevin
Graph use case
Do you know…
Customers who also follow
sports purchased…
gremlin> g.V().has(‘name’,’sara’).as(‘customer’).out(‘follows’).in(‘follows’).out(‘purchased’)
where(neq(‘customer’)).dedup().by(‘name’).properties('name')
PURCHASED PURCHASED
FOLLOWS
PURCHASED
KNOWS
PRODUCT
SPORT
FOLLOWS
Bill
Mary
FOLLOWS
Sara
// Identify a friend in common and
make a recommendation
gremlin> g.V().has('name','mary').as(‘start’).
both('knows').both('knows’).
where(neq(‘start’)).
dedup().by('name').properties('name')
27. Highly connected data best represented in a graph
Relational model
Foreign keys used to represent relationships
Queries can involve nesting & complex joins
Performance can degrade as datasets grow
Graph model
Relationships are first-order citizens
Write queries that navigate the graph
Results returned quickly, even on large datasets
28. Amazon Neptune
Fast & Scalable ReliableFlexible
Store billions of relationships;
query with millisecond
latency
Six replicas of your
data across three AZs
with full backup and
restore
Build powerful
queries with
Gremlin and SPARQL
Supports Apache
TinkerPop & W3C
RDF graph models
Gremlin
SPARQL
Open Standards
Fully managed graph database
29. Let’s take a closer look at…
Key-value Graph In-memory
30. Amazon ElastiCache
Fully managed, Redis or Memcached compatible, low latency, in memory data store
Fully
Managed
Extreme
Performance
Easily
Scalable
AWS manages all
hardware and software
setup, configuration,
monitoring
In-memory data store
and cache for sub-
millisecond response
times
Read scaling with
replicas. Write and memory
scaling with sharding.
Non disruptive scaling
31. Data models and common use cases
Amazon
Aurora,
Amazon
RDS,
Amazon
Redshift
Amazon
DynamoDB
Amazon
DynamoDB,
Amazon
DocumentDB
Amazon
ElastiCache
Amazon
Neptune
Amazon
Elasticsearch
ERP, medical
records, CRM,
finance
Real-time bidding,
shopping cart, IoT
device tracking
Content management,
personalization,
mobile
Leaderboards, real-
time analytics,
caching
Fraud detection,
social networking,
recommendation
engine
Product catalog,
help/FAQs, full-text
Relational Key-value In-memory Document SearchGraph
32. Airbnb uses different databases based
on the purpose
User search history: Amazon DynamoDB
• Massive data volume
• Need quick lookups for personalized search
Session state: Amazon ElastiCache
• In-memory store for submillisecond site rendering
Relational data: Amazon RDS
• Referential integrity
• Primary transactional database
33. CHALLENGE
Wanted to enable anyone to learn a
language for free.
SOLUTION
Purpose-built databases from AWS:
• DynamoDB: 31B items tracking
which language exercises completed
• Aurora: primary transactional
database for user data
• ElastiCache: instant access to
common words and phrases
Result:
More people learning a language on
Duolingo than entire US school system
300M total users
7B exercises per month
35. Amazon RedShift
Highly scalable cloud data warehouse at 10x the performance and 1/10th the
cost of traditional data warehouses
Fast
Delivers fast results for all
types of workloads
Cost-effective
No upfront costs, start small,
and pay as you go
Integrated Secure
Audit everything; encrypt data
end-to-end; extensive
certification and compliance
Integrated with S3 data
lakes, AWS services and
third-party tools
$
Simple
Create and start using a data
warehouse in minutes
Scalable
Gigabytes to petabytes
to exabytes
36. Data models and common use cases
Relational Key-value Document In-memory Graph Search
Referential
integrity, ACID
transactions,
schema-on-write
Low-latency,
key look-ups with
high throughput
and fast ingestion
of data
Indexing and
storing documents
with support
for query on
any attribute
Microseconds
latency,
key-based queries,
and specialized
data structures
Creating and
navigating
relations between
data easily
and quickly
Indexing and
searching
semistructured
logs and data
ERP, medical records,
CRM, finance
Real-time bidding,
shopping cart, IoT device
tracking
Content management,
personalization, mobile
Leaderboards, real-time
analytics, caching
Fraud detection, social
networking,
recommendation engine
Product catalog,
help/FAQs, full-text
Amazon Aurora,
Amazon RDS.
Amazon Redshift
Amazon
DynamoDB
Amazon
DynamoDB,
Amazon
DocumentDB
Amazon
ElastiCache for
Redis &
Memcached
Amazon Neptune
Amazon
Elasticsearch