This session will begin with an introduction to non-relational (NoSQL) databases and compare them with relational (SQL) databases. Learn the fundamentals of Amazon DynamoDB, a fully managed NoSQL database service, and see the DynamoDB console first-hand. See a walk-through demo of building a serverless web application using this high-performance key-value and JSON document store.
1. Introduction to Amazon DynamoDB
Sean Shriver
NoSQL Solutions Architect
AWS Solution Architecture
15 March 2017
2. Agenda
• Brief history of data processing
• Relational (SQL) vs. nonrelational (NoSQL)
• NoSQL solutions on AWS
• Amazon DynamoDB’s fully managed features
• Demo – serverless applications
3. Data volume since 2010
• 90% of stored data generated in
last 2 years
• 1 terabyte of data in 2010 equals
6.5 petabytes today
• Linear correlation between data
pressure and technical innovation
• No reason these trends will not
continue over time
9. SQL (Relational)
Price Desc.
$11.50
$8.99
Chaplin’s
first …
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Partitas
Product
ID
Type
1
2
3
Products
Book
Album
Movie
Books
Title Date
Odyssey 1871
Book ID
1
Books
Author
Homer
10. SQL (Relational)
Price Desc.
$11.50
$8.99
Chaplin’s
first …
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Partitas
Product
ID
Type
1
2
3
Products
Book
Album
Movie
Books
Title Date
Odyssey 1871
Book ID
1
Books
Genre Director
Drama,
Comedy
Chaplin
Movie ID Title
3 The Kid
Movies
Author
Homer
11. SQL (Relational)
Products
Price Desc.
$11.50
$8.99
Chaplin’s
first …
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Partitas
Product
ID
Type
1
2
3
Book
Album
Movie
Books Albums
Title Date
Odyssey 1871
Book ID
1
Books Albums
Title
6 Partitas
Album
ID
Artist
2
Genre Director
Drama,
Comedy
Chaplin
Movie ID Title
3 The Kid
Movies
Bach
Author
Homer
12. SQL (Relational)
Price Desc.
$11.50
$8.99
Chaplin’s
first …
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Partitas
Product
ID
Type
1
2
3
Books Albums
Products
Book
Album
Movie
Title Date
Odyssey 1871
Book ID
1
Books Albums
Title
6 Partitas
Album
ID
Artist
2
Genre Director
Drama,
Comedy
Chaplin
Movie ID Title
3 The Kid
Movies Tracks
Track
Partita
No. 1
Album
ID
Track ID
2 1
Bach
Author
Homer
13. SQL (Relational) vs. NoSQL (Non-relational)
Product
ID
Type
Odyssey Homer1 Book ID
2 Album ID 6 Partitas
2
Album ID:
Track ID
Partita
No. 1
Bach
Attributes
Schema is defined per item
Items
Partition Key Sort Key
3 Movie ID The Kid
Drama,
Comedy
1871
Chaplin
Primary Key Products
Price Desc.
$11.50
$8.99
Chaplin’s
first …
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Partitas
Product
ID
Type
1
2
3
Title Date
Odyssey 1871
Book ID
1
Books Albums
Title
6 Partitas
Album
ID
Artist
2
Genre Director
Drama,
Comedy
Chaplin
Movie ID Title
3 The Kid
Movies
Products
Book
Album
Movie
Tracks
Track
Partita
No. 1
Album
ID
Track ID
2 1
Author
Homer Bach NoSQL design optimizes for
compute instead of storage
14. Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
15. NoSQL solutions using Amazon EC2 and EBS
DB hosted on-premises DB hosted on Amazon EC2
16. The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester
Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed
spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in
the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
The Forrester Wave™: Big Data NoSQL, Q3 2016
22. WRITES
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No latency trade-off
Designed to
support 99.99%
of availability
Built for high
durability
High availability and durability
25. MLBAM (MLB Advanced Media) is a full service solutions
provider, operating a powerful content delivery platform.
For the first time, we can
measure things we’ve never
been able to measure
before.
Joe Inzerillo
Executive Vice President and CTO, MLBAM
”
“ • MLBAM can scale to support many games on a
single day.
• Amazon DynamoDB powers queries and supports the
fast data retrieval required.
• MLBAM distributes 25,000 live events annually and
10 million streams daily.
Major League Baseball Fields Big Data,
Excitement with Amazon DynamoDB
26. Redfin is a full-service real estate company with local
agents and online tools to help people buy & sell homes.
We have billions of records
on DynamoDB being
refreshed daily or hourly or
even by seconds.
Yong Huang
Director, Big Data Analytics, Redfin
”
“ • Redfin provides property and agent details and
ratings through its websites and apps.
• With DynamoDB, latency for “similar” properties
improved from 2 seconds to just 12 milliseconds.
• Redfin stores and processes five billion items in
DynamoDB.
Redfin Is Revolutionizing Home Buying and
Selling with Amazon DynamoDB
27. Duolingo Scales to Store Over 31 Billion Items
Using DynamoDB
Duolingo is a free language learning service where
users help translate the web and rate translations.
Using AWS, we can handle
traffic spikes that expand up
to seven times the amount of
normal traffic.
Severin Hacker
CTO, Duolingo
”
“
• Duolingo stores data about each user to be able to
generate personalized lessons.
• The MySQL database couldn’t keep up with
Duolingo’s rate of growth
• By using the scalable database service, data store
capacity increased from 100 million to more than four
billion items
• Duolingo has the capacity to scale to support over
8 million active users
28. Nexon is a leading South Korean video game developer
and a pioneer in the world of interactive entertainment.
By using AWS, we
decreased our initial
investment costs, and only
pay for what we use.
Chunghoon Ryu
Department Manager, Nexon
”
“ • Nexon used Amazon DynamoDB as its
primary game database for a new blockbuster
mobile game, HIT
• HIT became the #1 Mobile Game in Korea
within the first day of launch and has > 2M
registered users
• Nexon’s HIT leverages DynamoDB to deliver
steady latency of less than 10ms to deliver a
fantastic mobile gaming experience for
170,000 concurrent players
Nexon Scales Mobile Gaming with Amazon
DynamoDB
29. Ad Tech Gaming MobileIoT Web
Scaling high-velocity use cases with DynamoDB
34. Global secondary index (GSI)
GSIs
A5
(part.)
A4
(sort)
A1
(table key)
A3
(projected)
Table
INCLUDE A3
A4
(part.)
A5
(sort)
A1
(table key)
A2
(projected)
A3
(projected) ALL
A2
(part.)
A1
(table key) KEYS_ONLY
RCU/WCU provisioned
separately for GSIs
Online Indexing
A1
(partition)
A2 A3 A4 A5
Alternate partition (+sort) key
Index is across all table partition keys
35. Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(table key)
A1
(partition)
A2
(sort)
A3 A4 A5
LSIs
A1
(partition)
A4
(sort)
A2
(table key)
A3
(projected)
Table
KEYS_ONLY
INCLUDE A3
A1
(partition)
A5
(sort)
A2
(table key)
A3
(projected)
A4
(projected)
ALL
10 GB max per partition
key, i.e. LSIs limit the #
of sort keys!
36. Integration capabilities
DynamoDB Triggers
Implemented as AWS
Lambda functions
Your code scales
automatically
Java, Node.js, and Python
DynamoDB Streams
Stream of table updates
Asynchronous
Exactly once
Strictly ordered
24-hr lifetime per item
37. Integration capabilities
• Amazon Elasticsearch Service
integration
• Full-text queries
Add search to mobile apps
Monitor IoT sensor status codes
App telemetry pattern discovery
using regular expressions
• Fine-grained access control by
using AWS Identity and Access
Management (IAM)
• Table-, item-, and attribute-
level access control
38. Advanced topics in DynamoDB
• Design patterns and best practices
• Data modeling
• Understanding Partitions
• DynamoDB Scaling
47. • Free Tier
25GB of storage
25 Reads per second
25 Writes per second
• Pricing for additional usage in US East (N. Virginia)
$0.25 per GB per month
Write throughput: $0.0065 per hour for every 10 units of Write Capacity
Read throughput: $0.0065 per hour for every 50 units of Read Capacity
DynamoDB Pricing & Free Tier