Más contenido relacionado La actualidad más candente (20) Similar a AWS Database and Analytics State of the Union (20) Más de Amazon Web Services (20) AWS Database and Analytics State of the Union1. AWS Database and Analytics
State of the Union
D i c k s o n Yu e
S o l u t i o n s A r c h i t e c t
Level 200
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to Expect
the portfolio of
AWS Database
and Analytics
services
1
of the volume
and scale at which
customers are using
AWS services
2
about common
customer use cases
and architectures
3
when to use
which services
4
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Purpose-built
databases & analytic
engines. Right tool
for the right job.
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Databases and Analytics
B r o a d a n d d e e p p o r t f o l i o , p u r p o s e - b u i l t f o r b u i l d e r s
Data Lake
S3/Glacier AWS Glue
(ETL & Data Catalog)
Data Movement
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Non-Relational
Databases
DynamoDB
ElastiCache
(Redis, Memcached)
Neptune
(Graph)
Analytics
DW | Big Data Processing | Interactive
Redshift EM
R
Athena
Kinesis
Analytics
Elasticsearc
h Service
Real-time
Relational Databases
RDS
Aurora
Business Intelligence & Machine Learning
QuickSight SageMaker Comprehend
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Accelerating the Pace of Innovation
*Projected number of releases to year-end
features
in 2017
new database and
analytics services
GA in 2017
2015 2016 2017
150
210*
100
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Services
Aurora RDS
AWS Database Migration Service
Relational
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon RDS
Managed relational database service with a choice of six popular database engines
Easy to administer
No need for infrastructure
provisioning, installing and
maintaining DB software
Highly scalable
Scale database compute
and storage with a few
clicks with no application
downtime
Available & Durable
Automatic Multi-AZ
data replication;
automated backup,
snapshots, failover
Fast & Secure
SSD storage and
guaranteed provisioned I/O;
data encryption at rest and
in transit
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Old World Commercial Databases
Lock-inProprietary Punitive
licensing
Very expensive You’ve
got mail
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Moving to Open Database Engines
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Moving to Open Database Engines
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora
MySQL and PostgreSQL compatible relational database built for the cloud
Performance and availability of commercial-grade databases at 1/10th the cost
Performance &
scalability
Availability &
durability
Highly Secure Fully managed
5x throughput of standard
MySQL and 3x of standard
PostgreSQL; scale-out up to
15 read replicas
Fault-tolerant, self-healing
storage; six copies of data
across three AZs; continuous
backup to S3
Network isolation,
encryption at
rest/transit
Managed by RDS: no
hardware provisioning,
software patching, setup,
configuration or backups
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Up to 15 read replicas across three AZs
Auto-scale new read replicas
Seamless recovery from read replica failures
Availability
Zone 1
Scale out read performance
Availability
Zone 2
Availability
Zone 3
Amazon Aurora—High Performance
S c a l e o u t t o m i l l i o n s o f r e a d s p e r s e c o n d
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Zero application downtime from ANY instance failure
Zero application downtime from ANY AZ failure
Faster write performance and higher scale
Sign up for single-region multi-master preview today;
multi-region multi-master coming in 2018
Aurora Multi-Master (Preview)
F i r s t r e l a t i o n a l D B s e r v i c e t o s c a l e o u t r e a d s a n d wr i t e s ,
a c r o s s m u l t i p l e d a t a c e n t e r s
Availability
Zone 1
Scale out both reads and writes
Availability
Zone 2
Availability
Zone 3
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Starts up on demand, shuts down when not in use
Automatically scales with no instances to manage
Pay per second for the database capacity you use
Aurora Serverless (Preview)
O n - d e m a n d , a u t o - s c a l i n g d a t a b a s e f o r a p p l i c a t i o n s w i t h v a r i a b l e
w o r k l o a d s
Warm Capacity
Pool
Application
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Migration Service
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Database Services
Database Migration Service
Non-relational
DynamoDB ElastiCache Neptune
Key value | Document Graph
Non-relational
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
W e n e e d e d t o a d a p t t o p o w e r A m a z o n . c o m
Needed to power Amazon.com
Required massive scalability and reliability
DynamoDB designed to meet this need
2017
Hall of Fame
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
F a s t a n d f l e x i b l e N o S Q L d a t a b a s e s e r v i c e f o r a n y s c a l e
Fast, consistent
performance
Highly scalable Fully managed Business critical
reliability
Consistent single-digit millisecond
latency; DAX in-memory
performance reduces response
times to microseconds
Auto-scaling to hundreds
of terabytes of data that
serve millions of requests
per second
Automatic provisioning,
infrastructure
management, scaling,
and configuration with
zero downtime
Data is replicated across
fault tolerant Availability
Zones, with fine-grained
access control
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
VPC
Endpoints
April 2017
Auto
Scaling
June 2017
DynamoDB
Accelerator
(DAX)
April 2017
Time to
Live
(TTL)
February 2017
Global Tables
(GA)
On-demand
Backup (GA)
Amazon DynamoDB
D eliver ing on c us tomer needs
Encryption at rest
(Coming Soon)
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DynamoDB Global Tables (GA)
F i r s t f u l l y m a n a g e d , m u l t i - m a s t e r, m u l t i - r e g i o n d a t a b a s e
Build high performance, globally distributed applications
Low latency reads & writes to locally available tables
Disaster proof with multi-region redundancy
Easy to set up and no application re-writes required
Globally dispersed users
Global Table
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB Powers
Backup and restore on mobile application for 300M users
300+ PBs in AWS, 850 TBs in DynamoDB, 130M daily API requests
Migrated from Cassandra to DynamoDB
Consistent performance and 70% cost savings (TCO)
DynamoDB provided consistent high
performance at a drastically lower cost
than Cassandra.
Seongkyu Kim, Server Engineer, Samsung Electronics
“
”
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB Powers
Herd is the database system powering customer purchases
Migrated from Oracle to DynamoDB
Extreme scale to handle millions of requests per second
Workflow processing dropped from 1s to 100ms
Prime Day 2017: DynamoDB handled peak of 12.9M requests per
second (an increase from 6.3M) with no increase in latency
Work Item Storage
Partition Assigner
Timer Router
Tim
er
No
des
ode
sdesNo
des
Timer
Hosts
View Router
Tim
er
No
des
No
des
ode
s
No
des
View
Hosts
DynamoDB
Herd is a mission-critical system for Amazon, and we
are extremely confident in DynamoDB as the
technology on which to run it.
Mike Thomas, Software Development Manager, Amazon Herd
“
”
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ElastiCache
Extreme
performance
Secure & hardened Easily scalable
Highly available &
reliable
In-memory data store and
cache using optimized
stack to deliver sub-
millisecond response times
VPC for cluster isolation,
encryption at rest/transit,
and HIPAA compliance
Read scaling with
replicas; write and
memory scaling with
sharding; non-disruptive
scaling
Multi-AZ with automatic
failover
Managed, in-memory data store service
Redis or Memcached to power real-time apps with sub-millisecond latency
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Graph Use Cases
Social News Feed Recommendations Retail Fraud Detection
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
H ighly C onnect ed D at a B est R epresent ed in a Graph
Relational model
Foreign keys used to represent relationships
Queries can involve nesting & complex joins
Performance can degrade as datasets grow
Graph model
Relationships are first-order citizens
Easy to write queries that navigate the graph
Results returned quickly, even on large datasets
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Neptune (Preview)
F u l l y m a n a g e d g r a p h d a t a b a s e
Fast & Scalable ReliableOpen
Store billions of relationships;
query with millisecond latency
Six replicas of your
data across three AZs
with full backup and
restore
Build powerful
queries easily with
Gremlin and
SPARQL
Supports Apache
TinkerPop & W3C
RDF graph models
Gremlin
SPARQL
Easy
27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Big Data and Analytic Services
A n y a n a l y t i c w o r k l o a d , a n y s c a l e , a t t h e l o w e s t p o s s i b l e c o s t
Insights
Analytics
Data Lake
Data Movement
QuickSight SageMaker
Glue
(ETL & Data Catalog)
S3/Glacier
(Storage)
Redshift
+Spectrum
EM
R
Athena
Elasticsearch
service
Kinesis Data Analytics
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams
Real-time
Comprehend
DW Big data processing Interactive
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traditionally, Analytics Used to Look Like
This
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence Relational data
TBs-PBs scale
Schema defined prior to data load
Operational reporting and ad hoc
Large initial capex + $10k-$50k / TB / Year
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes Extend the Traditional
Approach
Relational and non-relational data
TBs-EBs scale
Schema defined during analysis
Diverse analytical engines to gain insights
Designed for low-cost storage and analytics
OLTP ERP CRM LOB
Data Warehouse
Business
Intelligence
Data Lake
100110000100101011100
101010111001010100001
011111011010
0011110010110010110
0100011000010
Devices Web Sensor
s
Social
Catalog
Machine
Learning
DW
Queries
Big data
processing
Interactive Real-time
30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
S3
Most ways to bring data in
Unmatched durability and availability at EB scale
Best security, compliance, and audit capabilities
Run any analytics on the same data without movement
Scale storage and compute independently
Store data at $0.023 / month; Query for $0.05/GB scanned
Redshift
EMR
Athena Kinesis
Elasticsearch Service
Data Lakes on AWS
Kinesis
Video Streams
AI Services
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Kinesis—Real Time
Easily collect, process, and analyze video and data streams in real time
Capture, process,
and store video
streams for analytics
Load data streams
into AWS data stores
Analyze data streams
with SQL
Build custom
applications that
analyze data streams
Kinesis Video Streams Kinesis Data Streams Kinesis Data Firehose Kinesis Data Analytics
SQL
New
32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Glue—Serverless Data catalog & ETL
service
Data Catalog
ETL Job
authoring
Discover data and
extract schema
Auto-generates
customizable ETL code
in Python and Spark
Automatically discovers data and stores schema
Data searchable, and available for ETL
Generates customizable code
Schedules and runs your ETL jobs
Serverless
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Athena—Interactive Analysis
Interactive query service to analyze data in Amazon S3 using standard SQL
No infrastructure to set up or manage and no data to load
Ability to run SQL queries on data archived in Glacier (Coming soon)
$ SQL
Query Instantly
Zero setup cost; just
point to S3 and start
querying
Pay per query
Pay only for queries run;
save 30–90% on per-
query costs through
compression
Open
ANSI SQL interface,
JDBC/ODBC drivers, multiple
formats, compression types,
and complex joins and data
types
Easy
Serverless: zero
infrastructure, zero
administration
Integrated with Amazon
QuickSight
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Redshift—Data Warehousing
Fast, powerful, simple, and fully managed data warehouse at 1/10 the cost
Massively parallel, scale from gigabytes to petabytes
Fast at scale
Columnar storage
technology to improve I/O
efficiency and scale query
performance
$
Inexpensive
As low as $1,000 per
terabyte per year, 1/10th
the cost of traditional data
warehouse solutions; start
at $0.25 per hour
Open file formats Secure
Audit everything; encrypt
data end-to-end;
extensive certification and
compliance
Analyze optimized data
formats on the latest SSD,
and all open data formats in
Amazon S3
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Redshift Spectrum
E x t e n d t h e d a t a w a r e h o u s e t o e x a b y t e s o f d a t a i n S 3 d a t a l a k e
S3 data lakeRedshift data
Redshift Spectrum
query engine
Exabyte Redshift SQL queries against Amazon S3
Join data across Redshift and S3
Scale compute and storage separately
Stable query performance and unlimited concurrency
CSV, ORC, Grok, Avro, & Parquet data formats
Pay only for the amount of data scanned
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EMR—Big Data Processing
Analytics and ML at scale
Nineteen open-source projects: Apache Hadoop, Spark, HBase, Presto, and more
Enterprise-grade security
$
Latest versions
Updated with the latest
open source frameworks
within 30 days of release
Low cost
Flexible billing with per-
second billing, EC2 spot,
Reserved Instances and
auto-scaling to reduce
costs 50–80%
Use S3 storage
Process data directly in
the S3 data lake securely
with high performance
using the EMRFS
connector
Easy
Launch fully managed
Hadoop & Spark in minutes;
no cluster setup, node
provisioning, cluster tuning
Data Lake
100110000100101011100
1010101110010101000
00111100101100101
010001100001
37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elasticsearch Service
Easy to Use
Fully-managed.
Deploy production-ready
clusters in minutes
Open
Direct access to
Elasticsearch open-source
APIs; supports Logstash
and Kibana
Secure
Secure access with VPC
to keep all traffic within
AWS network
Available
Zone awareness replicates
data between two AZs;
automatically monitors &
replaces failed nodes
Easy to deploy, secure, operate, and scale Elasticsearch
Customers use Elasticsearch for log analytics, full-text search & application monitoring
38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Expedia Migrated from SQL Server to
AWS
Needed real-time analysis of lodging market pricing
Migrated from Microsoft SQL Server
Use Aurora, Amazon Redshift, Kinesis, and ElastiCache
Process high-volume pricing and availability data
Query execution times reduced 80–95%
Database has >15B rows and continues to grow
Aurora
ElastiCach
e
(Redis)
Redshift
Kinesis
Firehose
S3
Historical
queries on
up to 2 years
of data
Operational
queries of real-
time data
Staging near
real-time data
Join / compare
events
Real-time
streams of
lodging
market data
Ingest
multiple data
streams
Reference data
on-premises
EC2
39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon QuickSight
Fast, easy to use, serverless analytics at 1/10th the cost of traditional BI
Empower
everyone
Seamless
connectivity
Fast analysis Serverless
40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Enabling All Types of Data-Driven Analytic
Retrospective
analysis and
reporting
Here-and-now
real-time processing
and dashboards
Predictions
to enable smart
applications
41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the
second
Amazon SageMaker (GA)
The quickest and easiest way to get ML models from idea to production
$
42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Comprehend example
43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
When to Use Which Services
Situation Solution
Existing application Use your existing engine on RDS
• MySQL Amazon Aurora, RDS for MySQL
• PostgreSQL Amazon Aurora, RDS for PostgreSQL
• Oracle Amazon Aurora, RDS for Oracle
• SQL Server Amazon Aurora, RDS for SQL Server
• MariaDB Amazon Aurora, RDS for MariaDB
New application • If you can avoid relational features DynamoDB
• If you need relational features Amazon Aurora
In-memory store/cache • Amazon ElastiCache
Data Warehouse & BI • Amazon Redshift, Amazon Spectrum, and Amazon QuickSight
Interactive, serverless analysis of data in S3 • Amazon Athena and Amazon QuickSight
Apache Spark, Apache Hadoop, Apache Hbase • Amazon EMR
Log analytics, operational monitoring and search • Amazon Elasticsearch Service and Amazon Kinesis
Natural language processing • Amazon Comprehend
Machine Learning • Amazon SageMaker
• Apache Spark Amazon EMR