SlideShare a Scribd company logo
1 of 74
© 2019, Amazon Web Services, Inc. or its Affiliates.
Chandra Kapireddy
Specialist SA, Data & Analytics
Data Lifecycle Management
from ingest to archive
© 2019, Amazon Web Services, Inc. or its Affiliates.
Agenda
• What’s new with data today
• How AWS can help
• Data storage services
Break (15 min)
• Data transfer ( & hybrid storage) services
• Data lakes, database and analytic services
• Q/A
© 2019, Amazon Web Services, Inc. or its Affiliates.
What’s new with data today?
© 2019, Amazon Web Services, Inc. or its Affiliates.
Data is a strategic asset
for every organization
The world’s most valuable
resource is no longer oil,
but data.
© 2019, Amazon Web Services, Inc. or its Affiliates.
https://www.networkworld.com/article/3325397/idc-expect-175-zettabytes-of-data-worldwide-by-2025.html
Data is growing at an
exponential rate.
The IDC predicts:*
The collective sum of the world’s 2018 data
was 33 zettabytes.
By 2025, it will grow to 175 zettabytes.
Data types are also diversifying, requiring
storage and analysis of structured AND
unstructured data.
© 2019, Amazon Web Services, Inc. or its Affiliates.
Governance
& control
There are more
people working
with data than ever
before.
How do I provide democratized access
to data to enable informed decisions
while at the same time enforce data
governance and prevent
mismanagement of the data?
Democratization
of data
© 2019, Amazon Web Services, Inc. or its Affiliates.
There are more ways to analyze
data than ever before.
Hadoop Elasticsearch
Years ago
12 9 6 5
Presto Spark
Didn’t exist
© 2019, Amazon Web Services, Inc. or its Affiliates.
Why streaming data?
Get actionable insights quickly
Source: Perishable insights, Mike Gualtieri, Forrester
Real time Seconds Minutes Hours Days Months
Valueofdatatodecision-making
Preventive/Predictive
Actionable Reactive Historical
Time critical decisions Traditional “batch” business intelligence
Information half-life
in decision-making
© 2019, Amazon Web Services, Inc. or its Affiliates.
Thinking about data as an asset, not a cost
Stop
throwing
data away.
Make it
available to
more users.
Arm users
with more
data processing
technologies.
What you need to succeed
© 2019, Amazon Web Services, Inc. or its Affiliates.
Storage
Processing
Analytics /
Machine
learning
Archive /
Retire
Creation /
Ingest
At every stage of your data’s
lifecycle, you need scalable
services that can flex with your
data growth.
© 2019, Amazon Web Services, Inc. or its Affiliates.
How AWS can help
© 2019, Amazon Web Services, Inc. or its Affiliates.
Typical storage workloads
Backup &
Restore
Non-disruptive
Easy place to start
Integrated with all
major vendors
Archive &
Compliance
Media workflows
Tape replacement
Public Sector,
FinServ,
Healthcare/Life
Sciences
Home
Directories
Simple to move
Less latency
sensitive
Significant cost
savings
Data Lakes
Variety of
analytics tools
Foundation for
AI/ML
Built for
streaming data
Data visualization
Business-
Critical
Applications
Integrated with
major vendors
Fully managed
infrastructure
Lift-and-shift
migrations
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Storage Solutions – Use Cases & Customers
Backup & Restore
• Tools for in-cloud and on-prem backup
• S3 Standard-IA and S3 One Zone-IA for
cost-effective backup storage
• Storage Gateway integrations for tape,
volume, and block backup in the cloud
Archive
• S3 Glacier for frequent retrievals
• S3 Glacier Deep Archive for cold archive at
$1 for 1TB/month
• VTL capabilities to replace physical tape
• Query archive data with S3 Glaicer Select
Data Lakes
• Run analytics applications without ETL
• Query data in place with S3 Select and S3
Glacier Select
• Integrations with FSx for Lustre to run HPC,
ML, and data media processing
Enterprise Applications
• EFS for file systems to run applications
• EBS for compute storage
• Seamless integration between partner
application and existing AWS systems
• Backup to S3 and S3 Glacier
Data Compliance
• S3 Object Lock to configure & enforce
write-once-ready-many (WORM) controls
• S3 Object Lock can be set to compliance or
governance modes
• S3 Glacier Vault Lock for WORM archive
Hybrid Storage
• Connect on-premises environments to the
AWS Cloud with AWS Storage Gateway
• Back up data, burst compute-intensive
workloads, or transfer files, virtual tapes,
and volumes in the AWS Cloud
© 2019, Amazon Web Services, Inc. or its Affiliates.
Data
AWS cloud storage is core
…yielding bigger insights…
…helping you
innovate faster…
…gives you unique scale…
Building on or migrating
an application to AWS…
Most big data & data lakes
Most managed databases
Simplest enterprise applications
Easiest data warehousing
Singular query-in-place analytics
Greatest reliability
Highest security
Most manageable
Broadest compliance
Widest portfolio
Fastest innovation
Active archive
Disaster recovery
IoT
Artificial Intelligence
Advanced developer tools
Experienced consulting and support
Methodical migration services
The most data movement services
© 2019, Amazon Web Services, Inc. or its Affiliates.
More services & storage classes for every use case
Block storage
Amazon Elastic Block Storage (EBS)
• General purpose SSD
• Provisioned IOPS SSD
• Throughput-optimized HDD
• Cold HDD
• Elastic Volumes
File storage
Amazon Elastic File System (EFS)
• EFS Standard
• EFS Infrequent Access (for cost savings)
Object storage
Amazon Simple Storage Service (S3)
• 6 storage classes for various access patterns
• Includes 2 levels or archiving
• Build data lakes for structured and
unstructured data
Amazon EFS AWS Storage
Gateway Family
Amazon S3
Amazon FSx
for Lustre
Amazon FSx
for Windows
File Server
Amazon
EBS
Amazon
EC2
© 2019, Amazon Web Services, Inc. or its Affiliates.
More options for data transfer
AWS
Direct Connect
Amazon Kinesis
Firehose
AWS
Snowball
AWS Snowmobile
AWS
Storage
Gateway
Amazon S3
Transfer
Acceleration
AWS
DataSync
AWS Transfer
for SFTP
AWS
Snowball Edge
Storage Optimized
Amazon Kinesis
Data Streams
Amazon Kinesis
Video Streams
AWS
Snowball Edge
Compute Optimized
© 2019, Amazon Web Services, Inc. or its Affiliates.
Gartner Magic Quadrant
Magic Quadrant for
Public Cloud Storage Services,
Worldwide – 2018
Positioned furthest for
completeness of vision
and highest for ability
to execute in each report
since inception in 2014
Magic Quadrant for Public Cloud Storage Services, July 2018 – Raj Bala, Julia Palmer
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the
context of the entire document. The Gartner document is available upon request from Amazon Web Services.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise
technology users to select only those vendors with the highest ratings or other designation. Gartner research
publications consist of the opinions of Gartner’s research organization and should not be construed as statements
of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any
warranties of merchantability or fitness for a particular purpose.
© 2019, Amazon Web Services, Inc. or its Affiliates.
Data storage services
© 2019, Amazon Web Services, Inc. or its Affiliates.
Three types of storage
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Object storage
© 2019, Amazon Web Services, Inc. or its Affiliates.
S3 Batch Operations
FPO
The most features to cost-effectively store,
manage, audit, secure, and query data
– at virtually any scale.
S3 Standard
S3 Standard-IA
S3 Intelligent-Tiering
S3 One Zone-IA
S3 Glacier
S3 Glacier
Deep Archive
Use S3 Storage Class Analysis to
learn access patterns and S3
Lifecycle policies to move
objects between classes
S3 Storage Classes
Configure access to S3 resources
and define user access. Block all
public access requests with S3 Block
Public Access.
Access management
Replicate objects to a region of your
choice to reduce latency and for
compliance.
Cross-Region Replication
Append up to 10 metadata tags to an
object. Use tags, buckets, and prefixes to
organize data. Audit and report on access
requests and activities.
Data management tools
Execute tasks and invoke AWS Lambda across
billions of objects – with a single API call or a few
clicks in the console.
S3-integrated analytics applications
AWS Lake Formation
S3 Select to query data in place
FSx for Lustre for HPC, ML, and media data
processing
Analytics & file systems integrations
Supported by the most secure, durable, and performant storage infrastructure
Security by design Compliance programs 11 9’s of durability Multi-AZ resiliency Limitless scalability
Amazon Simple Storage Service (S3)
© 2019, Amazon Web Services, Inc. or its Affiliates.
Choose the storage class that fits best
≥ 3 AZs 1 AZ
99.99% 99.5%
Milliseconds Hours
Hours YearsFrequent Infrequent
0 Bytes 5 Terabytes
Reduce storage cost > 80% by choosing the
storage class option that best fits your use case
2 Regions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
99.9%
© 2019, Amazon Web Services, Inc. or its Affiliates.
Your choice of Amazon S3 storage classes
Access FrequencyFrequent Infrequent
• Active, frequently
accessed data
• Milliseconds access
• > 3 AZ
• $0.0210/GB
• Data with changing
access patterns
• Milliseconds access
• > 3 AZ
• $0.0210 to $0.0125/GB
• Monitoring fee per Obj.
• Min storage duration
• Infrequently accessed
data
• Milliseconds access
• > 3 AZ
• $0.0125/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
S3 Standard S3 S-IA S3 Z-IA Amazon Glacier
• Re-creatable, less
accessed data
• Milliseconds access
• 1 AZ
• $0.0100/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
• Archive data
• Select minutes or
hours
• > 3 AZ
• $0.0040/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
S3 INT
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon S3 Glacier Deep Archive
Lowest cost storage class for long-term
archiving and digital asset preservation
Fully managed
without tape
burden
$0.00099 per
GB-month
Designed for
99.999999999%
durability
Recover
data in 12
hours
© 2019, Amazon Web Services, Inc. or its Affiliates.
S3 Storage Class Analysis and S3 Lifecycle Policy
Use S3 Storage Class Analysis to identify storage age
groups that are less frequently accessed
Set S3 Lifecycle Policy to tier storage to lower cost
storage classes and expire storage based on age of
object
Great for predictable workloads (object age
indicates access frequency)
Fine tune analysis by bucket, prefix, or object tag
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates.
S3 Intelligent-Tiering storage class
Automatically optimizes storage costs for data with
changing access patterns
Moves objects between two storage tiers:
• Frequent access tier optimized for frequent use of
data
• Lower cost infrequent access tier optimized for
less accessed data
Monitors access patterns and auto-tiers on granular
object level
No performance impact, no operational overhead
Milliseconds access, > 3 AZs, Monitoring fee per
Object, Minimum storage duration
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
File storage
© 2019, Amazon Web Services, Inc. or its Affiliates.
Fully managed cloud file systems
Amazon EFS
Amazon FSx for
Windows File Server
Amazon FSx
for Lustre
LINUX-BASED WORKLOADS WINDOWS-BASED WORKLOADS
Fully managed Windows file servers
for business applications
Fully managed cloud-native file system
for Linux-based applications
Fully managed Lustre file system for
compute-intensive workloads
COMPUTE-INTENSIVE WORKLOADS
AWS provides file system options that help you easily address
the diverse needs of your file-based applications and workloads
© 2019, Amazon Web Services, Inc. or its Affiliates.
Performance
Amazon Elastic File System
Scalable, elastic, cloud-native Linux file system
Shared access Highly durable
and available
Elastic and
scalable
Secure and
compliant
Storage classes
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon FSx for Windows File Server
Lift and shift your Windows file storage with fully managed windows file servers
Fast and flexible
performance
Broad
accessibility
Fully
managed
Native Windows
compatibility
Enterprise-ready
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon FSx for Lustre
Fully managed Lustre file system for compute-intensive workloads
Native file
system interface
Fully
managed
Seamless access to
your data repositories
Massively scalable
performance
Cost-optimized for
compute-intensive workloads
Secure
and compliant
© 2019, Amazon Web Services, Inc. or its Affiliates.
Seamless integration with Amazon S3
Data stored in S3 is loaded to
Amazon FSx for processing
Output of processing
returned to S3 for retention
When your workload finishes, simply delete your file system
Link your Amazon S3 data set to your Amazon FSx for Lustre file system, then….
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Block storage
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS block storage offerings
EC2 instance
store
sc1st1
io1gp2
EBS
SSD-backed
volumes
EBS
HDD-backed
volumes
HDDSSD
© 2019, Amazon Web Services, Inc. or its Affiliates.
What is Amazon EC2 instance store?
• Local to instance
• Non-persistent data store
• Data not replicated (by default)
• No snapshot support
• SSD or HDD
EC2 instances
Physical Host
Instance Store
or
© 2019, Amazon Web Services, Inc. or its Affiliates.
What is Amazon EBS?
• Block storage as a service
• Create, attach volumes through an API
• Service accessed over the network
EC2
instance
EBS
volume
© 2019, Amazon Web Services, Inc. or its Affiliates.
What is Amazon EBS?
• Volumes persist independent of EC2
• Select storage and compute based on
your workload
• Detach and attach between instances
within the same Availability Zone
EC2
instance
EC2
instance
Availability Zone
AWS Region
EBS
volume
© 2019, Amazon Web Services, Inc. or its Affiliates.
What is Amazon EBS?
• Volumes attach to one instance
• Many volumes can attach to an instance
• Separate boot and data volumes
EC2
instance
EBS
volume
(boot)
EBS
volume
(data)
EBS
volume
(data)
Availability Zone
AWS Region
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon EBS volume types
Hard disk drives (HDD)Solid-state drives (SSD)
© 2019, Amazon Web Services, Inc. or its Affiliates.
Choosing an Amazon EBS volume type
or
What is more important to your workload?
IOPS Throughput?
© 2019, Amazon Web Services, Inc. or its Affiliates.
Choosing an Amazon EBS volume type
i3
gp2
Latency?
< 1 ms Single-digit ms
Which is more important?
Cost Performance
IOPS
≤ 80,000> 80,000
is more important
io1
Throughput
is more important
Small, random I/O Large, sequential I/O
st1
d2
≤ 1,750 MiB/s
Aggregate throughput?
> 1,750 MiB/s
Which is more important?
Cost Performance
sc1
© 2019, Amazon Web Services, Inc. or its Affiliates.
Data transfer & hybrid storage
services
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS data transfer & hybrid storage
Online managed
data transfer
Hybrid
storage
Offline
data transfer
Private
network
connections
to AWS
Load
streaming
data into
Amazon S3
Ship static
data into and
out of
Amazon S3
Access AWS
storage from
on-premises
Edge locations
for Amazon
S3 enabled
applications
Online
transfer of
active data
AWS
DataSync
AWS
Transfer
for SFTP
SFTP transfers
into Amazon S3
NEW NEW
Storage and
compute in
disconnected
environments
Network-based
services
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Network-based services
AWS Direct Connect &
S3 Transfer Acceleration
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Direct Connect
Reduce
bandwidth Costs
Consistent
network
performance
Compatible with
all AWS services
Private
connectivity to
VPC
Elastic Simple
• Establish private connectivity between AWS and your data center
• Dedicated connection can be partitioned into multiple virtual interfaces
• Maintain network separation between public and private environments
BENEFITS
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon S3 Transfer Acceleration
AWS Edge
Location
Optimized
Throughput!
Leverages AWS Global Edge Locations
(Amazon CloudFront) and optimized
AWS network path
Optimized protocols
Change your endpoint, not your code
No firewall exceptions & no client
software required
Speeds up transfers for applications that use S3 API over long distances
On average, a 171%
improvement over regular
Amazon S3 cli commands when
uploading over long distances
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS DataSync
© 2019, Amazon Web Services, Inc. or its Affiliates.
What use cases need to transfer active data?
Migration of active
application data
Transferring data
for time sensitive
in-cloud analysis
Replication of data
for business
continuity
Online data transfer
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS DataSync
Online transfer service that simplifies, automates, and accelerates
moving data between on-premises storage and AWS
Fast data
transfer
Cost-
effective
Enterprise
ready
Combines the speed and reliability of network acceleration software
with the cost-effectiveness of open source tools
Easy to use Secure and
reliable
Up to 10Gbps Fully managed
in-cloud w/
agents on-prem.
Data encryption
& validation
PCI & HIPAA;
Works w/ AWS
IAM & CloudTrail
Usage-based,
$0.04 per-GB
copied
© 2019, Amazon Web Services, Inc. or its Affiliates.
Shared
file system
NFS TLS
How AWS DataSync works
On-Premise
Amazon S3
bucket
AWS Storage resources
AWS
DataSync
Agent deployed
on-premises for
fast access to
local storage
Region
Amazon EFS
file system
AWS DataSync
agent
Data transfer
over the WAN via
efficient purpose-
built protocol
Managed from the
console or AWS
Command Line
Interface (AWS CLI)
Service in AWS
writes or reads
data from AWS
storage services
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Transfer for SFTP
© 2019, Amazon Web Services, Inc. or its Affiliates.
SFTP: It’s here; it’s everywhere
Protocol is deeply embedded in workflows across a variety of industries
Financial services
$
Retail
Healthcare … and more
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Transfer for SFTP
Fully managed service enabling transfer
of data over SFTP while stored in Amazon S3
Seamless migration
of existing workflows
Native integration
with AWS services
Simple
to use
Cost
effective
Fully managed
in AWS
Enterprise
ready
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Snowball family
© 2019, Amazon Web Services, Inc. or its Affiliates.
Why Snowball Edge & the Snow family?
Offline Transfer of large data volumes + Edge Computing, analytics &
machine learning in remote and harsh environments
Moving large datasets over slow links can take years
Remote locations with limited, intermittent or no WAN
Many industries need edge computing for environments
where data generation is decentralized, and data volumes
are significant
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Snowball
AWS Snowball Edge
Compute or Storage Optimized AWS Snowmobile
• 80-TB storage capacity
• 10GE networking
• Data encryption end-to-end
• Rugged 8.5-G impact case
• Rain and dust-resistant
• 42 or 100-TB storage capacity
• Data encryption end-to-end
• Rugged 8.5-G impact case
• Rain and dust resistant
• AWS Greengrass support for local
compute, messaging, and caching
• EC2/AMI support for edge
compute
• Optional GPU
• Exabyte-scale storage in a 45-ft
container
• Data encryption end-to-end
• Dedicated security personnel
• GPS tracking, alarm monitoring,
24/7 surveillance, and optional
additional security
AWS Snow Family
© 2019, Amazon Web Services, Inc. or its Affiliates.
Snowball Edge options summary
Amazon S3 Compatible Storage 42TB 100TB
Compute Comparable to m5d.2xlarge Comparable to m4.4xlarge
AWS Services available Amazon Lambda, File Gateway, Amazon EC2 Lambda, File Gateway, Amazon EC2
Memory 208GB 32GB
vCPUs 52 24
Disk Space for Instances 7.62TB 1TB
Clustering Available Available
Typical job lifetime Can be long lasting About 1 month
Networking Max 100Gb Max 40Gb
GPU Optional Nvidia Tesla V100 No
Storage
Optimized
Compute
Optimized
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Storage Gateway family
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Storage Gateway
On-premises access to virtually unlimited cloud storage
Customer premises AWS Cloud
Amazon S3
S3 Glacier
S3 Glacier
Deep Archive
Amazon Elastic
Block Store (EBS)
Integrated with IAM, KMS, CloudTrail,
CloudWatch services
Appliance Configuration: VMware, Hyper-V, EC2, Hardware
Files
(NFS/SMB)
Volumes
(iSCSI)
Tapes
(iSCSI VTL)
AWS Backup
HTTPS
Gateway
Appliance
Storage
Gateway
Managed
Service
© 2019, Amazon Web Services, Inc. or its Affiliates.
The AWS Storage Gateway family
File gateway
Store and access objects in Amazon S3 from file-based
applications with local caching
Volume gateway
Block storage on-premises backed by cloud storage with local
caching, Amazon EBS snapshots, and clones
Tape gateway
Drop-in replacement for physical tape infrastructure backed by
cloud storage with local caching
Three gateway types provide file, block, and tape storage interfaces
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Streaming Services
© 2019, Amazon Web Services, Inc. or its Affiliates.
Streaming with Amazon Kinesis
Easily collect, process, and analyze video and data streams in real time
Capture, process,
and store video
streams
Load data streams
into AWS data
stores
Analyze data
streams in real time
Capture, process,
and store data
streams
© 2019, Amazon Web Services, Inc. or its Affiliates.
Amazon Kinesis Data Firehose—How it works
Ingest Transform Deliver
Amazon S3
Amazon Redshift
Amazon Elasticsearch Service
AWS IoT
Amazon Kinesis Agent
Amazon Kinesis Streams
Amazon CloudWatch Logs
Amazon CloudWatch Events
Apache Kafka
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Partner Network (APN) for
data transfer, migration & storage
© 2019, Amazon Web Services, Inc. or its Affiliates.
Backup & restore Archive Primary storage BC/DRData migration
AWS Partner Network (APN): Migration & storage
© 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates.
Data Lakes, Database and
Analytic Services
© 2019, Amazon Web Services, Inc. or its Affiliates.
Store exabytes of structured and unstructured data
Load, transform, and catalog once
Make data available to many tools
Open formats and interfaces support innovation
Introducing AWS Lake Formation
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
Amazon S3
Amazon
Redshift
Amazon
EMR
Amazon
Athena
Amazon
Kinesis
Amazon
Elasticsearch
Service
Kinesis
Video Streams
AI Services
Amazon
QuickSight
Data lakes
Build and deploy a fully
managed data lake with a
few clicks
Centrally define security,
governance, and auditing
policies
Self-service discovery and
safe access to all data
from a single catalog
© 2019, Amazon Web Services, Inc. or its Affiliates.
AWS Marketplace
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Kinesis Analytics
Real-time
Amazon Elasticsearch service
Operational Analytics
RDS
MySQL, PostgreSQL, MariaDB,
Oracle, SQL Server
Aurora
MySQL, PostgreSQL
Amazon
QuickSight
Amazon
SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Amazon Glacier
AWS Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect
Data Movement
AnalyticsDatabases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
Amazon
Comprehend
Amazon
Rekognition
Amazon
Lex
Amazon
Transcribe
AWS DeepLens 250+ solutions
730+ Database
solutions
600+ Analytics
solutions
25+ Blockchain
solutions
20+ Data lake
solutions
30+ solutions
RDS on VMWare
Databases and analytics services – built for builders
© 2019, Amazon Web Services, Inc. or its Affiliates.
What are DMS and SCT?
AWS Database Migration Service (DMS) easily and
securely migrates and/or replicate your databases and
data warehouses to AWS
AWS Schema Conversion Tool (SCT) converts your
commercial database and data warehouse schemas to open-
source engines or AWS-native services, such as Amazon
Aurora and Redshift
© 2019, Amazon Web Services, Inc. or its Affiliates.
When to use DMS*?
Migrate
• Migrate business-critical applications
• Migrate from Classic to VPC
• Migrate data warehouse to Redshift
• Upgrade to a minor version
• Consolidate shards into Aurora
• Archive old data
• Migrate from NoSQL to SQL, SQL to
NoSQL or NoSQL to NoSQL
Targets:
Amazon Dynamo
DB
Amazon Redshift
Amazon S3
Amazon Aurora
*DMS is a HIPAA certified service
Amazon S3
Sources:
© 2019, Amazon Web Services, Inc. or its Affiliates.
When to use DMS?
Replicate • Create cross-regions Read Replicas
• Run your analytics in the cloud
• Keep your dev/test and production
environment sync
© 2019, Amazon Web Services, Inc. or its Affiliates.
DMS + Snowball
Common use cases –
• Migrate large databases (over 5TB)
• Migrate many databases at once
• Migrate over slow network
• Push vs. Pull
© 2019, Amazon Web Services, Inc. or its Affiliates.
Security & Compliance is a shared responsibility between AWS and the customer.
What about security?
© 2019, Amazon Web Services, Inc. or its Affiliates.
Thank you

More Related Content

What's hot

Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
Databricks
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
Amazon Web Services
 

What's hot (20)

Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of Metadata
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep Dive
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Data Protection in Transit and at Rest
Data Protection in Transit and at RestData Protection in Transit and at Rest
Data Protection in Transit and at Rest
 
Data Quality
Data QualityData Quality
Data Quality
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
Considerations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseConsiderations for Data Access in the Lakehouse
Considerations for Data Access in the Lakehouse
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 

Similar to Data Lifecycle Management

Databases - Choosing the right Database on AWS
Databases - Choosing the right Database on AWSDatabases - Choosing the right Database on AWS
Databases - Choosing the right Database on AWS
Amazon Web Services
 
在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析
Amazon Web Services
 

Similar to Data Lifecycle Management (20)

Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Deriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML ArchitecturesDeriving Value with Next Gen Analytics and ML Architectures
Deriving Value with Next Gen Analytics and ML Architectures
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call Deck
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call Deck
 
AWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempiAWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempi
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
 
AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive AWS Data Transfer Services Deep Dive
AWS Data Transfer Services Deep Dive
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
Databases - Choosing the right Database on AWS
Databases - Choosing the right Database on AWSDatabases - Choosing the right Database on AWS
Databases - Choosing the right Database on AWS
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Leveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven DecisionsLeveraging Data Analytics in the Cloud to Support Data-Driven Decisions
Leveraging Data Analytics in the Cloud to Support Data-Driven Decisions
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/MLPreparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
Data as an Asset, Not a Cost
Data as an Asset, Not a CostData as an Asset, Not a Cost
Data as an Asset, Not a Cost
 
在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
 
Architecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWSArchitecting a Serverless Data Lake on AWS
Architecting a Serverless Data Lake on AWS
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Data Lifecycle Management

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. Chandra Kapireddy Specialist SA, Data & Analytics Data Lifecycle Management from ingest to archive
  • 2. © 2019, Amazon Web Services, Inc. or its Affiliates. Agenda • What’s new with data today • How AWS can help • Data storage services Break (15 min) • Data transfer ( & hybrid storage) services • Data lakes, database and analytic services • Q/A
  • 3. © 2019, Amazon Web Services, Inc. or its Affiliates. What’s new with data today?
  • 4. © 2019, Amazon Web Services, Inc. or its Affiliates. Data is a strategic asset for every organization The world’s most valuable resource is no longer oil, but data.
  • 5. © 2019, Amazon Web Services, Inc. or its Affiliates. https://www.networkworld.com/article/3325397/idc-expect-175-zettabytes-of-data-worldwide-by-2025.html Data is growing at an exponential rate. The IDC predicts:* The collective sum of the world’s 2018 data was 33 zettabytes. By 2025, it will grow to 175 zettabytes. Data types are also diversifying, requiring storage and analysis of structured AND unstructured data.
  • 6. © 2019, Amazon Web Services, Inc. or its Affiliates. Governance & control There are more people working with data than ever before. How do I provide democratized access to data to enable informed decisions while at the same time enforce data governance and prevent mismanagement of the data? Democratization of data
  • 7. © 2019, Amazon Web Services, Inc. or its Affiliates. There are more ways to analyze data than ever before. Hadoop Elasticsearch Years ago 12 9 6 5 Presto Spark Didn’t exist
  • 8. © 2019, Amazon Web Services, Inc. or its Affiliates. Why streaming data? Get actionable insights quickly Source: Perishable insights, Mike Gualtieri, Forrester Real time Seconds Minutes Hours Days Months Valueofdatatodecision-making Preventive/Predictive Actionable Reactive Historical Time critical decisions Traditional “batch” business intelligence Information half-life in decision-making
  • 9. © 2019, Amazon Web Services, Inc. or its Affiliates. Thinking about data as an asset, not a cost Stop throwing data away. Make it available to more users. Arm users with more data processing technologies. What you need to succeed
  • 10. © 2019, Amazon Web Services, Inc. or its Affiliates. Storage Processing Analytics / Machine learning Archive / Retire Creation / Ingest At every stage of your data’s lifecycle, you need scalable services that can flex with your data growth.
  • 11. © 2019, Amazon Web Services, Inc. or its Affiliates. How AWS can help
  • 12. © 2019, Amazon Web Services, Inc. or its Affiliates. Typical storage workloads Backup & Restore Non-disruptive Easy place to start Integrated with all major vendors Archive & Compliance Media workflows Tape replacement Public Sector, FinServ, Healthcare/Life Sciences Home Directories Simple to move Less latency sensitive Significant cost savings Data Lakes Variety of analytics tools Foundation for AI/ML Built for streaming data Data visualization Business- Critical Applications Integrated with major vendors Fully managed infrastructure Lift-and-shift migrations
  • 13. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Storage Solutions – Use Cases & Customers Backup & Restore • Tools for in-cloud and on-prem backup • S3 Standard-IA and S3 One Zone-IA for cost-effective backup storage • Storage Gateway integrations for tape, volume, and block backup in the cloud Archive • S3 Glacier for frequent retrievals • S3 Glacier Deep Archive for cold archive at $1 for 1TB/month • VTL capabilities to replace physical tape • Query archive data with S3 Glaicer Select Data Lakes • Run analytics applications without ETL • Query data in place with S3 Select and S3 Glacier Select • Integrations with FSx for Lustre to run HPC, ML, and data media processing Enterprise Applications • EFS for file systems to run applications • EBS for compute storage • Seamless integration between partner application and existing AWS systems • Backup to S3 and S3 Glacier Data Compliance • S3 Object Lock to configure & enforce write-once-ready-many (WORM) controls • S3 Object Lock can be set to compliance or governance modes • S3 Glacier Vault Lock for WORM archive Hybrid Storage • Connect on-premises environments to the AWS Cloud with AWS Storage Gateway • Back up data, burst compute-intensive workloads, or transfer files, virtual tapes, and volumes in the AWS Cloud
  • 14. © 2019, Amazon Web Services, Inc. or its Affiliates. Data AWS cloud storage is core …yielding bigger insights… …helping you innovate faster… …gives you unique scale… Building on or migrating an application to AWS… Most big data & data lakes Most managed databases Simplest enterprise applications Easiest data warehousing Singular query-in-place analytics Greatest reliability Highest security Most manageable Broadest compliance Widest portfolio Fastest innovation Active archive Disaster recovery IoT Artificial Intelligence Advanced developer tools Experienced consulting and support Methodical migration services The most data movement services
  • 15. © 2019, Amazon Web Services, Inc. or its Affiliates. More services & storage classes for every use case Block storage Amazon Elastic Block Storage (EBS) • General purpose SSD • Provisioned IOPS SSD • Throughput-optimized HDD • Cold HDD • Elastic Volumes File storage Amazon Elastic File System (EFS) • EFS Standard • EFS Infrequent Access (for cost savings) Object storage Amazon Simple Storage Service (S3) • 6 storage classes for various access patterns • Includes 2 levels or archiving • Build data lakes for structured and unstructured data Amazon EFS AWS Storage Gateway Family Amazon S3 Amazon FSx for Lustre Amazon FSx for Windows File Server Amazon EBS Amazon EC2
  • 16. © 2019, Amazon Web Services, Inc. or its Affiliates. More options for data transfer AWS Direct Connect Amazon Kinesis Firehose AWS Snowball AWS Snowmobile AWS Storage Gateway Amazon S3 Transfer Acceleration AWS DataSync AWS Transfer for SFTP AWS Snowball Edge Storage Optimized Amazon Kinesis Data Streams Amazon Kinesis Video Streams AWS Snowball Edge Compute Optimized
  • 17. © 2019, Amazon Web Services, Inc. or its Affiliates. Gartner Magic Quadrant Magic Quadrant for Public Cloud Storage Services, Worldwide – 2018 Positioned furthest for completeness of vision and highest for ability to execute in each report since inception in 2014 Magic Quadrant for Public Cloud Storage Services, July 2018 – Raj Bala, Julia Palmer This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Amazon Web Services. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
  • 18. © 2019, Amazon Web Services, Inc. or its Affiliates. Data storage services
  • 19. © 2019, Amazon Web Services, Inc. or its Affiliates. Three types of storage
  • 20. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Object storage
  • 21. © 2019, Amazon Web Services, Inc. or its Affiliates. S3 Batch Operations FPO The most features to cost-effectively store, manage, audit, secure, and query data – at virtually any scale. S3 Standard S3 Standard-IA S3 Intelligent-Tiering S3 One Zone-IA S3 Glacier S3 Glacier Deep Archive Use S3 Storage Class Analysis to learn access patterns and S3 Lifecycle policies to move objects between classes S3 Storage Classes Configure access to S3 resources and define user access. Block all public access requests with S3 Block Public Access. Access management Replicate objects to a region of your choice to reduce latency and for compliance. Cross-Region Replication Append up to 10 metadata tags to an object. Use tags, buckets, and prefixes to organize data. Audit and report on access requests and activities. Data management tools Execute tasks and invoke AWS Lambda across billions of objects – with a single API call or a few clicks in the console. S3-integrated analytics applications AWS Lake Formation S3 Select to query data in place FSx for Lustre for HPC, ML, and media data processing Analytics & file systems integrations Supported by the most secure, durable, and performant storage infrastructure Security by design Compliance programs 11 9’s of durability Multi-AZ resiliency Limitless scalability Amazon Simple Storage Service (S3)
  • 22. © 2019, Amazon Web Services, Inc. or its Affiliates. Choose the storage class that fits best ≥ 3 AZs 1 AZ 99.99% 99.5% Milliseconds Hours Hours YearsFrequent Infrequent 0 Bytes 5 Terabytes Reduce storage cost > 80% by choosing the storage class option that best fits your use case 2 Regions © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 99.9%
  • 23. © 2019, Amazon Web Services, Inc. or its Affiliates. Your choice of Amazon S3 storage classes Access FrequencyFrequent Infrequent • Active, frequently accessed data • Milliseconds access • > 3 AZ • $0.0210/GB • Data with changing access patterns • Milliseconds access • > 3 AZ • $0.0210 to $0.0125/GB • Monitoring fee per Obj. • Min storage duration • Infrequently accessed data • Milliseconds access • > 3 AZ • $0.0125/GB • Retrieval fee per GB • Min storage duration • Min object size S3 Standard S3 S-IA S3 Z-IA Amazon Glacier • Re-creatable, less accessed data • Milliseconds access • 1 AZ • $0.0100/GB • Retrieval fee per GB • Min storage duration • Min object size • Archive data • Select minutes or hours • > 3 AZ • $0.0040/GB • Retrieval fee per GB • Min storage duration • Min object size S3 INT © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon S3 Glacier Deep Archive Lowest cost storage class for long-term archiving and digital asset preservation Fully managed without tape burden $0.00099 per GB-month Designed for 99.999999999% durability Recover data in 12 hours
  • 25. © 2019, Amazon Web Services, Inc. or its Affiliates. S3 Storage Class Analysis and S3 Lifecycle Policy Use S3 Storage Class Analysis to identify storage age groups that are less frequently accessed Set S3 Lifecycle Policy to tier storage to lower cost storage classes and expire storage based on age of object Great for predictable workloads (object age indicates access frequency) Fine tune analysis by bucket, prefix, or object tag © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 26. © 2019, Amazon Web Services, Inc. or its Affiliates. S3 Intelligent-Tiering storage class Automatically optimizes storage costs for data with changing access patterns Moves objects between two storage tiers: • Frequent access tier optimized for frequent use of data • Lower cost infrequent access tier optimized for less accessed data Monitors access patterns and auto-tiers on granular object level No performance impact, no operational overhead Milliseconds access, > 3 AZs, Monitoring fee per Object, Minimum storage duration © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 27. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. File storage
  • 28. © 2019, Amazon Web Services, Inc. or its Affiliates. Fully managed cloud file systems Amazon EFS Amazon FSx for Windows File Server Amazon FSx for Lustre LINUX-BASED WORKLOADS WINDOWS-BASED WORKLOADS Fully managed Windows file servers for business applications Fully managed cloud-native file system for Linux-based applications Fully managed Lustre file system for compute-intensive workloads COMPUTE-INTENSIVE WORKLOADS AWS provides file system options that help you easily address the diverse needs of your file-based applications and workloads
  • 29. © 2019, Amazon Web Services, Inc. or its Affiliates. Performance Amazon Elastic File System Scalable, elastic, cloud-native Linux file system Shared access Highly durable and available Elastic and scalable Secure and compliant Storage classes
  • 30. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon FSx for Windows File Server Lift and shift your Windows file storage with fully managed windows file servers Fast and flexible performance Broad accessibility Fully managed Native Windows compatibility Enterprise-ready
  • 31. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon FSx for Lustre Fully managed Lustre file system for compute-intensive workloads Native file system interface Fully managed Seamless access to your data repositories Massively scalable performance Cost-optimized for compute-intensive workloads Secure and compliant
  • 32. © 2019, Amazon Web Services, Inc. or its Affiliates. Seamless integration with Amazon S3 Data stored in S3 is loaded to Amazon FSx for processing Output of processing returned to S3 for retention When your workload finishes, simply delete your file system Link your Amazon S3 data set to your Amazon FSx for Lustre file system, then….
  • 33. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Block storage
  • 34. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS block storage offerings EC2 instance store sc1st1 io1gp2 EBS SSD-backed volumes EBS HDD-backed volumes HDDSSD
  • 35. © 2019, Amazon Web Services, Inc. or its Affiliates. What is Amazon EC2 instance store? • Local to instance • Non-persistent data store • Data not replicated (by default) • No snapshot support • SSD or HDD EC2 instances Physical Host Instance Store or
  • 36. © 2019, Amazon Web Services, Inc. or its Affiliates. What is Amazon EBS? • Block storage as a service • Create, attach volumes through an API • Service accessed over the network EC2 instance EBS volume
  • 37. © 2019, Amazon Web Services, Inc. or its Affiliates. What is Amazon EBS? • Volumes persist independent of EC2 • Select storage and compute based on your workload • Detach and attach between instances within the same Availability Zone EC2 instance EC2 instance Availability Zone AWS Region EBS volume
  • 38. © 2019, Amazon Web Services, Inc. or its Affiliates. What is Amazon EBS? • Volumes attach to one instance • Many volumes can attach to an instance • Separate boot and data volumes EC2 instance EBS volume (boot) EBS volume (data) EBS volume (data) Availability Zone AWS Region
  • 39. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon EBS volume types Hard disk drives (HDD)Solid-state drives (SSD)
  • 40. © 2019, Amazon Web Services, Inc. or its Affiliates. Choosing an Amazon EBS volume type or What is more important to your workload? IOPS Throughput?
  • 41. © 2019, Amazon Web Services, Inc. or its Affiliates. Choosing an Amazon EBS volume type i3 gp2 Latency? < 1 ms Single-digit ms Which is more important? Cost Performance IOPS ≤ 80,000> 80,000 is more important io1 Throughput is more important Small, random I/O Large, sequential I/O st1 d2 ≤ 1,750 MiB/s Aggregate throughput? > 1,750 MiB/s Which is more important? Cost Performance sc1
  • 42. © 2019, Amazon Web Services, Inc. or its Affiliates. Data transfer & hybrid storage services
  • 43. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS data transfer & hybrid storage Online managed data transfer Hybrid storage Offline data transfer Private network connections to AWS Load streaming data into Amazon S3 Ship static data into and out of Amazon S3 Access AWS storage from on-premises Edge locations for Amazon S3 enabled applications Online transfer of active data AWS DataSync AWS Transfer for SFTP SFTP transfers into Amazon S3 NEW NEW Storage and compute in disconnected environments Network-based services
  • 44. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Network-based services AWS Direct Connect & S3 Transfer Acceleration
  • 45. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Direct Connect Reduce bandwidth Costs Consistent network performance Compatible with all AWS services Private connectivity to VPC Elastic Simple • Establish private connectivity between AWS and your data center • Dedicated connection can be partitioned into multiple virtual interfaces • Maintain network separation between public and private environments BENEFITS
  • 46. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon S3 Transfer Acceleration AWS Edge Location Optimized Throughput! Leverages AWS Global Edge Locations (Amazon CloudFront) and optimized AWS network path Optimized protocols Change your endpoint, not your code No firewall exceptions & no client software required Speeds up transfers for applications that use S3 API over long distances On average, a 171% improvement over regular Amazon S3 cli commands when uploading over long distances
  • 47. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. AWS DataSync
  • 48. © 2019, Amazon Web Services, Inc. or its Affiliates. What use cases need to transfer active data? Migration of active application data Transferring data for time sensitive in-cloud analysis Replication of data for business continuity Online data transfer
  • 49. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS DataSync Online transfer service that simplifies, automates, and accelerates moving data between on-premises storage and AWS Fast data transfer Cost- effective Enterprise ready Combines the speed and reliability of network acceleration software with the cost-effectiveness of open source tools Easy to use Secure and reliable Up to 10Gbps Fully managed in-cloud w/ agents on-prem. Data encryption & validation PCI & HIPAA; Works w/ AWS IAM & CloudTrail Usage-based, $0.04 per-GB copied
  • 50. © 2019, Amazon Web Services, Inc. or its Affiliates. Shared file system NFS TLS How AWS DataSync works On-Premise Amazon S3 bucket AWS Storage resources AWS DataSync Agent deployed on-premises for fast access to local storage Region Amazon EFS file system AWS DataSync agent Data transfer over the WAN via efficient purpose- built protocol Managed from the console or AWS Command Line Interface (AWS CLI) Service in AWS writes or reads data from AWS storage services
  • 51. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Transfer for SFTP
  • 52. © 2019, Amazon Web Services, Inc. or its Affiliates. SFTP: It’s here; it’s everywhere Protocol is deeply embedded in workflows across a variety of industries Financial services $ Retail Healthcare … and more
  • 53. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Transfer for SFTP Fully managed service enabling transfer of data over SFTP while stored in Amazon S3 Seamless migration of existing workflows Native integration with AWS services Simple to use Cost effective Fully managed in AWS Enterprise ready
  • 54. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Snowball family
  • 55. © 2019, Amazon Web Services, Inc. or its Affiliates. Why Snowball Edge & the Snow family? Offline Transfer of large data volumes + Edge Computing, analytics & machine learning in remote and harsh environments Moving large datasets over slow links can take years Remote locations with limited, intermittent or no WAN Many industries need edge computing for environments where data generation is decentralized, and data volumes are significant
  • 56. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Snowball AWS Snowball Edge Compute or Storage Optimized AWS Snowmobile • 80-TB storage capacity • 10GE networking • Data encryption end-to-end • Rugged 8.5-G impact case • Rain and dust-resistant • 42 or 100-TB storage capacity • Data encryption end-to-end • Rugged 8.5-G impact case • Rain and dust resistant • AWS Greengrass support for local compute, messaging, and caching • EC2/AMI support for edge compute • Optional GPU • Exabyte-scale storage in a 45-ft container • Data encryption end-to-end • Dedicated security personnel • GPS tracking, alarm monitoring, 24/7 surveillance, and optional additional security AWS Snow Family
  • 57. © 2019, Amazon Web Services, Inc. or its Affiliates. Snowball Edge options summary Amazon S3 Compatible Storage 42TB 100TB Compute Comparable to m5d.2xlarge Comparable to m4.4xlarge AWS Services available Amazon Lambda, File Gateway, Amazon EC2 Lambda, File Gateway, Amazon EC2 Memory 208GB 32GB vCPUs 52 24 Disk Space for Instances 7.62TB 1TB Clustering Available Available Typical job lifetime Can be long lasting About 1 month Networking Max 100Gb Max 40Gb GPU Optional Nvidia Tesla V100 No Storage Optimized Compute Optimized
  • 58. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Storage Gateway family
  • 59. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Storage Gateway On-premises access to virtually unlimited cloud storage Customer premises AWS Cloud Amazon S3 S3 Glacier S3 Glacier Deep Archive Amazon Elastic Block Store (EBS) Integrated with IAM, KMS, CloudTrail, CloudWatch services Appliance Configuration: VMware, Hyper-V, EC2, Hardware Files (NFS/SMB) Volumes (iSCSI) Tapes (iSCSI VTL) AWS Backup HTTPS Gateway Appliance Storage Gateway Managed Service
  • 60. © 2019, Amazon Web Services, Inc. or its Affiliates. The AWS Storage Gateway family File gateway Store and access objects in Amazon S3 from file-based applications with local caching Volume gateway Block storage on-premises backed by cloud storage with local caching, Amazon EBS snapshots, and clones Tape gateway Drop-in replacement for physical tape infrastructure backed by cloud storage with local caching Three gateway types provide file, block, and tape storage interfaces
  • 61. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Streaming Services
  • 62. © 2019, Amazon Web Services, Inc. or its Affiliates. Streaming with Amazon Kinesis Easily collect, process, and analyze video and data streams in real time Capture, process, and store video streams Load data streams into AWS data stores Analyze data streams in real time Capture, process, and store data streams
  • 63. © 2019, Amazon Web Services, Inc. or its Affiliates. Amazon Kinesis Data Firehose—How it works Ingest Transform Deliver Amazon S3 Amazon Redshift Amazon Elasticsearch Service AWS IoT Amazon Kinesis Agent Amazon Kinesis Streams Amazon CloudWatch Logs Amazon CloudWatch Events Apache Kafka
  • 64. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. AWS Partner Network (APN) for data transfer, migration & storage
  • 65. © 2019, Amazon Web Services, Inc. or its Affiliates. Backup & restore Archive Primary storage BC/DRData migration AWS Partner Network (APN): Migration & storage
  • 66. © 2019, Amazon Web Services, Inc. or its Affiliates.© 2019, Amazon Web Services, Inc. or its Affiliates. Data Lakes, Database and Analytic Services
  • 67. © 2019, Amazon Web Services, Inc. or its Affiliates. Store exabytes of structured and unstructured data Load, transform, and catalog once Make data available to many tools Open formats and interfaces support innovation Introducing AWS Lake Formation Snowball Snowmobile Kinesis Data Firehose Kinesis Data Streams Amazon S3 Amazon Redshift Amazon EMR Amazon Athena Amazon Kinesis Amazon Elasticsearch Service Kinesis Video Streams AI Services Amazon QuickSight Data lakes Build and deploy a fully managed data lake with a few clicks Centrally define security, governance, and auditing policies Self-service discovery and safe access to all data from a single catalog
  • 68. © 2019, Amazon Web Services, Inc. or its Affiliates. AWS Marketplace Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Athena Interactive analytics Kinesis Analytics Real-time Amazon Elasticsearch service Operational Analytics RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Aurora MySQL, PostgreSQL Amazon QuickSight Amazon SageMaker DynamoDB Key value, Document ElastiCache Redis, Memcached Neptune Graph Timestream Time Series QLDB Ledger Database S3/Amazon Glacier AWS Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect Data Movement AnalyticsDatabases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain Amazon Comprehend Amazon Rekognition Amazon Lex Amazon Transcribe AWS DeepLens 250+ solutions 730+ Database solutions 600+ Analytics solutions 25+ Blockchain solutions 20+ Data lake solutions 30+ solutions RDS on VMWare Databases and analytics services – built for builders
  • 69. © 2019, Amazon Web Services, Inc. or its Affiliates. What are DMS and SCT? AWS Database Migration Service (DMS) easily and securely migrates and/or replicate your databases and data warehouses to AWS AWS Schema Conversion Tool (SCT) converts your commercial database and data warehouse schemas to open- source engines or AWS-native services, such as Amazon Aurora and Redshift
  • 70. © 2019, Amazon Web Services, Inc. or its Affiliates. When to use DMS*? Migrate • Migrate business-critical applications • Migrate from Classic to VPC • Migrate data warehouse to Redshift • Upgrade to a minor version • Consolidate shards into Aurora • Archive old data • Migrate from NoSQL to SQL, SQL to NoSQL or NoSQL to NoSQL Targets: Amazon Dynamo DB Amazon Redshift Amazon S3 Amazon Aurora *DMS is a HIPAA certified service Amazon S3 Sources:
  • 71. © 2019, Amazon Web Services, Inc. or its Affiliates. When to use DMS? Replicate • Create cross-regions Read Replicas • Run your analytics in the cloud • Keep your dev/test and production environment sync
  • 72. © 2019, Amazon Web Services, Inc. or its Affiliates. DMS + Snowball Common use cases – • Migrate large databases (over 5TB) • Migrate many databases at once • Migrate over slow network • Push vs. Pull
  • 73. © 2019, Amazon Web Services, Inc. or its Affiliates. Security & Compliance is a shared responsibility between AWS and the customer. What about security?
  • 74. © 2019, Amazon Web Services, Inc. or its Affiliates. Thank you