Learning Objectives:
- Understand the use cases for migrating or replicating databases to the cloud
- Learn about the benefits of cloud-native databases for performance and costs reduction
- See how AWS Database Migration Service helps with your migration and how AWS Schema Conversion Tool makes conversions simple and quick
Moving or replicating your databases to the cloud should be simple and inexpensive. AWS has recently enhanced the AWS Database Migration Service and the AWS Schema Conversion Tool with new data sources to increase your migration options. You can now export from MongoDB databases and Greenplum, IBM Netezza, HPE Vertica, Teradata, Oracle DW and Microsoft SQL Server data warehouses to AWS. Learn how to export and migrate your data and procedural code with minimal downtime to the cloud database of your choice, including cloud-native offerings such as Amazon Aurora, Amazon DynamoDB and Amazon Redshift.
2. How can I get to the cloud?
How will my on-premises data migrate to the cloud?
How can I make it transparent to my users?
Afterwards, how will on-premises and cloud data interact?
How can I integrate my data assets within AWS?
Can I get help moving off of commercial databases?
3. Migration used to be cost + complexity + time
Commercial data migration and replication software
Complex to set up and manage
Application downtime
Database-engine-specific application code
4. What are DMS and SCT?
AWS Database Migration Service (DMS) easily and securely
migrates and/or replicate your databases and data warehouses
to AWS
AWS Schema Conversion Tool (SCT) converts your commercial
database and data warehouse schemas to open-source engines or
AWS-native services, such as Amazon Aurora and Redshift
We have migrated over 26,000 unique databases. And counting…
5. When to use DMS and SCT?
Modernize Migrate Replicate
6. When to use SCT?
Modernize
Modernize your database tier
Modernize and Migrate your Data
Warehouse to Amazon Redshift
Amazon Aurora
Amazon Redshift
7. SCT helps with converting tables, views, and code
Sequences
User-defined types
Synonyms
Packages
Stored procedures
Functions
Triggers
Schemas
Tables
Indexes
Views
Sort and distribution keys
8. SCT Migration Assessment Report
• Assessment of migration
compatibility of source databases
with open-source database
engines – RDS MySQL, RDS
PostgreSQL and Aurora
• Recommends best target engine
• Provides details level of efforts to
complete migration
9. New SCT Data Extractors
Extract Data from your data warehouse and migrate to Amazon Redshift
• Extracts data through local migration agents
• Data is optimized for Redshift and saved in
local files
• Files are loaded to an Amazon S3 bucket
(through network or Amazon Snowball) and
then to Amazon Redshift
Amazon
Redshift
AWS SCT S3 Bucket
11. Migration process using Data Extractors
1. Connect SCT to Source and Target databases
2. Run Assessment Report
3. Migrate schema objects to Target
4. Manually migrate any objects that could not be automatically
migrated
5. Perform full data load using SCT Data Extractors
6. Catch-up intervening data changes on Source
12. Fault Tolerance – When an agent fails
Task 1 Migration
Status
• Install an agent on
another computer
• Configure the new
agent with the same
settings as the old
agent
• Start the new agent,
the old task will
continue running on
the new agent
Table = Sales
Partition = 201705
Current PK = 1000
Task 1Migration
Status
Table = Sales
Partition = 201705
Current PK = 1000
13. Parallel Processing
Extractor’s unit of work is a table partition or a un-partitioned table
Monthly partitioned sales table
Un-partitioned customer table
Task
1
Task
2
Task
3
2017-022017-012016-12 2017-03 2017-04
Task
4
Customers
15. Redshift Table Keys
SCT chooses sortkeys and
distkeys for your tables based
on metadata and/or statistics
Distkey – distribute table data
across nodes of the cluster
Sortkeys – physically order the
table data within each distkey
16. Best practices for using SCT Extractors
• Install agents on computers close to your data
warehouse
• Configure agents with SSL for encrypted data transfer
• Disperse agents across N > 1 servers to tolerate
network and server failures
• SCT will assign tasks for large tables on a partition basis
• Review automatic compression after data is loaded
• Validate selected sortkeys and distkeys against
expected workload
17. When to use SCT DW Extractors
• For full loads of your DW data to Redshift
• Catch up intervening changes after the full load
• For parallel processing of large partitioned tables
• When network or server issues could interrupt the
extract process
• For parallel processing of multiple tables simultaneously
• To automatically take advantage of Redshift data
compression
18. SCT Cost
• SCT is a free download available on the AWS website.
You pay only for resources used by the extractors
• S3 - cost depends on amount of storage and time used.
Clean-up unused objects to reduce cost
• Extract files are compressed before uploading to S3
• EC2 - if extractors are deployed in AWS then instance
costs are incurred
19. Resources available to customers —
AWS Schema Conversion Tool
User Guide: Review technical docs at
aws.amazon.com/documentation/SchemaConversionTool/
or choose the Download button.
Download area: Get installation files for the
Schema Conversion Tool.
Support forums: Ask questions and review
how-to guides.
https://forums.aws.amazon.com/forum.jspa?forumID=208.
20. When to use DMS*?
Migrate
• Migrate business-critical
applications
• Migrate from Classic to VPC
• Migrate data warehouse to Redshift
• Upgrade to a minor version
• Consolidate shards into Aurora
• Migrate from NoSQL to SQL, SQL
to NoSQL or NoSQL to NoSQL
Sources:
Targets:
Amazon
Dynamo DB
Amazon Redshift
Amazon S3
Amazon Aurora
*DMS is a HIPAA certified service
21. NoSQL Database Migration
Modernize
AWS DMS now supports MongoDB as a migration source and
Amazon DynamoDB as a migration target
Amazon Aurora
Amazon DynamoDB
22. MongoDB
DMS supports MongoDB versions 2.6.x and 3.x as a database source
DMS supports two migration modes when using MongoDB as a source
Document Mode
• JSON data migrated with no changes
• Documents written to single column in target table named “_doc”
• Can create “_id” column as PK (required for CDC)
Table Mode
• Documents are flattened into table data rows
• DMS derives target columns from a sample of the source documents
• Fields that do not exist in the collection the field is not replicated
• Collections cannot be renamed
23. MongoDB Change Data Capture
Mongo
DB
OpLog
Primary
Mongo
DB
OpLog
Replica DMS
• DMS requires a MongoDB replica set if one doesn’t already exist
• CDC scans operations log copied from the primary by members of the replica set
• Set extractDocID = TRUE to capture document IDs as part of CDC
24. Amazon DynamoDB
A fully managed NoSQL database service that provides fast and
predictable performance with seamless scalability
Fast performance
SSD technologies
High throughput
Low latencies
Scales to throughput
requirements
Seamlessly scales if
requirements go up
or down
Seamless hardware and
platform management
Automatically repartitions
data when scaling
Flexible architecture
Supports both
document and key-
value models
25. DMS mapping options for DynamoDB
DMS supports two mapping models for DynamoDB targets
map-record-to-record map-record-to-document
26. DynamoDB migration considerations
• DynamoDB supports three data types: Number, String,
and Boolean
• DMS converts Date values to String
• CLOBs are converted to Strings. LOBs are not
supported
• Max precision for Number fields is 38 digits. Higher
precision numbers should be mapped to String
27. AWS Database Migration Service pricing
T2 for developing and periodic data migration tasks
C4 for large databases and minimizing time
T2 pricing starts at $0.018 per hour for T2.micro
C4 pricing starts at $0.154 per hour for C4.large
50 GB GP2 storage included with T2 instances
100 GB GP2 storage included with C4 instances
Data transfer inbound and within AZ is free
Data transfer across AZs starts at $0.01 per GB
28. Resources available to customers—DMS
Getting Started Guide: Review technical
documentation.
Features and benefits: Highlights DMS
features.
Pricing: Prices for replication instances,
storage, and data transfer.
Support: Post your questions to our
Support forum.
Java SDK: Java-based API for creating
and managing data migration tasks.
AWS Command Line Interface: Start
and stop replication tasks with simple
commands.