More Related Content Similar to Open Source Databases on the Cloud - Peter Dachnowicz (20) More from Amazon Web Services (20) Open Source Databases on the Cloud - Peter Dachnowicz1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Open Source Managed Databases
Peter Dachnowicz
pdach@amazon.com
Sr Technical Account Manager
2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A Brief History of MySQL
• World’s most commonly used relational database
• Key component of the LAMP open source web application software stack
• Used in open source applications such as WordPress, Joomla, Redmine, and Drupal
• History
• First released in 1995 in Sweden by MySQL AB
• Acquired by Sun in 2008
• Launched as first Amazon RDS engine in 2009
• Acquired by Oracle in 2010
• Popular Branches
• MySQL Community Edition
• MySQL Enterprise Edition
• MariaDB Server
• Percona Server
• Amazon Aurora MySQL
Amazon RDS
3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MySQL is the “M” in the LAMP stack
4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Managed MySQL Compatible Engines at AWS
Standard
The open source
standard MySQL
Community
The popular
community choice
Amazon Aurora
Performance
The fastest MySQL
compatible engine
5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
RDS MySQL
• Most popular open-source database engine
• Support for MySQL Community Edition
– Current versions are 5.5.59, 5.6.39, and 5.7.21
• InnoDB and MyISAM storage engines
• Version 5.7 - New Features
– JSON support
– Query optimizer improvements
– GIS extensions
– Improved parallel replication
– Dynamic buffer pool resizing
• Version 8.0 coming soon
6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
RDS MariaDB
• Support for MariaDB Server
– Current versions are 10.0.35, 10.1.34, and 10.2.15
• Same instance, regions, pricing as RDS MySQL (including free tier)
• Differences from RDS MySQL
– XtraDB and Aria storage engines only
– Thread Pooling
– GTID
• Version 10.2 - New Features
– InnoDB now default storage engine
– Multiple triggers on the same event
– Auto-partition of table cache
7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
RDS Best Practices
• Leverage Multi-AZ Configurations
– Pros: High availability for planned and unplanned events
– Cons: Additional hourly charge, increased latency
• Leverage Read Replicas
– Pros: Read scalability, disaster recovery, upgrades
– Cons: Application must handle asynchronous behavior, watch log storage on master
• Leverage Enhanced Monitoring
– Pros: Additional information for tuning and troubleshooting
– Cons: CloudWatch Logs charge (small)
* Excluding micro instance classes
8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MariaDB Audit Plug-in for MariaDB and MySQL
• Provides customer configurable event logging for database activity
– Auditable events include logins, queries, and tables accessed
– Individual users can be included or excluded from the audit
• The MariaDB audit plug-in is supported on RDS versions
of MariaDB and MySQL
• Available via RDS option group
• Can impact server performance
– up to 10% penalty for full logs
9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Best Practices
• Use RDS provided stored procedures for managing sessions, queries, and read
replicas
– mysql.rds_kill(processID) to terminate a connection
– mysql.rds_kill_query(queryID) to terminate a query
– mysql.rds_stop_replication manually stop replication
– mysql.rds_skip_repl_error to skip the current replication error
– mysql.rds_next_master_log to change the master log position on the replica
10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Logging Best Practices
• Enable MySQL general and slow query logs for troubleshooting
– The logs can be written to a CSV table or file
– The long_query_time database parameter sets the threshold for slow queries
– Avoid large CSV table logs by using stored procedures to rotate the logs
• mysql.rds_rotate_general_log
• mysql.rds_rotate_slow_log
– Disable the general and slow query logs if they are not actively being used for
diagnosis or troubleshooting
• The general and slow query logs can impact database performance, especially if
long_query_time is set to a low value
11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Replication Tips
• Replicas can be made writeable
– Useful for changes like adding an index for reporting
– Think of it as “breaking the warranty”
• RDS supports managed replication chaining (source > target > target)
– Replica in one region and then a cross-region replica in another region
• RDS limits 5 replicas to per master (can be extended upon request)
• RDS instances support non-managed replication
– Useful for on-premises to RDS, EC2 to RDS scenarios
– Uses stored procedures rather than service API for managing
• RDS MariaDB provides crash-safe replication using Global Transaction
Identifiers (GTIDs)
12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Storage Types
• The type of storage can make a big difference for I/O intensive operations
– Magnetic – Low cost but IOPS and latency can vary
– General Purpose (GP2) – SSD based storage with 3K IOPS burst capability then a
baseline rate of 3 IOPS per GB. Throttled via a credit-based system. Great for
storage below 1 TB especially when you do not deplete credits.
– Provisioned IOPS (PIOPS) – SSD based storage with defined IOPS rates. Great
when you need consistent performance or when you need very high performance.
– For almost all use cases we recommend an SSD-based storage type
– Magnetic storage has average IOPS and latencies that are an order of magnitude
slower than SSD-based storage.
– GP2 is the default and makes sense until you start to go above 1 TB.
13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Storage Best Practices
• Fast, consistent storage is important since many routine operations can be
heavily I/O dependent
– Crash Recovery – InnoDB/XtraDB needs to scan through certain files as well as
rollback and roll-forward transactions.
– Engine Upgrades – MySQL scans through all tablespaces during a minor or major
engine version upgrade.
– DDL – MySQL can do online DDL which means it is copying a potentially large table.
Index creations and rebuilds also are I/O intensive.
• Multi-AZ failovers can be less than a minute with most of the downtime in DNS
propagation
– If MySQL needs to do a crash recovery or other I/O intensive operation before it can
start, the speed of storage could mean downtime is several minutes or even hours
14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MySQL & MariaDB Storage Tips
• GP2 is a great choice but be careful about burst credits on volumes < 1TB
– Hitting credit depletion results in IOPS drop. Latency and queue depth metrics will
spike until credits are replenished.
– Monitor read/write IOPS to see if average IOPS is greater than the baseline.
• Think of GP2 burst rate and PIOPS stated rate as maximum I/O rates.
– Your application needs to actually run an I/O intensive workload to hit these rates
– Multi-AZ introduces an extra commit
latency so MAZ IOPS will typically be
lower compared to SAZ
• Use EBS-optimized instance type
15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Running MySQL in Amazon EC2
• Install
– Install a MySQL distribution via yum or choose a pre-built machine image from the AWS
Marketplace
– Consider using EBS volumes vs. ephemeral disks
• Configure
– Operating System Configuration
– Networking and Security Configuration
• Backup/restore
– mysqldump, MySQL Enterprise Backup, Percona XtraBackup
• Cluster/replicate
– Semi-synchronous replication
– Galera Cluster/MySQL Cluster/Percona Cluster
• Monitor
– Leverage AWS CloudWatch for OS metrics
– MySQL Monitoring tools—DataDog, Percona, VividCortex, WebYog
16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Benefits of Running Managed MySQL
• Managed high availability through automated
failover across multiple data centers
• Managed disaster recovery with to the minute
point-in-time recovery
• Managed read scaling through read replicas
• Push button provisioning, automated instance
and storage scaling, patching, upgrades,
security, and general care and feeding
• Lower TCO because we manage the muck
– Get more leverage from your teams
– Focus on the things that differentiate you
17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Amazon RDS allowed us to focus
a little less on MySQL
administration and a little more
scaling out the rest of our service."
- Joey Parsons
Head of Operations at Flipboard
”
“
Flipboard is one of the world's first social magazines. Inspired
by the beauty and ease of print media, the company’s mission
is to fundamentally improve how people discover, view, and
share content across their social networks.
• From the beginning, Flipboard has
run its infrastructure on Amazon Web
Services
• One key decision was to use MySQL,
and in turn, Amazon RDS
• Flipboard uses Amazon RDS for
MySQL and its Multi-AZ capabilities
to store mission critical user data
• Key features are auto minor version
upgrade, automatic backups, easy
restores, and the ability to spin up
read replicas to add capacity
Customer Story: Flipboard
18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Migrating your MySQL Database into AWS
• Data Import Options
– Use mysqldump and mysql command line
– Use mysqlimport on EC2 instance
– Use external replication into AWS for minimizing downtime
– Use AWS Database Migration Services
(heterogeneous migrations/database consolidation)
• Import Backup from Amazon S3
– New for RDS MySQL
– Create full or incremental backup
with Percona XtraBackup 2.3
– Use AWS IAM role to access S3 bucket
– Use replication to catch up to changes in the
source database (if necessary)
19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Configuring your MySQL instance for data loading
• Turn off backup retention (disables binlog)
• Turn off autocommit mode
• Drop indexes and disable foreign keys
• For EBS-based engines
– Use EBS-optimized instance types
– Maximize storage IOPS
• Optimize parameter settings
– innodb_flush_log_at_trx_commit, innodb_io_capacity,
innodb_io_read_threads, innodb_io_write_threads, sync_binlog
• Remember to re-enable settings after load completes!
20. Getting started with Amazon RDS for MariaDB
Information
https://aws.amazon.com/rds/mariadb
Pricing
https://aws.amazon.com/rds/mariadb/pricing/
MariaDB user guide
https://docs.aws.amazon.com/AmazonRDS/lates
t/UserGuide/CHAP_MariaDB.html
21. Getting started with Amazon RDS for MySQL
Information
https://aws.amazon.com/rds/mysql/
Pricing
https://aws.amazon.com/rds/mysql/pricing/
MySQL user guide
http://docs.aws.amazon.com/AmazonRDS/latest
/UserGuide/CHAP_MySQL.html
22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
What is PostgreSQL?
• A Relational Database Management System
• An Object Relational Database
– Can add First Class simple and
complex objects with methods,
that can be used in a
Relational Context
– Queries can be made with SQL
• Pronounced “POST-gress”
– The “QL” is silent
23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Open source database
In active development for over 20 years
Owned by a foundation, not a single company
Permissive innovation-friendly open source license
PostgreSQL Fast Facts
Open Source Initiative
24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
High performance out of the box
Object-oriented and ANSI-SQL:2008 compatible
Most geospatial features of any open-source database
Supports stored procedures in 12 languages (Java, Perl,
Python, Ruby, Tcl, C/C++, its own Oracle-like PL/pgSQL, and
others)
PostgreSQL Fast Facts
25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Most Oracle-compatible open-source database
Highest AWS Schema Conversion Tool automatic
conversion rates are from Oracle to PostgreSQL
PostgreSQL Fast Facts
26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PostgreSQL Deployment Options
On-Premises Hosted Managed
EC2 DB Services
27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon RDS for PostgreSQL
Supporting Latest Minor Releases
• 10.4
• 9.6.9
• 9.5.13
• 9.4.18
• 9.3.23
Now Available —PostgreSQL 11 Beta 1 in Database Preview
28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New PostgreSQL Extensions Supported
Extensions Description
pgrouting Provides geospatial routing functionality for PostGIS
postgresql-hll HyperLogLog data type support
decoder_raw Output plugin to generates raw queries for logical replication changes
pg_repack Remove bloat from tables and indexes in version 9.6.3
pgaudit Provide detailed session and object audit logging in versions 9.6.3 and 9.5.7
wal2json Output plugin for logical decoding in versions 9.6.3 and 9.5.7
auto_explain Log execution plans of slow statements automatically in versions 9.6.3 and 9.5.7
pg_hint_plan Provides control of execution plans by using hint phrases
log_fdw Extension to query your database engine logs within the database
pg_freespacemap Examine free space map
29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Extension—pgaudit (9.6.3+)
• CREATE ROLE rds_pgaudit
• Add pgaudit to shared_preload_libraries and pgaudit.role = rds_pgaudit in a
custom parameter group in PostgreSQL 9.6 family
• Apply the modified parameter group to 9.6.3+ database instance and apply immediately
• CREATE EXTENSION pgaudit
• Grant SELECT on all tables to rds_pgaudit to enable auditing
• GRANT SELECT ON t1 TO rds_pgaudit;
• Database logs will show entry as follows
• ... 2017-06-12 19:09:49 UTC:…:pgadmin@postgres:[11701]:LOG: AUDIT:
OBJECT,1,1,READ,SELECT,TABLE,public.t1,select * from t1; ...
30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Extension—pg_stat_statements
• CREATE EXTENSION pg_stat_statements
• SELECT * from pg_stat_statements order by total_time DESC;
-[ RECORD 2 ]-------+--------
userid | 16388
dbid | 16464
queryid | 4286627671
query | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
calls | 165125
total_time | 5251.54200000001
min_time | 0.015
max_time | 5.558
mean_time | 0.0318034337623008
stddev_time | 0.0369181019548524
rows | 165125
• SELECT substring(query, 1, 50) AS short_query,
round(total_time::numeric, 2) AS total_time, calls,
round(mean_time::numeric, 2) AS mean, round((100 * total_time
/sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu FROM
pg_stat_statements ORDER BY total_time DESC LIMIT 10;
31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PostgreSQL Events and Logs
32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS