Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Amazon Aurora: Under the Hood

Cargando en…3

Eche un vistazo a continuación

1 de 57 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Amazon Aurora: Under the Hood (20)


Más de Amazon Web Services (20)

Amazon Aurora: Under the Hood

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Aurora – Under the hood Steven Bryen | AWS Technical Evangelist @steven_bryen
  2. 2. Agenda § Why did we build Amazon Aurora? § PostgreSQL compatibility § Durability and Availability Architecture § Performance Architecture § Performance Results vs. PostgreSQL § Announcing Performance Insights § Getting Data In § Database Freedom +
  3. 3. Traditional relational databases are hard to scale Multiple layers of functionality all in a monolithic stack SQL Transactions Caching Logging Storage
  4. 4. Traditional approaches to scale databases Each architecture is limited by the monolithic mindset SQL Transactions Caching Logging SQL Transactions Caching Logging Application Application SQL Transactions Caching Logging SQL Transactions Caching Logging Storage Application Storage Storage SQL Transactions Caching Logging Storage SQL Transactions Caching Logging Storage
  5. 5. Reimagining the relational database What if you were inventing the database today? You would break apart the stack You would build something that: ü Lets layers scale out independently… ü Is self-healing… ü Leverages distributed services…
  6. 6. A service-oriented architecture applied to the database Move the logging and storage layer into a multitenant, scale-out, database-optimized storage service Integrate with other AWS services such as S3, EC2, VPC, DynamoDB, SWF, and Route 53 for control & monitoring Make it a managed service – using Amazon RDS. Takes care of management and administrative functions. Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3 1 2 3 Amazon RDS
  7. 7. Cloud-optimized relational database Performance and availability of commercial databases Simplicity and cost-effectiveness of open source databases, with MySQL compatibility What is Amazon Aurora?
  8. 8. In 2014 we launched Amazon Aurora with MySQL compatibility. In 2017 we added PostgreSQL compatibility. Customers now have more choice in using Amazon’s cloud-optimized relational database with the performance and availability of commercial databases and the simplicity and cost-effectiveness of open source databases. Making Amazon Aurora Better
  9. 9. § Open-source database § In active development for 20 years § Owned by a foundation, not a single company § Permissive, innovation-friendly open source license § High performance out of the box § Object-oriented and ANSI-SQL:2008 compatible § Most geospatial features of any open source database § Supports stored procedures in 12 languages (Java, Perl, Python, Ruby, Tcl, C/C++, its own Oracle-like PL/pgSQL, etc.) § Most Oracle-compatible open-source database § Highest AWS Schema Conversion Tool automatic conversion rates are from Oracle to PostgreSQL PostgreSQL fast facts Open Source Initiative
  10. 10. What does PostgreSQL compatibility mean? PostgreSQL 9.6 + Amazon Aurora cloud-optimized storage § Performance: Up to 3x+ better performance than PostgreSQL alone § Availability: Failover time of <30 seconds § Durability: 6 copies across 3 Availability Zones § Read Replicas: Single-digit millisecond lag times on up to 15 replicas Amazon Aurora Storage
  11. 11. What does PostgreSQL compatibility mean? Cloud-native security and encryption § AWS Key Management Service (KMS) § AWS Identity and Access Management (IAM) Easy to manage with Amazon RDS Easy to load and unload § AWS Database Migration Service § AWS Schema Conversion Tool Fully compatible with PostgreSQL, now and for the foreseeable future § Native PostgreSQL implementation – not a compatibility layer AWS DMS Amazon RDS PostgreSQL
  12. 12. Aurora Customer Adoption Aurora is used by ¾ of the top 100 AWS customers Fastest growing service in AWS history
  13. 13. Customers Like Aurora for Performance and Availability At Capital One, our cloud-first approach led us to start testing the preview of Amazon Aurora’s PostgreSQL compatibility in November 2016. In our testing during the preview, we’ve been impressed with the performance and high availability offered by Amazon Aurora. - John Andrukonis, Chief Architect, Capital One
  14. 14. Amazon Aurora Durability & Availability
  15. 15. Scale-out, distributed, log structured storage Master Replica Replica Replica Availability Zone 1 Shared Storage Volume – Transaction Aware Primary Database Node Read Replica / Secondary Node Read Replica / Secondary Node Read Replica / Secondary Node Availability Zone 2 Availability Zone 3 AWS Region Storage Monitoring Database and Instance Monitoring
  16. 16. Amazon Aurora Storage Engine Overview Data is replicated 6 times across 3 Availability Zones Continuous backup to Amazon S3 (built for 11 9s durability) Continuous monitoring of nodes and disks for repair 10 GB segments as unit of repair or hotspot rebalance Quorum system for read/write; latency tolerant Quorum membership changes do not stall writes Storage volume automatically grows up to 64 TB AZ 1 AZ 2 AZ 3 Amazon S3 Database Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Monitoring
  17. 17. What can fail? Segment failures (disks) Node failures (machines) AZ failures (network or datacenter) Optimizations 4 out of 6 write quorum 3 out of 6 read quorum Peer-to-peer replication for repairs SQL Transaction AZ 1 AZ 2 AZ 3 Caching Amazon Aurora Storage Engine Fault-tolerance SQL Transaction AZ 1 AZ 2 AZ 3 Caching
  18. 18. Amazon Aurora Replicas Availability Failing database nodes are automatically detected and replaced Failing database processes are automatically detected and recycled Replicas are automatically promoted to primary if needed (failover) Customer specifiable fail-over order AZ 1 AZ 3AZ 2 Primary Node Primary Node Primary Database Node Primary Node Primary Node Read Replica Primary Node Primary Node Read Replica Database and Instance Monitoring Performance Customer applications can scale out read traffic across read replicas Read balancing across read replicas
  19. 19. Faster, more predictable failover with Amazon Aurora App RunningFailure Detection DNS Propagation Recovery Database Failure Amazon RDS for PostgreSQL is good: failover times of ~60 seconds Replica-Aware App Running Failure Detection DNS Propagation Recovery Database Failure Amazon Aurora is better: failover times < 30 seconds 1 5 - 2 0 s e c 3 - 1 0 s e c App Running 1 5 - 2 0 s e c 3 0 - 4 0 s e c
  20. 20. Amazon Aurora Continuous Backup Segment snapshot Log records Recovery point Segment 1 Segment 2 Segment 3 Time • Take periodic snapshot of each segment in parallel; stream the logs to Amazon S3 • Backup happens continuously without performance or availability impact • At restore, retrieve the appropriate segment snapshots and log streams from S3 to storage nodes • Apply log streams to segment snapshots in parallel and asynchronously
  21. 21. Traditional databases Have to replay logs since the last checkpoint Typically 5 minutes between checkpoints Single-threaded in MySQL and PostgreSQL; requires a large number of disk accesses Amazon Aurora No replay at startup because storage system is transaction-aware Underlying storage replays log records continuously, whether in recovery or not Coalescing is parallel, distributed, and asynchronous Checkpointed Data Log Crash at T0 requires a re-application of the SQL in the log since last checkpoint T0 T0 Crash at T0 will result in logs being applied to each segment on demand, in parallel, asynchronously Amazon Aurora Instant Crash Recovery
  22. 22. Customers Reduce Costs by Moving to Aurora Dow Jones’ first legacy on-prem database to Aurora migration was a high-profile workload that plays a critical role in engaging and retaining our customers. The migration allowed us to replace a legacy platform with performance challenges that required $400k in DBA staff to manage, with $1M in licensing costs annually, to a cloud-based, highly scalable and resilient solution. By moving much of the operational overhead to AWS, and eliminating the need to manage storage completely, Aurora frees up funds for innovation.” – Ramin Beheshti, Dow Jones Chief Product and Technology Officer
  23. 23. Amazon Aurora Performance Architecture
  24. 24. Do fewer IOs Minimize network packets Offload the database engine DO LESS WORK Process asynchronously Reduce latency path Use lock-free data structures Batch operations together BE MORE EFFICIENT How Does Amazon Aurora Achieve High Performance? DATABASES ARE ALL ABOUT IO NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING NEEDS CPU AND MEMORY OPTIMIZATIONS
  25. 25. Write IO Traffic in Amazon RDS for PostgreSQL WAL DATA COMMIT LOG & FILES RDS FOR POSTGRESQL WITH MULTI-AZ EBS mirrorEBS mirror AZ 1 AZ 2 Amazon S3 EBS Amazon Elastic Block Store (EBS) Primary Database Node Standby Database Node 1 2 3 4 5 Issue write to Amazon EBS, EBS issues to mirror, acknowledge when both done Stage write to standby instance Issue write to EBS on standby instance IO FLOW Steps 1, 3, 5 are sequential and synchronous This amplifies both latency and jitter Many types of writes for each user operation OBSERVATIONS T Y P E O F W R I T E
  26. 26. Write IO Traffic in an Amazon Aurora Database Node AZ 1 AZ 3 Primary Database Node Amazon S3 AZ 2 Read Replica / Secondary Node AMAZON AURORA ASYNC 4/6 QUORUM DISTRIBUTED WRITES DATAAMAZON AURORA + WAL LOG COMMIT LOG & FILES IO FLOW Only write WAL records; all steps asynchronous No data block writes (checkpoint, cache replacement) 6X more log writes, but 9X less network traffic Tolerant of network and storage outlier latency OBSERVATIONS 2x or better PostgreSQL Community Edition performance on write-only or mixed read-write workloads PERFORMANCE Boxcar log records – fully ordered by LSN Shuffle to appropriate segments – partially ordered Boxcar to storage nodes and issue writes WAL T Y P E O F W R I T E Read Replica / Secondary Node
  27. 27. Write IO Traffic in an Amazon Aurora Storage Node LOG RECORDS Primary Database Node INCOMING QUEUE STORAGE NODE AMAZON S3 BACKUP 1 2 3 4 5 6 7 8 UPDATE QUEUE ACK HOT LOG DATA BLOCKS POINT IN TIME SNAPSHOT GC SCRUB COALESCE SORT GROUP PEER TO PEER GOSSIPPeer Storage Nodes All steps are asynchronous Only steps 1 and 2 are in foreground latency path Input queue is far smaller than standard PostgreSQL Favors latency-sensitive operations Uses disk space to buffer against spikes in activity OBSERVATIONS IO FLOW ① Receive record and add to in-memory queue ② Persist record and acknowledge ③ Organize records and identify gaps in log ④ Gossip with peers to fill in holes ⑤ Coalesce log records into new data block versions ⑥ Periodically stage log and new block versions to Amazon S3 ⑦ Periodically garbage collect old versions ⑧ Periodically validate CRC codes on blocks
  28. 28. IO traffic in Aurora Replicas PAGE CACHE UPDATE Aurora Master 30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage PostgreSQL Master 30% Read 70% Write PostgreSQL Replica 30% New Reads 70% Write SINGLE-THREADED WAL APPLY Data Volume Data Volume Physical: Ship redo (WAL) to Replica Write workload similar on both instances Independent storage Physical: Ship redo (WAL) from Master to Replica Replica shares storage. No writes performed Cached pages have redo applied Advance read view when all commits seen POSTGRESQL READ SCALING AMAZON AURORA READ SCALING
  29. 29. Applications Restart Faster With Survivable Caches Cache normally lives inside the operating system database process– and goes away when/if that database dies Aurora moves the cache out of the database process Cache remains warm in the event of a database restart Lets the database resume fully-loaded operations much more quickly SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching RUNNING CRASH AND RESTART RUNNING
  30. 30. Customers Are Migrating from Oracle to Aurora PostgreSQL We are undergoing a major transformation in our database management approach, moving away from expensive, legacy commercial database solutions to more efficient and cost-effective options. Amazon Aurora PostgreSQL showed better performance over standard PostgreSQL residing on Amazon EC2 instances. - Shashidhar Sureban, Associate Director, Database Engineering, Verizon* *Presented at re:Invent 2017: How Verizon is Adopting the Amazon Aurora PostgreSQL-compatible (DAT332)
  31. 31. Amazon Aurora Performance vs. PostgreSQL
  32. 32. System Configuration PostgreSQL – Single AZ, no backup Amazon Aurora EBS EBS EBS 60,000 total IOPS AZ 1 AZ 2 AZ 3 Amazon S3 r4.16xlarge database instance Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node r4.8xlarge client driver r4.16xlarge database instance r4.8xlarge client driver r4.8xlarge client driver Amazon S3 Amazon RDS r4.16xlarge 30,000 IOPS
  33. 33. PgBench: Amazon Aurora is up to 3x Faster Running the standard pgbench benchmark, Amazon Aurora delivers 1.6x the peak throughput of PostgreSQL and 2.9x at high client counts 0 5 10 15 20 25 30 35 40 45 1 2 8 2 5 6 5 1 2 7 6 8 1 0 2 4 1 2 8 0 1 5 3 6 1 7 9 2 2 0 4 8 Throughput(tps,thousands) Number of clients pgbench tpcb-like throughput, 150 GiB PostgreSQL (Single AZ) Amazon Aurora (Three AZs) 1.6x 2.9x
  34. 34. Sysbench: Amazon Aurora is 2x–5x Faster 0 20 40 60 80 100 120 140 2 5 6 5 1 2 7 6 8 1 0 2 4 1 2 8 0 1 5 3 6 1 7 9 2 2 0 4 8 2 3 0 5 2 5 6 0 writes/second,thousands Number of clients sysbench write-only 30 GiB PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup) 2.2x 5.3x 2.7x Running the standard sysbench benchmark, Amazon Aurora delivers >2x the absolute peak of PostgreSQL and 5x at high client counts.
  35. 35. Amazon Aurora is 4x Faster at Large Scale Scales from 1.8x to 4.4x better as database grows from 10GiB to 100GiB 74 49 30 136 134 131 0 20 40 60 80 100 120 140 160 10 GiB 30 GiB 100 GiB writes/second,inthousands Database size sysbench write-only PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup) 4.4x1.8x 2.8x
  36. 36. Amazon Aurora Loads Data Faster Database initialization is >2x faster than PostgreSQL using the standard pgbench benchmark Copy In Copy In Vacuum Vacuum Index Build Index Build 0 500 1000 1500 2000 2500 3000 3500 4000 PostgreSQL Amazon Aurora Runtime (seconds) pgbench initialization, scale 10000 (150 GiB) 86% reduction in vacuum time
  37. 37. Amazon Aurora provides >2x Faster Response Times Response time under heavy write load >2x faster than PostgreSQL and variance reduced by 99% 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Responsetime,ms Minutes SYSBENCH RESPONSE TIME (p95), 30 GiB, 1024 CLIENTS PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
  38. 38. Amazon Aurora has more Consistent Throughput While running pgbench at load, throughput is 3x more consistent than PostgreSQL 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 10 15 20 25 30 35 40 45 50 55 60 Throughput,tps Minutes pgbench throughput over time, 150 GiB, 1024 clients PostgreSQL (Single AZ) Amazon Aurora (Three AZs)
  39. 39. Amazon Aurora Reduced Recovery Time Up to 97% Transaction-aware storage system recovers almost instantly 3 GiB redo recovered in 19 seconds 10 GiB redo recovered in 50 seconds 30 GiB redo recovered in 123 seconds Amazon Aurora has no redo. Recovered in 3 seconds while maintaining signficantly greater throughput. 0 20 40 60 80 100 120 140 160 0 20000 40000 60000 80000 100000 120000 140000 Recoverytimeinseconds(lessisbetter) writes / second (more is better) Recovery time from crash under load Bubble size represents redo log which must be recovered As PostgreSQL throughput goes up, so does log size and crash recovery time.
  40. 40. Amazon Aurora Performance By The Numbers Measurement Result PgBench Up to 2.9x faster Sysbench 2x-5x faster Vacuum Time reduced up to 86% Response time >2x lower, 99% < variance Throughput jitter 3x more consistent Throughput at scale 4x faster Recovery speed Time reduced up to 97%
  41. 41. Enterprise Software Vendors Are Moving to Aurora We are excited to announce the availability of Remedy ITSM on AWS Cloud. Our customers can now benefit from best-in- class cloud service, 3 times faster installation time, and lower cost of ownership by supporting migration to Aurora PostgreSQL. - Nayaki Nayyar, President of Digital Service Management BU, BMC
  42. 42. Amazon Aurora Performance Monitoring and Management
  43. 43. First Step: Enhanced Monitoring Released 2016 O/S Metrics Process & thread List Up to 1-second granularity
  44. 44. Next Step: Performance Insights Why Database Tuning? RDS is all about managed databases Customers also want performance managed: q Want easy tool for optimizing cloud database workloads q May not have deep tuning expertise à Want a single pane of glass to achieve this
  45. 45. Beyond Database Load: Other Performance Insights Features • Lock detection • Execution plans • API access • Up to 2 years data retention • Free tier available • Support for all RDS database engines in 2018
  46. 46. Additional Resources • Demo Video: • Documentation: de/USER_PerfInsights.html • Blog database-workload-with-performance-insights • Marketing page:
  47. 47. Amazon Aurora Getting Your Data In
  48. 48. AWS Database Migration Service
  49. 49. What are DMS and SCT? AWS Database Migration Service (DMS) easily and securely migrates and/or replicate your databases and data warehouses to AWS AWS Schema Conversion Tool (SCT) converts your commercial database and data warehouse schemas to open-source engines or AWS-native services, such as Amazon Aurora and Redshift
  50. 50. When to use SCT? Modernize Modernize your database tier Modernize and Migrate your Data Warehouse to Amazon Redshift Amazon Aurora Amazon Redshift
  51. 51. When to use DMS*? Migrate • Migrate business-critical applications • Migrate from Classic to VPC • Migrate data warehouse to Redshift • Upgrade to a minor version • Consolidate shards into Aurora • Archive old data • Migrate from NoSQL to SQL, SQL to NoSQL or NoSQL to NoSQL Targets Amazon Dynamo DB Amazon Redshift Amazon S3 Amazon Aurora *DMS is a HIPAA certified service Amazon S3 Sources
  52. 52. Customers Are Migrating to Aurora with Amazon Database Migration Service “We are an early adopter of Amazon Aurora PostgreSQL and used the AWS Database Migration Service to transition TIBCO Cloud Live Apps to Amazon Aurora seamlessly, while it was in production, without our customers noticing any interruption in our service. Amazon Aurora's reliability, security, and fast failover will continue to help us scale Live Apps, giving customers constant access to our service so they can build and run apps quickly and with high availability.” - Matt Quinn, Chief Operating Officer, TIBCO
  53. 53. High Performance Easy to Operate & Compatible High Availability Secure by Design Amazon Aurora with PostgreSQL compatibility ü 2x-3x more throughput than PostgreSQL ü Up to 64 TB of storage per instance ü Write jitter reduction ü Near synchronous replicas ü Reader endpoint ü Enhanced OS monitoring ü Performance Insights ü Push button migration ü Auto-scaling storage ü Continuous backup and PITR ü Easy provisioning / patching ü All PostgreSQL features ü All RDS for PostgreSQL extensions ü AWS DMS supported inbound ü Failover in less than 30 seconds ü Customer specifiable failover order ü Up to 15 readable failover targets ü Instant crash recovery ü Survivable buffer cache ü X-region snapshot copy ü Encryption at rest (AWS KMS) ü Encryption in transit (SSL) ü Amazon VPC by default ü Row Level Security
  54. 54. Amazon Aurora Available Durable The Amazon Aurora Database Family AWS DMS Amazon RDS AWS IAM, KMS & VPC Amazon S3 Convenient Compatible Automatic Failover Read Replicas X 6 Copies High Performance & Scale Secure Encryption at rest and in transit Enterprise Performance 64TB Storage PostgreSQL MySQL
  55. 55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You Steven Bryen | AWS Technical Evangelist @steven_bryen