Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Amazon DynamoDB Deep Dive Advanced Design Patterns for DynamoDB (DAT401) - AWS reInvent 2018.pdf

6.610 visualizaciones

Publicado el

This session is for those who already have some familiarity with DynamoDB. The patterns and data models discussed in this session summarize a collection of implementations and best practices leveraged by Amazon.com to deliver highly scalable solutions for a wide variety of business problems. The session also covers strategies for global secondary index sharding and index overloading, scalable graph processing with materialized queries, relational modeling with composite keys, and executing transactional workflows on DynamoDB.

Amazon DynamoDB Deep Dive Advanced Design Patterns for DynamoDB (DAT401) - AWS reInvent 2018.pdf

  1. 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Advanced Design Patterns for DynamoDB Rick Houlihan Principal Technologist, NoSQL AWS D A T 4 0 1
  2. 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Brief History of Data Processing (Why NoSQL?) • Overview of DynamoDB • NoSQL Data Modeling • Normalized versus De-normalized schema • Common NoSQL Design Patterns • Composite Keys, Hierarchical Data, Relational Data • Modeling Real Applications
  3. 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. History of Data Processing “History repeats itself because nobody was listening the first time”
  4. 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Timeline of Database TechnologyDataPressure
  5. 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Technology Adoption and the Hype Curve
  6. 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Why NoSQL? Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale SQL NoSQL
  7. 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon DynamoDB Document or Key-Value Scales to Any WorkloadFully Managed NoSQL Access Control Event Driven ProgrammingFast and Consistent
  8. 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Table Table Items Attributes Partition Key Sort Key Mandatory Key-value access pattern Determines data distribution Optional Model 1:N relationships Enables rich query capabilities All items for key ==, <, >, >=, <= “begins with” “between” “contains” “in” sorted results counts top/bottom N values
  9. 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 00 55 A954 FFAA00 FF Partition Keys Partition Key uniquely identifies an item Partition Key is used for building an unordered hash index Allows table to be partitioned for scale Id = 1 Name = Jim Hash (1) = 7B Id = 2 Name = Andy Dept = Eng Hash (2) = 48 Id = 3 Name = Kim Dept = Ops Hash (3) = CD Key Space
  10. 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Partition 3 Partition:Sort Key Partition:Sort Key uses two attributes together to uniquely identify an Item Within unordered hash index, data is arranged by the sort key No limit on the number of items (∞) per partition key Except if you have local secondary indexes 00:0 FF:∞ Hash (2) = 48 Customer# = 2 Order# = 10 Item = Pen Customer# = 2 Order# = 11 Item = Shoes Customer# = 1 Order# = 10 Item = Toy Customer# = 1 Order# = 11 Item = Boots Hash (1) = 7B Customer# = 3 Order# = 10 Item = Book Customer# = 3 Order# = 11 Item = Paper Hash (3) = CD 55 A9:∞54:∞ AAPartition 1 Partition 2
  11. 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Partitions are three-way replicated Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Id = 2 Name = Andy Dept = Engg Id = 3 Name = Kim Dept = Ops Id = 1 Name = Jim Replica 1 Replica 2 Replica 3
  12. 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Local Secondary Index (LSI) Alternate sort key attribute Index is local to a partition key A1 (partition) A3 (sort) A2 (item key) A1 (partition) A2 (sort) A3 A4 A5 LSIs A1 (partition) A4 (sort) A2 (item key) A3 (projected) Table KEYS_ONLY INCLUDE A3 A1 (partition) A5 (sort) A2 (item key) A3 (projected) A4 (projected) ALL 10 GB max per partition key, i.e. LSIs limit the # of range keys!
  13. 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Global Secondary Index (GSI) Alternate partition and/or sort key Index is across all partition keys Use composite sort keys for compound indexes A1 (partition) A2 A3 A4 A5 A5 (partition) A4 (sort) A1 (item key) A3 (projected) INCLUDE A3 A4 (partition) A5 (sort) A1 (item key) A2 (projected) A3 (projected) ALL A2 (partition) A1 (itemkey) KEYS_ONLY GSIs Table RCUs/WCUs provisioned separately for GSIs Online indexing
  14. 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How Do GSI Updates Work? Table Primary table Primary table Primary table Primary table Global Secondary Index Client 2. Asynchronous update (in progress) If GSIs don’t have enough write capacity, table writes will be throttled!
  15. 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scaling NoSQL “We are stuck with technology when what we really want is just stuff that works.”
  16. 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What bad NoSQL looks like … Partition Time Heat
  17. 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting the most out of Amazon DynamoDB throughput “To get the most out of DynamoDB throughput, create tables where the partition key element has a large number of distinct values, and values are requested fairly uniformly, as randomly as possible.” —DynamoDB Developer Guide Space: access is evenly spread over the key-space Time: requests arrive evenly spaced in time
  18. 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Much better picture …
  19. 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Auto Scaling Throughput automatically adapts to your actual traffic With Auto ScalingWithout Auto Scaling
  20. 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. NoSQL Data Modeling “A ship in port is safe, but that’s not what ships were built for.”
  21. 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. It’s all about relationships… Document management Process controlSocial network Data treesIT monitoring
  22. 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SQL vs. NoSQL design pattern
  23. 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Selecting a Partition Key • Large number of distinct values • Items are uniformly requested and randomly distributed Selecting a Sort Key • Model 1:n and n:n relationships • Efficient/selective patterns • Query multiple entities • Leverage range queries AMAZON DYNAMODB - KEY CONCEPTS Examples: • Bad: Status, Gender • Good: CustomerId, DeviceId Examples: • Orders and OrderItems • Hierarchical relationships
  24. 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Understand the use case • Identify the access patterns • Read/Write workloads • Query dimensions and aggregations • Data-modeling • Avoid relational design patterns, use one table • Review -> Repeat -> Review TENETS OF NoSQL DATA MODELING
  25. 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Understand the use case • Identify the access patterns • Read/Write workloads • Query dimensions and aggregations • Data-modeling • Avoid relational design patterns, use one table • Review -> Repeat -> Review • Nature of the application • OLTP / OLAP / DSS • Define the Entity-Relationship Model • Identify Data Life Cycle • TTL, Backup/Archival, etc. TENETS OF NoSQL DATA MODELING
  26. 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Understand the use case • Define the access patterns • Read/Write workloads • Data-modeling • Avoid relational design patterns, use one table • Review -> Repeat -> Review • Identify data sources • Define query aggregations • Document all workflows TENETS OF NoSQL DATA MODELING
  27. 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Understand the use case • Identify the access patterns • Read/Write workloads • Query dimensions and aggregations • Data-modeling • Avoid relational design patterns, use one table • Review -> Repeat -> Review TENETS OF NoSQL DATA MODELING • 1 application service = 1 table • Reduce round trips • Simplify access patterns • Identify Primary Keys • How will items be inserted and read? • Overload items into partitions • Define indexes for secondary access patterns
  28. 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Understand the use case • Identify the access patterns • Read/Write workloads • Query dimensions and aggregations • Data-modeling • Avoid relational design patterns, use one table • Review -> Repeat-> Review TENETS OF NoSQL DATA MODELING
  29. 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Complex Queries “Computers are useless. They can only give you answers.”
  30. 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB Streams and AWS Lambda
  31. 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Triggers Lambda function Notify change Item/table level metrics Amazon CloudSearch Kinesis Firehose
  32. 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Composite Keys “Hierarchies are celestial. In hell all are equal.”
  33. 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Secondary index Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David BobPartition key Sort key Multi-value Sorts and Filters
  34. 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Secondary Index Approach 1: Query Filter Bob Opponent Date GameId Status Host Alice 2014-10-02 d9bl3 DONE David Carol 2014-10-08 o2pnb IN_PROGRESS Bob Bob 2014-09-30 72f49 PENDING Alice Bob 2014-10-03 b932s PENDING Carol Bob 2014-10-03 ef9ca IN_PROGRESS David SELECT * FROM Game WHERE Opponent='Bob' ORDER BY Date DESC FILTER ON Status='PENDING' (filtered out)
  35. 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Approach 2: Composite Key StatusDate DONE_2014-10-02 IN_PROGRESS_2014-10-08 IN_PROGRESS_2014-10-03 PENDING_2014-09-30 PENDING_2014-10-03 Status DONE IN_PROGRESS IN_PROGRESS PENDING PENDING Date 2014-10-02 2014-10-08 2014-10-03 2014-10-03 2014-09-30 + =
  36. 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Secondary Index Approach 2: Composite Key Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol Partition key Sort key
  37. 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Opponent StatusDate GameId Host Alice DONE_2014-10-02 d9bl3 David Carol IN_PROGRESS_2014-10-08 o2pnb Bob Bob IN_PROGRESS_2014-10-03 ef9ca David Bob PENDING_2014-09-30 72f49 Alice Bob PENDING_2014-10-03 b932s Carol Secondary index Approach 2: Composite Key Bob SELECT * FROM Game WHERE Opponent='Bob' AND StatusDate BEGINS_WITH 'PENDING'
  38. 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Advanced Data Modeling “The great myth of our times is that technology is communication.”
  39. 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How OLTP Apps Use Data  Mostly hierarchical structures  Entity driven workflows  Data spread across tables  Requires complex queries  Primary driver for ACID
  40. 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. ItemID (PK) Version (SK) CurVer Attrs 1 v0 2 … v1 … v2 … v3 … Maintaining Version History (Many more item partitions) Item versions Overwrite v0 Item to Commit changes COPY Item.v0 -> Item1.v3 IF Item.v3 == NULL UPDATE Item1.v3 SET Attr1 += 1 UPDATE Item1.v3 SET Attr2 = … UPDATE Item1.v3 SET Attr3 = … COPY Item1.v3 -> Item1.v0 SET CurVer = 3 Transaction
  41. 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Managing Relational Transactions
  42. 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB Transactions API • TransactWriteItems • Synchronous update, put, delete, and check • Atomic • Automated Rollbacks • Up to 10 items within a transaction • Supports multiple tables • Complex conditional checks • Good Use Cases • Commit changes across items • Conditional batch inserts/updates • Bad Use Case • Maintaining normalized data
  43. 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DynamoDB Table Schema
  44. 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reverse Lookup GSI
  45. 45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hierarchical Data Joins? We don’t need no stinkin’ joins!
  46. 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hierarchical Data Structures as Items • Use composite sort key to define a hierarchy • Highly selective queries with sort conditions • Reduce query complexity © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  47. 47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modeling Relational Data Dude, where’s my Lookup Table?
  48. 48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Modeling a Delivery Service – GetMeThat! The Business Model: Customers want things, we get them things. How it works: People are busy, they need stuff. GetMeThat! lets users create lists of stuff they need and have it dropped where and when they need it. • Download GetMeThat! • Browse Stuff • Get stuff • Tell us where to put your stuff
  49. 49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Entity Model
  50. 50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Access Patterns • Get Order(s) • by date range • by Customer • by Vendor • Get Order Details • Status • Items • Get Deliveries • by date range • by Driver • Get available drivers • by Geo and Time • Get Customer Data • Get Vendor Data
  51. 51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Relational Approach
  52. 52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The NoSQL Approach
  53. 53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The NoSQL Approach (Orders and Drivers GSI)
  54. 54. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The NoSQL Approach (Vendor and Deliveries GSI)
  55. 55. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A Real World Example “Reality is that which, when you stop believing in it, does not go away.”
  56. 56. Audible eBook Sync Service • Allows users to save session state for Audible eBooks • Maintains mappings per user for eBooks and audio products • Spikey load patterns require significant overprovisioning • Large number of access patterns © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  57. 57. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Access Patterns # USE CASE 1 CompanionMapping getCompanionMappingsByAsin 2 CompanionMapping getCompanionMappingsByEbookAndAudiobookContentId 3 CompanionMapping getCompanionMappingsFromCache 4 CompanionMapping getCompanionMappings 5 CompanionMapping getCompanionMappingsAvailable 6 AcrInfo getACRInfo 7 AcrInfo getACRs 8 AcrInfo getACRInfos 9 AcrInfo getACRInfosbySKU 10 AudioProduct getAudioProductsForACRs 11 AudioProduct getAudioProducts 12 AudioProduct deleteAudioProductsMatchingSkuVersions 13 AudioProduct getChildAudioProductsForSKU 14 Product getProductInfoByAsins 15 Product getParentChildDataByParentAsins 16 AudioFile getAudioFilesForACR 17 AudioFile getAudioFilesForChildACR 18 AudioFile getAudioFilesByParentAsinVersionFormat 19 AudioFile getAudioFiles 20 AudioFile getAudioFilesForChildAsin
  58. 58. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Primary Table T A B L E Primary Key Attributes PK SK (GSI 3) ABOOKACR1 v0#ABOOKACR1 GSI-1 GSI-2 ABOOK-ASIN1 ABOOK-SKU1 EBOOKACR1 GSI-1 GSI-2 SyncFileAcr ABOOK-ASIN1 ABOOKACR1#TRACK#1 GSI-1 GSI-2 ABOOK-ASIN1 ABOOK-SKU1 ABOOKACR1#TRACK#2 GSI-1 GSI-2 ABOOK-ASIN1 ABOOK-SKU1 EBOOKACR1 EBOOKACR1 GSI-1 EBookAsin EBOOK-SKU1 ASIN
  59. 59. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Indexes G S I 1 Partition Key Projected Attributes ABOOK-ASIN1 ABOOKACR1 ABOOKACR1-v1 ABOOKACR1#TRACK#1 ABOOKACR1#TRACK#2 SyncFileAcr ABOOKACR1 MAP-EBOOKACR1 EBOOK-SKU1 ABOOKACR1 EBOOKACR1 G S I 2 Partition Key Projected Attributes ABOOK-ASIN1 ABOOKACR1 MAP-EBOOKACR1 ABOOK-SKU1 ABOOKACR1 ABOOKACR1-v1 ABOOKACR1#TRACK#1 ABOOKACR1#TRACK#2 G S I 3 GSI Partition Key Projected Attributes V0#ABOOKACR1 ABOOKACR1 ABOOKACR1-v1 EBOOKACR1 ABOOKACR1 MAP-EBOOKACR1 ABOOKACR1#TRACK#1 ABOOKACR1 ABOOKACR1#TRACK#1 ABOOKACR1#TRACK#2 ABOOKACR1 ABOOKACR1#TRACK#2 EBOOKACR1 ABOOKACR1 EBOOKACR1
  60. 60. Query Conditions # USE CASE Lookup parameters INDEX Key Conditions Filter Conditions 1 CompanionMapping getCompanionMappingsByAsin audiobookAsin/ebookSku GSI2 GSI-2=ABOOK-ASIN1 None 2 CompanionMapping getCompanionMappingsByEbookAndA udiobookContentId ebookAcr/sku,version,format or audiobookAcr/asin,version,format GSI-3 on TargetACR attribute OR PrimaryKey on Table GSI-3=MAP-EBOOKACR1 version=v and format=f 3 CompanionMapping getCompanionMappingsFromCache ebookAcr/sku,version,format or audiobookAcr/asin,version,format GSI-3 on TargetACR attribute OR PrimaryKey on Table GSI-3=MAP-EBOOKACR1 version=v and format=f 4 CompanionMapping getCompanionMappings syncfileAcr, ebookAcr?, audiobookAcr? GSI1 GSI-1=SyncFileAcr None 5 CompanionMapping getCompanionMappingsAvailable ebookAcr, audiobookAcr Primary Key on Table Acr=ABOOKACR1 and TargetACR beginswith "MAP-" 6 AcrInfo getACRInfo acr Primary Key on Table Acr=ABOOKACR1 and TargetACR beginswith "ABOOKACR1- v" 7 AcrInfo getACRs acr / asin,version,format Primary Key on Table Acr=ABOOKACR1 version=v and format=f 8 AcrInfo getACRInfos acr Primary Key on table Acr=ABOOKACR1 and TargetACR beginswith "ABOOKACR1" 9 AcrInfo getACRInfosbySKU sku GSI2 GSI-2=ABOOK-SKU1 10 AudioProduct getAudioProductsForACRs acr Primary Key on table Acr=ABOOKACR1 and TargetACR beginswith "ABOOKACR1" 11 AudioProduct getAudioProducts sku, version, format GSI2 GSI-2=ABOOK-SKU1 version=v and format=f 12 AudioProduct deleteAudioProductsMatchingSkuVersi ons sku, version GSI2 GSI-2=ABOOK-SKU1 version=v 13 AudioProduct getChildAudioProductsForSKU sku GSI2 GSI-2=ABOOK-SKU1 14 Product getProductInfoByAsins asin GSI1 GSI-1=ABOOK-ASIN1 15 Product getParentChildDataByParentAsins asin GSI1 GSI-1=ABOOK-ASIN1 16 AudioFile getAudioFilesForACR acr Primary Key on table Acr=ABOOKACR1 and TargetACR beginswith "ABOOKACR1#" 17 AudioFile getAudioFilesForChildACR acr, parent_asin Primary Key on table Acr=ABOOKACR1 version=v and format=f 18 AudioFile getAudioFilesByParentAsinVersionForm at parent_asin, version, format GSI1 GSI-1=ABOOK-ASIN1 version=v and format=f 19 AudioFile getAudioFiles sku, version, format GSI2 GSI-2=ABOOK-SKU1 version=v and format=f 20 AudioFile getAudioFilesForChildAsin asin, parent_asin, version, format GSI1 GSI-1=ABOOK-ASIN1 version=v and format=f
  61. 61. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Serverless Paradigm “Fairly cheap home computing was what changed my life.”
  62. 62. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Elastic Serverless Applications
  63. 63. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Conclusions
  64. 64. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  65. 65. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

×