SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
DynamoDB: Data Example
     userId    date          value    unlockedAchievments
     hadr-fb   18-07-2012    72       [’10 days’, ‘2 levels day’]
     hadr-fb   19-07-2012    1        None
     hadr-fb   20-07-2012    56789    [‘top 10 progress’]

                      Table: ‘Waldo-Scores’



        Id     platform     Name       JoinDate      Score
        hadr   fb           Hadrien    31-02-2011    10 457
        hadr   G+           Hadrien    18-07-2012    357
        pior   fb           Pior       12-12-2012    18 951

                          Table: ‘Players’
Data types (Lean. . . )

   Types

       single
             string (utf-8)
             number (entre 10-128 et 10+126 )
       set
             string (utf-8)
             number


   Constraints

       no “Embeded Documents”
       no complex types (dates, . . . )
Dimensionning 1/2: Big picture


   Units

       acces/s ∗ roundUp(kb) ∗ item
       provisionning
       updates are. . . constraining

   Storage

       tables are “elastic”
       64KB max per item
       overhead = 100o per item
Dimensionning 2/2: Traps and constraints
   TRAPS:

      Units are divided among each partition.
      Bigger tables often means higher throughput. Divide tables ?

   CONSTRAINTS for throughput:

      absolute
            min 5
            max 10 000
            1 single table in UPDATING state
      increase
            min 10%
            max 100%
      decrease
            min 10%
            max once a day
Integrated Service 1/3: IAM

        API level
        table level (except for “ListTables”)

   Example: “Fair” Scores table use

   {
       "Statement":[{
          "Effect":"Allow",
          "Action":["DynamoDB:DeleteItem", "DynamoDB:PutItem",
            "DynamoDB:UpdateItem", "DynamoDB:GetItem",
            "DynamoDB:Query"],
          "Resource":
            "arn:aws:DynamoDB:<region>:<account>:table/Scores"
       }]
   }
Integrated Service 2/3: CloudWatch
   Metrics:

       SuccessfulRequestLatency
       UserErrors
       SystemErrors
       ThrottledRequests
       ConsumedReadCapacityUnits
       ConsumedWriteCapacityUnits
       ReturnedItemCount

   Metric’s context

       Table
       Operation ({Put, Delete, Update, Get, BatchGet}Item, Scan,
       Query)
Integrated Service 3/3: EMR



       out of the scope of this presentation
       basically, HIVE integrated with DynamoDB => HiveQL

   use cases:

       custom index generation
       export to S3 (backup, data removal)
       data analysis / aggregation
Data access 1/3: GetItem


      Fastest: primary key(s)
      0-1 item
      Cost = 1 unit

  Example : ‘Hadrien’ Player of ‘fb’ platform

  table = conn.get_table(’Players’)
  item = table.get_item(
          hash_key=’hadr’,
          range_key=’fb’
      )
Data access 2/3: Query

       Fast
              primary key
              range key conditions =, <, >, <=, >=, startsWith
       0+ item(s)
       Cost = 1 unit per returned item

   Example : All ‘Waldo-Scores’ of ‘hadr-fb’ Player

   table = conn.get_table(’Waldo-Scores’)
   item = table.get_item(
           hash_key=’hadr-fb’,
           #range_key_condition=
       )
Data access 3/3: Scan

       Slooooow
            filter on any key
            tests ALL the table !
       0+ item(s)
       Cost = 1/2 unit for each parsed KB ! => Starvations
       Use case: get a full (small) table. Ex: ‘powerups’

   Example : All days where ‘hadr-*’ did better than 100

   table = conn.get_table(’Waldo-Scores’)
   item = table.get_item(
           scan_filter={
               ’userId’: BEGINSWITH(’hadr-’),
               ’value’: GT(100)
           })
Performance considerations: non indexed data 1/2




   De-normalisation

       Ex: Waldo and Players table :)
       big picture: data duplication to fit the
           view point
           need
Performance considerations: non indexed data 2/2

   Scan

       sloooooow (sequential)
       (bad) unit consumption (sequential)


   EMR

       scales (less slow :p)
       (better) units consumption (parallele)


   TL;DR
   Index your data !
Eventual vs strong consitence



      write => propagation ∼ 1s
      read => may not be up to date . . .


       Consistence   Applications   Cost (Units)   performance
       strong        critical       1 per KB       good
       eventual      aware          1/2 per KB     maximal
Critical/specific applications


   Redundancy/backup

       managed => no need
       “∼ Snapshot” => EMR + S3

   ∼ Transactions

       conditional operations (idempotent)
       atomic counter (idempotent BUT strong consistence)
API 1/3: Read


   Method           Consistence        Description      Returns
   GetItem          eventual/strong    load by key      0-1 item
   BatchGetItem     eventual/strong    same //          0-100 item, 0-1MB
   Query            eventual/strong    rangeKey filter   0+ item, 0-1MB
   Scan             eventual           any key filter    0+ item, 0-1MB


      rule: 0-1 filter / eligible key
      unprocessed => ‘UnprocessedKeys’, ‘LastEvaluatedKey’
      consumed units => ‘ConsumedCapacityUnits’
      enforce strong consistence => ‘ConsistentRead’
API 2/3: Edit



    Method           Consistence      Condition   Changes
    PutItem          create-replace   yes         1 item
    DeleteItem       supprime         yes         1 item, 0-1MB
    BatchWriteItem   create-up-del    no          1-25 item
    UpdateItem       create-up-del    yes         1+ field, 1 item


      not processed / failure => ‘UnprocessedItems’
      condition failed => ‘ConditionalCheckFailed’
API 3/3: Structure



    Method          Asynchronous   Description
    CreateTable     yes            Create table - provision units
    DeleteTable     yes            self explanatory
    DescribeTable   no             Read size, status, throughput
    ListTables      no             Get tables starting with “. . . ”
    UpdateTables    yes            Update provisions


      “DELETING” table might answer requests until deleted
TL;DR Let’s make it short :)




      Amazon
          scalable
          fully integrated
      Constraints
          throughput provisioning
          index matters

Más contenido relacionado

La actualidad más candente

Parameterization is nothing but giving multiple input
Parameterization is nothing but giving multiple inputParameterization is nothing but giving multiple input
Parameterization is nothing but giving multiple input
uanna
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
Deependra Ariyadewa
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
Bruce McPherson
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
DataStax Academy
 

La actualidad más candente (20)

Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Spring data ii
Spring data iiSpring data ii
Spring data ii
 
An Introduction To PostgreSQL Triggers
An Introduction To PostgreSQL TriggersAn Introduction To PostgreSQL Triggers
An Introduction To PostgreSQL Triggers
 
Lab1-DB-Cassandra
Lab1-DB-CassandraLab1-DB-Cassandra
Lab1-DB-Cassandra
 
Parameterization is nothing but giving multiple input
Parameterization is nothing but giving multiple inputParameterization is nothing but giving multiple input
Parameterization is nothing but giving multiple input
 
Dbabstraction
DbabstractionDbabstraction
Dbabstraction
 
15 MySQL Basics #burningkeyboards
15 MySQL Basics #burningkeyboards15 MySQL Basics #burningkeyboards
15 MySQL Basics #burningkeyboards
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
 
MongoDB Quick Reference Card
MongoDB Quick Reference CardMongoDB Quick Reference Card
MongoDB Quick Reference Card
 
Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....
 
Pandas csv
Pandas csvPandas csv
Pandas csv
 
Python - Lecture 11
Python - Lecture 11Python - Lecture 11
Python - Lecture 11
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
 
laravel-53
laravel-53laravel-53
laravel-53
 
(E Book) Asp .Net Tips, Tutorials And Code
(E Book) Asp .Net Tips,  Tutorials And Code(E Book) Asp .Net Tips,  Tutorials And Code
(E Book) Asp .Net Tips, Tutorials And Code
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
 
Market Basket Analysis Algorithm with no-SQL DB HBase and Hadoop
Market Basket Analysis Algorithm with no-SQL DB HBase and Hadoop Market Basket Analysis Algorithm with no-SQL DB HBase and Hadoop
Market Basket Analysis Algorithm with no-SQL DB HBase and Hadoop
 
Sparklyr
SparklyrSparklyr
Sparklyr
 
16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboards16 MySQL Optimization #burningkeyboards
16 MySQL Optimization #burningkeyboards
 
Functional streams with Kafka - A comparison between Akka-streams and FS2
Functional streams with Kafka - A comparison between Akka-streams and FS2Functional streams with Kafka - A comparison between Akka-streams and FS2
Functional streams with Kafka - A comparison between Akka-streams and FS2
 

Destacado

Pruitt caleb visual_resumestoryboard
Pruitt caleb visual_resumestoryboardPruitt caleb visual_resumestoryboard
Pruitt caleb visual_resumestoryboard
Cørey Prüitt
 
Insomnia
InsomniaInsomnia
Insomnia
aiaina
 
Establishing safety event analysis team seat turned ordinary people in to cha...
Establishing safety event analysis team seat turned ordinary people in to cha...Establishing safety event analysis team seat turned ordinary people in to cha...
Establishing safety event analysis team seat turned ordinary people in to cha...
Krishnan Sankara Narayanan MS, MBA, CPHQ, FASHRM, LHRM
 
Insomnia Solution
Insomnia SolutionInsomnia Solution
Insomnia Solution
aiaina
 
Vadodara trafficeducationtrust teamtist_2112012
Vadodara trafficeducationtrust teamtist_2112012Vadodara trafficeducationtrust teamtist_2112012
Vadodara trafficeducationtrust teamtist_2112012
Nidhin Krishnakumar
 

Destacado (7)

Pruitt caleb visual_resumestoryboard
Pruitt caleb visual_resumestoryboardPruitt caleb visual_resumestoryboard
Pruitt caleb visual_resumestoryboard
 
Insomnia
InsomniaInsomnia
Insomnia
 
Establishing safety event analysis team seat turned ordinary people in to cha...
Establishing safety event analysis team seat turned ordinary people in to cha...Establishing safety event analysis team seat turned ordinary people in to cha...
Establishing safety event analysis team seat turned ordinary people in to cha...
 
Insomnia Solution
Insomnia SolutionInsomnia Solution
Insomnia Solution
 
Rivera no life pdf
Rivera no life pdfRivera no life pdf
Rivera no life pdf
 
Vadodara trafficeducationtrust teamtist_2112012
Vadodara trafficeducationtrust teamtist_2112012Vadodara trafficeducationtrust teamtist_2112012
Vadodara trafficeducationtrust teamtist_2112012
 
Resume
ResumeResume
Resume
 

Similar a Dynamodb

Similar a Dynamodb (20)

CR17 - Designing a database like an archaeologist
CR17 - Designing a database like an archaeologistCR17 - Designing a database like an archaeologist
CR17 - Designing a database like an archaeologist
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
Om nom nom nom
Om nom nom nomOm nom nom nom
Om nom nom nom
 
Tues 115pm cassandra + s3 + hadoop = quick auditing and analytics_yazovskiy
Tues 115pm cassandra + s3 + hadoop = quick auditing and analytics_yazovskiyTues 115pm cassandra + s3 + hadoop = quick auditing and analytics_yazovskiy
Tues 115pm cassandra + s3 + hadoop = quick auditing and analytics_yazovskiy
 
Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce
Polyglot Persistence in the Real World: Cassandra + S3 + MapReducePolyglot Persistence in the Real World: Cassandra + S3 + MapReduce
Polyglot Persistence in the Real World: Cassandra + S3 + MapReduce
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Polyglot parallelism
Polyglot parallelismPolyglot parallelism
Polyglot parallelism
 
AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
Designing a database like an archaeologist
Designing a database like an archaeologistDesigning a database like an archaeologist
Designing a database like an archaeologist
 
Dbms
DbmsDbms
Dbms
 
Talk about Testing at vienna.rb meetup #2 on Apr 12th, 2013
Talk about Testing at vienna.rb meetup #2 on Apr 12th, 2013Talk about Testing at vienna.rb meetup #2 on Apr 12th, 2013
Talk about Testing at vienna.rb meetup #2 on Apr 12th, 2013
 
Amazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB DayAmazon Dynamo DB for Developers (김일호) - AWS DB Day
Amazon Dynamo DB for Developers (김일호) - AWS DB Day
 
What's Coming Next in Sencha Frameworks
What's Coming Next in Sencha FrameworksWhat's Coming Next in Sencha Frameworks
What's Coming Next in Sencha Frameworks
 
CIS-166 Midterm
CIS-166 MidtermCIS-166 Midterm
CIS-166 Midterm
 
Midterm Winter 10
Midterm  Winter 10Midterm  Winter 10
Midterm Winter 10
 
IBM Informix dynamic server 11 10 Cheetah Sql Features
IBM Informix dynamic server 11 10 Cheetah Sql FeaturesIBM Informix dynamic server 11 10 Cheetah Sql Features
IBM Informix dynamic server 11 10 Cheetah Sql Features
 
PyCon SG x Jublia - Building a simple-to-use Database Management tool
PyCon SG x Jublia - Building a simple-to-use Database Management toolPyCon SG x Jublia - Building a simple-to-use Database Management tool
PyCon SG x Jublia - Building a simple-to-use Database Management tool
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Sequel
SequelSequel
Sequel
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Dynamodb

  • 1. DynamoDB: Data Example userId date value unlockedAchievments hadr-fb 18-07-2012 72 [’10 days’, ‘2 levels day’] hadr-fb 19-07-2012 1 None hadr-fb 20-07-2012 56789 [‘top 10 progress’] Table: ‘Waldo-Scores’ Id platform Name JoinDate Score hadr fb Hadrien 31-02-2011 10 457 hadr G+ Hadrien 18-07-2012 357 pior fb Pior 12-12-2012 18 951 Table: ‘Players’
  • 2. Data types (Lean. . . ) Types single string (utf-8) number (entre 10-128 et 10+126 ) set string (utf-8) number Constraints no “Embeded Documents” no complex types (dates, . . . )
  • 3. Dimensionning 1/2: Big picture Units acces/s ∗ roundUp(kb) ∗ item provisionning updates are. . . constraining Storage tables are “elastic” 64KB max per item overhead = 100o per item
  • 4. Dimensionning 2/2: Traps and constraints TRAPS: Units are divided among each partition. Bigger tables often means higher throughput. Divide tables ? CONSTRAINTS for throughput: absolute min 5 max 10 000 1 single table in UPDATING state increase min 10% max 100% decrease min 10% max once a day
  • 5. Integrated Service 1/3: IAM API level table level (except for “ListTables”) Example: “Fair” Scores table use { "Statement":[{ "Effect":"Allow", "Action":["DynamoDB:DeleteItem", "DynamoDB:PutItem", "DynamoDB:UpdateItem", "DynamoDB:GetItem", "DynamoDB:Query"], "Resource": "arn:aws:DynamoDB:<region>:<account>:table/Scores" }] }
  • 6. Integrated Service 2/3: CloudWatch Metrics: SuccessfulRequestLatency UserErrors SystemErrors ThrottledRequests ConsumedReadCapacityUnits ConsumedWriteCapacityUnits ReturnedItemCount Metric’s context Table Operation ({Put, Delete, Update, Get, BatchGet}Item, Scan, Query)
  • 7. Integrated Service 3/3: EMR out of the scope of this presentation basically, HIVE integrated with DynamoDB => HiveQL use cases: custom index generation export to S3 (backup, data removal) data analysis / aggregation
  • 8. Data access 1/3: GetItem Fastest: primary key(s) 0-1 item Cost = 1 unit Example : ‘Hadrien’ Player of ‘fb’ platform table = conn.get_table(’Players’) item = table.get_item( hash_key=’hadr’, range_key=’fb’ )
  • 9. Data access 2/3: Query Fast primary key range key conditions =, <, >, <=, >=, startsWith 0+ item(s) Cost = 1 unit per returned item Example : All ‘Waldo-Scores’ of ‘hadr-fb’ Player table = conn.get_table(’Waldo-Scores’) item = table.get_item( hash_key=’hadr-fb’, #range_key_condition= )
  • 10. Data access 3/3: Scan Slooooow filter on any key tests ALL the table ! 0+ item(s) Cost = 1/2 unit for each parsed KB ! => Starvations Use case: get a full (small) table. Ex: ‘powerups’ Example : All days where ‘hadr-*’ did better than 100 table = conn.get_table(’Waldo-Scores’) item = table.get_item( scan_filter={ ’userId’: BEGINSWITH(’hadr-’), ’value’: GT(100) })
  • 11. Performance considerations: non indexed data 1/2 De-normalisation Ex: Waldo and Players table :) big picture: data duplication to fit the view point need
  • 12. Performance considerations: non indexed data 2/2 Scan sloooooow (sequential) (bad) unit consumption (sequential) EMR scales (less slow :p) (better) units consumption (parallele) TL;DR Index your data !
  • 13. Eventual vs strong consitence write => propagation ∼ 1s read => may not be up to date . . . Consistence Applications Cost (Units) performance strong critical 1 per KB good eventual aware 1/2 per KB maximal
  • 14. Critical/specific applications Redundancy/backup managed => no need “∼ Snapshot” => EMR + S3 ∼ Transactions conditional operations (idempotent) atomic counter (idempotent BUT strong consistence)
  • 15. API 1/3: Read Method Consistence Description Returns GetItem eventual/strong load by key 0-1 item BatchGetItem eventual/strong same // 0-100 item, 0-1MB Query eventual/strong rangeKey filter 0+ item, 0-1MB Scan eventual any key filter 0+ item, 0-1MB rule: 0-1 filter / eligible key unprocessed => ‘UnprocessedKeys’, ‘LastEvaluatedKey’ consumed units => ‘ConsumedCapacityUnits’ enforce strong consistence => ‘ConsistentRead’
  • 16. API 2/3: Edit Method Consistence Condition Changes PutItem create-replace yes 1 item DeleteItem supprime yes 1 item, 0-1MB BatchWriteItem create-up-del no 1-25 item UpdateItem create-up-del yes 1+ field, 1 item not processed / failure => ‘UnprocessedItems’ condition failed => ‘ConditionalCheckFailed’
  • 17. API 3/3: Structure Method Asynchronous Description CreateTable yes Create table - provision units DeleteTable yes self explanatory DescribeTable no Read size, status, throughput ListTables no Get tables starting with “. . . ” UpdateTables yes Update provisions “DELETING” table might answer requests until deleted
  • 18. TL;DR Let’s make it short :) Amazon scalable fully integrated Constraints throughput provisioning index matters