SlideShare a Scribd company logo
1 of 47
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
November 30, 2016
How DataXu Scaled Its Attribution
System to Handle Billions of Events
per Day with Amazon DynamoDB
Padma Malligarjunan, AWS
Yekesa Kosuru, DataXu
Rohit Dialani, DataXu
Agenda
• Benefits of NoSQL
• Fully managed features of Amazon DynamoDB
• DynamoDB integration with AWS services
• DataXu’s DynamoDB use case
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
SQL (Relational) vs. NoSQL (Non-relational)
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out
SQL (Relational) vs. NoSQL (Non-relational)
SQL (Relational)
Price Desc.
$11.50
$8.99
Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
ID
Type
1 Book
2 Album
3 Movie
Products
SQL (Relational)
Price Desc.
$11.50
$8.99
Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
ID
Type
1
2
3
Title Date
Harry
Potter…
2010
Book ID Author
1 JK Ro..
Books
Products
Book
Album
Movie
SQL (Relational)
Price Desc.
$11.50
$8.99
Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
ID
Type
1
2
3
Title Date
Harry
Potter…
2010
Book ID Author
1 JK Ro..
Books
Albums
Title
The Fox
Album
ID
Artist
2 Ylvis
Products
Book
Album
Movie
SQL (Relational)
Price Desc.
$11.50
$8.99
Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
ID
Type
1
2
3
Title Date
Harry
Potter…
2010
Book ID Author
1 JK Ro..
Books
Albums
Title
The Fox
Album
ID
Artist
2 Ylvis
Genre Director
Action
Zack
Snyder
Movie ID Title
3
Batman
vs Super
Movies
Products
Book
Album
Movie
SQL (Relational) vs. NoSQL (Non-relational)
Product
ID
Type
Harry
Potter..
JK
Rowling
1 Book ID
2 Album ID The Fox
3 Movie ID
Batman
vs Super
Ylvis
Attributes
Schema is defined per item
Items
Partition Key Sort Key
Price Desc.
$11.50
$8.99
Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
3
Movie ID:
Actor ID
Ben
Affleck
Action
2010
Zack
Snyder
Primary Key
Product
ID
Type
1
2
3
Title Date
Harry
Potter…
2010
Book ID Author
1 JK Ro.. Title
The Fox
Album
ID
Artist
2 Ylvis
Genre Director
Action
Zack
Snyder
Movie ID Title
3
Batman
vs Super
Products Products
Book
Album
Movie
Books
Albums
Movies
NoSQL design optimizes for
compute instead of storage
Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
Fully managed
Fast, consistent performance
Highly scalable
Flexible
Event-driven programming
Fine-grained access control
DynamoDB Benefits
Ad Tech Gaming MobileIoT Web
Scaling High-Velocity Use Cases with DynamoDB
Products
Product_Id
Table and Item API
Admin CRUD
Create Table Put/Get Item
Update Table Batch Put/Get Item
Delete Table Update Item
Describe Table Delete Item
Query
Scan
DynamoDB
Streams
ListStreams
DescribeStream
GetShardIterator
GetRecords
Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Highly durable
• Scale with table
24-hour lifetime
Sub-second latency
DynamoDB Streams
Stream
Table
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Table
Shard 1
Shard 2
Shard 3
Shard 4
KCL
Worker
KCL
Worker
KCL
Worker
KCL
Worker
Amazon Kinesis Client
Library application
DynamoDB
client application
Updates
DynamoDB Streams and
Amazon Kinesis Client Library
DynamoDB Streams and AWS Lambda
Triggers
Lambda function
Notify change
Derivative tables
Amazon Elasticsearch
Service
Amazon
ElastiCache
Analytics with
DynamoDB Streams
Collect and de-dupe data in DynamoDB
Aggregate data in memory and flush periodically
Performing real-time aggregation and
analytics
Reference architecture
DataXu’s DynamoDB Use Case
• Who is DataXu
• Attribution Use Case
• Why DynamoDB
• Deployment Architecture
• Capacity & Performance
• Tips & Lessons Learned
DataXu
• Who
• Spun out of MIT Labs
• A petabyte-scale digital
marketing platform
• One of the fastest growing
companies in Inc. 5000
• What
• Help world’s most
valuable brands
understand and engage
with their consumers
• Maximize ROI
Quick Statistics
• 2M+ bid requests per second
• Billions of impressions per
month, petabytes of data
• ~10ms round-trip response time
• 180+ TB logs per day
• 2 PB data analyzed
• 3000+ servers powering the
platform
• 13 regions, 24x7
Real Time Bidding
DataXu Reads and Writes on DynamoDB
X-Axis = Day
Y-Axis = Read/Write Capacity used
X-Axis = Time (6 hour intervals)
Y-Axis = Read/Write Capacity used
Attribution Use Case
Attribution is the science of allocating credit from an activity/sale to
the marketing touchpoints that a customer was exposed to prior to
the purchase/activity.
Attribution
Online
Purchase
Impression ClickImpression
Customer Journey
EI EventImpression A Activity
Generalized Event Chains
AI E I A
Time
• Billions of events and activities are organized into sequences.
• Events are correlated based on time and user to construct paths leading to an
activity.
EI Event
E
Impression A Activity
I E
Why DynamoDB
Why DynamoDB
• Managed Service
• Easy to use
• Elastic scaling, no need to overprovision
• API driven
• Fast & Predictable Performance (millisecs)
• Fast lookup/scan of user events
• Consistent & predictable read/write performance
• TCO
• Reasonable capex and no opex
Deployment Architecture
DataXu Flows
CDN
Real-Time
Bidding
Retargeting
Platform
Streams
(Amazon
Kinesis)
Advanced Analytics
(Third-Party)
Reporting Tools
(Third-Party)Machine
Learning
(Spark)
S3All Data
(Amazon
S3)
ETL (SPARK
SQL)
Attribution (MR)
Ecosystem of tools and services
Attribution Engine
Meta
Amazon
EMR
JobAmazon
Cloud
Watch
DynamoDB
AWS
Data
Pipeline
3rd
Party
S3
Buckets
1st
Party
AWS Direct
Connect
Amazon
VPC
Amazon
EC2
Amazon
RDS
Amazon SNS
AWS
IAM
Inside DynamoDB: Events Table
User Events
Table
Users Events_<month_1>
hash=userid
range=timestamp
<payload>
Put Item Events
Users Events_<month_2>
hash=userid
range=timestamp
<payload>
Property Value
Storage 25 TB
Avg. Record
Size
4 KB
1:N Relationship
Events Table Schema
(partition key) (sort key) (attributes)
Userid-1 epoch1 ..
Userid-1 epoch2 ..
userid timestamp payload
rLsWAQZU1C00TU5 1475624579321 <Binary compressed>
rLsWAQZU1C00TU5 1477762942692 <Binary compressed>
rLsWAQZU1C00TU5 1475624579695 <Binary compressed>
rLsWAQZU1C00TU5 1475624579703 <Binary compressed>
SS2U6KnX1BWziP5 1476829764673 <Binary compressed>
I
I
E
A
Capacity & Performance
R/W Operations vs. R/W Capacity Units
What influences capacity units for your table?
• Item size: Capacity unit size
• 4 KB per Read or 1 KB per Write
• Read/write request rate: Item Gets and Puts by your
Application
• Consistency: Strongly Consistent Read is counted double of
Eventually Consistent Read
• Local Secondary Index: Synchronized with the table
Capacity Planning: Unit of Scaling
• Partition:
• Storage: 10 GB per partition
• Compute: 3000 RCU or 1000 WCU per partition
• Partitions(for throughput) = (RCU/3000) + (WCU/1000)
• Partitions(for size) = Storage used in GB/10
• Total Number of Partitions =
Ceiling(MAX (Partitions(for throughput) , Partitions(for size)))
• e.g., Ceiling(Max(100/10, 9000/3000+3000/1000)) = 10
Capacity Examples
Storage Provisioned
RCU
Provisioned
WCU
Partitions Reads
/Sec
/Partition
Writes
/Sec
/Partition
35 GB* 1000 500 4 250 125
1000 GB* 1000 500 100 10 5
100 GB* 9000 3000 10 900 300
100 GB* 90K 30K 60 1500 500
100 GB** 9000 3000 10 450 60
* Item size of 1 KB or less
** Item size 5 KB
Throttling
100 GB 9000 3000 10 900 300
Storage Provisioned
RCU
Provisioned
WCU
Partitions Reads Per
Partition
Writes Per
Partition
900 Reads and 300 Writes Per Partition
Throttling kicks in > 900 R and 300 W
Partitions
How to Detect Throttling
Auto (Predictive) Scaling
Reduce TCO by tuning throughput to match usage
0.01% of the runs
Tips & Lessons Learned
Design Tips
• Understand Scaling
• Understand Hot Keys/Throttling
• Capture Application Metrics
• Configure Table Alarms
• Application Tuning for Outliers
• Retry w/Backoff
• DynamoDB Best Practices
• http://docs.aws.amazon.com/amazondynamodb/latest/developergui
de/BestPractices.html
• AWS Service Limits
• http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html
Lessons Learned
• Reduce RCU and WCU
• Combined Reads and Writes, Batch API
• Combined multiple rows that share the same hash key to the
same row (3X less puts)
• LZ4 compression
• How do we handle Deletes?
• Table rotation to match attribution windows
• Drop entire table when it is no longer necessary
Lessons Learned
• Dynamic scaling to large number of partitions takes time
• Debugging
• Application logging/metrics
• TCP dumps
• Turn on Request ID logging
• CloudWatch
• Local DynamoDB for testing
• http://docs.aws.amazon.com/amazondynamodb/latest/develo
perguide/Tools.DynamoDBLocal.html
Thank you!
ykosuru@dataxu.com
rdialani@dataxu.com
We are hiring!
https://www.dataxu.com/careers
Remember to complete
your evaluations!

More Related Content

What's hot

SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAmazon Web Services
 
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...Amazon Web Services
 
Building Your First Big Data Application on AWS
Building Your First Big Data Application on AWSBuilding Your First Big Data Application on AWS
Building Your First Big Data Application on AWSAmazon Web Services
 
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...Amazon Web Services
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAmazon Web Services
 
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...Amazon Web Services
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...Amazon Web Services
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysSmartNews, Inc.
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionAmazon Web Services
 
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...Amazon Web Services
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...Amazon Web Services
 
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...Amazon Web Services
 
The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017Amazon Web Services
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleAmazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 

What's hot (20)

SRV408 Deep Dive on AWS IoT
SRV408 Deep Dive on AWS IoTSRV408 Deep Dive on AWS IoT
SRV408 Deep Dive on AWS IoT
 
SEC301 Security @ (Cloud) Scale
SEC301 Security @ (Cloud) ScaleSEC301 Security @ (Cloud) Scale
SEC301 Security @ (Cloud) Scale
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...
BDA403 The Visible Network: How Netflix Uses Kinesis Streams to Monitor Appli...
 
Building Your First Big Data Application on AWS
Building Your First Big Data Application on AWSBuilding Your First Big Data Application on AWS
Building Your First Big Data Application on AWS
 
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
AWS re:Invent 2016: Running Lean Architectures: How to Optimize for Cost Effi...
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdays
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
 
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
AWS re:Invent 2016: Metering Big Data at AWS: From 0 to 100 Million Records i...
 
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
AWS re:Invent 2016: Innovation After Installation: Establishing a Digital Rel...
 
The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 

Viewers also liked

DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012dataxu
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
 
BIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleBIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleQubole
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsITProceed
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsAmazon Web Services
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup Qubole
 
Azure ARM’d and Ready
Azure ARM’d and ReadyAzure ARM’d and Ready
Azure ARM’d and Readymscug
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...NoSQLmatters
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)Amazon Web Services
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeJoydeep Sen Sarma
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoptionQubole
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Joydeep Sen Sarma
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesQubole
 
Nw qubole overview_033015
Nw qubole overview_033015Nw qubole overview_033015
Nw qubole overview_033015Michael Mersch
 
Unlocking Self-Service Big Data Analytics on AWS
Unlocking Self-Service Big Data Analytics on AWSUnlocking Self-Service Big Data Analytics on AWS
Unlocking Self-Service Big Data Analytics on AWSAmazon Web Services
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...MSAdvAnalytics
 

Viewers also liked (20)

DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012DataXu: Programmatic Premium Webinar - June 7, 2012
DataXu: Programmatic Premium Webinar - June 7, 2012
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
BIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - QuboleBIPD Tech Tuesday Presentation - Qubole
BIPD Tech Tuesday Presentation - Qubole
 
Azure stream analytics by Nico Jacobs
Azure stream analytics by Nico JacobsAzure stream analytics by Nico Jacobs
Azure stream analytics by Nico Jacobs
 
Creating a fortigate vpn network & security blog
Creating a fortigate vpn   network & security blogCreating a fortigate vpn   network & security blog
Creating a fortigate vpn network & security blog
 
Fortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure WorkloadsFortinet Automates Migration onto Layered Secure Workloads
Fortinet Automates Migration onto Layered Secure Workloads
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
 
Azure ARM’d and Ready
Azure ARM’d and ReadyAzure ARM’d and Ready
Azure ARM’d and Ready
 
Azure Document Db
Azure Document DbAzure Document Db
Azure Document Db
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Qubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europeQubole hadoop-summit-2013-europe
Qubole hadoop-summit-2013-europe
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
RDO-Packstack Workshop
RDO-Packstack Workshop RDO-Packstack Workshop
RDO-Packstack Workshop
 
5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption5 Crucial Considerations for Big data adoption
5 Crucial Considerations for Big data adoption
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Nw qubole overview_033015
Nw qubole overview_033015Nw qubole overview_033015
Nw qubole overview_033015
 
Unlocking Self-Service Big Data Analytics on AWS
Unlocking Self-Service Big Data Analytics on AWSUnlocking Self-Service Big Data Analytics on AWS
Unlocking Self-Service Big Data Analytics on AWS
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 

Similar to AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billions of events per day with Amazon DynamoDB (DAT312)

Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBAmazon Web Services
 
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...Amazon Web Services
 
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)Amazon Web Services Korea
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitAmazon Web Services
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyershuguk
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayAmazon Web Services Korea
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBAmazon Web Services
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftAmazon Web Services
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Amazon Web Services
 

Similar to AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billions of events per day with Amazon DynamoDB (DAT312) (20)

AWS Data Collection & Storage
AWS Data Collection & StorageAWS Data Collection & Storage
AWS Data Collection & Storage
 
Getting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDBGetting Started with Amazon DynamoDB
Getting Started with Amazon DynamoDB
 
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
AWS Storage and Database Architecture Best Practices (DAT203) | AWS re:Invent...
 
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
AWS CLOUD 2018- Amazon DynamoDB기반 글로벌 서비스 개발 방법 (김준형 솔루션즈 아키텍트)
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
What's New with Amazon DynamoDB - SRV311 - Atlanta AWS Summit
 
Log Analysis At Scale
Log Analysis At ScaleLog Analysis At Scale
Log Analysis At Scale
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Amazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian MeyersAmazon Elastic Map Reduce - Ian Meyers
Amazon Elastic Map Reduce - Ian Meyers
 
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB DayGetting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
Getting Strated with Amazon Dynamo DB (Jim Scharf) - AWS DB Day
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
 
Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...Getting Started with Managed Database Services on AWS - September 2016 Webina...
Getting Started with Managed Database Services on AWS - September 2016 Webina...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billions of events per day with Amazon DynamoDB (DAT312)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. November 30, 2016 How DataXu Scaled Its Attribution System to Handle Billions of Events per Day with Amazon DynamoDB Padma Malligarjunan, AWS Yekesa Kosuru, DataXu Rohit Dialani, DataXu
  • 2. Agenda • Benefits of NoSQL • Fully managed features of Amazon DynamoDB • DynamoDB integration with AWS services • DataXu’s DynamoDB use case
  • 3. Traditional SQL NoSQL DB Primary Secondary Scale up DB SQL (Relational) vs. NoSQL (Non-relational)
  • 4. Traditional SQL NoSQL DB Primary Secondary Scale up DB DB DBDB DB DB Scale out SQL (Relational) vs. NoSQL (Non-relational)
  • 5. SQL (Relational) Price Desc. $11.50 $8.99 Must watch.. Columns Rows Primary Key Index $14.95 One of 2 major … The Sounds.. Product ID Type 1 Book 2 Album 3 Movie Products
  • 6. SQL (Relational) Price Desc. $11.50 $8.99 Must watch.. Columns Rows Primary Key Index $14.95 One of 2 major … The Sounds.. Product ID Type 1 2 3 Title Date Harry Potter… 2010 Book ID Author 1 JK Ro.. Books Products Book Album Movie
  • 7. SQL (Relational) Price Desc. $11.50 $8.99 Must watch.. Columns Rows Primary Key Index $14.95 One of 2 major … The Sounds.. Product ID Type 1 2 3 Title Date Harry Potter… 2010 Book ID Author 1 JK Ro.. Books Albums Title The Fox Album ID Artist 2 Ylvis Products Book Album Movie
  • 8. SQL (Relational) Price Desc. $11.50 $8.99 Must watch.. Columns Rows Primary Key Index $14.95 One of 2 major … The Sounds.. Product ID Type 1 2 3 Title Date Harry Potter… 2010 Book ID Author 1 JK Ro.. Books Albums Title The Fox Album ID Artist 2 Ylvis Genre Director Action Zack Snyder Movie ID Title 3 Batman vs Super Movies Products Book Album Movie
  • 9. SQL (Relational) vs. NoSQL (Non-relational) Product ID Type Harry Potter.. JK Rowling 1 Book ID 2 Album ID The Fox 3 Movie ID Batman vs Super Ylvis Attributes Schema is defined per item Items Partition Key Sort Key Price Desc. $11.50 $8.99 Must watch.. Columns Rows Primary Key Index $14.95 One of 2 major … The Sounds.. 3 Movie ID: Actor ID Ben Affleck Action 2010 Zack Snyder Primary Key Product ID Type 1 2 3 Title Date Harry Potter… 2010 Book ID Author 1 JK Ro.. Title The Fox Album ID Artist 2 Ylvis Genre Director Action Zack Snyder Movie ID Title 3 Batman vs Super Products Products Book Album Movie Books Albums Movies NoSQL design optimizes for compute instead of storage
  • 10. Why NoSQL? Optimized for storage Optimized for compute Normalized/relational Denormalized/hierarchical Ad hoc queries Instantiated views Scale vertically Scale horizontally Good for OLAP Built for OLTP at scale SQL NoSQL
  • 11. Fully managed Fast, consistent performance Highly scalable Flexible Event-driven programming Fine-grained access control DynamoDB Benefits
  • 12. Ad Tech Gaming MobileIoT Web Scaling High-Velocity Use Cases with DynamoDB
  • 14. Table and Item API Admin CRUD Create Table Put/Get Item Update Table Batch Put/Get Item Delete Table Update Item Describe Table Delete Item Query Scan DynamoDB Streams ListStreams DescribeStream GetShardIterator GetRecords
  • 15. Stream of updates to a table Asynchronous Exactly once Strictly ordered • Per item Highly durable • Scale with table 24-hour lifetime Sub-second latency DynamoDB Streams
  • 16. Stream Table Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Table Shard 1 Shard 2 Shard 3 Shard 4 KCL Worker KCL Worker KCL Worker KCL Worker Amazon Kinesis Client Library application DynamoDB client application Updates DynamoDB Streams and Amazon Kinesis Client Library
  • 17. DynamoDB Streams and AWS Lambda
  • 18. Triggers Lambda function Notify change Derivative tables Amazon Elasticsearch Service Amazon ElastiCache
  • 19. Analytics with DynamoDB Streams Collect and de-dupe data in DynamoDB Aggregate data in memory and flush periodically Performing real-time aggregation and analytics
  • 21. DataXu’s DynamoDB Use Case • Who is DataXu • Attribution Use Case • Why DynamoDB • Deployment Architecture • Capacity & Performance • Tips & Lessons Learned
  • 22. DataXu • Who • Spun out of MIT Labs • A petabyte-scale digital marketing platform • One of the fastest growing companies in Inc. 5000 • What • Help world’s most valuable brands understand and engage with their consumers • Maximize ROI Quick Statistics • 2M+ bid requests per second • Billions of impressions per month, petabytes of data • ~10ms round-trip response time • 180+ TB logs per day • 2 PB data analyzed • 3000+ servers powering the platform • 13 regions, 24x7
  • 24. DataXu Reads and Writes on DynamoDB X-Axis = Day Y-Axis = Read/Write Capacity used X-Axis = Time (6 hour intervals) Y-Axis = Read/Write Capacity used
  • 26. Attribution is the science of allocating credit from an activity/sale to the marketing touchpoints that a customer was exposed to prior to the purchase/activity. Attribution Online Purchase Impression ClickImpression Customer Journey EI EventImpression A Activity
  • 27. Generalized Event Chains AI E I A Time • Billions of events and activities are organized into sequences. • Events are correlated based on time and user to construct paths leading to an activity. EI Event E Impression A Activity I E
  • 29. Why DynamoDB • Managed Service • Easy to use • Elastic scaling, no need to overprovision • API driven • Fast & Predictable Performance (millisecs) • Fast lookup/scan of user events • Consistent & predictable read/write performance • TCO • Reasonable capex and no opex
  • 31. DataXu Flows CDN Real-Time Bidding Retargeting Platform Streams (Amazon Kinesis) Advanced Analytics (Third-Party) Reporting Tools (Third-Party)Machine Learning (Spark) S3All Data (Amazon S3) ETL (SPARK SQL) Attribution (MR) Ecosystem of tools and services
  • 33. Inside DynamoDB: Events Table User Events Table Users Events_<month_1> hash=userid range=timestamp <payload> Put Item Events Users Events_<month_2> hash=userid range=timestamp <payload> Property Value Storage 25 TB Avg. Record Size 4 KB
  • 34. 1:N Relationship Events Table Schema (partition key) (sort key) (attributes) Userid-1 epoch1 .. Userid-1 epoch2 .. userid timestamp payload rLsWAQZU1C00TU5 1475624579321 <Binary compressed> rLsWAQZU1C00TU5 1477762942692 <Binary compressed> rLsWAQZU1C00TU5 1475624579695 <Binary compressed> rLsWAQZU1C00TU5 1475624579703 <Binary compressed> SS2U6KnX1BWziP5 1476829764673 <Binary compressed> I I E A
  • 36. R/W Operations vs. R/W Capacity Units What influences capacity units for your table? • Item size: Capacity unit size • 4 KB per Read or 1 KB per Write • Read/write request rate: Item Gets and Puts by your Application • Consistency: Strongly Consistent Read is counted double of Eventually Consistent Read • Local Secondary Index: Synchronized with the table
  • 37. Capacity Planning: Unit of Scaling • Partition: • Storage: 10 GB per partition • Compute: 3000 RCU or 1000 WCU per partition • Partitions(for throughput) = (RCU/3000) + (WCU/1000) • Partitions(for size) = Storage used in GB/10 • Total Number of Partitions = Ceiling(MAX (Partitions(for throughput) , Partitions(for size))) • e.g., Ceiling(Max(100/10, 9000/3000+3000/1000)) = 10
  • 38. Capacity Examples Storage Provisioned RCU Provisioned WCU Partitions Reads /Sec /Partition Writes /Sec /Partition 35 GB* 1000 500 4 250 125 1000 GB* 1000 500 100 10 5 100 GB* 9000 3000 10 900 300 100 GB* 90K 30K 60 1500 500 100 GB** 9000 3000 10 450 60 * Item size of 1 KB or less ** Item size 5 KB
  • 39. Throttling 100 GB 9000 3000 10 900 300 Storage Provisioned RCU Provisioned WCU Partitions Reads Per Partition Writes Per Partition 900 Reads and 300 Writes Per Partition Throttling kicks in > 900 R and 300 W Partitions
  • 40. How to Detect Throttling
  • 41. Auto (Predictive) Scaling Reduce TCO by tuning throughput to match usage 0.01% of the runs
  • 42. Tips & Lessons Learned
  • 43. Design Tips • Understand Scaling • Understand Hot Keys/Throttling • Capture Application Metrics • Configure Table Alarms • Application Tuning for Outliers • Retry w/Backoff • DynamoDB Best Practices • http://docs.aws.amazon.com/amazondynamodb/latest/developergui de/BestPractices.html • AWS Service Limits • http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html
  • 44. Lessons Learned • Reduce RCU and WCU • Combined Reads and Writes, Batch API • Combined multiple rows that share the same hash key to the same row (3X less puts) • LZ4 compression • How do we handle Deletes? • Table rotation to match attribution windows • Drop entire table when it is no longer necessary
  • 45. Lessons Learned • Dynamic scaling to large number of partitions takes time • Debugging • Application logging/metrics • TCP dumps • Turn on Request ID logging • CloudWatch • Local DynamoDB for testing • http://docs.aws.amazon.com/amazondynamodb/latest/develo perguide/Tools.DynamoDBLocal.html
  • 46. Thank you! ykosuru@dataxu.com rdialani@dataxu.com We are hiring! https://www.dataxu.com/careers