SlideShare una empresa de Scribd logo
1 de 70
Descargar para leer sin conexión
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Steve Abraham, Solutions Architect - AWS
Brian Filppu, Director of Business Intelligence - Zillow
October 2015
BDT307
Zero Infrastructure, Real-Time
Data Collection, and Analytics
Who am I?
• Steve Abraham
• Solutions Architect – AWS
• Previous life
• T-Mobile
• U.S. State Department
• Hasbro
• Software company
What we’ll cover
• Data ingestion pipeline
• Collect 1,000,000,000 data points per month
• Varied clients
• Near real-time access to data
• High performance / high availability
• Low cost / low maintenance
• Case study – Zillow
• Brian Filppu – Director of Business Intelligence
End State: Amazon Redshift
End State: Amazon Aurora
Amazon API Gateway
Amazon API Gateway
Amazon API Gateway
• Create REST-based endpoints
• Fully-managed
• Scales automatically
• Enables rapid development
• Flexible security controls
Amazon API Gateway
• Integration types
• Lambda
• Proxy AWS service
• Proxy existing service
• Mock
Amazon API Gateway
• Deploy to stages
• Cross-origin resource sharing (CORS) support
• Automatically generates SDK
• Android
• iOS
• JavaScript
Amazon API Gateway
• $3.50 per 1,000,000 calls
• Data transfer in - Free
• Data transfer out - $0.05 -> $0.09 per GB
• 1,000,000,000 calls
• $3,500.00 – Gateway
• $0.00 – Data transfer out
• Total price - $3,500.00
AWS Lambda
AWS Lambda
AWS Lambda
• Fully-managed server-less compute
• Event-driven
• Platform
• Amazon Linux
• Node.JS / Java
• Configure memory / CPU
• Timeout
AWS Lambda – Direct Invocation Model
• Respond to invocation
• Services
• Amazon API Gateway
• Custom code
AWS Lambda – Pull Model
• Polls the event source
• Services
• Amazon Kinesis
• Amazon DynamoDB
Streams
AWS Lambda – Push Model
• Respond to a specific event
• Services
• Amazon S3
• Amazon SNS
• Amazon Cognito
• Amazon Echo
AWS Lambda & Amazon API Gateway
• Amazon API Gateway / AWS Lambda
• Fast & easy to deploy
• Automatic scaling
• 100% utilization
• 100% managed
• Amazon EC2
• Existing infrastructure
• High utilization (> 90%)
AWS Lambda
• $0.20 per 1,000,000 requests
• First 1,000,000 requests / month – Free
• 1,000,000,000 executions -> $199.80
• $0.00001667 per GB-second
• 400,000 GB-seconds – Free
• 1,000,000,000 executions
• 0.5 seconds / 128 MB -> $1,035.21
• Total price -> $1,235.01
• Proxy price -> $0.00
Amazon Kinesis
Amazon Kinesis
Amazon Kinesis
• Fully-managed data aggregator
• Terabytes of data per hour
• Stream
• Replicated across 3 facilities
• 24-hour retention
• Shard
• 1 MB (1,000 PUT) / second – writes
• 2 MB (5 operations) / second – reads
• One thread
Amazon Kinesis
Amazon Kinesis
Amazon Kinesis Shard Management
• Split shard
• Add capacity to stream
• Merge shard
• Reduce cost
• Amazon Kinesis scaling utility
• Allows for scaling automatically
• https://github.com/awslabs/amazon-kinesis-scaling-utils
Amazon Kinesis
• Amazon API Gateway
• REST interface / proxy
• Most expensive
• Direct to Amazon Kinesis
• Amazon Kinesis API
• Least expensive
Amazon Kinesis
• $0.015 per shard hour / $11.16 per month
• 1,000,000,000 / 31 / 86,400 = 373 avg. requests/second
• 3 shards * $11.16 = $33.48
• $0.014 per 1,000,000 PUT payloads (25 KB)
• 1,000,000,000 / 1,000,000 * $0.014 = $14.00
• Total cost -> $47.48
Amazon S3 & Amazon SQS
Amazon S3 & Amazon SQS
Amazon Simple Storage Service
• Secure
• Encryption in flight - HTTPS
• Encryption at rest (Amazon S3 key, client key, AWS KMS)
• Durable
• Designed for 11 9’s of durability
• Scalable
• Millions of requests per second
• Trillions of objects
AWS Key Management Service
• Manage encryption keys
• Encrypt / decrypt data directly
• Directly Integrates with
• Amazon S3
• Amazon RDS
• Amazon Redshift
• AWS Lambda integration
• Access via API
Amazon Simple Storage Service
• Key name distribution
• Random values
• Lifecycle policy
• Delete objects
• Move objects to Amazon Glacier
• Amazon Glacier
• Infrequently accessed data (cold storage)
• Low-cost starting at $0.007 per GB
• Secure / durable
Amazon Simple Queue Service
• Simple
• Easy to set up
• Secure
• Encryption in flight - HTTPS
• Durable
• Multiple servers / data centers
• Scalable
• Automatically scales
Amazon S3 Pricing
• $0.0275 - $0.0408 per GB
• Tiered pricing
• Varies by region
• $0.005 - $0.007 per 1,000 PUT requests
• Varies by region
• $0.004 - $0.0056 per 10,000 GET requests
• Varies by region
• Total cost -> $3.87
Amazon SQS Pricing
• $0.50 per 1,000,000 requests
• First 1,000,000 requests free
• Total cost -> $0.00
Amazon Redshift
Amazon Redshift
Amazon Redshift
• Fully-managed, petabyte scale data warehouse
• Fast
• Columnar storage / data compression
• Scalable
• Scale up or down
• Fault tolerant
• Data replicated across nodes / Backed up to Amazon S3
• Familiar
• Connect via ODBC / JDBC
Amazon Redshift
ODBC / JDBC
Amazon Redshift
cluster
Amazon Redshift
• COPY command
• Amazon Redshift parallelizes the load
• Single transaction
• Encrypt credentials using AWS KMS
• Supports delimited, fixed width, JSON, AVRO
• Supports GZIP & LZOP
Amazon Redshift
• Micro-batch loading
• Number of files = multiple of virtual cores
• Define compression type for each column in table definition
• Load data in sort key order
• Use SSD node type (dc1 instance types)
Amazon Redshift
• Infinite loop
• Create 1 Amazon Kinesis stream with 1 shard
• Attach Lambda function to Amazon Kinesis stream
• Execute workload
• Put record into stream
• Create multiple shards for multiple threads
Amazon Redshift
Amazon Redshift
• Spin up / spin down
• 2 TB data warehouse
• On Demand - $632.40 / month
• 1 Year No Upfront - $496.00 / month (20% savings)
• 1 Year Partial - $2,500.00, $157 / month (41% savings)
• Total cost -> $365.33
Amazon Aurora
Amazon Aurora
Amazon Aurora
• Fully-managed relational database
• MySQL 5.6
• Wire compatible
• InnoDB storage engine
• Up to five times better performance than MySQL
• Over 500,000 SELECTs per second
• 100,000 updates per second
• Multi-AZ
• Data replicated 6 ways across 3 zones
Amazon Aurora or Amazon Redshift?
• Amazon Redshift
• Data warehouse workload
• Data > 64 TB
• 50 concurrent queries
• Amazon Aurora
• OLTP workload
• Data < 64 TB
• 500,000 SELECT / 100,000 UPDATES per second
Amazon Aurora Pricing - Compute
• db.r3.xlarge
• On Demand - $431.52 / month
• 1 Year No Upfront - $277.40 / month (34% savings)
• 1 Year Partial - $1,250.00, $131.40 / month (45% savings)
• Total compute cost -> $235.47
Amazon Aurora Pricing - Storage
• Storage
• $0.10 per GB/month
• $0.20 per 1,000,000 I/O requests
• 1,000,000,000 records
• Compute - $235.47
• 93 GB - $9.30
• 2,000,000,000 / 1,000,000 * $0.20 = $400.00
• Total cost -> $644.77
Zillow Case Study
Zillow
• What is Zillow?
• Zillow is the leading real estate and home-related information
marketplace. Zillow is dedicated to empowering consumers with
data, inspiration and knowledge around the place they call
home.
• Who am I?
• Brian Filppu
• Director, Business Intelligence, Zillow
• I have been at Zillow close to 8 years
• Previous life – Spent about 6 years consulting throughout
North America
Zillow – Use Case
• Needed to collect a subset of mobile app metrics
• Solution needed to be delivered in under 3 weeks
• Requirement to aggregate and report metrics back to
business owners several times during the day
• We already have a number of data warehouse
processes in AWS so we reached out to Steve, our AWS
solutions architect for assistance
Zillow – What Did We Create?
• Custom URL endpoint in Amazon API Gateway
• 16,000,000+ POSTs per day – to start
• Data sent from API Gateway to Amazon Kinesis using AWS
Lambda
• Storing data encrypted with AWS KMS in Amazon S3 using
Lambda
• Analyze our data with Spark on Amazon EMR
• Run Spark jobs through out the data with AWS Data Pipeline
• Have the ability to consume/analyze data real time on Spark
on Amazon EMR with Amazon Kinesis if the use case arises
Zillow – Architecture Diagram
Zillow – Data Collection Costs
• Using 3 Amazon Kinesis shards costing around $1.30 a
day which includes hourly + put costs.
• On AWS Lambda, we allocated 128 MB of memory per
function call. Lambda runs for under $6 dollars a day.
• Lambda and Amazon Kinesis gave us a cost effective
solution for storing data with little development time.
Zillow – Data Analysis
• Use Spark to perform ETL, clean up, and analysis
through out the day. ETL includes Parquet conversion,
data partitioning, etc.
• Use Presto on Amazon EMR for long-term
querying/analysis of data.
• Data is stored on Amazon S3. For all Amazon EMR
jobs, we use Amazon S3 as our HDFS.
• Currently running jobs 4 + times a day using AWS Data
Pipeline which launches Spark jobs.
Zillow – What Else Does My Team Run in AWS?
• Use Amazon Redshift for fast access to data
• Big users of Spark and Presto on Amazon EMR, which
includes ETL and ad hoc querying, other use cases
involve long term historical data not kept in
Amazon Redshift
• Amazon SQS, AWS Data Pipeline, Amazon SNS,
Amazon S3, AWS KMS, Amazon API Gateway,
Amazon EC2
Zillow – We are Hiring
• My team is hiring ETL data engineers and software
developers
• All open positions at Zillow can be found at
http://www.zillow.com/jobs/
Demo
Recap
Related Sessions
• BDT302 - Real-World Smart Applications with Amazon
Machine Learning
• BDT309 - Data Science & Best Practices for Apache
Spark on Amazon EMR
• BDT310 - Big Data Architectural Patterns and Best
Practices on AWS
Remember to complete
your evaluations!
Thank you!
Code used for the demo in this session is
available for download here:
http://abrstevepermalink.s3.amazonaws.com/Demo.zip
Amazon API Gateway Pricing
• $3.50 per 1,000,000 calls
• Data Transfer In - Free
• Data Transfer Out
• $0.09/GB for the first 10 TB
• $0.085/GB for the next 40 TB
• $0.07/GB for the next 100 TB
• $0.05/GB for the next 350 TB
• 1,000,000,000 calls / 1KB payload
• $3,500.00 – Gateway
• $85.83 – Data Transfer Out
AWS Lambda Pricing
• $0.20 per 1,000,000 requests
• First 1,000,000 requests / month – Free
• 1,000,000,000 executions
• (1,000,000,000 – 1,000,000) / 1,000,000 * $0.20 = $199.80
• $0.00001667 per GB-second
• 400,000 GB-seconds – Free
• 1,000,000,000 executions / 0.5 seconds / 128 MB
• 1,000,000,000 * 0.5 * 128 / 1024 = 62,500,000 GB-Sec
• 62,500,000 – 400,000 = 62,100,000
• 62,100,00 * $0.00001667 = $1,035.21
Amazon Kinesis Pricing
• $0.015 per shard hour / $11.16 per month
• 1,000,000,000 / 31 / 86,400 = 373 avg. requests/second
• 3 shards * $11.16 = $33.48
• $0.014 per 1,000,000 PUT payloads (25 KB)
• 1,000,000,000 / 1,000,000 * $0.014 = $14.00
Amazon S3 Pricing
• $0.03 per GB (1st TB)
• 1,000,000,000 * 100 bytes = 93.13 GB = $2.79
• $0.005 per 1,000 PUT requests
• 1,000,000,000 / 5,000 records / 1,000 * $0.005 = $1.00
• $0.004 per 10,000 GET requests
• 1,000,000,000 / 5,000 records / 10,000 * $0.004 = $0.08
Amazon SQS Pricing
• $0.50 per 1,000,000 requests
• First 1,000,000 requests free
• 1,000,000,000 / 5,000 records = 200,000 messages
• SendMessage -> 200,000
• ReceiveMessage -> 20,000
• DeleteMessageBatch -> 20,000
• Total -> 240,000 = $0.00

Más contenido relacionado

La actualidad más candente

Real-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaReal-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaAmazon Web Services
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierDeep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierAdrian Hornsby
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAmazon Web Services
 
Getting started with Amazon Dynamo BD
Getting started with Amazon Dynamo BDGetting started with Amazon Dynamo BD
Getting started with Amazon Dynamo BDAmazon Web Services
 
Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...
 Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T... Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...
Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...Amazon Web Services
 
serverless_architecture_patterns_london_loft.pdf
serverless_architecture_patterns_london_loft.pdfserverless_architecture_patterns_london_loft.pdf
serverless_architecture_patterns_london_loft.pdfAmazon Web Services
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSAmazon Web Services
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSAmazon Web Services
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Adrian Hornsby
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudAmazon Web Services
 
ENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudAmazon Web Services
 
(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda
(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda
(CMP407) Lambda as Cron: Scheduling Invocations in AWS LambdaAmazon Web Services
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraAmazon Web Services
 
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...Amazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 
SRV302 Deep Dive on Serverless Application Development
SRV302 Deep Dive on Serverless Application DevelopmentSRV302 Deep Dive on Serverless Application Development
SRV302 Deep Dive on Serverless Application DevelopmentAmazon Web Services
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSAmazon Web Services
 

La actualidad más candente (20)

Real-Time Processing Using AWS Lambda
Real-Time Processing Using AWS LambdaReal-Time Processing Using AWS Lambda
Real-Time Processing Using AWS Lambda
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon GlacierDeep Dive on Object Storage: Amazon S3 and Amazon Glacier
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
 
Getting started with Amazon Dynamo BD
Getting started with Amazon Dynamo BDGetting started with Amazon Dynamo BD
Getting started with Amazon Dynamo BD
 
Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...
 Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T... Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...
Getting Started with AWS Lambda and the Serverless Cloud - AWS Summit Cape T...
 
serverless_architecture_patterns_london_loft.pdf
serverless_architecture_patterns_london_loft.pdfserverless_architecture_patterns_london_loft.pdf
serverless_architecture_patterns_london_loft.pdf
 
Building and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECSBuilding and scaling your containerized microservices on Amazon ECS
Building and scaling your containerized microservices on Amazon ECS
 
Rackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWSRackspace Best Practices for DevOps on AWS
Rackspace Best Practices for DevOps on AWS
 
Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)Being Well Architected in the Cloud (Updated)
Being Well Architected in the Cloud (Updated)
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
ENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the CloudENT306 Migrating Large Scale Data Sets to the Cloud
ENT306 Migrating Large Scale Data Sets to the Cloud
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda
(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda
(CMP407) Lambda as Cron: Scheduling Invocations in AWS Lambda
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon Aurora
 
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 
SRV302 Deep Dive on Serverless Application Development
SRV302 Deep Dive on Serverless Application DevelopmentSRV302 Deep Dive on Serverless Application Development
SRV302 Deep Dive on Serverless Application Development
 
Monitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECSMonitoring in Motion: Monitoring Containers and Amazon ECS
Monitoring in Motion: Monitoring Containers and Amazon ECS
 

Destacado

(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS
(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS
(MBL202) Mobile State of the Union: Mobile Apps Powered by AWSAmazon Web Services
 
Easily Govern and Audit your AWS Resources
Easily Govern and Audit your AWS ResourcesEasily Govern and Audit your AWS Resources
Easily Govern and Audit your AWS ResourcesAmazon Web Services
 
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...Amazon Web Services
 
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPCAmazon Web Services
 
Serverless presentation
Serverless presentationServerless presentation
Serverless presentationjasonsich
 
Deep Dive: Infrastructure as Code
Deep Dive: Infrastructure as CodeDeep Dive: Infrastructure as Code
Deep Dive: Infrastructure as CodeAmazon Web Services
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureAmazon Web Services
 
DevOps, Continuous Integration and Deployment on AWS
DevOps, Continuous Integration and Deployment on AWSDevOps, Continuous Integration and Deployment on AWS
DevOps, Continuous Integration and Deployment on AWSAmazon Web Services
 
Getting started with the hybrid cloud enterprise backup and recovery - Toronto
Getting started with the hybrid cloud   enterprise backup and recovery - TorontoGetting started with the hybrid cloud   enterprise backup and recovery - Toronto
Getting started with the hybrid cloud enterprise backup and recovery - TorontoAmazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAmazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAmazon Web Services
 
Serverless - When to FaaS?
Serverless - When to FaaS?Serverless - When to FaaS?
Serverless - When to FaaS?Benny Bauer
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventSudhir Tonse
 
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)Amazon Web Services
 

Destacado (20)

(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS
(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS
(MBL202) Mobile State of the Union: Mobile Apps Powered by AWS
 
Beleza invisivel
Beleza invisivelBeleza invisivel
Beleza invisivel
 
Easily Govern and Audit your AWS Resources
Easily Govern and Audit your AWS ResourcesEasily Govern and Audit your AWS Resources
Easily Govern and Audit your AWS Resources
 
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...
AWS July Webinar Series - Overview Build and Manage your APs with amazon api ...
 
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC
(NET409) How Twilio Migrated Its Services from EC2-Classic to EC2-VPC
 
Serverless presentation
Serverless presentationServerless presentation
Serverless presentation
 
Deep Dive: Infrastructure as Code
Deep Dive: Infrastructure as CodeDeep Dive: Infrastructure as Code
Deep Dive: Infrastructure as Code
 
Real time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructureReal time data analytics - part 1 - backend infrastructure
Real time data analytics - part 1 - backend infrastructure
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
 
DevOps, Continuous Integration and Deployment on AWS
DevOps, Continuous Integration and Deployment on AWSDevOps, Continuous Integration and Deployment on AWS
DevOps, Continuous Integration and Deployment on AWS
 
Getting started with the hybrid cloud enterprise backup and recovery - Toronto
Getting started with the hybrid cloud   enterprise backup and recovery - TorontoGetting started with the hybrid cloud   enterprise backup and recovery - Toronto
Getting started with the hybrid cloud enterprise backup and recovery - Toronto
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
 
Serverless - When to FaaS?
Serverless - When to FaaS?Serverless - When to FaaS?
Serverless - When to FaaS?
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
 
Telenor Connexion
Telenor Connexion Telenor Connexion
Telenor Connexion
 
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)
AWS re:Invent 2016: Getting Started with Serverless Architectures (CMP211)
 

Similar a (BDT307) Zero Infrastructure, Real-Time Data Collection, and Analytics

AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)Amazon Web Services
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersAmazon Web Services
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersAmazon Web Services
 
Amazon Web Services OverView
Amazon Web Services OverViewAmazon Web Services OverView
Amazon Web Services OverViewAriel K
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Amazon Web Services
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rasmus Ekman
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Amazon Web Services
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersAmazon Web Services
 
Where Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionWhere Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionTamir Dresher
 
Serverless Real-time Tracking & Analysis
Serverless Real-time Tracking & AnalysisServerless Real-time Tracking & Analysis
Serverless Real-time Tracking & AnalysisHery Hope
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersAmazon Web Services
 
(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency
(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency
(ARC302) Running Lean Architectures: Optimizing for Cost EfficiencyAmazon Web Services
 
Real-time Analytics with Open-Source
Real-time Analytics with Open-SourceReal-time Analytics with Open-Source
Real-time Analytics with Open-SourceAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 

Similar a (BDT307) Zero Infrastructure, Real-Time Data Collection, and Analytics (20)

AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million Users
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Amazon Web Services OverView
Amazon Web Services OverViewAmazon Web Services OverView
Amazon Web Services OverView
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million Users
 
Log Analysis At Scale
Log Analysis At ScaleLog Analysis At Scale
Log Analysis At Scale
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)
 
Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)Why Scale Matters and How the Cloud is Really Different (at scale)
Why Scale Matters and How the Cloud is Really Different (at scale)
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Where Is My Data - ILTAM Session
Where Is My Data - ILTAM SessionWhere Is My Data - ILTAM Session
Where Is My Data - ILTAM Session
 
Serverless Real-time Tracking & Analysis
Serverless Real-time Tracking & AnalysisServerless Real-time Tracking & Analysis
Serverless Real-time Tracking & Analysis
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency
(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency
(ARC302) Running Lean Architectures: Optimizing for Cost Efficiency
 
Real-time Analytics with Open-Source
Real-time Analytics with Open-SourceReal-time Analytics with Open-Source
Real-time Analytics with Open-Source
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Último (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

(BDT307) Zero Infrastructure, Real-Time Data Collection, and Analytics

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Steve Abraham, Solutions Architect - AWS Brian Filppu, Director of Business Intelligence - Zillow October 2015 BDT307 Zero Infrastructure, Real-Time Data Collection, and Analytics
  • 2. Who am I? • Steve Abraham • Solutions Architect – AWS • Previous life • T-Mobile • U.S. State Department • Hasbro • Software company
  • 3. What we’ll cover • Data ingestion pipeline • Collect 1,000,000,000 data points per month • Varied clients • Near real-time access to data • High performance / high availability • Low cost / low maintenance • Case study – Zillow • Brian Filppu – Director of Business Intelligence
  • 4. End State: Amazon Redshift
  • 8. Amazon API Gateway • Create REST-based endpoints • Fully-managed • Scales automatically • Enables rapid development • Flexible security controls
  • 9. Amazon API Gateway • Integration types • Lambda • Proxy AWS service • Proxy existing service • Mock
  • 10. Amazon API Gateway • Deploy to stages • Cross-origin resource sharing (CORS) support • Automatically generates SDK • Android • iOS • JavaScript
  • 11. Amazon API Gateway • $3.50 per 1,000,000 calls • Data transfer in - Free • Data transfer out - $0.05 -> $0.09 per GB • 1,000,000,000 calls • $3,500.00 – Gateway • $0.00 – Data transfer out • Total price - $3,500.00
  • 14. AWS Lambda • Fully-managed server-less compute • Event-driven • Platform • Amazon Linux • Node.JS / Java • Configure memory / CPU • Timeout
  • 15. AWS Lambda – Direct Invocation Model • Respond to invocation • Services • Amazon API Gateway • Custom code
  • 16. AWS Lambda – Pull Model • Polls the event source • Services • Amazon Kinesis • Amazon DynamoDB Streams
  • 17. AWS Lambda – Push Model • Respond to a specific event • Services • Amazon S3 • Amazon SNS • Amazon Cognito • Amazon Echo
  • 18. AWS Lambda & Amazon API Gateway • Amazon API Gateway / AWS Lambda • Fast & easy to deploy • Automatic scaling • 100% utilization • 100% managed • Amazon EC2 • Existing infrastructure • High utilization (> 90%)
  • 19. AWS Lambda • $0.20 per 1,000,000 requests • First 1,000,000 requests / month – Free • 1,000,000,000 executions -> $199.80 • $0.00001667 per GB-second • 400,000 GB-seconds – Free • 1,000,000,000 executions • 0.5 seconds / 128 MB -> $1,035.21 • Total price -> $1,235.01 • Proxy price -> $0.00
  • 22. Amazon Kinesis • Fully-managed data aggregator • Terabytes of data per hour • Stream • Replicated across 3 facilities • 24-hour retention • Shard • 1 MB (1,000 PUT) / second – writes • 2 MB (5 operations) / second – reads • One thread
  • 25. Amazon Kinesis Shard Management • Split shard • Add capacity to stream • Merge shard • Reduce cost • Amazon Kinesis scaling utility • Allows for scaling automatically • https://github.com/awslabs/amazon-kinesis-scaling-utils
  • 26. Amazon Kinesis • Amazon API Gateway • REST interface / proxy • Most expensive • Direct to Amazon Kinesis • Amazon Kinesis API • Least expensive
  • 27. Amazon Kinesis • $0.015 per shard hour / $11.16 per month • 1,000,000,000 / 31 / 86,400 = 373 avg. requests/second • 3 shards * $11.16 = $33.48 • $0.014 per 1,000,000 PUT payloads (25 KB) • 1,000,000,000 / 1,000,000 * $0.014 = $14.00 • Total cost -> $47.48
  • 28. Amazon S3 & Amazon SQS
  • 29. Amazon S3 & Amazon SQS
  • 30. Amazon Simple Storage Service • Secure • Encryption in flight - HTTPS • Encryption at rest (Amazon S3 key, client key, AWS KMS) • Durable • Designed for 11 9’s of durability • Scalable • Millions of requests per second • Trillions of objects
  • 31. AWS Key Management Service • Manage encryption keys • Encrypt / decrypt data directly • Directly Integrates with • Amazon S3 • Amazon RDS • Amazon Redshift • AWS Lambda integration • Access via API
  • 32. Amazon Simple Storage Service • Key name distribution • Random values • Lifecycle policy • Delete objects • Move objects to Amazon Glacier • Amazon Glacier • Infrequently accessed data (cold storage) • Low-cost starting at $0.007 per GB • Secure / durable
  • 33. Amazon Simple Queue Service • Simple • Easy to set up • Secure • Encryption in flight - HTTPS • Durable • Multiple servers / data centers • Scalable • Automatically scales
  • 34. Amazon S3 Pricing • $0.0275 - $0.0408 per GB • Tiered pricing • Varies by region • $0.005 - $0.007 per 1,000 PUT requests • Varies by region • $0.004 - $0.0056 per 10,000 GET requests • Varies by region • Total cost -> $3.87
  • 35. Amazon SQS Pricing • $0.50 per 1,000,000 requests • First 1,000,000 requests free • Total cost -> $0.00
  • 38. Amazon Redshift • Fully-managed, petabyte scale data warehouse • Fast • Columnar storage / data compression • Scalable • Scale up or down • Fault tolerant • Data replicated across nodes / Backed up to Amazon S3 • Familiar • Connect via ODBC / JDBC
  • 39. Amazon Redshift ODBC / JDBC Amazon Redshift cluster
  • 40. Amazon Redshift • COPY command • Amazon Redshift parallelizes the load • Single transaction • Encrypt credentials using AWS KMS • Supports delimited, fixed width, JSON, AVRO • Supports GZIP & LZOP
  • 41. Amazon Redshift • Micro-batch loading • Number of files = multiple of virtual cores • Define compression type for each column in table definition • Load data in sort key order • Use SSD node type (dc1 instance types)
  • 42. Amazon Redshift • Infinite loop • Create 1 Amazon Kinesis stream with 1 shard • Attach Lambda function to Amazon Kinesis stream • Execute workload • Put record into stream • Create multiple shards for multiple threads
  • 44. Amazon Redshift • Spin up / spin down • 2 TB data warehouse • On Demand - $632.40 / month • 1 Year No Upfront - $496.00 / month (20% savings) • 1 Year Partial - $2,500.00, $157 / month (41% savings) • Total cost -> $365.33
  • 47. Amazon Aurora • Fully-managed relational database • MySQL 5.6 • Wire compatible • InnoDB storage engine • Up to five times better performance than MySQL • Over 500,000 SELECTs per second • 100,000 updates per second • Multi-AZ • Data replicated 6 ways across 3 zones
  • 48. Amazon Aurora or Amazon Redshift? • Amazon Redshift • Data warehouse workload • Data > 64 TB • 50 concurrent queries • Amazon Aurora • OLTP workload • Data < 64 TB • 500,000 SELECT / 100,000 UPDATES per second
  • 49. Amazon Aurora Pricing - Compute • db.r3.xlarge • On Demand - $431.52 / month • 1 Year No Upfront - $277.40 / month (34% savings) • 1 Year Partial - $1,250.00, $131.40 / month (45% savings) • Total compute cost -> $235.47
  • 50. Amazon Aurora Pricing - Storage • Storage • $0.10 per GB/month • $0.20 per 1,000,000 I/O requests • 1,000,000,000 records • Compute - $235.47 • 93 GB - $9.30 • 2,000,000,000 / 1,000,000 * $0.20 = $400.00 • Total cost -> $644.77
  • 52. Zillow • What is Zillow? • Zillow is the leading real estate and home-related information marketplace. Zillow is dedicated to empowering consumers with data, inspiration and knowledge around the place they call home. • Who am I? • Brian Filppu • Director, Business Intelligence, Zillow • I have been at Zillow close to 8 years • Previous life – Spent about 6 years consulting throughout North America
  • 53. Zillow – Use Case • Needed to collect a subset of mobile app metrics • Solution needed to be delivered in under 3 weeks • Requirement to aggregate and report metrics back to business owners several times during the day • We already have a number of data warehouse processes in AWS so we reached out to Steve, our AWS solutions architect for assistance
  • 54. Zillow – What Did We Create? • Custom URL endpoint in Amazon API Gateway • 16,000,000+ POSTs per day – to start • Data sent from API Gateway to Amazon Kinesis using AWS Lambda • Storing data encrypted with AWS KMS in Amazon S3 using Lambda • Analyze our data with Spark on Amazon EMR • Run Spark jobs through out the data with AWS Data Pipeline • Have the ability to consume/analyze data real time on Spark on Amazon EMR with Amazon Kinesis if the use case arises
  • 56. Zillow – Data Collection Costs • Using 3 Amazon Kinesis shards costing around $1.30 a day which includes hourly + put costs. • On AWS Lambda, we allocated 128 MB of memory per function call. Lambda runs for under $6 dollars a day. • Lambda and Amazon Kinesis gave us a cost effective solution for storing data with little development time.
  • 57. Zillow – Data Analysis • Use Spark to perform ETL, clean up, and analysis through out the day. ETL includes Parquet conversion, data partitioning, etc. • Use Presto on Amazon EMR for long-term querying/analysis of data. • Data is stored on Amazon S3. For all Amazon EMR jobs, we use Amazon S3 as our HDFS. • Currently running jobs 4 + times a day using AWS Data Pipeline which launches Spark jobs.
  • 58. Zillow – What Else Does My Team Run in AWS? • Use Amazon Redshift for fast access to data • Big users of Spark and Presto on Amazon EMR, which includes ETL and ad hoc querying, other use cases involve long term historical data not kept in Amazon Redshift • Amazon SQS, AWS Data Pipeline, Amazon SNS, Amazon S3, AWS KMS, Amazon API Gateway, Amazon EC2
  • 59. Zillow – We are Hiring • My team is hiring ETL data engineers and software developers • All open positions at Zillow can be found at http://www.zillow.com/jobs/
  • 60. Demo
  • 61. Recap
  • 62. Related Sessions • BDT302 - Real-World Smart Applications with Amazon Machine Learning • BDT309 - Data Science & Best Practices for Apache Spark on Amazon EMR • BDT310 - Big Data Architectural Patterns and Best Practices on AWS
  • 65. Code used for the demo in this session is available for download here: http://abrstevepermalink.s3.amazonaws.com/Demo.zip
  • 66. Amazon API Gateway Pricing • $3.50 per 1,000,000 calls • Data Transfer In - Free • Data Transfer Out • $0.09/GB for the first 10 TB • $0.085/GB for the next 40 TB • $0.07/GB for the next 100 TB • $0.05/GB for the next 350 TB • 1,000,000,000 calls / 1KB payload • $3,500.00 – Gateway • $85.83 – Data Transfer Out
  • 67. AWS Lambda Pricing • $0.20 per 1,000,000 requests • First 1,000,000 requests / month – Free • 1,000,000,000 executions • (1,000,000,000 – 1,000,000) / 1,000,000 * $0.20 = $199.80 • $0.00001667 per GB-second • 400,000 GB-seconds – Free • 1,000,000,000 executions / 0.5 seconds / 128 MB • 1,000,000,000 * 0.5 * 128 / 1024 = 62,500,000 GB-Sec • 62,500,000 – 400,000 = 62,100,000 • 62,100,00 * $0.00001667 = $1,035.21
  • 68. Amazon Kinesis Pricing • $0.015 per shard hour / $11.16 per month • 1,000,000,000 / 31 / 86,400 = 373 avg. requests/second • 3 shards * $11.16 = $33.48 • $0.014 per 1,000,000 PUT payloads (25 KB) • 1,000,000,000 / 1,000,000 * $0.014 = $14.00
  • 69. Amazon S3 Pricing • $0.03 per GB (1st TB) • 1,000,000,000 * 100 bytes = 93.13 GB = $2.79 • $0.005 per 1,000 PUT requests • 1,000,000,000 / 5,000 records / 1,000 * $0.005 = $1.00 • $0.004 per 10,000 GET requests • 1,000,000,000 / 5,000 records / 10,000 * $0.004 = $0.08
  • 70. Amazon SQS Pricing • $0.50 per 1,000,000 requests • First 1,000,000 requests free • 1,000,000,000 / 5,000 records = 200,000 messages • SendMessage -> 200,000 • ReceiveMessage -> 20,000 • DeleteMessageBatch -> 20,000 • Total -> 240,000 = $0.00