SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
David Stein, Business Development EBS
November 30, 2016
Case Study: How Zendesk and
Videology Modernized Their Big
Data Platforms on Amazon EBS
STG311
What to Expect from the Session
• How to architect big data processing platforms to scale to meet
growing demand while improving performance, availability, and cost
with Amazon EBS
• Learn how about new ST1 and SC1 Throughput Optimized EBS
volumes designed for big data workloads
• Overview of how Zendesk runs a large ELK (Elasticsearch,
Logstach, Kibana) on Amazon EC2 and EBS for their cloud-based
customer support platform
• Overview of how Videology runs a Hadoop architecture on EC2 and
EBS to ingest, process, and analyze logs for their converged
advertising solution
Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
AWS storage is a platform
Data Transfer
AWS Direct
Connect
ISV
Connectors
Amazon
Kinesis
Firehose
AWS Storage
Gateway
Amazon S3
Transfer
Acceleration
AWS
Snowball
Amazon
CloudFront
Internet/VPN
EBS volume types
Hard disk drive
(HDD)
Solid state drive
(SSD)
EBS volume types
General Purpose
SSD
gp2
Provisioned IOPS
SSD
io1
Throughput Optimized
HDD
st1
Cold
HDD
sc1
SSD HDD
EBS volume types: throughput
Throughput
Optimized HDD
st1
Baseline: 40 MB/s per TB up to 500 MB/s
Capacity: 500 GB to 16 TB
Burst: 250 MB/s per TB up to 500 MB/s
Ideal for large-block, high-throughput sequential workloads
Cold HDD
sc1
EBS volume types: throughput
Baseline: 12 MB/s per TB up to 192 MB/s
Capacity: 500 GB to 16 TB
Burst: 80 MB/s per TB up to 250 MB/s
Ideal for sequential throughput workloads such as logging and backup
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kyle House, David Bernstein, Zendesk
November 30, 2016
Case Study: How Zendesk Modernized
Their Big Data Platforms on Amazon EBS
Inside Our New ELK Deployment
Zendesk builds software for better customer relationships. It empowers
organizations to improve customer engagement and better understand
their customers. More than 87,000 paid customer accounts in over 150
countries and territories use Zendesk products. Based in San Francisco.
What to Expect from the Session
• Discuss storage redesign, utilizing
new Amazon EBS volumes
• Talk through design choices
• Explain benefits of new storage
• model
• Cost benefits of “rightsizing” storage
ELK at Zendesk
Distributed database
Log ingestion/parsing
Beautiful visualizations
The Problem
- Operational headaches
- Encryption
- Data retention
- Cost too high
The Investigation
- User access patterns
- Performance requirements
- New EBS volume types
The Proposal
- Full usage of EBS with new volume types
- Create a tiered storage model
- Optimize instance types; decouple instances from
storage
Tiered storage
Hot (0-7 days)
General Purpose
SSD (gp2)
Warm (8-30 days)
Throughput
Optimized HDD (st1)
Cold (31-60 days) Cold HDD (sc1)
Topology
VPN
gateway
3 x m4.large
esclient/esmaster
Proxy
Bastion
3 x m4.large
esclient/esmaster
gp2 roots
8 x c4.large
logindexers
8 x c4.large
logindexers
gp2 roots
gp2 roots
gp2 roots
gp2 roots +
11G (hot)
st1
35G (warm)
sc1
80g (cold)
10 x r3.2large
esdata
10 x r3.2large
esdata
gp2 roots +
11G (hot)
st1
35G (warm)
sc1
80g (cold)
Availability Zone
Availability Zone
Sparkleformation
The Result
- Reduced operating costs by 50%
- Increased data retention 3x
- Predictable scaling model
• Storage allocation detached from instance count
- Increased data transport reliability
- Reduced operational overhead
- Increased cluster stability
49% Reduction 79% Reduction
Recommendations
- Identify data usage model before you build
- Find places where performance matters, and where
cost can be optimized
- Reduce over-provisioned storage/IOPS
- Utilize AWS managed services whenever possible
Thank you!
Up next in this session:
Videology
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Videology
Paul Frederiksen – Principal DevOps Engineer
David Ortiz – Senior Software Engineer
Videology Big Data Team
November 30, 2016
On the Rocky Road to EBS
Videology’s Journey to EBS-backed Big Data
What to Expect from the Session
• Intro to Videology
• Challenges
• Road to EBS-backed cluster
• Happy engineers
Videology overview
Founded:
2007 by Scott Ferber, co-founder of Advertising.com, which sold to
AOL Time Warner in 2004 for $497 Million
Corporate
Headquarters:
New York, NY
Operations:
• Operating in 28 Global Markets
• Key Offices – New York, Baltimore, Toronto, London, Singapore
& Sydney
Employees: Approximately 380
Investors
NEA, Comcast Ventures, Harbourvest, Catalyst Investors,
Pinnacle Ventures, Valhalla Venture
Customers:
4,500 Active Users including Brand marketers, agencies, trading
desks, media companies, MVPD’s
Ecosystem
Integrations:
Open platform with 2200+ ecosystem integrations, including 1000+
media companies, 40 data providers, all major 3rd party
verification providers, and dozens of technology partners across
the media ecosystem
Recent Client
Wins:
Videology provides a
converged advertising solution
that is screen-agnostic,
ensuring unduplicated reach
with the right frequency
cadence to achieve
guaranteed results.
45
Industry accolades…
Videology was named Best Digital Video Ad Platform by Cynopsis Media at their
2015 Model D Awards.
“ ”
Videology was able to show that their platform drove brand lift that was on average 6X
higher than Nielsen's norms.
“ ”
Videology has the most sophisticated media optimizer to analyze the right
allocation of TV and online video to optimize reach and campaign cost.
“ ”
Hadoop overview
NameNode
ResourceManager
Gateway
DataNode
NodeManager
Where does big data processing fit in?
Original production
Instance
Type
Qty Role vCPU RAM
(GB)
Storage
(GB)
m3.xlarge 1 Jumpbox 4 15 80
m3.xlarge 1 Cloudera
Manager
4 15 80
m3.2xlarge 2 NN/RM 8 30 160
cc2.8xlarge 1 Service Master 32 60 3,200
cc2.8xlarge 30 Worker 32 60 3,200
I’ve got 99 problems and Hadoop is a few of them
Reliability
Scalability
Distcp
CPU to Memory Ratio
2015
Q2
2016
Q3
2016
Q4 and
beyond
Engaged Cloudera
for EBS support
Gave up on EBS
and tested D2s
New EBS to
the rescue!
Take advantage of
new hardware
CC2.8XL M4.10XLD2.8XL
Old
Not enough disk
Expensive
NirvanaLots of disk!
Not enough memory
Expensive
D2.8xl prototype
Instance
Type
Qty Role vCPU RAM
(GB)
Storage
(GB)
r3.large 1 Jumpbox 2 15.25 32
r3.large 1 Cloudera
Manager
2 15.25 32
r3.xlarge 2 NN/RM 8 30 160
r3.2xlarge 2 Service Master 8 61 160
d2.8xl 10 Worker 36 244 48,000
M4.10xlarge w/ sc1 prototype
Instance
Type
Qty Role vCPU RAM
(GB)
Storage
(GB)
r3.large 1 Jumpbox 2 15.25 32
r3.large 1 Cloudera
Manager
2 15.25 32
r3.xlarge 2 NN/RM 8 30 160
r3.2xlarge 2 Service Master 8 61 160
m4.10xlarge 18 Worker 40 160 4,000
M4.10xlarge w/ st1 prototype
Instance
Type
Qty Role vCPU RAM
(GB)
Storage
(GB)
r3.large 1 Jumpbox 2 15.25 32
r3.large 1 Cloudera Manager 2 15.25 32
r3.xlarge 2 NN/RM 8 30 160
r3.2xlarge 2 Service Master 8 61 160
m4.10xlarg
e
18 Worker 40 160 8,000
Problems no more!
• No more rebuilding Nodes
• 1 critical incident since switch vs. 5 in the year prior to release
• Get to play with kids instead of babysitting cluster
Engineering benefits - capacity
No longer restricted by
memory, we now have
resources to pursue other
tools to improve our reliability
and speed:
• Spark
• HBase
• Flafka
• Offloading processing from
Amazon Redshift to CDH
More resilient to log volume
increases
Can expand storage as
requirements changes
Financial benefits
$0.00
$5,000.00
$10,000.00
$15,000.00
$20,000.00
$25,000.00
$30,000.00
Total Cost Cost by Utilization
Cc2 M4
$0.00
$0.01
$0.02
$0.03
$0.04
$0.05
$0.06
$0.07
$0.08
$0.09
$0.10
Cost to Process 1000 Requests
Cc2 M4
Thank you!
Questions?
Remember to complete
your evaluations!

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
AWS re:Invent 2016: How Aptean uses AWS Marketplace storage solutions to back...
 
ENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million UsersENT309 Scaling Up to Your First 10 Million Users
ENT309 Scaling Up to Your First 10 Million Users
 
AWS re:Invent 2016: How Netflix Achieves Email Delivery at Global Scale with ...
AWS re:Invent 2016: How Netflix Achieves Email Delivery at Global Scale with ...AWS re:Invent 2016: How Netflix Achieves Email Delivery at Global Scale with ...
AWS re:Invent 2016: How Netflix Achieves Email Delivery at Global Scale with ...
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
AWS re:Invent 2016: Deep Dive on AWS Cloud Data Migration Services (ENT210)
AWS re:Invent 2016: Deep Dive on AWS Cloud Data Migration Services (ENT210)AWS re:Invent 2016: Deep Dive on AWS Cloud Data Migration Services (ENT210)
AWS re:Invent 2016: Deep Dive on AWS Cloud Data Migration Services (ENT210)
 
AWS re:Invent 2016: Three Customer Viewpoints: Private Equity, Managed Servic...
AWS re:Invent 2016: Three Customer Viewpoints: Private Equity, Managed Servic...AWS re:Invent 2016: Three Customer Viewpoints: Private Equity, Managed Servic...
AWS re:Invent 2016: Three Customer Viewpoints: Private Equity, Managed Servic...
 
Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and Archive
 
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWS
 
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech TalksDeep Dive on Elastic File System - February 2017 AWS Online Tech Talks
Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks
 
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Introduction to Amazon CloudFront - Pop-up Loft Tel Aviv
Introduction to Amazon CloudFront - Pop-up Loft Tel AvivIntroduction to Amazon CloudFront - Pop-up Loft Tel Aviv
Introduction to Amazon CloudFront - Pop-up Loft Tel Aviv
 
AWS Services for Content Production
AWS Services for Content ProductionAWS Services for Content Production
AWS Services for Content Production
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
 
SRV422 Deep Dive on AWS Database Migration Service
SRV422 Deep Dive on AWS Database Migration ServiceSRV422 Deep Dive on AWS Database Migration Service
SRV422 Deep Dive on AWS Database Migration Service
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
 
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the Cloud
 
SRV301 Getting the most Bang for your buck with #EC2 #Winning
SRV301 Getting the most Bang for your buck with #EC2 #WinningSRV301 Getting the most Bang for your buck with #EC2 #Winning
SRV301 Getting the most Bang for your buck with #EC2 #Winning
 

Destacado

Destacado (20)

AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
AWS re:Invent 2016: Workshop: Converting Your Oracle or Microsoft SQL Server ...
 
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
AWS re:Invent 2016: Migrating Your Data Warehouse to Amazon Redshift (DAT202)
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
 
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
 
[Team MIAS] Initial Presentation
[Team MIAS] Initial Presentation[Team MIAS] Initial Presentation
[Team MIAS] Initial Presentation
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
 
BIg Data Trends in 2016
BIg Data Trends in 2016BIg Data Trends in 2016
BIg Data Trends in 2016
 
1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing
 
AWS re:Invent 2016: Workshop: Stretching Scalability: Doing more with Amazon ...
AWS re:Invent 2016: Workshop: Stretching Scalability: Doing more with Amazon ...AWS re:Invent 2016: Workshop: Stretching Scalability: Doing more with Amazon ...
AWS re:Invent 2016: Workshop: Stretching Scalability: Doing more with Amazon ...
 
Internet of Things and Big Data
Internet of Things and Big DataInternet of Things and Big Data
Internet of Things and Big Data
 
Thank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement PlatformThank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement Platform
 
The Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial ServicesThe Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial Services
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Cloud transition - The Trivadis approach
Cloud transition - The Trivadis approachCloud transition - The Trivadis approach
Cloud transition - The Trivadis approach
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 
The Art Of Net Promoter Score
The Art Of Net Promoter ScoreThe Art Of Net Promoter Score
The Art Of Net Promoter Score
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 
Galvanize Data Science Open House
Galvanize Data Science Open HouseGalvanize Data Science Open House
Galvanize Data Science Open House
 
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
 

Similar a AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

AWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APACAWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APAC
Amazon Web Services
 

Similar a AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311) (20)

Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
 
CMP217_Scale In-Memory Workloads on Amazon EC2 X1 and X1e Instances with up t...
CMP217_Scale In-Memory Workloads on Amazon EC2 X1 and X1e Instances with up t...CMP217_Scale In-Memory Workloads on Amazon EC2 X1 and X1e Instances with up t...
CMP217_Scale In-Memory Workloads on Amazon EC2 X1 and X1e Instances with up t...
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Re invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampionRe invent announcements_2016_hcls_use_cases_mchampion
Re invent announcements_2016_hcls_use_cases_mchampion
 
Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS Running Business Critical Workloads on AWS
Running Business Critical Workloads on AWS
 
AWS Partner Webcast - Reporting and Analytics in the Cloud
AWS Partner Webcast - Reporting and Analytics in the CloudAWS Partner Webcast - Reporting and Analytics in the Cloud
AWS Partner Webcast - Reporting and Analytics in the Cloud
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Introduction on Amazon EC2
Introduction on Amazon EC2Introduction on Amazon EC2
Introduction on Amazon EC2
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Building Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit CanberraBuilding Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit Canberra
 
Workshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECSWorkshop: Deploy a Deep Learning Framework on Amazon ECS
Workshop: Deploy a Deep Learning Framework on Amazon ECS
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
How to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutesHow to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutes
 
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
 
AWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APACAWS Presentation at JasperWorld APAC
AWS Presentation at JasperWorld APAC
 
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
 
AWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data AnalyticsAWS Summit Berlin 2013 - Big Data Analytics
AWS Summit Berlin 2013 - Big Data Analytics
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

AWS re:Invent 2016: Case Study: How Videology and Zendesk Modernized Their Big Data Platforms on Amazon EBS (STG311)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. David Stein, Business Development EBS November 30, 2016 Case Study: How Zendesk and Videology Modernized Their Big Data Platforms on Amazon EBS STG311
  • 2. What to Expect from the Session • How to architect big data processing platforms to scale to meet growing demand while improving performance, availability, and cost with Amazon EBS • Learn how about new ST1 and SC1 Throughput Optimized EBS volumes designed for big data workloads • Overview of how Zendesk runs a large ELK (Elasticsearch, Logstach, Kibana) on Amazon EC2 and EBS for their cloud-based customer support platform • Overview of how Videology runs a Hadoop architecture on EC2 and EBS to ingest, process, and analyze logs for their converged advertising solution
  • 3. Amazon EFS File Amazon EBS Amazon EC2 Instance Store Block Amazon S3 Amazon Glacier Object AWS storage is a platform Data Transfer AWS Direct Connect ISV Connectors Amazon Kinesis Firehose AWS Storage Gateway Amazon S3 Transfer Acceleration AWS Snowball Amazon CloudFront Internet/VPN
  • 4. EBS volume types Hard disk drive (HDD) Solid state drive (SSD)
  • 5. EBS volume types General Purpose SSD gp2 Provisioned IOPS SSD io1 Throughput Optimized HDD st1 Cold HDD sc1 SSD HDD
  • 6. EBS volume types: throughput Throughput Optimized HDD st1 Baseline: 40 MB/s per TB up to 500 MB/s Capacity: 500 GB to 16 TB Burst: 250 MB/s per TB up to 500 MB/s Ideal for large-block, high-throughput sequential workloads
  • 7. Cold HDD sc1 EBS volume types: throughput Baseline: 12 MB/s per TB up to 192 MB/s Capacity: 500 GB to 16 TB Burst: 80 MB/s per TB up to 250 MB/s Ideal for sequential throughput workloads such as logging and backup
  • 8. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kyle House, David Bernstein, Zendesk November 30, 2016 Case Study: How Zendesk Modernized Their Big Data Platforms on Amazon EBS Inside Our New ELK Deployment
  • 9. Zendesk builds software for better customer relationships. It empowers organizations to improve customer engagement and better understand their customers. More than 87,000 paid customer accounts in over 150 countries and territories use Zendesk products. Based in San Francisco.
  • 10. What to Expect from the Session • Discuss storage redesign, utilizing new Amazon EBS volumes • Talk through design choices • Explain benefits of new storage • model • Cost benefits of “rightsizing” storage
  • 11. ELK at Zendesk Distributed database Log ingestion/parsing Beautiful visualizations
  • 12.
  • 13.
  • 14.
  • 15. The Problem - Operational headaches - Encryption - Data retention - Cost too high
  • 16.
  • 17.
  • 18. The Investigation - User access patterns - Performance requirements - New EBS volume types
  • 19.
  • 20.
  • 21. The Proposal - Full usage of EBS with new volume types - Create a tiered storage model - Optimize instance types; decouple instances from storage
  • 22. Tiered storage Hot (0-7 days) General Purpose SSD (gp2) Warm (8-30 days) Throughput Optimized HDD (st1) Cold (31-60 days) Cold HDD (sc1)
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Topology VPN gateway 3 x m4.large esclient/esmaster Proxy Bastion 3 x m4.large esclient/esmaster gp2 roots 8 x c4.large logindexers 8 x c4.large logindexers gp2 roots gp2 roots gp2 roots gp2 roots + 11G (hot) st1 35G (warm) sc1 80g (cold) 10 x r3.2large esdata 10 x r3.2large esdata gp2 roots + 11G (hot) st1 35G (warm) sc1 80g (cold) Availability Zone Availability Zone
  • 33.
  • 35.
  • 36.
  • 37.
  • 38. The Result - Reduced operating costs by 50% - Increased data retention 3x - Predictable scaling model • Storage allocation detached from instance count - Increased data transport reliability - Reduced operational overhead - Increased cluster stability
  • 39. 49% Reduction 79% Reduction
  • 40. Recommendations - Identify data usage model before you build - Find places where performance matters, and where cost can be optimized - Reduce over-provisioned storage/IOPS - Utilize AWS managed services whenever possible
  • 41. Thank you! Up next in this session: Videology
  • 42. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Videology Paul Frederiksen – Principal DevOps Engineer David Ortiz – Senior Software Engineer Videology Big Data Team November 30, 2016
  • 43. On the Rocky Road to EBS Videology’s Journey to EBS-backed Big Data
  • 44. What to Expect from the Session • Intro to Videology • Challenges • Road to EBS-backed cluster • Happy engineers
  • 45. Videology overview Founded: 2007 by Scott Ferber, co-founder of Advertising.com, which sold to AOL Time Warner in 2004 for $497 Million Corporate Headquarters: New York, NY Operations: • Operating in 28 Global Markets • Key Offices – New York, Baltimore, Toronto, London, Singapore & Sydney Employees: Approximately 380 Investors NEA, Comcast Ventures, Harbourvest, Catalyst Investors, Pinnacle Ventures, Valhalla Venture Customers: 4,500 Active Users including Brand marketers, agencies, trading desks, media companies, MVPD’s Ecosystem Integrations: Open platform with 2200+ ecosystem integrations, including 1000+ media companies, 40 data providers, all major 3rd party verification providers, and dozens of technology partners across the media ecosystem Recent Client Wins: Videology provides a converged advertising solution that is screen-agnostic, ensuring unduplicated reach with the right frequency cadence to achieve guaranteed results. 45
  • 46. Industry accolades… Videology was named Best Digital Video Ad Platform by Cynopsis Media at their 2015 Model D Awards. “ ” Videology was able to show that their platform drove brand lift that was on average 6X higher than Nielsen's norms. “ ” Videology has the most sophisticated media optimizer to analyze the right allocation of TV and online video to optimize reach and campaign cost. “ ”
  • 48. Where does big data processing fit in?
  • 49. Original production Instance Type Qty Role vCPU RAM (GB) Storage (GB) m3.xlarge 1 Jumpbox 4 15 80 m3.xlarge 1 Cloudera Manager 4 15 80 m3.2xlarge 2 NN/RM 8 30 160 cc2.8xlarge 1 Service Master 32 60 3,200 cc2.8xlarge 30 Worker 32 60 3,200
  • 50. I’ve got 99 problems and Hadoop is a few of them Reliability Scalability Distcp CPU to Memory Ratio
  • 51. 2015 Q2 2016 Q3 2016 Q4 and beyond Engaged Cloudera for EBS support Gave up on EBS and tested D2s New EBS to the rescue! Take advantage of new hardware
  • 52. CC2.8XL M4.10XLD2.8XL Old Not enough disk Expensive NirvanaLots of disk! Not enough memory Expensive
  • 53. D2.8xl prototype Instance Type Qty Role vCPU RAM (GB) Storage (GB) r3.large 1 Jumpbox 2 15.25 32 r3.large 1 Cloudera Manager 2 15.25 32 r3.xlarge 2 NN/RM 8 30 160 r3.2xlarge 2 Service Master 8 61 160 d2.8xl 10 Worker 36 244 48,000
  • 54. M4.10xlarge w/ sc1 prototype Instance Type Qty Role vCPU RAM (GB) Storage (GB) r3.large 1 Jumpbox 2 15.25 32 r3.large 1 Cloudera Manager 2 15.25 32 r3.xlarge 2 NN/RM 8 30 160 r3.2xlarge 2 Service Master 8 61 160 m4.10xlarge 18 Worker 40 160 4,000
  • 55. M4.10xlarge w/ st1 prototype Instance Type Qty Role vCPU RAM (GB) Storage (GB) r3.large 1 Jumpbox 2 15.25 32 r3.large 1 Cloudera Manager 2 15.25 32 r3.xlarge 2 NN/RM 8 30 160 r3.2xlarge 2 Service Master 8 61 160 m4.10xlarg e 18 Worker 40 160 8,000
  • 56. Problems no more! • No more rebuilding Nodes • 1 critical incident since switch vs. 5 in the year prior to release • Get to play with kids instead of babysitting cluster
  • 57. Engineering benefits - capacity No longer restricted by memory, we now have resources to pursue other tools to improve our reliability and speed: • Spark • HBase • Flafka • Offloading processing from Amazon Redshift to CDH More resilient to log volume increases Can expand storage as requirements changes
  • 58. Financial benefits $0.00 $5,000.00 $10,000.00 $15,000.00 $20,000.00 $25,000.00 $30,000.00 Total Cost Cost by Utilization Cc2 M4 $0.00 $0.01 $0.02 $0.03 $0.04 $0.05 $0.06 $0.07 $0.08 $0.09 $0.10 Cost to Process 1000 Requests Cc2 M4