SlideShare una empresa de Scribd logo
1 de 19
C3 – Compute Capacity Calculator
Hadoop User Group (HUG) – 20 Nov 2013

Viraj Bhat
viraj@yahoo-inc.com
Why we need this tool?
o Capacity Planning for a multi-tenant system like Hadoop Grid is
critical
o Project Owners need to estimate their project capacity
requirements for provisioning on Hadoop clusters
o BU-POCs need to have capacity estimates from projects to
manage their demand vs. supply equation within their business
units
o SEO needs product owners to provide Grid capacity requirements

quarterly (CAR)
Onboarding Projects - Challenge
o Application developers typically develop and test their Hadoop Jobs
or Oozie workflows on the limited capacity, shared prototyping
research Hadoop cluster with partial data sets before on-boarding to
production Hadoop clusters
o Research and Production Hadoop Grids, have varying map reduce

slots, container sizes, compute and communication costs
o Projects may need optimization before being on boarded
o SupportShop is the front end portal for teams to onboard projects

onto Yahoo! Grids
o

Onboarding tool known as Himiko tracks users requests till the project is
provisioned on the cluster
Project On-boarding needs Computing Capacity
C3 Tool Requirements
o Self-Serve deployed as a web interface tool hosted within end-user onestop portal – SupportSHOP
o Rule Based Uses post-job execution diagnostic rule engine to calculate
the computation capacities
o SLA Focus Given a desired SLA, the tool will calculate optimal compute
resources required on the cluster for the entire SLA range of [ 2x to 0.25x]
o Hide Complexity should take into account the source & target cluster’s
map-reduce slot configuration, internal Hadoop scheduling and execution
details as well as hardware specific “speedup” in calculating the compute
capacities
o Pig Jobs Support should analyze the Job DAG (Directed Acyclic Graph)
of Map Reduce job spawned by Pig to accurately compute the capacities
o Oozie Support: workflows running on our Grids use Oozie
C3 Architecture
Job Type:[Pig]
Grid Name: [..]
Pig Console Output: [Location]
SLA [Mins]: [..]

Browser
C3 php forms

SupportShop Frontend

Submit
Job Type:[MR]
Grid Name: [..]
Job ID: [job_202030_1234]
SLA [Mins]: [..]

Submit

SupportShop Backend
Web Server
yphp backend

Input forms
1) Parse pig logs/oozie jobid
2) Copy pig logs
3) Run pending jobs from db
Record completed jobs to db

Compute Capacity Report
Job Type:[Pig/MR]
Grid Name: [..]
SLA [Mins]: [..]
Map Slot Capacity
Reduce Slot Capacity
Job Dag

C3 DB

C3Cronjob

1) Fetch job history logs and conf
logs using HDFS Proxy
2) Execute Hadoop Vaidya rule
3) Send results back to c3cronjob

C3
Core Logic

HDFS Proxy
Output report is emailed to
user

Yahoo! Grid
C3 – Compute Capacity Calculator
o Calculate the compute capacity needed for their M/R jobs to meet the
required processing time Service Level Agreement (SLA)
o Compute capacity is calculated in terms of number of Map and Reduce
slots/containers
o

Estimate machines procured based on the Map and Reduce Slots/containers

o Projects normally run their jobs on the research cluster and are onboarded to
the production cluster
o Tool should automatically match the map reduce slot ratio in research to
production (Hadoop 1.x)
o Capacities of M/R jobs which are launched in parallel are added
o

Example: Fork in Oozie workflows

o Maximum of the Capacity of M/R’s jobs are considered when launched in
sequence
o

Example: Pig Dag which produces sequential jobs
C3 Statistics
o C3 and Himiko have helped onboard more than 200 projects

o More than 2300+ requests have been submitted to C3

o C3 has analyzed Pig Dag which consists of more than 200 individual

M/R jobs

o C3 has helped detect performance issues with certain M/R jobs

where excessive mappers were being used in a Pig script
C3 Backend – Hadoop Vaidya
o Rule based performance diagnosis of
M/R jobs
o

M/R performance analysis expertise is captured
and provided as an input through a set of predefined diagnostic rules

o

Detects performance problems by postmortem
analysis of a job by executing the diagnostic rules
against the job execution counters

o

Provides targeted advice against individual
performance problems

o Extensible framework
o

You can add your own rules based on a rule
template and published job counters data
structures

o

Write complex rules using existing simpler rules

Vaidya: An expert (versed
in his own profession , esp.
in medical science) , skilled
in the art of healing , a
physician
C3 Rule logic at the Backend
o Reduce slot capacity/containers is same as number of reduce
slots/containers required for number of reducers specified for the M/R
job
o Calculate shuffle time as amount of data per reducer / 4MBps
(conservative estimate of bandwidth) - configurable
o Reduce phase time =~ max (sort + reduce logic time) of reducers *
speedup
o Map Phase time = SLA - (shuffle time - reduce phase time) * speedup
o Map slot capacity = MAP_SLOT_MILLIS / Map Phase time (in millis)
o MAP_SLOT_MILLIS = Median of the 10% of the worst performing mappers

o Once we get initial Map and Reduce slot capacity using above
calculations, iteratively get their ratio close to slot configuration per
node (Hadoop 1.0)
o Add 10% slots for speculative execution (failed/killed task attempts)
C3 Input
Compute Capacity Tool Output

Pig Dag

Single M/R job
C3 tool integrated with Hadoop Vaidya
C3 results for M/R jobs run in Hadoop 23
C3 results for Pig script run on Hadoop 23
Future Enhancements
o C3 should output the storage requirements for a job
o Display Map and Reduce runtime
o Capacity planning for custom Map Reduce jobs which can provide
an xml of their DAG’s
o Introduce more granular estimation using a speed-up factor per
cluster based on the hardware node configuration (processors,
memory etc)

o C3 should accept % data input to accurately estimate the
capacities
Links
o Hadoop Vaidya
o https://hadoop.apache.org/docs/r1.2.1/vaidya.html
o Hadoop Vaidya Job History Server Integration for Hadoop 2.0
o https://issues.apache.org/jira/browse/MAPREDUCE-3202
Acknowledgements
 Yahoo
›

Ryota Egashira – egashira@yahoo-inc.com

›

Kendall Thrapp - kthrapp@yahoo-inc.com

›

Kimsukh Kundu – kimsukhk@yahoo-inc.com

 Ebay
›

Shashank Phadke - sphadke@pacbell.net

 Pivotal
›

Milind Bhandarkar - mbhandarkar@gopivotal.com

›

Vitthal Gogate - vitthal_gogate@yahoo.com
Questions?

Más contenido relacionado

La actualidad más candente

Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Masayuki Matsushita
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
inside-BigData.com
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
DataWorks Summit
 
A sql implementation on the map reduce framework
A sql implementation on the map reduce frameworkA sql implementation on the map reduce framework
A sql implementation on the map reduce framework
eldariof
 

La actualidad más candente (20)

Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォームPivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
Pivotal Greenplum 次世代マルチクラウド・データ分析プラットフォーム
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
 
Mixing Analytic Workloads with Greenplum and Apache Spark
Mixing Analytic Workloads with Greenplum and Apache SparkMixing Analytic Workloads with Greenplum and Apache Spark
Mixing Analytic Workloads with Greenplum and Apache Spark
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid Sherlock: an anomaly detection service on top of Druid
Sherlock: an anomaly detection service on top of Druid
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Large-scaled telematics analytics
Large-scaled telematics analyticsLarge-scaled telematics analytics
Large-scaled telematics analytics
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]Resource Aware Scheduling for Hadoop [Final Presentation]
Resource Aware Scheduling for Hadoop [Final Presentation]
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
A sql implementation on the map reduce framework
A sql implementation on the map reduce frameworkA sql implementation on the map reduce framework
A sql implementation on the map reduce framework
 
Big Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on SparkBig Data Heterogeneous Mixture Learning on Spark
Big Data Heterogeneous Mixture Learning on Spark
 

Destacado

Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Yahoo Developer Network
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
Yahoo Developer Network
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
Yahoo Developer Network
 

Destacado (9)

HW09 Hadoop Vaidya
HW09 Hadoop VaidyaHW09 Hadoop Vaidya
HW09 Hadoop Vaidya
 
Jumbune optimize hadoop-solutions
Jumbune optimize hadoop-solutionsJumbune optimize hadoop-solutions
Jumbune optimize hadoop-solutions
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
 
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Profile hadoop apps
Profile hadoop appsProfile hadoop apps
Profile hadoop apps
 

Similar a November 2013 HUG: Compute Capacity Calculator

Introduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUGIntroduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUG
Adam Kawa
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
Ari Jolma
 
Juniper Innovation Contest
Juniper Innovation ContestJuniper Innovation Contest
Juniper Innovation Contest
AMIT BORUDE
 

Similar a November 2013 HUG: Compute Capacity Calculator (20)

Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
Introduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUGIntroduction To Apache Pig at WHUG
Introduction To Apache Pig at WHUG
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
 
R the unsung hero of Big Data
R the unsung hero of Big DataR the unsung hero of Big Data
R the unsung hero of Big Data
 
A slide share pig in CCS334 for big data analytics
A slide share pig in CCS334 for big data analyticsA slide share pig in CCS334 for big data analytics
A slide share pig in CCS334 for big data analytics
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data ProcessingApache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query ProcessingApache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
A intro to (hosted) Shiny Apps
A intro to (hosted) Shiny AppsA intro to (hosted) Shiny Apps
A intro to (hosted) Shiny Apps
 
Big Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics Chapter3-6@2021.pdf
 
Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...Geospatial web services using little-known GDAL features and modern Perl midd...
Geospatial web services using little-known GDAL features and modern Perl midd...
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to Mahout
 
Introduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUGIntroduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUG
 
Hadoop mapreduce and yarn frame work- unit5
Hadoop mapreduce and yarn frame work-  unit5Hadoop mapreduce and yarn frame work-  unit5
Hadoop mapreduce and yarn frame work- unit5
 
Juniper Innovation Contest
Juniper Innovation ContestJuniper Innovation Contest
Juniper Innovation Contest
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
 
Angular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraAngular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - Linagora
 
Dataflow.pptx
Dataflow.pptxDataflow.pptx
Dataflow.pptx
 
AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...AutoML for user segmentation: how to match millions of users with hundreds of...
AutoML for user segmentation: how to match millions of users with hundreds of...
 

Más de Yahoo Developer Network

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
Yahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 

Más de Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

November 2013 HUG: Compute Capacity Calculator

  • 1. C3 – Compute Capacity Calculator Hadoop User Group (HUG) – 20 Nov 2013 Viraj Bhat viraj@yahoo-inc.com
  • 2. Why we need this tool? o Capacity Planning for a multi-tenant system like Hadoop Grid is critical o Project Owners need to estimate their project capacity requirements for provisioning on Hadoop clusters o BU-POCs need to have capacity estimates from projects to manage their demand vs. supply equation within their business units o SEO needs product owners to provide Grid capacity requirements quarterly (CAR)
  • 3. Onboarding Projects - Challenge o Application developers typically develop and test their Hadoop Jobs or Oozie workflows on the limited capacity, shared prototyping research Hadoop cluster with partial data sets before on-boarding to production Hadoop clusters o Research and Production Hadoop Grids, have varying map reduce slots, container sizes, compute and communication costs o Projects may need optimization before being on boarded o SupportShop is the front end portal for teams to onboard projects onto Yahoo! Grids o Onboarding tool known as Himiko tracks users requests till the project is provisioned on the cluster
  • 4. Project On-boarding needs Computing Capacity
  • 5. C3 Tool Requirements o Self-Serve deployed as a web interface tool hosted within end-user onestop portal – SupportSHOP o Rule Based Uses post-job execution diagnostic rule engine to calculate the computation capacities o SLA Focus Given a desired SLA, the tool will calculate optimal compute resources required on the cluster for the entire SLA range of [ 2x to 0.25x] o Hide Complexity should take into account the source & target cluster’s map-reduce slot configuration, internal Hadoop scheduling and execution details as well as hardware specific “speedup” in calculating the compute capacities o Pig Jobs Support should analyze the Job DAG (Directed Acyclic Graph) of Map Reduce job spawned by Pig to accurately compute the capacities o Oozie Support: workflows running on our Grids use Oozie
  • 6. C3 Architecture Job Type:[Pig] Grid Name: [..] Pig Console Output: [Location] SLA [Mins]: [..] Browser C3 php forms SupportShop Frontend Submit Job Type:[MR] Grid Name: [..] Job ID: [job_202030_1234] SLA [Mins]: [..] Submit SupportShop Backend Web Server yphp backend Input forms 1) Parse pig logs/oozie jobid 2) Copy pig logs 3) Run pending jobs from db Record completed jobs to db Compute Capacity Report Job Type:[Pig/MR] Grid Name: [..] SLA [Mins]: [..] Map Slot Capacity Reduce Slot Capacity Job Dag C3 DB C3Cronjob 1) Fetch job history logs and conf logs using HDFS Proxy 2) Execute Hadoop Vaidya rule 3) Send results back to c3cronjob C3 Core Logic HDFS Proxy Output report is emailed to user Yahoo! Grid
  • 7. C3 – Compute Capacity Calculator o Calculate the compute capacity needed for their M/R jobs to meet the required processing time Service Level Agreement (SLA) o Compute capacity is calculated in terms of number of Map and Reduce slots/containers o Estimate machines procured based on the Map and Reduce Slots/containers o Projects normally run their jobs on the research cluster and are onboarded to the production cluster o Tool should automatically match the map reduce slot ratio in research to production (Hadoop 1.x) o Capacities of M/R jobs which are launched in parallel are added o Example: Fork in Oozie workflows o Maximum of the Capacity of M/R’s jobs are considered when launched in sequence o Example: Pig Dag which produces sequential jobs
  • 8. C3 Statistics o C3 and Himiko have helped onboard more than 200 projects o More than 2300+ requests have been submitted to C3 o C3 has analyzed Pig Dag which consists of more than 200 individual M/R jobs o C3 has helped detect performance issues with certain M/R jobs where excessive mappers were being used in a Pig script
  • 9. C3 Backend – Hadoop Vaidya o Rule based performance diagnosis of M/R jobs o M/R performance analysis expertise is captured and provided as an input through a set of predefined diagnostic rules o Detects performance problems by postmortem analysis of a job by executing the diagnostic rules against the job execution counters o Provides targeted advice against individual performance problems o Extensible framework o You can add your own rules based on a rule template and published job counters data structures o Write complex rules using existing simpler rules Vaidya: An expert (versed in his own profession , esp. in medical science) , skilled in the art of healing , a physician
  • 10. C3 Rule logic at the Backend o Reduce slot capacity/containers is same as number of reduce slots/containers required for number of reducers specified for the M/R job o Calculate shuffle time as amount of data per reducer / 4MBps (conservative estimate of bandwidth) - configurable o Reduce phase time =~ max (sort + reduce logic time) of reducers * speedup o Map Phase time = SLA - (shuffle time - reduce phase time) * speedup o Map slot capacity = MAP_SLOT_MILLIS / Map Phase time (in millis) o MAP_SLOT_MILLIS = Median of the 10% of the worst performing mappers o Once we get initial Map and Reduce slot capacity using above calculations, iteratively get their ratio close to slot configuration per node (Hadoop 1.0) o Add 10% slots for speculative execution (failed/killed task attempts)
  • 12. Compute Capacity Tool Output Pig Dag Single M/R job
  • 13. C3 tool integrated with Hadoop Vaidya
  • 14. C3 results for M/R jobs run in Hadoop 23
  • 15. C3 results for Pig script run on Hadoop 23
  • 16. Future Enhancements o C3 should output the storage requirements for a job o Display Map and Reduce runtime o Capacity planning for custom Map Reduce jobs which can provide an xml of their DAG’s o Introduce more granular estimation using a speed-up factor per cluster based on the hardware node configuration (processors, memory etc) o C3 should accept % data input to accurately estimate the capacities
  • 17. Links o Hadoop Vaidya o https://hadoop.apache.org/docs/r1.2.1/vaidya.html o Hadoop Vaidya Job History Server Integration for Hadoop 2.0 o https://issues.apache.org/jira/browse/MAPREDUCE-3202
  • 18. Acknowledgements  Yahoo › Ryota Egashira – egashira@yahoo-inc.com › Kendall Thrapp - kthrapp@yahoo-inc.com › Kimsukh Kundu – kimsukhk@yahoo-inc.com  Ebay › Shashank Phadke - sphadke@pacbell.net  Pivotal › Milind Bhandarkar - mbhandarkar@gopivotal.com › Vitthal Gogate - vitthal_gogate@yahoo.com