SlideShare una empresa de Scribd logo
1 de 21
Scaling MapReduce Applications
across Hybrid Clouds to Meet Soft
Deadlines
Dongseo University
Division of Computer & Information Engineering
Intelligent Smart Systems Research Lab
Presented by:
Ahmed Abdulhakim Al-Absi
2
About The Paper
 2013 IEEE International Conference on Advanced
Information Networking and Applications (AINA)
 Cloud Computing and Distributed Systems
(CLOUDS) Laboratory
 University of Melbourne, Australia
Michael Mattess, Rodrigo N. Calheiros, and Rajkumar Buyya, Proceedings of the 27th IEEE
International Conference on Advanced Information Networking and Applications (AINA 2013,
IEEE CS Press, USA), Barcelona, Spain, March 25-28, 2013.
Outline
 Motivations
 Proposed Policy
 Proposed Policy Scenario
 Policy parameters
 Policy Parameters Algorithm
 Implementation
 Performance Evaluation
 Experimental Testbed and Sample Application
 Performance Analysis
 Experimental Testbed and Sample Application
 Results – Map phase
 Results – Reduce phase
 Conclusion
 My Opinion
3
 As MapReduce is becoming the prevalent
programming model for building data processing
applications in Clouds, the need for timely
execution of such applications becomes a
necessity.
 Existing approaches for execution of such deadline-
constrained MapReduce applications focus in
meeting deadlines via admission control. However,
they cannot solve conflicts when applications with
Motivation
4
Proposed Policy
5
 To tackle this limitation of current approaches for
the problem, authors proposed a novel policy for
dynamic provisioning of public Cloud resources to
speed up execution of MapReduce applications.
 Authors target the Map phase as the target of the
soft deadline because this phase contains tasks
that generally have uniform execution time.
Policy System
1. The initial state of the system consists of
 local worker nodes registered with the master
node and ready to execute MapReduce tasks.
 Datasets are available for local workers and are
also stored in the Cloud provider’s storage
service.
2. A MapReduce application is submitted to the
master node and the scheduler begins assigning
Map tasks to worker nodes;
3. When a predefined number or fraction of the Map
tasks completes, the master uses the provisioning
policy and, based on the application deadline,
6
4. Requested resources register with the master node
as they become available and the scheduler assigns
Map tasks to them;
5. When the Map phase completes, the scheduler
begins assigning Reduce tasks to workers;
6. Each Reduce task obtains the intermediate data to
be reduced from other nodes;
7. The output of the Reduce tasks may remain on the
nodes in anticipation of another MapReduce
application, which will further process the data.
Alternatively, the output can be collected by the
master node and sent back to the user that
submitted the application.
7
Policy System
Proposed Policy Scenario
8
9
Policy Parameters
 MARGIN is a ‘safety margin’ that is removed from the
remaining time to account for errors in the prediction of
tasks execution time.
 LOCAL FACTOR is a multiplier applied to the average
run time of the first batch of Map tasks to generate an
estimation of the expected Map task execution time on
the Cluster considering variation in the worker
performance.
 REMOTE FACTOR is set to predict the expected Map
task execution time on public Cloud resources. This
allows accounting for the expected variation in the
performance of local and public Cloud resources.
 BOOT TIME is the expected amount of time between
requesting a new resource from the public Cloud and
Policy Parameters Algorithm
10
The decision of
number of public
Cloud resources to
be allocated to a
MapReduce
application
11
Implementation
 Implemented in the Aneka Cloud Platform with some
changes:
 Dynamic Provisioning: Extended the existing
MapReduce Service of Aneka to enable its
interaction with the Provisioning Service
 Data Exchange: enable Aneka MapReduce to work
across local and remote resources by HTTP and IIS
to serve the intermediary data files.
 Remote Storage: S3 as the source of local input
data files when running on EC2.
12
Performance Evaluation
Experimental Testbed and Sample
Application
 The environment is a hybrid Cloud composed of a
local Cluster and a public Cloud:
 Local Cluster: 4 IBM System X3200 M3 servers
running Citrix XenServer. 2 Windows 2008 virtual
machines= 8 worker nodes.
 Public Cloud Resources: provisioned from Amazon
EC2, USA East Coast data center. m1.small
instances, 1.0-1.2 GHz CPU, 1.7 GB of memory.
 Dataset: 4.5 GB copied in Local and S3
Performance Evaluation
Experimental Testbed and Sample
Application13
14
Performance Analysis
Results – Map Phase
Authors sequentially executed several
requests for the word count application.
On each request, they modified the
sleep time in each Map task, and kept
the deadline for completing the Map
phase constant and equal to 30 minutes
for each application.
Performance Analysis
Results – Map Phase
Performance Analysis
Results – Reduce Phase
352 MB as Small Data
3422 MB as Big Data
Similarly, a sleep was inserted in the
Reduce tasks in order to increase their
execution time and observe how different
sizes of Reduce tasks affect execution time
of applications. Two different values for
sleep,
60 seconds and 5 seconds
16
 This paper presented dynamic provisioning policy for
MapReduce applications and a prototype in the Aneka
Cloud Platform.
 Results showed that the approach, even though its
lower complexity, delivers good results.
 The policy was able to meet deadlines of applications,
which are defined in terms of completion time of the
Map phase, for increasing execution times of Map
Conclusion
17
My Opinion
18
Q & A
Thank You!
Presented by Ahmed Abdulhakim Al-Absi -  Scaling map reduce applications across hybrid clouds to meet soft deadlines

Más contenido relacionado

La actualidad más candente

Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingRamandeep Kaur
 
Task Scheduling methodology in cloud computing
Task Scheduling methodology in cloud computing Task Scheduling methodology in cloud computing
Task Scheduling methodology in cloud computing Qutub-ud- Din
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentSwapnil Shahade
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
Improving resource utilisation in the cloud environment using multivariate pr...
Improving resource utilisation in the cloud environment using multivariate pr...Improving resource utilisation in the cloud environment using multivariate pr...
Improving resource utilisation in the cloud environment using multivariate pr...Shrabanee Swagatika
 
A Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingA Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingMohd Hairey
 
task scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithmtask scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithmSwathi Rampur
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingDIGVIJAY SHINDE
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisIRJET Journal
 
Distributed in memory processing of all k nearest neighbor queries
Distributed in memory processing of all k nearest neighbor queriesDistributed in memory processing of all k nearest neighbor queries
Distributed in memory processing of all k nearest neighbor queriesieeepondy
 
Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters IJECEIAES
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...dbpublications
 
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Databricks
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKAbhi Jit
 
Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.Ramandeep Kaur
 
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...IJERA Editor
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentAlexander Decker
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Ural-PDC
 

La actualidad más candente (20)

Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
 
Task Scheduling methodology in cloud computing
Task Scheduling methodology in cloud computing Task Scheduling methodology in cloud computing
Task Scheduling methodology in cloud computing
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing Environment
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
Improving resource utilisation in the cloud environment using multivariate pr...
Improving resource utilisation in the cloud environment using multivariate pr...Improving resource utilisation in the cloud environment using multivariate pr...
Improving resource utilisation in the cloud environment using multivariate pr...
 
A Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud ComputingA Survey on Resource Allocation & Monitoring in Cloud Computing
A Survey on Resource Allocation & Monitoring in Cloud Computing
 
task scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithmtask scheduling in cloud datacentre using genetic algorithm
task scheduling in cloud datacentre using genetic algorithm
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
 
Fast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data AnalysisFast Range Aggregate Queries for Big Data Analysis
Fast Range Aggregate Queries for Big Data Analysis
 
Distributed in memory processing of all k nearest neighbor queries
Distributed in memory processing of all k nearest neighbor queriesDistributed in memory processing of all k nearest neighbor queries
Distributed in memory processing of all k nearest neighbor queries
 
Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters
 
Shubhankar pawade resume
Shubhankar pawade resumeShubhankar pawade resume
Shubhankar pawade resume
 
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
On Traffic-Aware Partition and Aggregation in Map Reduce for Big Data Applica...
 
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
Large-Scaled Insurance Analytics Using Tweedie Models in Apache Spark with Ya...
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
Scheduling in cloud
Scheduling in cloudScheduling in cloud
Scheduling in cloud
 
Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.Task Scheduling in Grid Computing.
Task Scheduling in Grid Computing.
 
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
Implementing Workload Postponing In Cloudsim to Maximize Renewable Energy Uti...
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environment
 
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
Principles of Computing Resources Planning in Cloud-Based Problem Solving Env...
 

Similar a Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications across hybrid clouds to meet soft deadlines

Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintijccsa
 
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADING
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADINGCONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADING
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADINGIJCNCJournal
 
Sharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow ApplicationsSharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow Applicationsijcsit
 
Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...ijccsa
 
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Yahoo Developer Network
 
Quality of Service based Task Scheduling Algorithms in Cloud Computing
Quality of Service based Task Scheduling Algorithms in  Cloud Computing  Quality of Service based Task Scheduling Algorithms in  Cloud Computing
Quality of Service based Task Scheduling Algorithms in Cloud Computing IJECEIAES
 
A simulation-based approach for straggler tasks detection in Hadoop MapReduce
A simulation-based approach for straggler tasks detection in Hadoop MapReduceA simulation-based approach for straggler tasks detection in Hadoop MapReduce
A simulation-based approach for straggler tasks detection in Hadoop MapReduceIRJET Journal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
Resource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmResource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmIRJET Journal
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentAlexander Decker
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTijwscjournal
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTIJCNCJournal
 
Dynamic Task Scheduling based on Burst Time Requirement for Cloud Environment
Dynamic Task Scheduling based on Burst Time Requirement for Cloud EnvironmentDynamic Task Scheduling based on Burst Time Requirement for Cloud Environment
Dynamic Task Scheduling based on Burst Time Requirement for Cloud EnvironmentIJCNCJournal
 

Similar a Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications across hybrid clouds to meet soft deadlines (20)

D04573033
D04573033D04573033
D04573033
 
DIET_BLAST
DIET_BLASTDIET_BLAST
DIET_BLAST
 
E031201032036
E031201032036E031201032036
E031201032036
 
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraintHadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
 
GRID COMPUTING
GRID COMPUTINGGRID COMPUTING
GRID COMPUTING
 
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADING
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADINGCONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADING
CONTEXT-AWARE DECISION MAKING SYSTEM FOR MOBILE CLOUD OFFLOADING
 
Sharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow ApplicationsSharing of cluster resources among multiple Workflow Applications
Sharing of cluster resources among multiple Workflow Applications
 
Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
 
Quality of Service based Task Scheduling Algorithms in Cloud Computing
Quality of Service based Task Scheduling Algorithms in  Cloud Computing  Quality of Service based Task Scheduling Algorithms in  Cloud Computing
Quality of Service based Task Scheduling Algorithms in Cloud Computing
 
A simulation-based approach for straggler tasks detection in Hadoop MapReduce
A simulation-based approach for straggler tasks detection in Hadoop MapReduceA simulation-based approach for straggler tasks detection in Hadoop MapReduce
A simulation-based approach for straggler tasks detection in Hadoop MapReduce
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
Resource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmResource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling Algorithm
 
Qo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environmentQo s aware scientific application scheduling algorithm in cloud environment
Qo s aware scientific application scheduling algorithm in cloud environment
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
 
Dynamic Task Scheduling based on Burst Time Requirement for Cloud Environment
Dynamic Task Scheduling based on Burst Time Requirement for Cloud EnvironmentDynamic Task Scheduling based on Burst Time Requirement for Cloud Environment
Dynamic Task Scheduling based on Burst Time Requirement for Cloud Environment
 

Último

SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 

Último (20)

SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 

Presented by Ahmed Abdulhakim Al-Absi - Scaling map reduce applications across hybrid clouds to meet soft deadlines

  • 1. Scaling MapReduce Applications across Hybrid Clouds to Meet Soft Deadlines Dongseo University Division of Computer & Information Engineering Intelligent Smart Systems Research Lab Presented by: Ahmed Abdulhakim Al-Absi
  • 2. 2 About The Paper  2013 IEEE International Conference on Advanced Information Networking and Applications (AINA)  Cloud Computing and Distributed Systems (CLOUDS) Laboratory  University of Melbourne, Australia Michael Mattess, Rodrigo N. Calheiros, and Rajkumar Buyya, Proceedings of the 27th IEEE International Conference on Advanced Information Networking and Applications (AINA 2013, IEEE CS Press, USA), Barcelona, Spain, March 25-28, 2013.
  • 3. Outline  Motivations  Proposed Policy  Proposed Policy Scenario  Policy parameters  Policy Parameters Algorithm  Implementation  Performance Evaluation  Experimental Testbed and Sample Application  Performance Analysis  Experimental Testbed and Sample Application  Results – Map phase  Results – Reduce phase  Conclusion  My Opinion 3
  • 4.  As MapReduce is becoming the prevalent programming model for building data processing applications in Clouds, the need for timely execution of such applications becomes a necessity.  Existing approaches for execution of such deadline- constrained MapReduce applications focus in meeting deadlines via admission control. However, they cannot solve conflicts when applications with Motivation 4
  • 5. Proposed Policy 5  To tackle this limitation of current approaches for the problem, authors proposed a novel policy for dynamic provisioning of public Cloud resources to speed up execution of MapReduce applications.  Authors target the Map phase as the target of the soft deadline because this phase contains tasks that generally have uniform execution time.
  • 6. Policy System 1. The initial state of the system consists of  local worker nodes registered with the master node and ready to execute MapReduce tasks.  Datasets are available for local workers and are also stored in the Cloud provider’s storage service. 2. A MapReduce application is submitted to the master node and the scheduler begins assigning Map tasks to worker nodes; 3. When a predefined number or fraction of the Map tasks completes, the master uses the provisioning policy and, based on the application deadline, 6
  • 7. 4. Requested resources register with the master node as they become available and the scheduler assigns Map tasks to them; 5. When the Map phase completes, the scheduler begins assigning Reduce tasks to workers; 6. Each Reduce task obtains the intermediate data to be reduced from other nodes; 7. The output of the Reduce tasks may remain on the nodes in anticipation of another MapReduce application, which will further process the data. Alternatively, the output can be collected by the master node and sent back to the user that submitted the application. 7 Policy System
  • 9. 9 Policy Parameters  MARGIN is a ‘safety margin’ that is removed from the remaining time to account for errors in the prediction of tasks execution time.  LOCAL FACTOR is a multiplier applied to the average run time of the first batch of Map tasks to generate an estimation of the expected Map task execution time on the Cluster considering variation in the worker performance.  REMOTE FACTOR is set to predict the expected Map task execution time on public Cloud resources. This allows accounting for the expected variation in the performance of local and public Cloud resources.  BOOT TIME is the expected amount of time between requesting a new resource from the public Cloud and
  • 10. Policy Parameters Algorithm 10 The decision of number of public Cloud resources to be allocated to a MapReduce application
  • 11. 11 Implementation  Implemented in the Aneka Cloud Platform with some changes:  Dynamic Provisioning: Extended the existing MapReduce Service of Aneka to enable its interaction with the Provisioning Service  Data Exchange: enable Aneka MapReduce to work across local and remote resources by HTTP and IIS to serve the intermediary data files.  Remote Storage: S3 as the source of local input data files when running on EC2.
  • 12. 12 Performance Evaluation Experimental Testbed and Sample Application  The environment is a hybrid Cloud composed of a local Cluster and a public Cloud:  Local Cluster: 4 IBM System X3200 M3 servers running Citrix XenServer. 2 Windows 2008 virtual machines= 8 worker nodes.  Public Cloud Resources: provisioned from Amazon EC2, USA East Coast data center. m1.small instances, 1.0-1.2 GHz CPU, 1.7 GB of memory.  Dataset: 4.5 GB copied in Local and S3
  • 14. 14 Performance Analysis Results – Map Phase Authors sequentially executed several requests for the word count application. On each request, they modified the sleep time in each Map task, and kept the deadline for completing the Map phase constant and equal to 30 minutes for each application.
  • 16. Performance Analysis Results – Reduce Phase 352 MB as Small Data 3422 MB as Big Data Similarly, a sleep was inserted in the Reduce tasks in order to increase their execution time and observe how different sizes of Reduce tasks affect execution time of applications. Two different values for sleep, 60 seconds and 5 seconds 16
  • 17.  This paper presented dynamic provisioning policy for MapReduce applications and a prototype in the Aneka Cloud Platform.  Results showed that the approach, even though its lower complexity, delivers good results.  The policy was able to meet deadlines of applications, which are defined in terms of completion time of the Map phase, for increasing execution times of Map Conclusion 17
  • 19.
  • 20. Q & A Thank You!