Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
OPTIMAL RESOURCE
PROVISIONING FOR RUNNING
MAPREDUCE PROGRAMS IN
THE CLOUD
Presented By:
Group Id: 29
Priyanka Sangtani
Ans...
PROBLEM STATEMENT
The problem at hand is defining a resource provisioning
framework for MapReduce jobs running in a cloud ...
MAPREDUCE OVERVIEW
 In a typical MapReduce framework, data are
divided into blocks and distributed across many
nodes in a...
WHY MAPREDUCE OPTIMIZATION
 MapReduce programming paradigm lends itself well to most
data-intensive analytics jobs, given...
WORK FLOW OF PROPOSED SOLUTION
User
Application
Signature
Matching
Algorithm
SLO Based
Provisioning
Priority
Algorithm
Bot...
PROPOSED ALGORITHM
1.Signature Matching
A sample of input is run on the cloud to generate a resource consumption signature...
1 . SIGNATURE MATCHING
MATHEMATICAL MODEL
 Entire job run split into n (a pre-chosen number) intervals with
each interval having the same durati...
ALGORITHM
1. Take a sample input IS of appropriate size from actual input.
2. Take a resource set RS .
3. Take the signatu...
2. SLO – BASED PROVISIONING
Given a MapReduce job J with input dataset D identify minimal combinations (S J
M
, S J
R) of ...
Mathematical Model –
Makespan Algo: The makespan of the greedy task assignment is at least n*avg /k and at most (n
− 1)*av...
In the algorithm, T is targeted as a lower bound of the job completion time. The algorithm sweeps
through the entire range...
3. PRIORITY ALGORITHM
 Workflow Priority
o prioritizes entire workflows
o increase spending on all workflows that are mor...
MATHEMATICAL MODEL
 Workflow priority
o Lets say we have m workflow with weight vector w, i.e
w = [w1,w2…….wn]
o Total we...
ALGORITHM
1. Consider a job with n workflow and each workflow
consist of m stages.
2. User are asked to input total budget...
SKEW MITIGATION
 In addition, to support parallelism, partitions must be small enough that
several partitions can be proc...
BOTTLENECK REMOVAL
 A map-reduce system can
simultaneously run multiple
jobs competing for the node’s
resources and traff...
MATHEMATICAL MODEL
Bottleneck detection
 Te
i is expected execution time of task i.
 Tr
i is running time of task i.
 T...
ALGORITHM
 Bottleneck avoidance
Step 1: Compute task and node features
1. Run the task over cloud
2. Collect the performa...
 Bottleneck Elimination
To reduce execution time we can carry out Execution Bottleneck elimination
algorithm that will sc...
DEADLOCK
A deadlock may occur between mappers and reducers with no progress in the job
when
 Initial available map/reduce...
 Phase 2: Sort and reduce
In phase two, each partition must be sorted by key, and the reduce function must be
applied to ...
IMPLEMENTATION FRAMEWORK
 Apache Hadoop is an open source implementation of the MapReduce
programming model supported by ...
CHALLENGES IN MAPREDUCE SIMULATIONS
 The right level of abstraction.
 Data layout aware.
 Resource contention aware.
 ...
Comparison of Map Reduce Simulators
Based on Language GUI Support Workload-
aware
Resource-
contention
aware
MRPerf Ns-2 J...
 Prior simulators on evaluating schedulers are trace-driven and
aware of other jobs in a work-load, but are limited in th...
 MRPerf is implemented based on ns-2, a packet-level
network simulator, and its performance is much worse than
other simu...
MRSIM ARCHITECTURE
 MRSim model simulates network topology and
traffic using GridSim. On the other hand, it models
the rest of system entiti...
WHAT IS SIMJAVA?
 SimJava is a discrete event, process oriented simulation
package. It is an API that augments Java with ...
CONSTRUCTING A SIMULATION INVOLVES :
 Coding the behavior of simulation entities - done by
extending the sim_entity class...
GRIDSIM
 allows modelling and simulation of entities in parallel and distributed
computing (PDC) systems-users, applicati...
JACKSON MODEL
Jackson Api contains a lot of functionalities to read and
build json using java.
It has very powerful data...
JACKSON API
//1. Convert Java object to JSON format
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(new File("...
JOB TRACKER LAYOUT
 The main components of the simulator is Job Tracker that
controls generating map and reduce tasks, monitors when
differe...
DEMO – MRSIM
COMPARISON PARAMETERS
 Number of map and reduce slots
 CPU Usage
 Hard-disk Utilization
 Average Mapper Time
 Average...
JOB PROFILES
Referred from Resource Provisioning Framework for MapReduce Jobs with
Performance Goals , Abhishek Verma1, Lu...
TIME DURATION FOR DIFFERENT PHASES
PROFILE NoOfMap,
NoOfReduce
T1 T2 T3
Profile1 7,10 SLO 1398 1344 1357
SIGN + PRIOR 1209...
MEAN TIME OVERHEADS FOR VARIOUS PHASES
SLO FAILED(JOB CAN’T BE COMPLETED
WITHIN DEADLINE)
420
SLO EXECUTED 1334
Signature ...
COMPARISON OF BASE ALGORITHM VS PROPOSED
ALGORITHM
PROFILE NO OF
MAPPER
NO OF
REDU
CERS
BASE ALGO CPU USAGE HDD UTILIZATN ...
CONTI…..
Profile 5 12 16 0.000015307 0.002802 5239 164.185 204.315
0.001846 0.022107 4240 286.6 124.45
Profile 6 46 14 0.0...
CPU UTILIZATION
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.01
Base Algorithm
Proposed Algorithm
HARD – DISK UTILIZATION
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Base Algorithm
Proposed Algorithm
EXECUTION TIME
0
1000
2000
3000
4000
5000
6000
Base Algorithm
Proposed Algorithm
AVERAGE MAPPER TIME
0
200
400
600
800
1000
1200
1400
Base Algorithm
Proposed Algorithm
AVERAGE REDUCER TIME
0
100
200
300
400
500
600
700
800
900
Base Algorithm
Proposed Algorithm
RESULTS FOR JOB PROFILE 1
GRAPHS FOR PROFILE 1
RESULTS FOR JOB PROFILE 2
GRAPHICAL COMPARISON FOR PROFILE 2
TRACE FOR EXECUTION
 INFO GUISimulator:114 - <init>- done
 Initialising...
 INFO HTopology:112 - initGridSim- Initializ...
TRACE CONTINUED…
 INFO HJobTracker:622 - stopSimulation- send end of simualtion 10.0
 INFO InMemFSMergeThread:71 - body-...
OUTPUT SNAPSHOTS FOR PROPOSED
ALGORITHM
REFERENCES
[1] E. Bortnikov, A. Frank, E. Hillel, and S. Rao, “Predicting execution bottlenecks in map-reduce
clusters” In...
[9] R. Lammel, “Google’s MapReduce programming model – Revisited” in Journal of Science of
Computer Programming, Oct 2007
...
Aggarwal Draft
Aggarwal Draft
Aggarwal Draft
Aggarwal Draft
Próxima SlideShare
Cargando en…5
×

Aggarwal Draft

324 visualizaciones

Publicado el

Testing Voices Session

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Aggarwal Draft

  1. 1. OPTIMAL RESOURCE PROVISIONING FOR RUNNING MAPREDUCE PROGRAMS IN THE CLOUD Presented By: Group Id: 29 Priyanka Sangtani Anshul Aggarwal Pooja Jain
  2. 2. PROBLEM STATEMENT The problem at hand is defining a resource provisioning framework for MapReduce jobs running in a cloud keeping in mind performance goals such as Resource utilization with -optimal number of map and reduce slots -improvements in execution time -Highly scalable solution This is a design issue related to software frameworks available in cloud . Traditional provisioning frameworks provide the users with defaults which do not lend well to Mapreduce jobs . Such jobs are highly parallelizable and our proposed algorithm aims to use this fact to provide highly optimized resource provisioning suitable for Mapreduce.
  3. 3. MAPREDUCE OVERVIEW  In a typical MapReduce framework, data are divided into blocks and distributed across many nodes in a cluster and the MapReduce framework takes advantage of data locality by shipping computation to data rather than moving data to where it is processed.  Most input data blocks to MapReduce applications are located on the local node, so they can be loaded very fast and reading multiple blocks can be done on multiple nodes in parallel.  Therefore, MapReduce can achieve very high aggregate I/O bandwidth and data processing rate.
  4. 4. WHY MAPREDUCE OPTIMIZATION  MapReduce programming paradigm lends itself well to most data-intensive analytics jobs, given its ability to scale-out and leverage several machines to parallel process data.  Research has demonstrated that existing approaches to provisioning other applications in the cloud are not immediately relevant to MapReduce -based applications  MapReduce jobs have over 180 configuration parameters . Too high a value can potentially cause resource contention and degrade overall performance. Setting a low value, on the other hand, might under-utilize the resources, and once again reduce performance.  Each application has a different bottleneck resource (CPU:Disk:Network), and different bottleneck resource utilization, and thus needs to pick a different combination of these parameters such that the bottleneck resource is maximally utilized.
  5. 5. WORK FLOW OF PROPOSED SOLUTION User Application Signature Matching Algorithm SLO Based Provisioning Priority Algorithm Bottleneck Removal Database with Signature YesNo Resource Provisioning Framework Optimal no. of map / reduce slots
  6. 6. PROPOSED ALGORITHM 1.Signature Matching A sample of input is run on the cloud to generate a resource consumption signature . This signature is matched with a database. If a match is found, we can use the optimal configurations stored for the matched signature else we move to SLO-based provisioning. 2. SLO based Resource Provisioning Based on the number of maps and reduce jobs, available slots and time constraints , we calculate the optimal number of maps and reduce jobs to run in parallel . 3. Priority Assignment To give users a better control over provisioning, we can assign priorities in this stage 4. Skew Mitigation Managing parallel partitions. 5. Bottle Neck Removal The most common problem in parallel computation is bottleneck. 6. Deadlock detection removal This stage deals with deadlock removal to improve execution time.
  7. 7. 1 . SIGNATURE MATCHING
  8. 8. MATHEMATICAL MODEL  Entire job run split into n (a pre-chosen number) intervals with each interval having the same duration.  For the ith interval, compute the average resource consumption for each, rth resource. The resource types (us, sy, wa, id, bi, bo, ni, no , sr = % of CPU in user time, system time, waiting time, ideal time, disk block in, disk block out, network in and network out ,slow ratio respectively)  Generate a resource consumption signature set, Sr , for every rth resource as Srm = {Srm1, Srm2 , ..., Srmn }  The signature distances between the generated signatures and the signature of the databases is computed as X2 (𝑆 𝑅1 𝑚 , 𝑆 𝑅1 𝑚 ) = 𝐼=1 𝑛 (𝑆 𝑅1 𝑚𝑖−𝑆 𝑅2 𝑚𝑖) 2 /(𝑆 𝑅1 𝑚𝑖 +𝑆 𝑅2 𝑚𝑖)  χ2 represents the vector distance between two signatures for a particular resource r in time-interval vector space. We compute scalar addition of χ2 for all the resource types . Lower value of sum of χ2 indicates more similar signatures. We choose the configuration of the application that has the closest signature distance sum to the new application.
  9. 9. ALGORITHM 1. Take a sample input IS of appropriate size from actual input. 2. Take a resource set RS . 3. Take the signature database with average distance between signatures DAVG.. 4 .Split the entire job run into n (a pre-chosen number) intervals with each interval having the same duration. 5. For all the resource types in (us, sy, wa, id, bi, bo, ni, no ,sr ) 6. For the ith interval from 1 to n 7. Compute the average resource consumption . We generate a resource consumption signature set, Sr , for every rth resource as Srm = {Srm1, Srm2 , ..., Srmn }. 8. Set min_distance = 10000. 9. For every signature S in database 10. Find the distance D between the calculated signature and S 11. If D < min_distance , set min_distance = D and Signature_matched = S 12. Set precision value P 13. If D > P*DAVG , return no match found 14. Else return Signature_matched
  10. 10. 2. SLO – BASED PROVISIONING Given a MapReduce job J with input dataset D identify minimal combinations (S J M , S J R) of map and reduce slots that can be allocated to job J so that it finishes within time T? Step I: Create a compact job profile that reflects all phases of a given job: map, shuffle/sort and reduce phases. Map Stage: (Mmin,Mavg,Mmax,AvgSizeinput M , SelectivityM) Shuffle Stage: (Sh1 avg, Sh1 max, Shtyp avg, Shtyp max) Reduce Stage: (Rmin,Ravg ,SelectivityR) Step II: There are three design choices according to the completion time- 1) T is targeted as a lower bound of the job completion time. Typically, this leads to the least amount of resources allocated to the job for finishing within deadline T. The lower bound corresponds to an ideal computation under allocated resources and is rarely achievable in real environments. 2) T is targeted as an upper bound of the job completion time. Typically, this leads to a more aggressive resource allocations and might lead to a job completion time that is much smaller than T because worst case scenarios are also rare in production settings. 3) Given time T is targeted as the average between lower and upper bounds on job completion time. This more balanced resource allocation might provide a solution that enables the job to complete within time T.
  11. 11. Mathematical Model – Makespan Algo: The makespan of the greedy task assignment is at least n*avg /k and at most (n − 1)*avg/k + max. Suppose the dataset is partitioned into NJ M map tasks and NJ R reduce tasks. Let SJ M and SJ R be the number of map and reduce slots. By Makespan Theorem, the lower and upper bounds on the duration of the entire map stage (denoted as Tlow M and Tup M respectively) are estimated as follows: T low M = NJ M * Mavg/SJ M T up M = (NJ M− 1) * Mavg/SJ M +Mmax T low sh = (NJ r /SJ r -1)* Shtyp avg T up sh = ((NJ r− 1) /SJ r ) -1)* Shtyp avg +Shtyp max T low M = T low M + Sh1 avg + T low sh +T low R T up M = T up M + Sh1 avg + T up sh +T up R T low J = NJ M·Mavg / SJ M+ NJ R·(Shtyp avg+Ravg) / SJ R+ Sh1 avg−Shtyp avg Tlow j = Alow J·NJM/SJ M+ Blow J·NJ R /SJ R+ Clow J Where Alow J = Mavg Blow J = Shtyp avg+Ravg Clow J = Sh1 avg−Shtyp avg Taking Tlow j as T (expected completion time), T= Alow J·NJM/SJ M+ Blow J·NJ R /SJ R+ Clow J
  12. 12. In the algorithm, T is targeted as a lower bound of the job completion time. The algorithm sweeps through the entire range of map slot allocations and finds the corresponding values of reduce slots that are needed to complete the job within time T. Resource allocation algorithm Input: Job profile of J (NJ M,NJ R) ←Number of map and reduce tasks of J (SM, SR) ←Total number of map and reduce slots in the cluster T ←Deadline by which job must be completed Output: P ←Set of plausible resource allocations SJ M,SJ R Algorithm: for SJ M← MIN(NJ M, SM) to 1 do Solve the equation Alow J·NJ M /SJ M+ Blow J·NJ RSJ R= T − Clow J for SJ R if 0 < SJ R≤ SR then P ← P ∪ (SJ M, SJ R) else // Job cannot be completed within deadline T // with the allocated map slots Break out of the loop end if end for The complexity of the above proposed algorithm is O(min(NJ M,Sm)) and thus linear in the number of map slots.
  13. 13. 3. PRIORITY ALGORITHM  Workflow Priority o prioritizes entire workflows o increase spending on all workflows that are more important and drop spending on less important workflows o Importance may be implied by proximity to deadline, current demand of anticipated output or whether the application is in a test or production phase.  Stage Priority o Prioritizes different stages of a single workflow o system splits a budget according to user-defined weights o budget is split within the workflow across the different stages o Spending more on phases where resources are more critical, the overall utility of the workflow may be increased
  14. 14. MATHEMATICAL MODEL  Workflow priority o Lets say we have m workflow with weight vector w, i.e w = [w1,w2…….wn] o Total weight of job is W= w1+w2…… wn o Budget for workflow i is bwi = bs* wi/W Where bs is total budget of job.  Stage Priority o Lets say we have m stages with weight vector sw i.e sw = [sw1,sw2…….swm] o Total weight of workflow is SW= sw1+sw2……swm o Budget for stage i is bswi = bw* swi/SW Where bw is total budget of workflow.
  15. 15. ALGORITHM 1. Consider a job with n workflow and each workflow consist of m stages. 2. User are asked to input total budget and workflow priority and stage priority. 3. Low priority has value 1 and high priority has value 0.5 to spend double on high priority. 4. Calculate budget for each workflow i.e bwi = bs* wi/W 5. Use bwi to find resource share for a workflow 6. Calculate budget for each stage i.e bswi = bw* swi/SW 7. Use bswi to find resource share for a stage 8. Workflow or stage will be given more cost and time for execution and thus high priority task have high spending rate i.e high b/d ratio.
  16. 16. SKEW MITIGATION  In addition, to support parallelism, partitions must be small enough that several partitions can be processed in parallel. To avoid record skew, select a partitioning function to keep each partition roughly the same size  On each node, we applies the map operation to a prefix of the records in each input file stored on that node.  As the map function produces records, the node records information about the intermediate data, such as how much larger or smaller it is than the input and the number of records generated. It also stores information about each intermediate key and the associated record's size.  It sends that metadata to the coordinator. The coordinator merges the metadata from each of the nodes to estimate the intermediate data size. It then uses this size, and the desired partition size, to compute the number of partitions.  Then, it performs a streaming merge-sort on the samples from each node. Once all the sampled data is sorted, partition boundaries are calculated based on the desired partition sizes. The result is a list of “boundary keys" that define the edges of each partition.
  17. 17. BOTTLENECK REMOVAL  A map-reduce system can simultaneously run multiple jobs competing for the node’s resources and traffic bandwidth.  These conflicts cause slowdown in the execution of tasks. The duration of each phase, and hence the duration of the job is determined by the slowest, or straggler task.  The slowdowns of individual tasks are highly correlated with overall job latencies.  However, significant task slowdowns tend to indicate bottlenecks in job execution as well.
  18. 18. MATHEMATICAL MODEL Bottleneck detection  Te i is expected execution time of task i.  Tr i is running time of task i.  TE i>Tr i means no bottleneck  Tr i – Te i > t means bottleneck is present where t is a time which is derived from past data .If a task is running for t more than expected time, bottleneck is detected. Bottleneck Elimination  ni- number of idle nodes, na- number of active nodes,f – boost factor  To reduce bottleneck, we distribute task such that total spending is equal to average spending, i.e. b/d.  Spending at active node = b/d ∗ (1 + (ni/na) ∗ f)  Spending at idle node = b/d ∗ (1 − f)  E = na/na+ni*( b/d ∗ (1 + (ni/na) ∗ f)) + ni/na+ni*( b/d ∗ (1 − f)) = b/ na+ni*d(na + ni*f + ni – ni*f) = b/ na+ni*d(na + ni) = b/d = Avg. Spending
  19. 19. ALGORITHM  Bottleneck avoidance Step 1: Compute task and node features 1. Run the task over cloud 2. Collect the performance traces after every 10 minutes and store the result in a file Step 2: Compute slowdown factor 1. Compare current job trace with already completed job 2. Calculate slowdown factor which is ration of current job parameter to similar job Step 3: Give slowdown factor of each job to scheduler 1. Scheduler schedule high slowdown job first 2. Scheduler don’t schedule high slowdown job to congested hardware node  Bottleneck detection Step 1 : Estimate execution time of each job using historical data Step 2: Periodically compute time for which job is running Step 3: Compare excepted execution time and running time 1. If TE i>Tr i ,no bottleneck. 2. Else If Tr i – Te i > t, bottleneck has occurred
  20. 20.  Bottleneck Elimination To reduce execution time we can carry out Execution Bottleneck elimination algorithm that will schedule redundant copies of the remaining tasks across several nodes which do not have other work to perform Bottleneck elimination algorithm 1. idle ← GETIDLENODES(nodes) 2. active ← nodes – idle 3. ni ← SIZE(idle) 4. na ← SIZE(active) 5. for each node ∈ active node.spending ←b/d ∗ (1 + (ni/na) ∗ f) 6. for each node ∈ idle node.spending ←b/d ∗ (1 − f) where f is a boost factor whose value is between 0 and 1 and this is set by user. b is budget and d is duration
  21. 21. DEADLOCK A deadlock may occur between mappers and reducers with no progress in the job when  Initial available map/reduce slots were allocated to mappers  Once few of mappers are completed, reducers started occupying few of the slots  After a while ,all slots occupied by reducers.  Since there were still mapper tasks not yet assigned any slot, the map phase never completed.  The system entered a deadlock state where reducers occupy all available slots, but are waiting for mappers to be complete; mappers cannot move forward because of no slot available. Deadlock prevention: Unlike existing MapReduce systems, which executes map and reduce tasks concurrently in waves, we can implements the MapReduce programming model in two phases of operation:  Phase 1: Map and shuffle The Reader stage reads records from an input disk and sends them to the Mapper stage, which applies the map function to each record. As the map function produces intermediate records, each record's key is hashed to determine the node to which it should be sent and placed in a per destination buffer that is given to the sender when it is full.
  22. 22.  Phase 2: Sort and reduce In phase two, each partition must be sorted by key, and the reduce function must be applied to groups of records with the same key. Deadlock Detection:  The deadlock detector periodically probes workers to see if they are waiting for a memory allocation request to complete.  If multiple probe cycles pass in which all workers are waiting for an allocation or are idle, the deadlock detector informs the memory allocator that a deadlock has occurred. Deadlock Elimination  Process Termination: One or more process involved in the deadlock may be aborted. We can choose to abort all processes involved in the deadlock. This ensures that deadlock is resolved with certainty and speed.  Resource Preemption: Resources allocated to various processes may be successively preempted and allocated to other processes until the deadlock is broken.
  23. 23. IMPLEMENTATION FRAMEWORK  Apache Hadoop is an open source implementation of the MapReduce programming model supported by Yahoo and used by google , Amazon etc  It also includes the underlying Hadoop Distributed File System (HDFS).  Hadoop has over 180 configuration parameters. Examples include number of replicas of input data, number of parallel map/reduce tasks to run, number of parallel connections for transferring data etc.  Hadoop installation comes with a default set of values for all the parameters in its configuration.  Scheduling in Hadoop is performed by a master node  Hadoop has a variety of schedulers. The original one schedules all jobs using a FIFO queue in the master. Another one, Hadoop on Demand (HOD), creates private MapReduce clusters dynamically and manages them using the Torque batch scheduler
  24. 24. CHALLENGES IN MAPREDUCE SIMULATIONS  The right level of abstraction.  Data layout aware.  Resource contention aware.  Heterogeneity modeling.  Resource heterogeneity is common in large clusters.  Input dependence.  Workload aware.  Verification.  Performance
  25. 25. Comparison of Map Reduce Simulators Based on Language GUI Support Workload- aware Resource- contention aware MRPerf Ns-2 JAVA Yes Yes Yes Cardona et al. GridSim C No Yes No Mumak Hadoop C No Yes No SimMR From scratch - - Yes No HSim From scratch - - No Yes MRSim GridSim JAVA Yes No Yes SimMapReduc e GridSim JAVA Yes No yes
  26. 26.  Prior simulators on evaluating schedulers are trace-driven and aware of other jobs in a work-load, but are limited in that they are not aware of resource contention, so tasks execution time may not be accurate. Our algorithm optimizes resource provisioning so we require resource-contention-aware simulator.  It is almost impractical to set up a very large cluster consisting hundreds or thousands of nodes to measure the scalability of an algorithm. Hadoop environment set up involves alterations of a great number of parameters which are crucial to achieve best performances. An obvious solution to the above problems is to use a simulator which can simulate the Hadoop environment; a simulator on one hand allows us to measure scalability of MapReduce based applications easily and quickly, on the other hand determines the effects of different configurations of Hadoop setup on MapReduce based applications behavior in terms of speed.
  27. 27.  MRPerf is implemented based on ns-2, a packet-level network simulator, and its performance is much worse than other simulators. It could not generate accurate results for jobs of different type of algorithms or different cluster configurations.  No existing implementation of HSim is available so it will require a lot of work to start from scratch.  Most of the current ongoing works in cloud computing are being done on the CloudSim simulator but since our problem entails use of map reduce model and no implementation is provided by CloudSim to support MapReduce , we are not using it.  MRSim is extending discrete event engine used SimJava to accurately simulate the Hadoop environment. Using SimJava we simulate interactions between different entities within cluster. GridSim package is also used for network simulation. It is written in Java programming language on top of SimJava.
  28. 28. MRSIM ARCHITECTURE
  29. 29.  MRSim model simulates network topology and traffic using GridSim. On the other hand, it models the rest of system entities using SimJava discrete event engine. The System is designed using object oriented based models.  Each machine is part of Network Topology model. Each machine can host Job Tracker process and Task Tracker Process. However there is only one Job Tracker per MapReduce Cluster. Each Task Tracker Model can launch several Map and Reduce tasks up to the maximum allowed number in the configuration files.
  30. 30. WHAT IS SIMJAVA?  SimJava is a discrete event, process oriented simulation package. It is an API that augments Java with building blocks for defining and running simulations.  Each system is considered to be a set of interacting processes or entities as they are referred to in SimJava. These entities communicate with each other by passing events. The simulation time progresses on the basis of these events.  Progress is recorded as trace messages and saved in a file.  As of version 2.0, SimJava has been augmented with considerable statistical and reporting support.
  31. 31. CONSTRUCTING A SIMULATION INVOLVES :  Coding the behavior of simulation entities - done by extending the sim_entity class and using the body() method.  Adding instances of these entities using sim_system object using sim_system.add(entity)  linking entities ports together,using sim_system.link_ports()  finally,setting the simulation in motion using sim_system.run().
  32. 32. GRIDSIM  allows modelling and simulation of entities in parallel and distributed computing (PDC) systems-users, applications, resources, and resource brokers (schedulers) for design and evaluation of scheduling algorithms.  provides a comprehensive facility for creating different classes of heterogeneous resources that can be aggregated using resource brokers. for solving compute and data intensive applications. A resource can be a single processor or multi-processor with shared or distributed memory and managed by time or space shared schedulers. The processing nodes within a resource can be heterogeneous in terms of processing capability, configuration, and availability. The resource brokers use scheduling algorithms or policies for mapping jobs to resources to optimize system or user objectives depending on their goals.
  33. 33. JACKSON MODEL Jackson Api contains a lot of functionalities to read and build json using java. It has very powerful data binding capabilities and provides a framework to serialize custom java objects to json string and deserialize json string back to java objects.  Json written with jackson can contain embedded class information that helps in creating the complete object tree during deserialization.
  34. 34. JACKSON API //1. Convert Java object to JSON format ObjectMapper mapper = new ObjectMapper(); mapper.writeValue(new File("c:user.json"), user); //2. Convert JSON to Java object ObjectMapper mapper = new ObjectMapper(); User user = mapper.readValue(new File("c:user.json"), User.class);
  35. 35. JOB TRACKER LAYOUT
  36. 36.  The main components of the simulator is Job Tracker that controls generating map and reduce tasks, monitors when different phases complete, and producing the final results.  Map task is started by Job Tracker. The following processes take place; • A Java VM is instantiated for the task. • Data is read from the local disk or requested remotely. • Map, sort, and spill operations are performed on the input data until all of it has been consumed. • Background file system mergers are merging the output data to reduce the number of output files to one or few files. • A message indicating the completion of the map task is returned to the Job Tracker.
  37. 37. DEMO – MRSIM
  38. 38. COMPARISON PARAMETERS  Number of map and reduce slots  CPU Usage  Hard-disk Utilization  Average Mapper Time  Average Reducer Time  Execution Time
  39. 39. JOB PROFILES Referred from Resource Provisioning Framework for MapReduce Jobs with Performance Goals , Abhishek Verma1, Ludmila Cherkasova2, and Roy H. Campbell
  40. 40. TIME DURATION FOR DIFFERENT PHASES PROFILE NoOfMap, NoOfReduce T1 T2 T3 Profile1 7,10 SLO 1398 1344 1357 SIGN + PRIOR 1209 1207 1217 Profile2 7,10 SLO 1367 1368 1387 SIGN + PRIOR 1276 1256 1273 Profile3 3,12 SLO 1397 1380 1363 SIGN + PRIOR 1245 1288 1253 Profile4 12,16 SLO 1320 1402 1409 SIGN + PRIOR 1263 1285 1207 Profile5 46,14 SLO 1316 1368 1353 SIGN + PRIOR 1208 1254 1256 Profile6 12,2 SLO 1342 1376 1332 SIGN + PRIOR 1267 1265 1287 Profile7(Job can’t be completed) 22,33 SLO 472 450 430 SIGN + PRIOR 0 0 0 Profile8 16,12 SLO 1327 1396 1376 SIGN + PRIOR 1233 1265 1274
  41. 41. MEAN TIME OVERHEADS FOR VARIOUS PHASES SLO FAILED(JOB CAN’T BE COMPLETED WITHIN DEADLINE) 420 SLO EXECUTED 1334 Signature not found 1337 Signature found 937 Priority 331
  42. 42. COMPARISON OF BASE ALGORITHM VS PROPOSED ALGORITHM PROFILE NO OF MAPPER NO OF REDU CERS BASE ALGO CPU USAGE HDD UTILIZATN TIME AV MAPPER TIME AV REDUCER TIME OUR ALGO Profile 1 60 1 0.00001429 0.00105 1919 28.021 238.179 0.0000020 0.00403 2372 25.313 853.76 Profile 2 7 10 0.000001653 0.001834 5200 291.21 316.163 0.0002732 0.003917 4095 283.891 112.045 Profile 3 7 10 0.000003592 0.0031320 3044 314.459 84.322 0.00913784 0.01550 4108 281.432 114.249 Profile 4 3 12 0.0000023 0.03093 4259 1143.458 69.098 0.0008095 0.01197 4066 425.292 108.949
  43. 43. CONTI….. Profile 5 12 16 0.000015307 0.002802 5239 164.185 204.315 0.001846 0.022107 4240 286.6 124.45 Profile 6 46 14 0.000036771 0.0024045 4163 426. 536 117.796 0.0010386 0.01082 3171 44.416 105. 881 Profile 7 12 2 0.00021723 0.005321 3986 205.405 137.099 0.0003971 0.007538 2739 426.411 100.124 Profile 8 16 12 0.00010813 0.0028452 4136 426.987 75.338 0.00478452 0.0093604 2863 122.479 114.748
  44. 44. CPU UTILIZATION 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 Base Algorithm Proposed Algorithm
  45. 45. HARD – DISK UTILIZATION 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 Base Algorithm Proposed Algorithm
  46. 46. EXECUTION TIME 0 1000 2000 3000 4000 5000 6000 Base Algorithm Proposed Algorithm
  47. 47. AVERAGE MAPPER TIME 0 200 400 600 800 1000 1200 1400 Base Algorithm Proposed Algorithm
  48. 48. AVERAGE REDUCER TIME 0 100 200 300 400 500 600 700 800 900 Base Algorithm Proposed Algorithm
  49. 49. RESULTS FOR JOB PROFILE 1
  50. 50. GRAPHS FOR PROFILE 1
  51. 51. RESULTS FOR JOB PROFILE 2
  52. 52. GRAPHICAL COMPARISON FOR PROFILE 2
  53. 53. TRACE FOR EXECUTION  INFO GUISimulator:114 - <init>- done  Initialising...  INFO HTopology:112 - initGridSim- Initializing GridSim package  Initialising...  INFO HSimulator:64 - initSimulator- creat new Result dir /home/hadoop/workspace/work/hadoop.simulator/results/26-27- Apr-2010 19:57:55  INFO HJobTracker:311 - createEntities- create topology  INFO HJobTracker:314 - createEntities- config.Heartbeat:1.0, read topology.getName:rack 0  INFO HJobTracker:318 - createEntities- init NetEnd from rack  INFO GUISimulator:389 - mnuSimStartActionPerformed- simulator has started simulator  INFO HSimulator:106 - startSimulator- Starting simulator version  INFO HSimulator:117 - startSimulator- trace level200  INFO HSimulator:120 - startSimulator- graph file: /home/hadoop/workspace/work/hadoop.simulator/results/26-27-Apr- 2010 19:57:55/graph.sjg  INFO HSimulator:125 - startSimulator- going to call Sim_system.run()  Entities started.  Entity huser has no body().  INFO HJobTracker:129 - body- start entity  INFO SimoTreeCollector:94 - body- add rack {m1=m1}  INFO GUISimulator:394 - mnuSimStopActionPerformed- going to stop simulator  INFO HTopology:252 - stopSimulation- Stopping NetEnd Simulation
  54. 54. TRACE CONTINUED…  INFO HJobTracker:622 - stopSimulation- send end of simualtion 10.0  INFO InMemFSMergeThread:71 - body- m1-reduce-0-inMemFSMergeThread END_OF_SIMULATION 10.0  INFO CPU:148 - body- cpu_m1 END_OF_SIMULATION 10.0  INFO InMemFSMergeThread:71 - body- m1-reduce-0-inMemFSMergeThread END_OF_SIMULATION 10.0  INFO HDD:148 - body- hdd_m1 END_OF_SIMULATION 10.0  INFO InMemFSMergeThread:71 - body- m1-reduce-1-inMemFSMergeThread END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-reduce-0 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-1 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-2 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-0 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-reduce-1 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-3 END_OF_SIMULATION 10.0  INFO NetEnd:100 - body- m1 end simulation at time 10.0  INFO HTask:166 - body- m1-map-0 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-1 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-2 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-map-3 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-reduce-0 END_OF_SIMULATION 10.0  INFO HTask:166 - body- m1-reduce-1 END_OF_SIMULATION 10.0  INFO SimoTreeCollector:78 - body- simotree END_OF_SIMULATION 10.0  INFO InMemFSMergeThread:71 - body- m1-reduce-1-inMemFSMergeThread END_OF_SIMULATION 10.0
  55. 55. OUTPUT SNAPSHOTS FOR PROPOSED ALGORITHM
  56. 56. REFERENCES [1] E. Bortnikov, A. Frank, E. Hillel, and S. Rao, “Predicting execution bottlenecks in map-reduce clusters” In Proc. of the 4th USENIX conference on Hot Topics in Cloud computing, 2012. [2] R. Buyya, S. K. Garg, and R. N. Calheiros, “SLA-Oriented Resource Provisioning for Cloud Computing: Challenges, Architecture, and Solutions” In International Conference on Cloud and Service Computing, 2011. [3] S. Chaisiri, Bu-Sung Lee, and D. Niyato, “Optimization of Resource Provisioning Cost in Cloud Computing” in Transactions On Service Computing, Vol. 5, No. 2, IEEE, April-June 2012 [4] L Cherkasova and R.H. Campbell, “Resource Provisioning Framework for MapReduce Jobs with Performance Goals”, in Middleware 2011, LNCS 7049, pp. 165–186, 2011 [5] J. Dean, and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”, Communications of the ACM, Jan 2008 [6] Y. Hu, J. Wong, G. Iszlai, and M. Litoiu, “Resource Provisioning for Cloud Computing” In Proc. of the 2009 Conference of the Center for Advanced Studies on Collaborative Research, 2009. [7] K. Kambatla, A. Pathak, and H. Pucha, “Towards optimizing hadoop provisioning in the cloud in Proc. of the First Workshop on Hot Topics in Cloud Computing, 2009 [8] Kuyoro S. O., Ibikunle F. and Awodele O., “Cloud Computing Security Issues and Challenges” in International Journal of Computer Networks (IJCN), Vol. 3, Issue 5, 2011
  57. 57. [9] R. Lammel, “Google’s MapReduce programming model – Revisited” in Journal of Science of Computer Programming, Oct 2007 [10] R. P. Padhy, “Big Data Processing with Hadoop-MapReduce in Cloud Systems” In International Journal of Cloud Computing and Services Science, vol. 2, Feb 2013. [11] B. Palanisamy, A. Singh, L. Liu and B. Langston, "Cura: A Cost-Optimized Model for MapReduce in a Cloud", Proc. of 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2013) [12] A. Rasmussen, M. Conley, R. Kapoor, V. T. Lam, G. Porter, and A. Vahdat, “Themis: An I/O- Efficient MapReduce”, Communications of the ACM, Oct 2012 [13] V. K. Reddy, B. T. Rao, Dr. L.S.S. Reddy, and P. S. Kiran ,” Research Issues in Cloud Computing” in Global Journal of Computer Science and Technology, vol. 11, Jul 2011 [14] T. Sandholm and K. Lai, “MapReduce Optimization Using Regulated Dynamic Prioritization” in Social Computing Laboratory, Hewlett-Packard Laboratories, 2011 [15] F. Tian, K. Chen,”Towards Optimal Resource Provisioning for Running MapReduce Programs in Public Clouds”, in 4th Intl. Conference on Cloud Computing, IEEE, 2011 [16] Hadoop. http://hadoop.apache.org. [17] Amazon Elastic MapReduce, http://aws.amazon.com/elasticmapreduce/

×