SlideShare una empresa de Scribd logo
1 de 22
The Gain of Resource Delegation in Distributed Computing Environments Alexander Fölling, Christian Grimme, Joachim Lepping, andAlexander Papaspyrou 15th Workshop on Job Scheduling for Parallel Processing April 23, 2010 -Atlanta, GA, USA
Outline Motivation System Model Resource Delegation Policy Evaluation Setup Results Conclusion and Future Work 2 Alexander Fölling | April 23, 2010
Motivation Distributed computing  infrastructures (DCI) have reached production status More and more users  draw its computing resources from Grid and Cloud infrastructures Many DCIs are exhaustively used and produce significant revenue  Cloud-Infrastructures allow easy on-demand provisioning of resources (enlargement of local resource space) Infrastructure as a Service (IaaS) by virtualization technology Simple access and pricing model  The temporal extension of the local resource space allows more flexible scheduling decisions Locally, no  traditional parallel job scheduling problem with parallel machines (Pm, Rm, Qm - Model)  On-demand resource leasing may improve scheduling performance 3 Alexander Fölling | April 23, 2010
System Model 4 4 Alexander Fölling | April 23, 2010
System Model 5 5 Alexander Fölling | April 23, 2010
System Model 6 Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs 6 Alexander Fölling | April 23, 2010
System Model 7 Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs Remote Access Remote Access 7 Alexander Fölling | April 23, 2010
Properties of Resource Delegation Different from centralized scheduling  with multi-site execution No central scheduling component but independent sites Scheduler cedes full control to other schedulers when resource access is granted (for a certain period) Resource leasing enlarges the local resource space Scheduling decisions are exclusively made by local schedulers Resources might be used immediately or later during leasing period  Advanced scheduling strategies may support both local allocations under varying machine sizes planning of future resource requirements Each participant in a DCI is both resource consumer and resource provider  8 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 9 Site 1 1 2 3 Queue Schedule Site 2 Schedule 9 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy(ST-RDP) 10 Site 1 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Queue Schedule Site 2 Schedule 10 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 11 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability Queue Schedule Site 2 Schedule 11 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 12 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability Queue Schedule Site 2 [Enough] Schedule 12 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 13 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] ? Request 2 CPUs for 100 seconds Schedule 13 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 14 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] ? Request 2 CPUs for 100 seconds [Request Denied] Schedule 14 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 15 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] [Request   Accepted] Prioritize New Job Request 2 CPUs for 100 seconds [Request Denied] Schedule 15 Alexander Fölling | April 23, 2010
Submission Triggered Resource Delegation Policy (ST-RDP) 16 Site 1 Forward Job To LRMS 1 2 3 Check ResourceAvailability 4a [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] [Request   Accepted] Prioritize New Job 4b [Request Denied] Schedule 16 Alexander Fölling | April 23, 2010
Evaluation Setup Input Data Real Workload Traces from Parallel Workloads Archive KTH, CTC, SDSC05 ~ 100 – 1600 CPUs, ~ 28000 – 74000 Jobs (first 11 months) Local Resource Management System EASY Backfilling Evaluation objectives for results Improvements in AWRT Reconfiguration behavior 17 Alexander Fölling | April 23, 2010
Results: ST-RDP Performance 18 Alexander Fölling | April 23, 2010
Results: Reconfiguration behavior KTH and CTC 11 month with ST-RDP CPUs CPUs 400 500 300 400 200 300 100 200 2 4 6 8 10 2 4 6 8 10 Time in month Time in month KTH CTC 19 Alexander Fölling | April 23, 2010
Conclusion 20 Proposed new concept for resource delegation in DCIs Parallel job scheduling problems under varying machine sizes The resource requirements can be flexibly negotiated among participants   Evaluation of a simpleresourcedelegationmethod Without need for further information exchange Robust in changing environments Results show significant benefits for the local scheduling (improvement in AWRT)  During operation, many resources are delegated among sites Alexander Fölling | April 23, 2010
Future Work Application to larger DCI environments  Considering additional location policies that decides which site to ask first for delegation Long term planning of resource leasing/delegation Not only single job decisions Decisions should be based on workload records (user behavior, submission patterns etc.) Eventually, make decision on predicted user behavior  Consider additional parameters like local queue/schedule status  21 Alexander Fölling | April 23, 2010
Thank You 22 Alexander Fölling alexander.foelling@udo.edu Christian Grimme christian.grimme@udo.edu Joachim Lepping joachim.lepping@udo.edu Robotics Research Institute Information Technology Section TU Dortmund University, Germany http://www.it.irf.de Alexander Papaspyrou alexander.papaspyrou@udo.edu

Más contenido relacionado

Destacado

O Brilll Dominant Behavior Notes
O Brilll Dominant Behavior NotesO Brilll Dominant Behavior Notes
O Brilll Dominant Behavior Notes
obpatrick
 
Book Exceprts Covers
Book Exceprts CoversBook Exceprts Covers
Book Exceprts Covers
carleenpalmer
 
Construction of mag
Construction of magConstruction of mag
Construction of mag
HanaEllis
 
Basketball dreams/Garcia
Basketball dreams/GarciaBasketball dreams/Garcia
Basketball dreams/Garcia
guesta97f9b
 
Malaysia Special Ev Set
Malaysia Special Ev SetMalaysia Special Ev Set
Malaysia Special Ev Set
Cicak
 
Stages of creating my poster
Stages of creating my posterStages of creating my poster
Stages of creating my poster
HanaEllis
 
Discussion about my poster
Discussion about my posterDiscussion about my poster
Discussion about my poster
HanaEllis
 

Destacado (20)

Künstliche Intelligenz - Chancen und Herausforderungen für die Rechtsberatung
Künstliche Intelligenz - Chancen und Herausforderungen für die RechtsberatungKünstliche Intelligenz - Chancen und Herausforderungen für die Rechtsberatung
Künstliche Intelligenz - Chancen und Herausforderungen für die Rechtsberatung
 
Künstliche Intelligenz einsetzen und Customer Experience verbessern
Künstliche Intelligenz einsetzen und Customer Experience verbessernKünstliche Intelligenz einsetzen und Customer Experience verbessern
Künstliche Intelligenz einsetzen und Customer Experience verbessern
 
O Brilll Dominant Behavior Notes
O Brilll Dominant Behavior NotesO Brilll Dominant Behavior Notes
O Brilll Dominant Behavior Notes
 
Book Exceprts Covers
Book Exceprts CoversBook Exceprts Covers
Book Exceprts Covers
 
Construction of mag
Construction of magConstruction of mag
Construction of mag
 
Basketball dreams/Garcia
Basketball dreams/GarciaBasketball dreams/Garcia
Basketball dreams/Garcia
 
Test-Driven Development
Test-Driven DevelopmentTest-Driven Development
Test-Driven Development
 
Transform: The value of your values, John Godfrey
Transform: The value of your values, John GodfreyTransform: The value of your values, John Godfrey
Transform: The value of your values, John Godfrey
 
Flex4 Component Lifecycle
Flex4 Component LifecycleFlex4 Component Lifecycle
Flex4 Component Lifecycle
 
Brand pie presentation_transform_conference
Brand pie presentation_transform_conferenceBrand pie presentation_transform_conference
Brand pie presentation_transform_conference
 
Wp7 study
Wp7 studyWp7 study
Wp7 study
 
Analyzing film magazines
Analyzing film magazinesAnalyzing film magazines
Analyzing film magazines
 
Malaysia Special Ev Set
Malaysia Special Ev SetMalaysia Special Ev Set
Malaysia Special Ev Set
 
Sourcefinance
SourcefinanceSourcefinance
Sourcefinance
 
Stages of creating my poster
Stages of creating my posterStages of creating my poster
Stages of creating my poster
 
Reputation in Oil, Gas and Mining 2014: Public perceptions
Reputation in Oil, Gas and Mining 2014: Public perceptions Reputation in Oil, Gas and Mining 2014: Public perceptions
Reputation in Oil, Gas and Mining 2014: Public perceptions
 
Bmi & the rabbit agency
Bmi & the rabbit agencyBmi & the rabbit agency
Bmi & the rabbit agency
 
Viburnix
ViburnixViburnix
Viburnix
 
Why is going paperless
Why is going paperlessWhy is going paperless
Why is going paperless
 
Discussion about my poster
Discussion about my posterDiscussion about my poster
Discussion about my poster
 

Similar a JSSPP 2010

Heuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environmentHeuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environment
eSAT Journals
 
A latency-aware max-min algorithm for resource allocation in cloud
A latency-aware max-min algorithm for resource  allocation in cloud A latency-aware max-min algorithm for resource  allocation in cloud
A latency-aware max-min algorithm for resource allocation in cloud
IJECEIAES
 
UC Ref Group Mar09
UC Ref Group Mar09UC Ref Group Mar09
UC Ref Group Mar09
UCUOM
 

Similar a JSSPP 2010 (20)

Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011Mr. Scott Manson's presentation at QITCOM 2011
Mr. Scott Manson's presentation at QITCOM 2011
 
A Novel Dynamic Priority Based Job Scheduling Approach for Cloud Environment
A Novel Dynamic Priority Based Job Scheduling Approach for Cloud EnvironmentA Novel Dynamic Priority Based Job Scheduling Approach for Cloud Environment
A Novel Dynamic Priority Based Job Scheduling Approach for Cloud Environment
 
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...
 
Heuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environmentHeuristics based multi queue job scheduling for cloud computing environment
Heuristics based multi queue job scheduling for cloud computing environment
 
VIRTUAL MACHINE SCHEDULING IN CLOUD COMPUTING ENVIRONMENT
VIRTUAL MACHINE SCHEDULING IN CLOUD COMPUTING ENVIRONMENTVIRTUAL MACHINE SCHEDULING IN CLOUD COMPUTING ENVIRONMENT
VIRTUAL MACHINE SCHEDULING IN CLOUD COMPUTING ENVIRONMENT
 
REVIEW PAPER on Scheduling in Cloud Computing
REVIEW PAPER on Scheduling in Cloud ComputingREVIEW PAPER on Scheduling in Cloud Computing
REVIEW PAPER on Scheduling in Cloud Computing
 
Optimization of FCFS Based Resource Provisioning Algorithm for Cloud Computing
Optimization of FCFS Based Resource Provisioning Algorithm for Cloud ComputingOptimization of FCFS Based Resource Provisioning Algorithm for Cloud Computing
Optimization of FCFS Based Resource Provisioning Algorithm for Cloud Computing
 
ABB Roadshow - Westpoint, NE 6-27-13 Technology Planning
ABB Roadshow - Westpoint, NE 6-27-13 Technology PlanningABB Roadshow - Westpoint, NE 6-27-13 Technology Planning
ABB Roadshow - Westpoint, NE 6-27-13 Technology Planning
 
C017531925
C017531925C017531925
C017531925
 
Energy Efficient Heuristic Base Job Scheduling Algorithms in Cloud Computing
Energy Efficient Heuristic Base Job Scheduling Algorithms in Cloud ComputingEnergy Efficient Heuristic Base Job Scheduling Algorithms in Cloud Computing
Energy Efficient Heuristic Base Job Scheduling Algorithms in Cloud Computing
 
A 01
A 01A 01
A 01
 
Survey of streaming data warehouse update scheduling
Survey of streaming data warehouse update schedulingSurvey of streaming data warehouse update scheduling
Survey of streaming data warehouse update scheduling
 
A latency-aware max-min algorithm for resource allocation in cloud
A latency-aware max-min algorithm for resource  allocation in cloud A latency-aware max-min algorithm for resource  allocation in cloud
A latency-aware max-min algorithm for resource allocation in cloud
 
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
 
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
 
Resource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling AlgorithmResource Allocation for Task Using Fair Share Scheduling Algorithm
Resource Allocation for Task Using Fair Share Scheduling Algorithm
 
UC Ref Group Mar09
UC Ref Group Mar09UC Ref Group Mar09
UC Ref Group Mar09
 
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympi...
 
B02120307013
B02120307013B02120307013
B02120307013
 
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch ApplicationsNeptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

JSSPP 2010

  • 1. The Gain of Resource Delegation in Distributed Computing Environments Alexander Fölling, Christian Grimme, Joachim Lepping, andAlexander Papaspyrou 15th Workshop on Job Scheduling for Parallel Processing April 23, 2010 -Atlanta, GA, USA
  • 2. Outline Motivation System Model Resource Delegation Policy Evaluation Setup Results Conclusion and Future Work 2 Alexander Fölling | April 23, 2010
  • 3. Motivation Distributed computing infrastructures (DCI) have reached production status More and more users draw its computing resources from Grid and Cloud infrastructures Many DCIs are exhaustively used and produce significant revenue Cloud-Infrastructures allow easy on-demand provisioning of resources (enlargement of local resource space) Infrastructure as a Service (IaaS) by virtualization technology Simple access and pricing model The temporal extension of the local resource space allows more flexible scheduling decisions Locally, no traditional parallel job scheduling problem with parallel machines (Pm, Rm, Qm - Model) On-demand resource leasing may improve scheduling performance 3 Alexander Fölling | April 23, 2010
  • 4. System Model 4 4 Alexander Fölling | April 23, 2010
  • 5. System Model 5 5 Alexander Fölling | April 23, 2010
  • 6. System Model 6 Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs 6 Alexander Fölling | April 23, 2010
  • 7. System Model 7 Workload- Analysis Workload- Analysis Forward Jobs Forward Jobs Remote Access Remote Access 7 Alexander Fölling | April 23, 2010
  • 8. Properties of Resource Delegation Different from centralized scheduling with multi-site execution No central scheduling component but independent sites Scheduler cedes full control to other schedulers when resource access is granted (for a certain period) Resource leasing enlarges the local resource space Scheduling decisions are exclusively made by local schedulers Resources might be used immediately or later during leasing period Advanced scheduling strategies may support both local allocations under varying machine sizes planning of future resource requirements Each participant in a DCI is both resource consumer and resource provider 8 Alexander Fölling | April 23, 2010
  • 9. Submission Triggered Resource Delegation Policy (ST-RDP) 9 Site 1 1 2 3 Queue Schedule Site 2 Schedule 9 Alexander Fölling | April 23, 2010
  • 10. Submission Triggered Resource Delegation Policy(ST-RDP) 10 Site 1 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Queue Schedule Site 2 Schedule 10 Alexander Fölling | April 23, 2010
  • 11. Submission Triggered Resource Delegation Policy (ST-RDP) 11 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability Queue Schedule Site 2 Schedule 11 Alexander Fölling | April 23, 2010
  • 12. Submission Triggered Resource Delegation Policy (ST-RDP) 12 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability Queue Schedule Site 2 [Enough] Schedule 12 Alexander Fölling | April 23, 2010
  • 13. Submission Triggered Resource Delegation Policy (ST-RDP) 13 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] ? Request 2 CPUs for 100 seconds Schedule 13 Alexander Fölling | April 23, 2010
  • 14. Submission Triggered Resource Delegation Policy (ST-RDP) 14 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] ? Request 2 CPUs for 100 seconds [Request Denied] Schedule 14 Alexander Fölling | April 23, 2010
  • 15. Submission Triggered Resource Delegation Policy (ST-RDP) 15 Site 1 2 CPUs idle 4 CPUs 100 Seconds Forward Job To LRMS 1 2 3 4 Check ResourceAvailability [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] [Request Accepted] Prioritize New Job Request 2 CPUs for 100 seconds [Request Denied] Schedule 15 Alexander Fölling | April 23, 2010
  • 16. Submission Triggered Resource Delegation Policy (ST-RDP) 16 Site 1 Forward Job To LRMS 1 2 3 Check ResourceAvailability 4a [Not Enough] Queue Schedule Try toLend ResourceDeficiency Site 2 [Enough] [Request Accepted] Prioritize New Job 4b [Request Denied] Schedule 16 Alexander Fölling | April 23, 2010
  • 17. Evaluation Setup Input Data Real Workload Traces from Parallel Workloads Archive KTH, CTC, SDSC05 ~ 100 – 1600 CPUs, ~ 28000 – 74000 Jobs (first 11 months) Local Resource Management System EASY Backfilling Evaluation objectives for results Improvements in AWRT Reconfiguration behavior 17 Alexander Fölling | April 23, 2010
  • 18. Results: ST-RDP Performance 18 Alexander Fölling | April 23, 2010
  • 19. Results: Reconfiguration behavior KTH and CTC 11 month with ST-RDP CPUs CPUs 400 500 300 400 200 300 100 200 2 4 6 8 10 2 4 6 8 10 Time in month Time in month KTH CTC 19 Alexander Fölling | April 23, 2010
  • 20. Conclusion 20 Proposed new concept for resource delegation in DCIs Parallel job scheduling problems under varying machine sizes The resource requirements can be flexibly negotiated among participants Evaluation of a simpleresourcedelegationmethod Without need for further information exchange Robust in changing environments Results show significant benefits for the local scheduling (improvement in AWRT) During operation, many resources are delegated among sites Alexander Fölling | April 23, 2010
  • 21. Future Work Application to larger DCI environments Considering additional location policies that decides which site to ask first for delegation Long term planning of resource leasing/delegation Not only single job decisions Decisions should be based on workload records (user behavior, submission patterns etc.) Eventually, make decision on predicted user behavior Consider additional parameters like local queue/schedule status 21 Alexander Fölling | April 23, 2010
  • 22. Thank You 22 Alexander Fölling alexander.foelling@udo.edu Christian Grimme christian.grimme@udo.edu Joachim Lepping joachim.lepping@udo.edu Robotics Research Institute Information Technology Section TU Dortmund University, Germany http://www.it.irf.de Alexander Papaspyrou alexander.papaspyrou@udo.edu