SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Mitglied der Helmholtz-Gemeinschaft
                                         ISSGC’09
                                        7th International
                                      Summer School on
                                        Grid Computing




                                        UNICORE day at ISSGC’09
                                      Presenters:
                                      Rebecca Breu, Bastian Demuth, Mathilde Romberg

                                      Jülich Supercomputing Centre (JSC)
                                        6. Juli 2009




                                                                           7 July 2009
ISSGC’09


  Agenda

    9:00 – 10:30 Principles of Job Submission and Execution Management
             Set the scene
 11:00 – 12:30 UNICORE – Architecture and Components
             Technical overview on how UNICORE works and how it is used
 14:00 – 15:30 UNICORE Basic Practical
             Practical: submitting jobs with the command line client
 16:00 – 17:30 UNICORE Workflow Practical
             Practical: submitting workflows with the graphical client
 18:00 – 19:00 UNICORE: An Application
             Example applications using UNICORE

07/07/2009                                                               Slide 2
Mitglied der Helmholtz-Gemeinschaft
                                          ISSGC’09

                                        7th International
                                      Summer School on
                                        Grid Computing



                                         Session 9:
                                         Principles of Job Submission and
                                         Execution Management
                                         Author/Presenter: Achim Streit, Mathilde Romberg
                                         Jülich Supercomputing Centre (JSC)
                                         6. Juli 2009
ISSGC’09




 Job Submission




07/07/2009        Slide 4
ISSGC’09


    Jobs

    Job
             Some work to be executed
             Requires CPU and memory
             Possibly accesses additional resources,
             e.g., storage, devices, services

    Job scheduling
          Policy for assigning jobs to resources



                                                   Courtesy of Prof. Felix Wolf, RWTH Aachen

07/07/2009                                                                            Slide 5
ISSGC’09


  Resources

  Compute
       Memory
       Central Processing Units
       Nodes
       Threads/Tasks
  Data
       Size
       Transfer rate
  Network
       Bandwidths
  ...


07/07/2009                        Slide 6
ISSGC’09


  How to Differentiate Compute Resources?

  by number of CPUs
        Single processor
        Multi processor

  Multiprocessor systems can be grouped into
         Shared memory
            Equal access time to memory from each processor
         Distributed memory
            Each CPU has its own memory and I/O
            Different address spaces
         Distributed shared memory
            Shared address space
            Access time depends on location of data in memory


07/07/2009                                                      Slide 7
ISSGC’09


   Multiprocessor Systems – Examples
                                                  Jülich
SMP (Symmetric (shared-memory) MultiProcessors)
  IBM Power 4/5/6 node, multi-core chips

MPP (Massively Parallel Processor)      Jülich                Helsinki

  IBM Blue Gene/P, Cray XT4

NUMA (Non-Uniform Memory Access)        Munich
  SGI Altix

Cluster:
                                                  Barcelona
  Mare Nostrum, IBM Power4/5/6 system
  Tera-10, self-built cluster

 07/07/2009                                                              Slide 8
ISSGC’09




Job Scheduling




07/07/2009       Slide 9
ISSGC’09


    Job Scheduling

    Policy for assigning jobs to resources
           Input are
               Set of jobs with requirements
               Set of resources
    Criteria for assignment
           Fairness
           Efficiency
           Minimize response time (interactive users)
           and turnaround time (batch jobs)
           Maximize throughput
                                                Courtesy of Prof. Felix Wolf, RWTH Aachen

07/07/2009                                                                         Slide 10
ISSGC’09


  Usage of Multiprocessor Systems

             Typically the user/job resource demands are greater than
             the available resources      users/jobs compete
             Typically resource requirements differ from one user
             (or application) to the other
                 Large/small (in terms of number of processors)
                 Large/small (in terms of amount of memory)
                 Long/short (in terms of duration of resource usage)
             A form of resource management and job scheduling is
             required !
                 How to share the available resources among the
                 competing jobs?
                 When does a job start and which resources are
                 assigned?
07/07/2009                                                         Slide 11
ISSGC’09


  Resource Management & Job Scheduling – 1

  Time-sharing (or time-slicing)
             Several jobs share the same resource
             Jobs are executed quasi-simultaneously
                Resources are not exclusively assigned to jobs
             Resource usage of jobs is reduced to short time slices
             (some clock ticks of the processor)
                Jobs need more than a single time slice to complete
             Each job gets the resource assigned in a round-robin fashion
                New jobs start immediately
             Execution time takes longer than on a dedicated resource
             Typically handled by the operating system
  Examples: SMP machines, your own Linux PC
07/07/2009                                                                  Slide 12
ISSGC’09


 Resource Management & Job Scheduling – 2

 Space-sharing (or space-slicing)
      Resources are exclusively assigned to a job until it completes
      Jobs may have to wait for enough free resources until their start
      Needs a separate resource management system (also known as
      batch system) and job scheduler

 Examples:
     MPP systems, clusters, etc.
     LoadLeveler, Torque + Maui, PBSPro, OpenCCS, SLURM, …

         space-sharing based resource management and
           job scheduling is commonly used on clusters
07/07/2009
                 and other multiprocessor systems                  Slide 13
ISSGC’09

  Job Submission on Multiprocessor Systems
  Example – LoadLeveler
  IBM Tivoli Workload Scheduler LoadLeveler
        Available for AIX, Linux

  Basic LoadLeveler commands
      llsubmit
                             Submit a job
      llq
                             Show queued and running jobs
      llcancel <job_id>
                             Delete a queued or running job
      llstatus
                             Displays status information

  Job submission via job command file
      llsubmit <cmdfile>



07/07/2009                                                    Slide 14
ISSGC’09


  LoadLeveler cmd_file examples – 1

   # @ job_name = BGP-LoadL-Sample-1
  IBMcommentGene/P system @ Jülich – JUGENE
   # @ Blue = "BGP Job by Size"
   # @ error = $(job_name).$(jobid).out
   # @ output = $(job_name).$(jobid).out
   # @ environment = COPY_ALL;
   # @ wall_clock_limit = 00:20:00                        runtime/duration
   # @ notification = error
   # @ notify_user = a.streit@fz-juelich.de
   # @ job_type = bluegene
   # @ bg_size = 32                                       size of partition
   # @ queue
   /usr/local/bin/mpirun -exe `/bin/pwd`/wait_bgp.rts 
       -mode VN -np 48 -verbose 1 -args "-t 1"
                                                          Executable,
                                                          only mpirun !




07/07/2009                                                                    Slide 15
ISSGC’09


  LoadLeveler cmd_file examples – 2

  IBM p690 eServer Cluster 1600 @ Jülich – JUMP

        #@ job_type = parallel
        #@ output = out.$(jobid).$(stepid)
        #@ error = err.$(jobid).$(stepid)
        #@ wall_clock_limit = 00:15:00                          runtime/duration
        #@ notify_user = a.streit@fz-juleich.de
        #@ node = 2
                                                                resource
        #@ total_tasks = 64
        #@ data_limit = 1.5GB                                   requirements
        #@ queue
        myprogram                                               executable


             #@ node: number of nodes for the job
             #@ total_tasks: number of total tasks in the job

07/07/2009                                                                   Slide 16
ISSGC’09

  Job Submission on Multiprocessor Systems
  Example – Torque + Maui

  Torque is the resource manager
     Maui is the cluster scheduler
  Basic Maui commands Submit a new job
  msub

   showq
                        Displays detailed and prioritized
                        list of active and idle jobs
   canceljob
                        Cancels existing jobs
   showstart
                        Shows estimated start time of
                        idle jobs
   showstats
                        Shows detailed usage statistics
                        for users, groups, and accounts,
                        the user has access to

07/07/2009                                                  Slide 17
ISSGC’09


  Job submission in Maui

  Via commandline:
  msub -l nodes=32:ppn=2,pmem=1800mb,walltime=3600   myscript




             resource list:                          script file
             32 nodes with 2 processors each
             1800 MB per task
             3600 seconds duration




07/07/2009                                                         Slide 18
ISSGC’09


             Lessons Learned

  Each job submission system is different
         Different commands for submission, status query,
         cancellation
         Different options, scheduling policies, …
  Even different configurations of the same job submission systems
     for different multiprocessor systems exist
  Job requirements are specified differently
         Command-line parameters for the job submission command
         Separate job command file
  Different job requirements exist
         Nodes and tasks per node, total tasks, …
07/07/2009                                                   Slide 19
ISSGC’09


  Job submission and the Grid

             A higher, meta level with more abstraction is needed to
             describe the requirements of jobs in a Grid of
             heterogeneous systems
             A lot of proprietary solutions exist, each Grid middleware
             is using its own language, e.g.
                 AJO in UNICORE 5, ClassAds/JDL in Condor, JDL in
                 gLite, RSL in Globus Toolkit, xRSL in ARC/NorduGrid,
                 etc…
             And there is JSDL 1.0
                 Open Grid Forum (OGF) standard
                http://www.gridforum.org/documents/GFD.56.pdf


07/07/2009                                                           Slide 20
ISSGC’09


  JSDL – Introduction

  JSDL stands for Job Submission Description Language
       A language for describing the requirements of computational
       jobs for submission to Grids and other systems
       Can also be used between middleware systems or for
       submitting to any Grid middleware ( interoperability)

  JSDL does not define a submission interface or what the results
    of a submission look like or how resources are selected




07/07/2009                                                    Slide 21
ISSGC’09


  JSDL Document

  A JSDL document is an XML document, which may contain
       Generic (job) identification information
       Application description
       Resource requirements (main focus is computational jobs)
       Description of required data files

  A JSDL document is a template, which can be submitted multiple
     times and can be used to create multiple job instances
        No job instance specific attributes can be defined,
        e.g. start or end time


07/07/2009                                                  Slide 22
ISSGC’09


  JSDL – A Hello World Example

    <?xml version="1.0" encoding="UTF-8"?>
    <jsdl:JobDefinition
       xmlns:jsdl=“http://schemas.ggf.org/2005/11/jsdl”
       xmlns:jsdl-posix=http://schemas.ggf.org/jsdl/2005/11/jsdl-posix>
    <jsdl:JobDescription>
       <jsdl:Application>
          <jsdl-posix:POSIXApplication>
             <jsdl-posix:Executable>/bin/echo<jsdl-posix:Executable>
             <jsdl-posix:Input>/dev/null</jsdl-posix:Input>
             <jsdl-posix:Output>std.out</jsdl-posix:Output>
             <jsdl-posix:Argument>hello</jsdl-posix:Argument>
             <jsdl-posix:Argument>world</jsdl-posix:Argument>
          </jsdl-posix:POSIXApplication>
       </jsdl:Application>
    </jsdl:JobDescription>
    </jsdl:JobDefinition>
07/07/2009                                                           Slide 23
ISSGC’09


  JSDL – Resource Requirements Description

   Support simple descriptions of resource requirements
         NOT a comprehensive resource requirements language
         Can be extended with other elements for richer or more
         abstract descriptions
   Main target is compute jobs
         CPU, memory, file system/disk, operating system
         requirements
   Allows some flexibility for aggregate (total) requirements
         “I want 10 CPUs in total and each resource should have
         2 or more”
   Very basic support for network requirements

07/07/2009                                                    Slide 24
ISSGC’09


  JSDL application extensions

   SPMD (single-program-multiple-data)
       http://www.ogf.org/documents/GFD.115.pdf
             Extends JDSL 1.0 for parallel applications (MPI, PVM, etc.)
             Allows to specify number of processors, processors per
             host, threads per processes along with the application

   HPC (high performance computing)
        http://www.ogf.org/documents/GFD.111.pdf
             Extends JSDL 1.0 for HPC applications running as
             operating system processes (e.g. username, environment,
             working directory can be specified)

07/07/2009                                                          Slide 25
ISSGC’09


             Lessons Learned

  JSDL is a standardized language to describe jobs to be submitted
    to Grid resources
       Not only the job itself (application, arguments, input, output,
       etc.), but also resource requirements (CPU, memory, etc.)
       Extensions for specific application domains (parallel
       programs, HPC applications) exist

  BUT: JSDL can not directly be submitted to Grid resources, i.e. a
    resource management and job scheduling system of a cluster or
    multiprocessor system in a Grid



07/07/2009                                                       Slide 26
ISSGC’09


  Execution and Job Management – 1

  One of the essential functionalities and components in a Grid
    middleware
  Deals with
       Initiating/submitting, monitoring and managing jobs
       Handling and staging of all job data
       Coordinating and scheduling of multi-step jobs
  Examples:
       XNJS in UNICORE 6 ( sessions 10-12 today)
       WS-GRAM in GT4 ( session 19-21 on Thursday)
       WMS in gLite ( session 24-26 on Friday)
       ARC Client in NorduGrid/ARC ( session 29-30 on Sat.)
07/07/2009                                                  Slide 27
ISSGC’09


  Execution and Job Management – 2

  Initiating/submitting, monitoring and managing jobs
          Translates the Grid job in a specific job (application details,
          resources, etc.) for the target system
          Submits the job to the resource management system using
          its proprietary way of job submission
          Frequently polls the job status (waiting/queued,
          running/executing, failed, aborted, paused, finished, etc.)
          from the resource management system
          Provides “access” to the job, its status and data during its
          runtime and after its (successful or unsuccessful)
          completion
          If at job submission time the resource management system
          becomes not available/reachable, the job is cached for a
          future hand over to it
07/07/2009                                                           Slide 28
ISSGC’09


  Execution and Job Management – 3

  Handling and staging of all job data, incl. job directory and persistency
        Creates, manages, destroys the job directory
        All data submitted with the job as input is stored in the job
        directory
        Data is staged in from remote data resources/archives
        All data generated by the job is preserved and/or staged after the
        successful completion of the job
  Coordinating and scheduling of multi-step jobs
        If a job consists of more than one step (a workflow), the required
        resources are orchestrated
        Manages the proper initiation of the workflow execution
        The execution of the workflow is controlled and monitored

07/07/2009                                                             Slide 29
ISSGC’09


             Lessons Learned

  Execution and job management is needed
  A meta-layer on top of the Grid resources is needed
       to provide a uniform way of accessing the Grid
       and to provide an intuitive, secure and easy to
       use interface for the user




07/07/2009                                               Slide 30
ISSGC’09




Introduction to UNICORE
(from 30,000 ft)




more in sessions 10-12, today
 07/07/2009                     Slide 31
ISSGC’09


  (Short) History Lesson

             UNiform Interface to COmputing REsources
                  seamless, secure, and intuitive
             Initial development started in two German projects funded by the
             German ministry of education and research (BMBF)
                  08/1997 – 12/1999: UNICORE project
                  Results: well defined security architecture with X.509 certificates,
                   intuitive GUI, central job supervisor based on Codine
                   (predecessor of SGE) from Genias
                1/2000 – 12/2002: UNICORE Plus project
                  Results: implementation enhancements (e.g. replacement of
                    Codine by custom NJS), extended job control (workflows),
                    application specific interfaces (plugins)
             Continuous development since 2002 in several EU projects
             Open Source community development since Summer 2004
07/07/2009                                                                        Slide 32
ISSGC’09

                                        Projects
                                                                                               WisNetGrid
                                                                                   ETICS2
More than a decade of German and European                                         SmartLM
  research & development and                                                     PRACE
  infrastructure projects                                                    D-MON
                                                                        PHOSPHORUS
                                                                      Chemomentum
Any many others, e.g.                                                 eDEISA
                                                                     A-WARE
                                                                    OMII-Europe
                                                                  EGEE-II
                                                              D-Grid IP          D-Grid IP 2
                                                      CoreGRID
                                                      NextGRID
                                                   DEISA                            DEISA2
                                                   VIOLA
                                                     UniGrids
                                     OpenMolGRID
                                  GRIDSTART
                                GRIP
                         EUROGRID
       UNICORE Plus
UNICORE

  1999     2000   2001    2002   2003    2004   2005      2006    2007    2008     2009    2010   2011
  07/07/2009                                                                                       Slide 33
ISSGC’09


                           – Grid driving HPC

Used in
     DEISA (European Distributed Supercomputing Infrastructure)
     National German Supercomputing Center NIC
     Gauss Center for Supercomputing (Alliance of the three
     German HPC centers)
     PRACE (European PetaFlop HPC Infrastructure) – starting-up
     But also in non-HPC-focused infrastructures (i.e. D-Grid)
Taking up major
  requirements from i.e.
     HPC users
     HPC user support teams
     HPC operations teams
07/07/2009                                                Slide 34
ISSGC’09


                          – www.unicore.eu


  Open source (BSD license)
        Open developer community on SourceForge
        Contribution with your own developments easily possible
  Design principles
        Standards: OGSA-conform, WS-RF compliant
        Open, extensible, interoperable
        End-to-End, seamless, secure and intuitive
        Security: X.509, proxy and VO support
        Workflow and application support directly integrated
        Variety of clients: graphical, commandline, portal, API, etc.
        Quick and simple installation and configuration
        Support for many operating and batch systems
        100% Java 5
07/07/2009                                                              Slide 35
ISSGC’09




UNICORE in use
some examples




 07/07/2009      Slide 36
ISSGC’09

                     Usage in D-Grid

Core D-Grid sites committing parts of
  their existing resources to D-Grid
      Approx. 700 CPUs
      Approx. 1 PByte of storage
      UNICORE is installed and used
Additional Sites received extra money
  from the BMBF for buying compute
  clusters and data storage
      Approx. 2000 CPUs
      Approx. 2 PByte of storage
      UNICORE (as well as Globus
      and gLite) is installed as soon       LRZ
                                        DLR-DFD
      as systems are in place
07/07/2009                                        Slide 37
ISSGC’09

                     Distributed European Infrastructure
                     for Supercomputing Applications
Consortium of leading national HPC centers in Europe
Deploy and operate a persistent, production quality,
  distributed, heterogeneous HPC environment                               www.deisa.eu

UNICORE as Grid Middleware
      On top of DEISA’s core services:
         Dedicated network
         Shared file system
         Common production
         environment at all
         sites
      Used e.g. for workflow IDRIS – CNRS (Paris, France), FZJ (Jülich, Germany), RZG (Garching,
                               Germany), CINECA (Bologna, Italy), EPCC ( Edinburgh, UK),
      applications             CSC (Helsinki, Finland), SARA (Amsterdam, NL), HLRS (Stuttgart, Germany),
                                               BSC (Barcelona, Spain), LRZ (Munich, Germany), ECMWF (Reading, UK)

more in session 33, Monday 9:00 – 10:30
 07/07/2009                                                                                                  Slide 38
ISSGC’09


 Interoperability and Usability of Grid Infrastructures
Focus on providing common interfaces and integration of major Grid
  software infrastructures
      OGSA-DAI, VOMS, GridSphere, OGSA-BES, OGSA-RUS
      UNICORE, gLite, Globus Toolkit, CROWN
Apply interoperability components in application use-cases




 07/07/2009                                                   Slide 39
ISSGC’09



Grid Services based Environment to enable Innovative Research

Provide an integrated Grid solution for workflow-centric, complex
  applications with a focus on data, semantics and knowledge
      Provide decision support services for risk assessment, toxicity
      prediction, and drug design
      End user focus
         ease of use
         domain specific tools
         “hidden Grid”
      Based on UNICORE 6




more in sessions 12-13, this afternoon
 07/07/2009                                                      Slide 40
ISSGC’09


             Commercial usage at




07/07/2009                         Slide courtesy of Alfred Geiger, T-Systems SfRSlide 43
ISSGC’09



             Lessons Learned

             UNICORE has a strong HPC-background, but is not limited
             to HPC use cases, it can be used in any Grid
             UNICORE is OGSA-conform and WS-RF compliant
             UNICORE is open, extensible and interoperable
             UNICORE is open source and coded in Java
             UNICORE is used in EU and national projects,
             European e-infrastructures, National Grid Initiatives (NGI),
             commercial environments, etc.
             Documentation, tutorials, mailing lists, community links,
             software, source code, and more:
                 http://www.unicore.eu
07/07/2009                                                           Slide 44

Más contenido relacionado

Similar a Session9part1

CS403: Operating System : Lec 7 OS Properties.pptx
CS403: Operating System : Lec 7 OS Properties.pptxCS403: Operating System : Lec 7 OS Properties.pptx
CS403: Operating System : Lec 7 OS Properties.pptxAsst.prof M.Gokilavani
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
 
Forecasting database performance
Forecasting database performanceForecasting database performance
Forecasting database performanceShenglin Du
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGAmy Roman
 
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)mfrancis
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3Alluxio, Inc.
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computingijujournal
 
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Sangwook Kim
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-schedulingHesham Elmasry
 
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...lccausp
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 

Similar a Session9part1 (20)

CS403: Operating System : Lec 7 OS Properties.pptx
CS403: Operating System : Lec 7 OS Properties.pptxCS403: Operating System : Lec 7 OS Properties.pptx
CS403: Operating System : Lec 7 OS Properties.pptx
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
 
IEEE CLOUD \'11
IEEE CLOUD \'11IEEE CLOUD \'11
IEEE CLOUD \'11
 
Spring batch
Spring batchSpring batch
Spring batch
 
Chapter1
Chapter1Chapter1
Chapter1
 
Forecasting database performance
Forecasting database performanceForecasting database performance
Forecasting database performance
 
A REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTINGA REVIEW ON PARALLEL COMPUTING
A REVIEW ON PARALLEL COMPUTING
 
035
035035
035
 
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)
OSG(a)i: because AI needs a runtime - Tim Verbelen (imec)
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
A Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud ComputingA Review on Scheduling in Cloud Computing
A Review on Scheduling in Cloud Computing
 
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-scheduling
 
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
“Programação paralela híbrida com MPI e OpenMP – uma abordagem prática”. Edua...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 

Más de ISSGC Summer School

Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future ISSGC Summer School
 
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundSession 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundISSGC Summer School
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeISSGC Summer School
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteISSGC Summer School
 
Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management ISSGC Summer School
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical ISSGC Summer School
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution ISSGC Summer School
 
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleSession 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleISSGC Summer School
 
Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction ISSGC Summer School
 
Session 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteSession 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteISSGC Summer School
 
General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school ISSGC Summer School
 
Session 3-Distributed System Principals
Session 3-Distributed System PrincipalsSession 3-Distributed System Principals
Session 3-Distributed System PrincipalsISSGC Summer School
 

Más de ISSGC Summer School (20)

Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future Session 58 - Cloud computing, virtualisation and the future
Session 58 - Cloud computing, virtualisation and the future
 
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake EdlundSession 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
Session 58 :: Cloud computing, virtualisation and the future Speaker: Ake Edlund
 
Session 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in EuropeSession 50 - High Performance Computing Ecosystem in Europe
Session 50 - High Performance Computing Ecosystem in Europe
 
Integrating Practical2009
Integrating Practical2009Integrating Practical2009
Integrating Practical2009
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky Note
 
Departure
DepartureDeparture
Departure
 
Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management Session 48 - Principles of Semantic metadata management
Session 48 - Principles of Semantic metadata management
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution
 
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an exampleSession 43 :: Accessing data using a common interface: OGSA-DAI as an example
Session 43 :: Accessing data using a common interface: OGSA-DAI as an example
 
Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction Session 40 : SAGA Overview and Introduction
Session 40 : SAGA Overview and Introduction
 
Session 36 - Engage Results
Session 36 - Engage ResultsSession 36 - Engage Results
Session 36 - Engage Results
 
Session 23 - Intro to EGEE-III
Session 23 - Intro to EGEE-IIISession 23 - Intro to EGEE-III
Session 23 - Intro to EGEE-III
 
Session 33 - Production Grids
Session 33 - Production GridsSession 33 - Production Grids
Session 33 - Production Grids
 
Social Program
Social ProgramSocial Program
Social Program
 
Session29 Arc
Session29 ArcSession29 Arc
Session29 Arc
 
Session 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLiteSession 24 - Distribute Data and Metadata Management with gLite
Session 24 - Distribute Data and Metadata Management with gLite
 
Session 23 - gLite Overview
Session 23 - gLite OverviewSession 23 - gLite Overview
Session 23 - gLite Overview
 
General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school General Introduction to technologies that will be seen in the school
General Introduction to technologies that will be seen in the school
 
Session 3-Distributed System Principals
Session 3-Distributed System PrincipalsSession 3-Distributed System Principals
Session 3-Distributed System Principals
 

Último

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Último (20)

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

Session9part1

  • 1. Mitglied der Helmholtz-Gemeinschaft ISSGC’09 7th International Summer School on Grid Computing UNICORE day at ISSGC’09 Presenters: Rebecca Breu, Bastian Demuth, Mathilde Romberg Jülich Supercomputing Centre (JSC) 6. Juli 2009 7 July 2009
  • 2. ISSGC’09 Agenda 9:00 – 10:30 Principles of Job Submission and Execution Management Set the scene 11:00 – 12:30 UNICORE – Architecture and Components Technical overview on how UNICORE works and how it is used 14:00 – 15:30 UNICORE Basic Practical Practical: submitting jobs with the command line client 16:00 – 17:30 UNICORE Workflow Practical Practical: submitting workflows with the graphical client 18:00 – 19:00 UNICORE: An Application Example applications using UNICORE 07/07/2009 Slide 2
  • 3. Mitglied der Helmholtz-Gemeinschaft ISSGC’09 7th International Summer School on Grid Computing Session 9: Principles of Job Submission and Execution Management Author/Presenter: Achim Streit, Mathilde Romberg Jülich Supercomputing Centre (JSC) 6. Juli 2009
  • 5. ISSGC’09 Jobs Job Some work to be executed Requires CPU and memory Possibly accesses additional resources, e.g., storage, devices, services Job scheduling Policy for assigning jobs to resources Courtesy of Prof. Felix Wolf, RWTH Aachen 07/07/2009 Slide 5
  • 6. ISSGC’09 Resources Compute Memory Central Processing Units Nodes Threads/Tasks Data Size Transfer rate Network Bandwidths ... 07/07/2009 Slide 6
  • 7. ISSGC’09 How to Differentiate Compute Resources? by number of CPUs Single processor Multi processor Multiprocessor systems can be grouped into Shared memory Equal access time to memory from each processor Distributed memory Each CPU has its own memory and I/O Different address spaces Distributed shared memory Shared address space Access time depends on location of data in memory 07/07/2009 Slide 7
  • 8. ISSGC’09 Multiprocessor Systems – Examples Jülich SMP (Symmetric (shared-memory) MultiProcessors) IBM Power 4/5/6 node, multi-core chips MPP (Massively Parallel Processor) Jülich Helsinki IBM Blue Gene/P, Cray XT4 NUMA (Non-Uniform Memory Access) Munich SGI Altix Cluster: Barcelona Mare Nostrum, IBM Power4/5/6 system Tera-10, self-built cluster 07/07/2009 Slide 8
  • 10. ISSGC’09 Job Scheduling Policy for assigning jobs to resources Input are Set of jobs with requirements Set of resources Criteria for assignment Fairness Efficiency Minimize response time (interactive users) and turnaround time (batch jobs) Maximize throughput Courtesy of Prof. Felix Wolf, RWTH Aachen 07/07/2009 Slide 10
  • 11. ISSGC’09 Usage of Multiprocessor Systems Typically the user/job resource demands are greater than the available resources users/jobs compete Typically resource requirements differ from one user (or application) to the other Large/small (in terms of number of processors) Large/small (in terms of amount of memory) Long/short (in terms of duration of resource usage) A form of resource management and job scheduling is required ! How to share the available resources among the competing jobs? When does a job start and which resources are assigned? 07/07/2009 Slide 11
  • 12. ISSGC’09 Resource Management & Job Scheduling – 1 Time-sharing (or time-slicing) Several jobs share the same resource Jobs are executed quasi-simultaneously Resources are not exclusively assigned to jobs Resource usage of jobs is reduced to short time slices (some clock ticks of the processor) Jobs need more than a single time slice to complete Each job gets the resource assigned in a round-robin fashion New jobs start immediately Execution time takes longer than on a dedicated resource Typically handled by the operating system Examples: SMP machines, your own Linux PC 07/07/2009 Slide 12
  • 13. ISSGC’09 Resource Management & Job Scheduling – 2 Space-sharing (or space-slicing) Resources are exclusively assigned to a job until it completes Jobs may have to wait for enough free resources until their start Needs a separate resource management system (also known as batch system) and job scheduler Examples: MPP systems, clusters, etc. LoadLeveler, Torque + Maui, PBSPro, OpenCCS, SLURM, … space-sharing based resource management and job scheduling is commonly used on clusters 07/07/2009 and other multiprocessor systems Slide 13
  • 14. ISSGC’09 Job Submission on Multiprocessor Systems Example – LoadLeveler IBM Tivoli Workload Scheduler LoadLeveler Available for AIX, Linux Basic LoadLeveler commands llsubmit Submit a job llq Show queued and running jobs llcancel <job_id> Delete a queued or running job llstatus Displays status information Job submission via job command file llsubmit <cmdfile> 07/07/2009 Slide 14
  • 15. ISSGC’09 LoadLeveler cmd_file examples – 1 # @ job_name = BGP-LoadL-Sample-1 IBMcommentGene/P system @ Jülich – JUGENE # @ Blue = "BGP Job by Size" # @ error = $(job_name).$(jobid).out # @ output = $(job_name).$(jobid).out # @ environment = COPY_ALL; # @ wall_clock_limit = 00:20:00 runtime/duration # @ notification = error # @ notify_user = a.streit@fz-juelich.de # @ job_type = bluegene # @ bg_size = 32 size of partition # @ queue /usr/local/bin/mpirun -exe `/bin/pwd`/wait_bgp.rts -mode VN -np 48 -verbose 1 -args "-t 1" Executable, only mpirun ! 07/07/2009 Slide 15
  • 16. ISSGC’09 LoadLeveler cmd_file examples – 2 IBM p690 eServer Cluster 1600 @ Jülich – JUMP #@ job_type = parallel #@ output = out.$(jobid).$(stepid) #@ error = err.$(jobid).$(stepid) #@ wall_clock_limit = 00:15:00 runtime/duration #@ notify_user = a.streit@fz-juleich.de #@ node = 2 resource #@ total_tasks = 64 #@ data_limit = 1.5GB requirements #@ queue myprogram executable #@ node: number of nodes for the job #@ total_tasks: number of total tasks in the job 07/07/2009 Slide 16
  • 17. ISSGC’09 Job Submission on Multiprocessor Systems Example – Torque + Maui Torque is the resource manager Maui is the cluster scheduler Basic Maui commands Submit a new job msub showq Displays detailed and prioritized list of active and idle jobs canceljob Cancels existing jobs showstart Shows estimated start time of idle jobs showstats Shows detailed usage statistics for users, groups, and accounts, the user has access to 07/07/2009 Slide 17
  • 18. ISSGC’09 Job submission in Maui Via commandline: msub -l nodes=32:ppn=2,pmem=1800mb,walltime=3600 myscript resource list: script file 32 nodes with 2 processors each 1800 MB per task 3600 seconds duration 07/07/2009 Slide 18
  • 19. ISSGC’09 Lessons Learned Each job submission system is different Different commands for submission, status query, cancellation Different options, scheduling policies, … Even different configurations of the same job submission systems for different multiprocessor systems exist Job requirements are specified differently Command-line parameters for the job submission command Separate job command file Different job requirements exist Nodes and tasks per node, total tasks, … 07/07/2009 Slide 19
  • 20. ISSGC’09 Job submission and the Grid A higher, meta level with more abstraction is needed to describe the requirements of jobs in a Grid of heterogeneous systems A lot of proprietary solutions exist, each Grid middleware is using its own language, e.g. AJO in UNICORE 5, ClassAds/JDL in Condor, JDL in gLite, RSL in Globus Toolkit, xRSL in ARC/NorduGrid, etc… And there is JSDL 1.0 Open Grid Forum (OGF) standard http://www.gridforum.org/documents/GFD.56.pdf 07/07/2009 Slide 20
  • 21. ISSGC’09 JSDL – Introduction JSDL stands for Job Submission Description Language A language for describing the requirements of computational jobs for submission to Grids and other systems Can also be used between middleware systems or for submitting to any Grid middleware ( interoperability) JSDL does not define a submission interface or what the results of a submission look like or how resources are selected 07/07/2009 Slide 21
  • 22. ISSGC’09 JSDL Document A JSDL document is an XML document, which may contain Generic (job) identification information Application description Resource requirements (main focus is computational jobs) Description of required data files A JSDL document is a template, which can be submitted multiple times and can be used to create multiple job instances No job instance specific attributes can be defined, e.g. start or end time 07/07/2009 Slide 22
  • 23. ISSGC’09 JSDL – A Hello World Example <?xml version="1.0" encoding="UTF-8"?> <jsdl:JobDefinition xmlns:jsdl=“http://schemas.ggf.org/2005/11/jsdl” xmlns:jsdl-posix=http://schemas.ggf.org/jsdl/2005/11/jsdl-posix> <jsdl:JobDescription> <jsdl:Application> <jsdl-posix:POSIXApplication> <jsdl-posix:Executable>/bin/echo<jsdl-posix:Executable> <jsdl-posix:Input>/dev/null</jsdl-posix:Input> <jsdl-posix:Output>std.out</jsdl-posix:Output> <jsdl-posix:Argument>hello</jsdl-posix:Argument> <jsdl-posix:Argument>world</jsdl-posix:Argument> </jsdl-posix:POSIXApplication> </jsdl:Application> </jsdl:JobDescription> </jsdl:JobDefinition> 07/07/2009 Slide 23
  • 24. ISSGC’09 JSDL – Resource Requirements Description Support simple descriptions of resource requirements NOT a comprehensive resource requirements language Can be extended with other elements for richer or more abstract descriptions Main target is compute jobs CPU, memory, file system/disk, operating system requirements Allows some flexibility for aggregate (total) requirements “I want 10 CPUs in total and each resource should have 2 or more” Very basic support for network requirements 07/07/2009 Slide 24
  • 25. ISSGC’09 JSDL application extensions SPMD (single-program-multiple-data) http://www.ogf.org/documents/GFD.115.pdf Extends JDSL 1.0 for parallel applications (MPI, PVM, etc.) Allows to specify number of processors, processors per host, threads per processes along with the application HPC (high performance computing) http://www.ogf.org/documents/GFD.111.pdf Extends JSDL 1.0 for HPC applications running as operating system processes (e.g. username, environment, working directory can be specified) 07/07/2009 Slide 25
  • 26. ISSGC’09 Lessons Learned JSDL is a standardized language to describe jobs to be submitted to Grid resources Not only the job itself (application, arguments, input, output, etc.), but also resource requirements (CPU, memory, etc.) Extensions for specific application domains (parallel programs, HPC applications) exist BUT: JSDL can not directly be submitted to Grid resources, i.e. a resource management and job scheduling system of a cluster or multiprocessor system in a Grid 07/07/2009 Slide 26
  • 27. ISSGC’09 Execution and Job Management – 1 One of the essential functionalities and components in a Grid middleware Deals with Initiating/submitting, monitoring and managing jobs Handling and staging of all job data Coordinating and scheduling of multi-step jobs Examples: XNJS in UNICORE 6 ( sessions 10-12 today) WS-GRAM in GT4 ( session 19-21 on Thursday) WMS in gLite ( session 24-26 on Friday) ARC Client in NorduGrid/ARC ( session 29-30 on Sat.) 07/07/2009 Slide 27
  • 28. ISSGC’09 Execution and Job Management – 2 Initiating/submitting, monitoring and managing jobs Translates the Grid job in a specific job (application details, resources, etc.) for the target system Submits the job to the resource management system using its proprietary way of job submission Frequently polls the job status (waiting/queued, running/executing, failed, aborted, paused, finished, etc.) from the resource management system Provides “access” to the job, its status and data during its runtime and after its (successful or unsuccessful) completion If at job submission time the resource management system becomes not available/reachable, the job is cached for a future hand over to it 07/07/2009 Slide 28
  • 29. ISSGC’09 Execution and Job Management – 3 Handling and staging of all job data, incl. job directory and persistency Creates, manages, destroys the job directory All data submitted with the job as input is stored in the job directory Data is staged in from remote data resources/archives All data generated by the job is preserved and/or staged after the successful completion of the job Coordinating and scheduling of multi-step jobs If a job consists of more than one step (a workflow), the required resources are orchestrated Manages the proper initiation of the workflow execution The execution of the workflow is controlled and monitored 07/07/2009 Slide 29
  • 30. ISSGC’09 Lessons Learned Execution and job management is needed A meta-layer on top of the Grid resources is needed to provide a uniform way of accessing the Grid and to provide an intuitive, secure and easy to use interface for the user 07/07/2009 Slide 30
  • 31. ISSGC’09 Introduction to UNICORE (from 30,000 ft) more in sessions 10-12, today 07/07/2009 Slide 31
  • 32. ISSGC’09 (Short) History Lesson UNiform Interface to COmputing REsources seamless, secure, and intuitive Initial development started in two German projects funded by the German ministry of education and research (BMBF) 08/1997 – 12/1999: UNICORE project Results: well defined security architecture with X.509 certificates, intuitive GUI, central job supervisor based on Codine (predecessor of SGE) from Genias 1/2000 – 12/2002: UNICORE Plus project Results: implementation enhancements (e.g. replacement of Codine by custom NJS), extended job control (workflows), application specific interfaces (plugins) Continuous development since 2002 in several EU projects Open Source community development since Summer 2004 07/07/2009 Slide 32
  • 33. ISSGC’09 Projects WisNetGrid ETICS2 More than a decade of German and European SmartLM research & development and PRACE infrastructure projects D-MON PHOSPHORUS Chemomentum Any many others, e.g. eDEISA A-WARE OMII-Europe EGEE-II D-Grid IP D-Grid IP 2 CoreGRID NextGRID DEISA DEISA2 VIOLA UniGrids OpenMolGRID GRIDSTART GRIP EUROGRID UNICORE Plus UNICORE 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 07/07/2009 Slide 33
  • 34. ISSGC’09 – Grid driving HPC Used in DEISA (European Distributed Supercomputing Infrastructure) National German Supercomputing Center NIC Gauss Center for Supercomputing (Alliance of the three German HPC centers) PRACE (European PetaFlop HPC Infrastructure) – starting-up But also in non-HPC-focused infrastructures (i.e. D-Grid) Taking up major requirements from i.e. HPC users HPC user support teams HPC operations teams 07/07/2009 Slide 34
  • 35. ISSGC’09 – www.unicore.eu Open source (BSD license) Open developer community on SourceForge Contribution with your own developments easily possible Design principles Standards: OGSA-conform, WS-RF compliant Open, extensible, interoperable End-to-End, seamless, secure and intuitive Security: X.509, proxy and VO support Workflow and application support directly integrated Variety of clients: graphical, commandline, portal, API, etc. Quick and simple installation and configuration Support for many operating and batch systems 100% Java 5 07/07/2009 Slide 35
  • 36. ISSGC’09 UNICORE in use some examples 07/07/2009 Slide 36
  • 37. ISSGC’09 Usage in D-Grid Core D-Grid sites committing parts of their existing resources to D-Grid Approx. 700 CPUs Approx. 1 PByte of storage UNICORE is installed and used Additional Sites received extra money from the BMBF for buying compute clusters and data storage Approx. 2000 CPUs Approx. 2 PByte of storage UNICORE (as well as Globus and gLite) is installed as soon LRZ DLR-DFD as systems are in place 07/07/2009 Slide 37
  • 38. ISSGC’09 Distributed European Infrastructure for Supercomputing Applications Consortium of leading national HPC centers in Europe Deploy and operate a persistent, production quality, distributed, heterogeneous HPC environment www.deisa.eu UNICORE as Grid Middleware On top of DEISA’s core services: Dedicated network Shared file system Common production environment at all sites Used e.g. for workflow IDRIS – CNRS (Paris, France), FZJ (Jülich, Germany), RZG (Garching, Germany), CINECA (Bologna, Italy), EPCC ( Edinburgh, UK), applications CSC (Helsinki, Finland), SARA (Amsterdam, NL), HLRS (Stuttgart, Germany), BSC (Barcelona, Spain), LRZ (Munich, Germany), ECMWF (Reading, UK) more in session 33, Monday 9:00 – 10:30 07/07/2009 Slide 38
  • 39. ISSGC’09 Interoperability and Usability of Grid Infrastructures Focus on providing common interfaces and integration of major Grid software infrastructures OGSA-DAI, VOMS, GridSphere, OGSA-BES, OGSA-RUS UNICORE, gLite, Globus Toolkit, CROWN Apply interoperability components in application use-cases 07/07/2009 Slide 39
  • 40. ISSGC’09 Grid Services based Environment to enable Innovative Research Provide an integrated Grid solution for workflow-centric, complex applications with a focus on data, semantics and knowledge Provide decision support services for risk assessment, toxicity prediction, and drug design End user focus ease of use domain specific tools “hidden Grid” Based on UNICORE 6 more in sessions 12-13, this afternoon 07/07/2009 Slide 40
  • 41. ISSGC’09 Commercial usage at 07/07/2009 Slide courtesy of Alfred Geiger, T-Systems SfRSlide 43
  • 42. ISSGC’09 Lessons Learned UNICORE has a strong HPC-background, but is not limited to HPC use cases, it can be used in any Grid UNICORE is OGSA-conform and WS-RF compliant UNICORE is open, extensible and interoperable UNICORE is open source and coded in Java UNICORE is used in EU and national projects, European e-infrastructures, National Grid Initiatives (NGI), commercial environments, etc. Documentation, tutorials, mailing lists, community links, software, source code, and more: http://www.unicore.eu 07/07/2009 Slide 44