SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
glideinWMS training @ UCSD




           glideinWMS architecture
                     by Igor Sfiligoi (UCSD)




UCSD Jan 17th 2012       glideinWMS architecture   1
Outline


                     ●   A high level overview
                         of the glideinWMS
                     ●   Description of the
                         components




UCSD Jan 17th 2012            glideinWMS architecture   2
glideinWMS




                      glideinWMS
                     from 10k feet

UCSD Jan 17th 2012       glideinWMS architecture   3
Refresher - Condor
 ●   A Condor pool is composed of 3 pieces

                        Central manager
                                                       Execution node
                           Collector
                                                       Execution node
                          Negotiator
     Submit node
                                                       Execution node
     Submit node
                                                       Execution node
     Submit node
                                                       Execution node
       Schedd                                             Startd

                                                               Job



UCSD Jan 17th 2012           glideinWMS architecture                    4
What is a glidein?
 ●   A glidein is just a properly configured
     execution node submitted as a Grid job
                       Central manager
                                                        glidein
                                                      Execution node
                          Collector
                                                        glidein
                                                      Execution node
                         Negotiator
     Submit node
     Submit node
                                                        glidein
                                                      Execution node
     Submit node
                                                      Execution node
                                                        glidein
       Schedd                                            Startd

                                                              Job



UCSD Jan 17th 2012          glideinWMS architecture                    5
What is glideinWMS?
 ●   glideinWMS is an automated tool for submitting
     glideins on demand
                        Central manager
                                                         glidein
                                                       Execution node
                           Collector           CREAM
                                                         glidein
                                                       Execution node
                          Negotiator
     Submit node
     Submit node
                                                         glidein
                                                       Execution node
     Submit node
                                                       Execution node
                                                         glidein
       Schedd                                             Startd
                                             Globus
                                                               Job
                        glideinWMS


UCSD Jan 17th 2012           glideinWMS architecture                    6
glideinWMS architecture
 ●   glideinWMS has 3 logical pieces
Frontend domain Monitor
                                 Submit node                               Configure
                Condor
                                 Submit node                              Condor G.N.
  Frontend node
                                 Submit node                      Worker node
     Frontend
                                Central manager                  glidein_startup
        Match
                     Request                           CREAM          Startd
                     glideins
                            Factory node

                                 Condor                            glidein
                                                                Execution node
                                                       Globus
                                 Factory                          glidein
                                                                Execution node
                                                Submit
                                                glideins

UCSD Jan 17th 2012                   glideinWMS architecture                            7
glideinWMS architecture
 ●   glideinWMS has 3 logical pieces
      ●   glidein_startup – Configures and starts
                            Condor execution daemons
                                                       Runtime environment
                                                      discovery and validation

      ●   Factory – Knows about the sites and
                    does the submission     Grid knowledge and
                                                          troubleshooting

      ●   Frontend – Knows about user jobs and
                     requests glideins
                                                        Site selection logic
                                                        and job monitoring


UCSD Jan 17th 2012          glideinWMS architecture                              8
Cardinality
 ●   N-to-M relationship
      ●   Each Frontend can talk to many Factories
      ●   Each Factory may serve many Frontends
                                                         VO Frontend

            VO Frontend         Glidein Factory                              Collector
                                                                              Schedd
                                                                             Negotiator


          Collector                                           Startd
                                        Startd
           Schedd
                                              User job            User job
          Negotiator
                                           Startd
              Glidein Factory                    User job

UCSD Jan 17th 2012                  glideinWMS architecture                               9
Many operators
 ●   Factory and Frontend are usually operated
     by different people
 ●   Frontends VO specific
      ●   Operated by VO admins
      ●   Each sets policies for its users
 ●   Factories generic
      ●   Do not need to be affiliated with any group
      ●   Factory ops main task is Grid monitoring and
          troubleshooting

UCSD Jan 17th 2012          glideinWMS architecture      10
glideinWMS



               A (sort of) detailed view of

                     glidein_startup

UCSD Jan 17th 2012        glideinWMS architecture   11
Refresher – glideinWMS arch.
 ●   glidein_startup configures and starts Condor
                     Monitor     Submit node
                     Condor                                                Configure
                                 Submit node                              Condor G.N.
  Frontend node
                                 Submit node                      Worker node
     Frontend
                               Central manager                   glidein_startup
                     Match
                                                       CREAM          Startd
             Request
             glideins          Factory node

                                 Condor                           glidein
                                                                Execution node
                                                       Globus
                                 Factory                           glidein
                                                                Execution node
                                                Submit
                                                glideins

UCSD Jan 17th 2012                   glideinWMS architecture                            12
glidein_startup tasks
 ●   Validate node (environment)
 ●   Download Condor binaries                        Performed
                                                     by plugins
 ●   Configure Condor
 ●   Start Condor daemon(s)
 ●   Collect post-mortem monitoring info
 ●   Cleanup



UCSD Jan 17th 2012         glideinWMS architecture                13
glidein_startup plugins
 ●   Config files and scripts loaded via HTTP
      ●   From both the factory and the frontend Web servers
      ●   Can use local Web proxy (e.g. Squid)
      ●   Mechanism tamper proof and cache coherent
       Factory node                                        glidein_startup
           HTTPd                                  ●   Load files
                                                      from factory Web
                                  Squid

                                                  ●   Load files
                                                      from frontend Web
      Frontend node                               ●   Run executables
                                                  ●   Start Condor      Startd
           HTTPd                                  ●   Cleanup


UCSD Jan 17th 2012          glideinWMS architecture                              14
glidein_startup scripts
 ●   Standard plugins
      ●   Basic Grid node validation (certs, disk space, etc.)
      ●   Setup Condor (glexec, CCB, etc.)
 ●   VO provided plugins
      ●   Optional, but can be anything
      ●   CMS@UCSD checks for CMS SW
 ●   Factory admin can also provide them
 ●   Details about the plugins can be found at
     http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html

UCSD Jan 17th 2012            glideinWMS architecture                    15
glideinWMS



          A (sort of) detailed view of the

                     glidein factory

UCSD Jan 17th 2012        glideinWMS architecture   16
Refresher – glideinWMS arch.
 ●   The factory knowns about the grid and
     submits glideins
                                                                             Configure
                                   Submit node                              Condor G.N.
  Frontend node
                       Monitor     Submit node
                       Condor                                       Worker node
     Frontend
                                 Central manager                   glidein_startup
                     Match
                                                         CREAM          Startd
             Request
             glideins            Factory node

                                   Condor                            glidein
                                                                  Execution node
                                                         Globus
                                   Factory                          glidein
                                                                  Execution node
                                                  Submit
                                                  glideins

UCSD Jan 17th 2012                     glideinWMS architecture                            17
Glidein factory
 ●   Glidein factory knows how to contact sites
      ●   List in a local config
      ●   Only trusted and tested sites should be included
 ●   For each site (called entry)
      ●   Contact info (Node, grid type, jobmanager)
      ●   Site config (startup dir, glexec, OS type, …)
      ●   VOs supported
      ●   Other attributes (Site name, closest SE, ...)
 ●   Admin maintained, periodically compared to BDII
     http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html


UCSD Jan 17th 2012             glideinWMS architecture                  18
Glidein factory role
 ●   The glidein factory is just a slave
      ●   The frontend(s) tell it how many glideins
          to submit where
      ●   Once the glideins start to run, they report to
          the VO collector and the factory is not involved
 ●   The communication is based on ClassAds
      ●   The factory has a Collector for this purpose
               Frontend node                      Factory node

                 Frontend                            Collector

                                                         Factory

UCSD Jan 17th 2012             glideinWMS architecture             19
Factory collector
 ●    The factory collector handles all communication
                                                            Factory node
     Frontend node   Find sites
                                         Collector
      Frontend       Request
                     glideins
           .                                          Advertise            Retrieve
           .                                            entry               orders
           .
                                                 Entry            ...        Entry
     Frontend node

      Frontend                                                  Spawn
                                                                Factory

  http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html

UCSD Jan 17th 2012                glideinWMS architecture                             20
Frontends
 ●   The factory admin decides
     which Frontends to serve
                                                        Frontend node
      ●   Valid proxy
                                                         Frontend
          with known DN needed
          to talk to the collector
      ●   Factory config has further
                                                                    Factory node
          fine grained controls
                                                                        Collector
                     Frontend node
                                                                        Factory
                      Frontend




UCSD Jan 17th 2012            glideinWMS architecture                               21
Glidein submission
 ●   The glidein factory (entry) uses
     Condor-G to submit glideins
      ●   Condor-G does the heavy lifting
      ●   The factory just monitors the progress
                                                                    glidein
                                                                    glidein
                     Factory node
                                                           CREAM
                       Submit
             Entry                          Schedd
               .       Monitor                .
               .                              .
               .                              .                     glidein
                       Submit
                                            Schedd         Globus
             Entry                                                  glidein
                       Monitor


UCSD Jan 17th 2012               glideinWMS architecture                      22
Credentials/Proxy
 ●   Proxy typically provided by the frontend
      ●   Although the factory can provide a default one (rarely used)

 ●   Proxy delivered encrypted in the ClassAd
      ●   Factory (entry) provides the encryption key (PKI)
 ●   Proxy stored on disk
      ●   Each VO mapped to a different UID
          Frontend node                                          Factory node
                              Get key
            Frontend                                 Collector                  Schedd
                             Deliver proxy
                              (encrypted)               Entry



UCSD Jan 17th 2012                   glideinWMS architecture                             23
glideinWMS



          A (sort of) detailed view of the

                     VO frontend

UCSD Jan 17th 2012      glideinWMS architecture   24
Refresher – glideinWMS arch.
 ●   The frontend monitors the user Condor pool,
     does the matchmaking and requests glideins
Frontend domain                                                              Configure
                                   Submit node                              Condor G.N.
  Frontend node
                       Monitor     Submit node
                       Condor                                       Worker node
     Frontend
                                 Central manager                   glidein_startup
                     Match
                                                         CREAM          Startd
             Request
             glideins            Factory node

                                   Condor                           glidein
                                                                  Execution node
                                                         Globus
                                   Factory                           glidein
                                                                  Execution node
                                                  Submit
                                                  glideins

UCSD Jan 17th 2012                     glideinWMS architecture                            25
VO frontend
 ●   The VO frontend is the brain
     of a glideinWMS-based pool
      ●   Like a site-level “negotiator”

 VO domain                                                         Find                  Find
                                   Submit node                   idle jobs              entries
  Frontend node
                       Monitor     Submit node
     Frontend          Condor
                                                                             Match
                                 Central manager
                     Match
                                                                             Request
             Request                                                         glideins
             glideins            Factory node


UCSD Jan 17th 2012                     glideinWMS architecture                                    26
Two level matchmaking
 ●   The frontend triggers glidein submission
      ●   The “regular” negotiator matches jobs to glideins
                         Central manager
                                                          glidein
                                                        Execution node
                            Collector           CREAM
                                                          glidein
                                                        Execution node
                           Negotiator
     Submit node

       Schedd                                             glidein
                                                        Execution node
                                                          glidein
                                                        Execution node

                                                           Startd
                                              Globus
                                                                Job
          Frontend
                                   Factory

UCSD Jan 17th 2012            glideinWMS architecture                    27
Frontend logic
 ●   The glideinWMS glidein request logic
     is based on the principle on “constant pressure”
      ●   Frontend requests a certain number of
          “idle glideins” in the factory queue at all times
      ●   It does not request a specific number of glideins
 ●   This is done due to the asynchronous nature of
     the system
      ●   Both the factory and the frontend are
          in a polling loop and talk to each other indirectly


UCSD Jan 17th 2012           glideinWMS architecture            28
Frontend logic
 ●   Frontend matches job attrs against entry attrs
      ●   It then counts the matched idle jobs
      ●   A fraction of this number becomes the
          “pressure requests” (up to 1/3)
 ●   The matchmaking expression is
     defined by the frontend admin
      ●   Not the user
      ●   Debatable if it is better or worse, but it does reduce
          frontend code complexity


UCSD Jan 17th 2012          glideinWMS architecture                29
Frontend config
 ●   The frontend owns the “glidein proxy”
      ●   And delegates it to the factory(s)
          when requesting glideins
      ●   Must keep it valid at all times
          (usually at OS level)
 ●   The VO frontend can (and should) provide
     VO‑specific validation scripts
 ●   The VO frontend can (and should) set the
     glidein start expression
      ●   Used by the VO negotiator for final matchmaking
UCSD Jan 17th 2012           glideinWMS architecture        30
glideinWMS



                      And the

                     summary

UCSD Jan 17th 2012    glideinWMS architecture   31
Summary
 ●   Glideins are just properly configured Condor
     execute nodes submitted as Grid jobs
 ●   The glideinWMS is a mechanism to automate
     glidein submission
 ●   The glideinWMS is composed of three logical
     entities, two being actual services:
      ●   Glidein factories know about the Grid
      ●   VO frontend know about the users and
          drive the factories

UCSD Jan 17th 2012         glideinWMS architecture   32
Pointers
 ●   glideinWMS development team is reachable at
     glideinwms-support@fnal.gov
 ●   The official project Web page is
     http://tinyurl.com/glideinWMS
 ●   CMS frontend at UCSD
     http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html

 ●   OSG glidein factory at UCSD
     http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory
     http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html




UCSD Jan 17th 2012                           glideinWMS architecture                                       33
Acknowledgments
 ●   The glideinWMS is a CMS-led project
     developed mostly at FNAL, with contributions
     from UCSD and ISI
 ●   The glideinWMS factory operations at UCSD is
     sponsored by OSG
 ●   The funding comes from NSF, DOE and the
     UC system




UCSD Jan 17th 2012       glideinWMS architecture    34

Más contenido relacionado

Destacado

Presentacion corta 15 sin kuenhe nagel
Presentacion corta 15 sin kuenhe nagelPresentacion corta 15 sin kuenhe nagel
Presentacion corta 15 sin kuenhe nagel
ComunicacionesPDB
 
Novedades de producto_2010
Novedades de producto_2010Novedades de producto_2010
Novedades de producto_2010
Ana Malumbres
 
Enfermedad hepática grasa no alcohólica EHNA
Enfermedad hepática grasa no alcohólica EHNAEnfermedad hepática grasa no alcohólica EHNA
Enfermedad hepática grasa no alcohólica EHNA
Len Mrl
 
Manual del curso OSHAS 18001
Manual del curso OSHAS 18001Manual del curso OSHAS 18001
Manual del curso OSHAS 18001
amparoweb20
 

Destacado (19)

women padel day
women padel daywomen padel day
women padel day
 
Mercancias para transporte marítimo
Mercancias para transporte marítimoMercancias para transporte marítimo
Mercancias para transporte marítimo
 
G41 m vs2
G41 m vs2G41 m vs2
G41 m vs2
 
Presentacion corta 15 sin kuenhe nagel
Presentacion corta 15 sin kuenhe nagelPresentacion corta 15 sin kuenhe nagel
Presentacion corta 15 sin kuenhe nagel
 
La fréquentation des sites Internet français octobre 2011
La fréquentation des sites Internet français octobre 2011La fréquentation des sites Internet français octobre 2011
La fréquentation des sites Internet français octobre 2011
 
stp-2013-iss50
stp-2013-iss50stp-2013-iss50
stp-2013-iss50
 
Novedades de producto_2010
Novedades de producto_2010Novedades de producto_2010
Novedades de producto_2010
 
59563233 algoritmo-bresenham
59563233 algoritmo-bresenham59563233 algoritmo-bresenham
59563233 algoritmo-bresenham
 
Catalogo navidad Buomarino 2014
Catalogo navidad Buomarino 2014Catalogo navidad Buomarino 2014
Catalogo navidad Buomarino 2014
 
Enfermedad hepática grasa no alcohólica EHNA
Enfermedad hepática grasa no alcohólica EHNAEnfermedad hepática grasa no alcohólica EHNA
Enfermedad hepática grasa no alcohólica EHNA
 
09 mobile marketing
09 mobile marketing09 mobile marketing
09 mobile marketing
 
Ivf in pcos
Ivf in pcosIvf in pcos
Ivf in pcos
 
Trustpilot
TrustpilotTrustpilot
Trustpilot
 
AdSense Arbitrage | Houssem Zaoui
AdSense Arbitrage | Houssem ZaouiAdSense Arbitrage | Houssem Zaoui
AdSense Arbitrage | Houssem Zaoui
 
Liturgia, manual de iniciación indice
Liturgia, manual de iniciación indiceLiturgia, manual de iniciación indice
Liturgia, manual de iniciación indice
 
Vocabulario arte del barroco
Vocabulario arte del barrocoVocabulario arte del barroco
Vocabulario arte del barroco
 
Manual del curso OSHAS 18001
Manual del curso OSHAS 18001Manual del curso OSHAS 18001
Manual del curso OSHAS 18001
 
333 puntos del par biomagnetico
333 puntos del par biomagnetico333 puntos del par biomagnetico
333 puntos del par biomagnetico
 
El Arte de Ser Padres: Programa de coaching para madres y padres
El Arte de Ser Padres: Programa de coaching para madres y padresEl Arte de Ser Padres: Programa de coaching para madres y padres
El Arte de Ser Padres: Programa de coaching para madres y padres
 

Similar a glideinWMS Architecture - glideinWMS Training Jan 2012

Similar a glideinWMS Architecture - glideinWMS Training Jan 2012 (10)

Glidein internals
Glidein internalsGlidein internals
Glidein internals
 
glideinWMS validation scirpts - glideinWMS Training Jan 2012
glideinWMS validation scirpts - glideinWMS Training Jan 2012glideinWMS validation scirpts - glideinWMS Training Jan 2012
glideinWMS validation scirpts - glideinWMS Training Jan 2012
 
Introduction to glideinWMS
Introduction to glideinWMSIntroduction to glideinWMS
Introduction to glideinWMS
 
Pilot Factory
Pilot FactoryPilot Factory
Pilot Factory
 
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor poolMonitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
 
Solving Grid problems through glidein monitoring
Solving Grid problems through glidein monitoringSolving Grid problems through glidein monitoring
Solving Grid problems through glidein monitoring
 
Condor from the user point of view - glideinWMS Training Jan 2012
Condor from the user point of view - glideinWMS Training Jan 2012Condor from the user point of view - glideinWMS Training Jan 2012
Condor from the user point of view - glideinWMS Training Jan 2012
 
Wedding convenience and control with RemoteCondor
Wedding convenience and control with RemoteCondorWedding convenience and control with RemoteCondor
Wedding convenience and control with RemoteCondor
 
The glideinWMS approach to the ownership of System Images in the Cloud World
The glideinWMS approach to the ownership of System Images in the Cloud WorldThe glideinWMS approach to the ownership of System Images in the Cloud World
The glideinWMS approach to the ownership of System Images in the Cloud World
 
CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6CG OpneGL 2D viewing & simple animation-course 6
CG OpneGL 2D viewing & simple animation-course 6
 

Más de Igor Sfiligoi

Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
Igor Sfiligoi
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
Igor Sfiligoi
 

Más de Igor Sfiligoi (20)

Preparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYROPreparing Fusion codes for Perlmutter - CGYRO
Preparing Fusion codes for Perlmutter - CGYRO
 
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
O&C Meeting - Evaluation of ARM CPUs for IceCube available through Google Kub...
 
Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...Comparing single-node and multi-node performance of an important fusion HPC c...
Comparing single-node and multi-node performance of an important fusion HPC c...
 
The anachronism of whole-GPU accounting
The anachronism of whole-GPU accountingThe anachronism of whole-GPU accounting
The anachronism of whole-GPU accounting
 
Auto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resourcesAuto-scaling HTCondor pools using Kubernetes compute resources
Auto-scaling HTCondor pools using Kubernetes compute resources
 
Speeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rateSpeeding up bowtie2 by improving cache-hit rate
Speeding up bowtie2 by improving cache-hit rate
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
 
Comparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance computeComparing GPU effectiveness for Unifrac distance compute
Comparing GPU effectiveness for Unifrac distance compute
 
Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...Managing Cloud networking costs for data-intensive applications by provisioni...
Managing Cloud networking costs for data-intensive applications by provisioni...
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 
Using A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific OutputUsing A100 MIG to Scale Astronomy Scientific Output
Using A100 MIG to Scale Astronomy Scientific Output
 
Using commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobsUsing commercial Clouds to process IceCube jobs
Using commercial Clouds to process IceCube jobs
 
Modest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYROModest scale HPC on Azure using CGYRO
Modest scale HPC on Azure using CGYRO
 
Data-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud BurstData-intensive IceCube Cloud Burst
Data-intensive IceCube Cloud Burst
 
Scheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with AdmiraltyScheduling a Kubernetes Federation with Admiralty
Scheduling a Kubernetes Federation with Admiralty
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
 
Porting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUsPorting and optimizing UniFrac for GPUs
Porting and optimizing UniFrac for GPUs
 
Demonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public CloudsDemonstrating 100 Gbps in and out of the public Clouds
Demonstrating 100 Gbps in and out of the public Clouds
 
TransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud linksTransAtlantic Networking using Cloud links
TransAtlantic Networking using Cloud links
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

glideinWMS Architecture - glideinWMS Training Jan 2012

  • 1. glideinWMS training @ UCSD glideinWMS architecture by Igor Sfiligoi (UCSD) UCSD Jan 17th 2012 glideinWMS architecture 1
  • 2. Outline ● A high level overview of the glideinWMS ● Description of the components UCSD Jan 17th 2012 glideinWMS architecture 2
  • 3. glideinWMS glideinWMS from 10k feet UCSD Jan 17th 2012 glideinWMS architecture 3
  • 4. Refresher - Condor ● A Condor pool is composed of 3 pieces Central manager Execution node Collector Execution node Negotiator Submit node Execution node Submit node Execution node Submit node Execution node Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 4
  • 5. What is a glidein? ● A glidein is just a properly configured execution node submitted as a Grid job Central manager glidein Execution node Collector glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Job UCSD Jan 17th 2012 glideinWMS architecture 5
  • 6. What is glideinWMS? ● glideinWMS is an automated tool for submitting glideins on demand Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMS UCSD Jan 17th 2012 glideinWMS architecture 6
  • 7. glideinWMS architecture ● glideinWMS has 3 logical pieces Frontend domain Monitor Submit node Configure Condor Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match Request CREAM Startd glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 7
  • 8. glideinWMS architecture ● glideinWMS has 3 logical pieces ● glidein_startup – Configures and starts Condor execution daemons Runtime environment discovery and validation ● Factory – Knows about the sites and does the submission Grid knowledge and troubleshooting ● Frontend – Knows about user jobs and requests glideins Site selection logic and job monitoring UCSD Jan 17th 2012 glideinWMS architecture 8
  • 9. Cardinality ● N-to-M relationship ● Each Frontend can talk to many Factories ● Each Factory may serve many Frontends VO Frontend VO Frontend Glidein Factory Collector Schedd Negotiator Collector Startd Startd Schedd User job User job Negotiator Startd Glidein Factory User job UCSD Jan 17th 2012 glideinWMS architecture 9
  • 10. Many operators ● Factory and Frontend are usually operated by different people ● Frontends VO specific ● Operated by VO admins ● Each sets policies for its users ● Factories generic ● Do not need to be affiliated with any group ● Factory ops main task is Grid monitoring and troubleshooting UCSD Jan 17th 2012 glideinWMS architecture 10
  • 11. glideinWMS A (sort of) detailed view of glidein_startup UCSD Jan 17th 2012 glideinWMS architecture 11
  • 12. Refresher – glideinWMS arch. ● glidein_startup configures and starts Condor Monitor Submit node Condor Configure Submit node Condor G.N. Frontend node Submit node Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 12
  • 13. glidein_startup tasks ● Validate node (environment) ● Download Condor binaries Performed by plugins ● Configure Condor ● Start Condor daemon(s) ● Collect post-mortem monitoring info ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 13
  • 14. glidein_startup plugins ● Config files and scripts loaded via HTTP ● From both the factory and the frontend Web servers ● Can use local Web proxy (e.g. Squid) ● Mechanism tamper proof and cache coherent Factory node glidein_startup HTTPd ● Load files from factory Web Squid ● Load files from frontend Web Frontend node ● Run executables ● Start Condor Startd HTTPd ● Cleanup UCSD Jan 17th 2012 glideinWMS architecture 14
  • 15. glidein_startup scripts ● Standard plugins ● Basic Grid node validation (certs, disk space, etc.) ● Setup Condor (glexec, CCB, etc.) ● VO provided plugins ● Optional, but can be anything ● CMS@UCSD checks for CMS SW ● Factory admin can also provide them ● Details about the plugins can be found at http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html UCSD Jan 17th 2012 glideinWMS architecture 15
  • 16. glideinWMS A (sort of) detailed view of the glidein factory UCSD Jan 17th 2012 glideinWMS architecture 16
  • 17. Refresher – glideinWMS arch. ● The factory knowns about the grid and submits glideins Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 17
  • 18. Glidein factory ● Glidein factory knows how to contact sites ● List in a local config ● Only trusted and tested sites should be included ● For each site (called entry) ● Contact info (Node, grid type, jobmanager) ● Site config (startup dir, glexec, OS type, …) ● VOs supported ● Other attributes (Site name, closest SE, ...) ● Admin maintained, periodically compared to BDII http://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html UCSD Jan 17th 2012 glideinWMS architecture 18
  • 19. Glidein factory role ● The glidein factory is just a slave ● The frontend(s) tell it how many glideins to submit where ● Once the glideins start to run, they report to the VO collector and the factory is not involved ● The communication is based on ClassAds ● The factory has a Collector for this purpose Frontend node Factory node Frontend Collector Factory UCSD Jan 17th 2012 glideinWMS architecture 19
  • 20. Factory collector ● The factory collector handles all communication Factory node Frontend node Find sites Collector Frontend Request glideins . Advertise Retrieve . entry orders . Entry ... Entry Frontend node Frontend Spawn Factory http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html UCSD Jan 17th 2012 glideinWMS architecture 20
  • 21. Frontends ● The factory admin decides which Frontends to serve Frontend node ● Valid proxy Frontend with known DN needed to talk to the collector ● Factory config has further Factory node fine grained controls Collector Frontend node Factory Frontend UCSD Jan 17th 2012 glideinWMS architecture 21
  • 22. Glidein submission ● The glidein factory (entry) uses Condor-G to submit glideins ● Condor-G does the heavy lifting ● The factory just monitors the progress glidein glidein Factory node CREAM Submit Entry Schedd . Monitor . . . . . glidein Submit Schedd Globus Entry glidein Monitor UCSD Jan 17th 2012 glideinWMS architecture 22
  • 23. Credentials/Proxy ● Proxy typically provided by the frontend ● Although the factory can provide a default one (rarely used) ● Proxy delivered encrypted in the ClassAd ● Factory (entry) provides the encryption key (PKI) ● Proxy stored on disk ● Each VO mapped to a different UID Frontend node Factory node Get key Frontend Collector Schedd Deliver proxy (encrypted) Entry UCSD Jan 17th 2012 glideinWMS architecture 23
  • 24. glideinWMS A (sort of) detailed view of the VO frontend UCSD Jan 17th 2012 glideinWMS architecture 24
  • 25. Refresher – glideinWMS arch. ● The frontend monitors the user Condor pool, does the matchmaking and requests glideins Frontend domain Configure Submit node Condor G.N. Frontend node Monitor Submit node Condor Worker node Frontend Central manager glidein_startup Match CREAM Startd Request glideins Factory node Condor glidein Execution node Globus Factory glidein Execution node Submit glideins UCSD Jan 17th 2012 glideinWMS architecture 25
  • 26. VO frontend ● The VO frontend is the brain of a glideinWMS-based pool ● Like a site-level “negotiator” VO domain Find Find Submit node idle jobs entries Frontend node Monitor Submit node Frontend Condor Match Central manager Match Request Request glideins glideins Factory node UCSD Jan 17th 2012 glideinWMS architecture 26
  • 27. Two level matchmaking ● The frontend triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Schedd glidein Execution node glidein Execution node Startd Globus Job Frontend Factory UCSD Jan 17th 2012 glideinWMS architecture 27
  • 28. Frontend logic ● The glideinWMS glidein request logic is based on the principle on “constant pressure” ● Frontend requests a certain number of “idle glideins” in the factory queue at all times ● It does not request a specific number of glideins ● This is done due to the asynchronous nature of the system ● Both the factory and the frontend are in a polling loop and talk to each other indirectly UCSD Jan 17th 2012 glideinWMS architecture 28
  • 29. Frontend logic ● Frontend matches job attrs against entry attrs ● It then counts the matched idle jobs ● A fraction of this number becomes the “pressure requests” (up to 1/3) ● The matchmaking expression is defined by the frontend admin ● Not the user ● Debatable if it is better or worse, but it does reduce frontend code complexity UCSD Jan 17th 2012 glideinWMS architecture 29
  • 30. Frontend config ● The frontend owns the “glidein proxy” ● And delegates it to the factory(s) when requesting glideins ● Must keep it valid at all times (usually at OS level) ● The VO frontend can (and should) provide VO‑specific validation scripts ● The VO frontend can (and should) set the glidein start expression ● Used by the VO negotiator for final matchmaking UCSD Jan 17th 2012 glideinWMS architecture 30
  • 31. glideinWMS And the summary UCSD Jan 17th 2012 glideinWMS architecture 31
  • 32. Summary ● Glideins are just properly configured Condor execute nodes submitted as Grid jobs ● The glideinWMS is a mechanism to automate glidein submission ● The glideinWMS is composed of three logical entities, two being actual services: ● Glidein factories know about the Grid ● VO frontend know about the users and drive the factories UCSD Jan 17th 2012 glideinWMS architecture 32
  • 33. Pointers ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● The official project Web page is http://tinyurl.com/glideinWMS ● CMS frontend at UCSD http://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html ● OSG glidein factory at UCSD http://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactory http://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html UCSD Jan 17th 2012 glideinWMS architecture 33
  • 34. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC system UCSD Jan 17th 2012 glideinWMS architecture 34