SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
A batch scheduling system with Docker containers
Web - http://www.genouest.org/godocker/
Code - https://bitbucket.org/osallou/go-docker
Twitter - #godocker
Olivier Sallou – IRISA - 2015
CC-BY-SA
GoDocker
What
 Execute batch jobs/commands in containers
 For multi-user system (ldap based for example)
 With personal and/or common shared directories
(home, central data, …)
 In a scalable architecture to handle massive job
submission.
Why?
 Need for an open source scheduling job
submission tool (like Sun Grid Engine)
• with isolation of resources
• availability of tools without cluster specific
OS/version issues (with containers)
• with remote and authenticated access
• with access to job resource monitoring
How?
 Using proven technologies and software
 Using scalable components
 With plugin support to modify easily default
behavior and adapt it to YOUR system.
Technologies
 Docker: for containers
 Docker Swarm or Apache Mesos for job execution
and dispatch, as well as for node monitoring.
 Google cAdvisor: for job monitoring
 Language: Python
 Databases backend: MongoDB, Redis, InfluxDB
(optional).
Features
 Remote execution of a job (command line)
• in a Docker container
• with requested resources (cpu, memory)
• with requested directories mounted in container
(according to ACL)
 Allowed container images can be limited to a list
(config)
 User can specify the container image (config)
 Optional root access to container (config)
Features
 Interactive sessions (ssh) in a container
 User/Group priority and quotas.
 Jobs scheduling according to multiple properties
(priority, waiting time, previous usage, …). Fair share
algorithm available.
 Plugins to modify or add features.
 Global and per job monitoring (past and live).
 Partial DRMAA v1 support
Architecture
CLI/ Web UI
Web proxy
Web servers
Influxdb
Redis
MongoDB
Scheduler Watchers
Dispatcher
(Swarm, Mesos)
Submit task Monitor tasks
Execution nodes
(with Docker)
Shared file system
Databases
 Mongodb:
• used to store jobs data
 Redis:
• use lists to dispatch requests between
executors to monitor jobs
 Influxdb:
• optional db to store time based data
(cpu/memory usage, number of jobs, etc.)
Components
 CLI : Command Line Interface
 Web interface / REST API
 Authentication / ACLs => plugins
 Scheduler => plugins
 Executor => plugins
 Watchers => plugins
Command Line Interface
 Execute commands using the REST API of the web
server:
• submit and control running jobs
• download output files from jobs
• etc.
 Some commands are dedicated to administrators:
• project and user quota manager
• etc.
Web server
 Submit and manage tasks via web UI
 REST interface for remote control
 Partial DRMAA v1 integration
 Register new tasks for scheduler.
Authentication / ACL
 A plugin is available to authenticate users with an
LDAP but it should be adapted to your needs
• manage authentication for web site
• define which volumes/directories can be
mounted in container (user home directory
etc.), and their mode (ro, rw).
 Other plugins can be developed for specific
authentication/acl
Scheduler
 Only one instance of the process
 The scheduler reorder job requests:
• per priority (user and/or group)
• reject if quota reached
• different algorithms are available:
• fifo
• fair(share)
• others can be added with plugins
Scheduler
 It executes the job command using the executor
plugin:
• Docker Swarm
• Apache Mesos
• others can be developed
• manage port mapping for interactive jobs
Executor
 Multiple instances can be run to scale with the
number of jobs to monitor.
 Manage kill or reschedule requests
 Checks the status of the job (running, over)
 Trigger watchers (see next slide)
 When job is over, it updates job status.
Watcher
 Watchers are plugins called by executors during
job execution to act upon job life cycle:
• ex: kill job
• ex: update some meta-data
 New plugins can be added
 Available:
• Maxlifespanwatcher: kill a job after X days.
Monitoring
 Cadvisor
• helps to monitor “live” job cpu/memory
usage.
• data can be saved in InfluxDB for later
analysis.
 Previous jobs data are kept in MongoDB for
statistics/analysis.
About
 Authors:
• Olivier Sallou (IRISA / Univ. Rennes 1)
• Cyril Monjeaud (IRISA/ Univ. Rennes 1)
 License: Apache-2.0
Web: http://www.genouest.org/godocker/
Code: https://bitbucket.org/osallou/go-docker

Más contenido relacionado

La actualidad más candente

zookeeperProgrammers
zookeeperProgrammerszookeeperProgrammers
zookeeperProgrammers
Hiroshi Ono
 
Practicing Continuous Deployment
Practicing Continuous DeploymentPracticing Continuous Deployment
Practicing Continuous Deployment
zeeg
 

La actualidad más candente (20)

Kubernetes #4 volume & stateful set
Kubernetes #4   volume & stateful setKubernetes #4   volume & stateful set
Kubernetes #4 volume & stateful set
 
Scala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camouScala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camou
 
CoreOS : 설치부터 컨테이너 배포까지
CoreOS : 설치부터 컨테이너 배포까지CoreOS : 설치부터 컨테이너 배포까지
CoreOS : 설치부터 컨테이너 배포까지
 
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
 
WebCamp 2016: DevOps. Ярослав Погребняк: Gobetween - новый лоад балансер для ...
WebCamp 2016: DevOps. Ярослав Погребняк: Gobetween - новый лоад балансер для ...WebCamp 2016: DevOps. Ярослав Погребняк: Gobetween - новый лоад балансер для ...
WebCamp 2016: DevOps. Ярослав Погребняк: Gobetween - новый лоад балансер для ...
 
From Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in SydneyFrom Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in Sydney
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
zookeeperProgrammers
zookeeperProgrammerszookeeperProgrammers
zookeeperProgrammers
 
Deploying Symfony2 app with Ansible
Deploying Symfony2 app with AnsibleDeploying Symfony2 app with Ansible
Deploying Symfony2 app with Ansible
 
Scaling and Embracing Failure: Clustering Docker with Mesos
Scaling and Embracing Failure: Clustering Docker with MesosScaling and Embracing Failure: Clustering Docker with Mesos
Scaling and Embracing Failure: Clustering Docker with Mesos
 
Ansible - A 'crowd' introduction
Ansible - A 'crowd' introductionAnsible - A 'crowd' introduction
Ansible - A 'crowd' introduction
 
ExpressJs Session01
ExpressJs Session01ExpressJs Session01
ExpressJs Session01
 
Service discovery in mesos miguel, Angel Guillen
Service discovery in mesos miguel, Angel GuillenService discovery in mesos miguel, Angel Guillen
Service discovery in mesos miguel, Angel Guillen
 
Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016Apache zookeeper seminar_trinh_viet_dung_03_2016
Apache zookeeper seminar_trinh_viet_dung_03_2016
 
Red Hat Satellite 6 - Automation with Puppet
Red Hat Satellite 6 - Automation with PuppetRed Hat Satellite 6 - Automation with Puppet
Red Hat Satellite 6 - Automation with Puppet
 
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMANagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
 
Configuration Surgery with Augeas
Configuration Surgery with AugeasConfiguration Surgery with Augeas
Configuration Surgery with Augeas
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
Practicing Continuous Deployment
Practicing Continuous DeploymentPracticing Continuous Deployment
Practicing Continuous Deployment
 
Wcat
WcatWcat
Wcat
 

Destacado

Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Paco Nathan
 

Destacado (6)

Dynamic Scheduling - Federated clusters in mesos
Dynamic Scheduling - Federated clusters in mesosDynamic Scheduling - Federated clusters in mesos
Dynamic Scheduling - Federated clusters in mesos
 
Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?Docker and Go: why did we decide to write Docker in Go?
Docker and Go: why did we decide to write Docker in Go?
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache Mesos
 
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
 
STEAL THIS PRESENTATION!
STEAL THIS PRESENTATION! STEAL THIS PRESENTATION!
STEAL THIS PRESENTATION!
 

Similar a GoDocker presentation

adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiaoadaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
lyvanlinh519
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Cloudera, Inc.
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
harendra_pathak
 

Similar a GoDocker presentation (20)

Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiaoadaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
adaidoadaoap9dapdadadjoadjoajdoiajodiaoiao
 
What's New in Docker - February 2017
What's New in Docker - February 2017What's New in Docker - February 2017
What's New in Docker - February 2017
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14Docker dev ops for cd meetup 12-14
Docker dev ops for cd meetup 12-14
 
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius SchumacherOSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
 
Give your little scripts big wings: Using cron in the cloud with Amazon Simp...
Give your little scripts big wings:  Using cron in the cloud with Amazon Simp...Give your little scripts big wings:  Using cron in the cloud with Amazon Simp...
Give your little scripts big wings: Using cron in the cloud with Amazon Simp...
 
Airflow presentation
Airflow presentationAirflow presentation
Airflow presentation
 
airflowpresentation1-180717183432.pptx
airflowpresentation1-180717183432.pptxairflowpresentation1-180717183432.pptx
airflowpresentation1-180717183432.pptx
 
DataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptxDataPipelineApacheAirflow.pptx
DataPipelineApacheAirflow.pptx
 
(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environment(ATS6-PLAT07) Managing AEP in an enterprise environment
(ATS6-PLAT07) Managing AEP in an enterprise environment
 
StorageQuery: federated querying on object stores, powered by Alluxio and Presto
StorageQuery: federated querying on object stores, powered by Alluxio and PrestoStorageQuery: federated querying on object stores, powered by Alluxio and Presto
StorageQuery: federated querying on object stores, powered by Alluxio and Presto
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
 
Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?
 
DevOps with Elastic Beanstalk - TCCC-2014
DevOps with Elastic Beanstalk - TCCC-2014DevOps with Elastic Beanstalk - TCCC-2014
DevOps with Elastic Beanstalk - TCCC-2014
 
Hitchhiker's guide to Cloud-Native Build Pipelines and Infrastructure as Code
Hitchhiker's guide to Cloud-Native Build Pipelines and Infrastructure as CodeHitchhiker's guide to Cloud-Native Build Pipelines and Infrastructure as Code
Hitchhiker's guide to Cloud-Native Build Pipelines and Infrastructure as Code
 
DevOPS training - Day 2/2
DevOPS training - Day 2/2DevOPS training - Day 2/2
DevOPS training - Day 2/2
 
Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)Why kubernetes for Serverless (FaaS)
Why kubernetes for Serverless (FaaS)
 
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless  - Serverless Summit 2017 - Krishna KumarKubernetes for Serverless  - Serverless Summit 2017 - Krishna Kumar
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

GoDocker presentation

  • 1. A batch scheduling system with Docker containers Web - http://www.genouest.org/godocker/ Code - https://bitbucket.org/osallou/go-docker Twitter - #godocker Olivier Sallou – IRISA - 2015 CC-BY-SA GoDocker
  • 2. What  Execute batch jobs/commands in containers  For multi-user system (ldap based for example)  With personal and/or common shared directories (home, central data, …)  In a scalable architecture to handle massive job submission.
  • 3. Why?  Need for an open source scheduling job submission tool (like Sun Grid Engine) • with isolation of resources • availability of tools without cluster specific OS/version issues (with containers) • with remote and authenticated access • with access to job resource monitoring
  • 4. How?  Using proven technologies and software  Using scalable components  With plugin support to modify easily default behavior and adapt it to YOUR system.
  • 5. Technologies  Docker: for containers  Docker Swarm or Apache Mesos for job execution and dispatch, as well as for node monitoring.  Google cAdvisor: for job monitoring  Language: Python  Databases backend: MongoDB, Redis, InfluxDB (optional).
  • 6. Features  Remote execution of a job (command line) • in a Docker container • with requested resources (cpu, memory) • with requested directories mounted in container (according to ACL)  Allowed container images can be limited to a list (config)  User can specify the container image (config)  Optional root access to container (config)
  • 7. Features  Interactive sessions (ssh) in a container  User/Group priority and quotas.  Jobs scheduling according to multiple properties (priority, waiting time, previous usage, …). Fair share algorithm available.  Plugins to modify or add features.  Global and per job monitoring (past and live).  Partial DRMAA v1 support
  • 8. Architecture CLI/ Web UI Web proxy Web servers Influxdb Redis MongoDB Scheduler Watchers Dispatcher (Swarm, Mesos) Submit task Monitor tasks Execution nodes (with Docker) Shared file system
  • 9. Databases  Mongodb: • used to store jobs data  Redis: • use lists to dispatch requests between executors to monitor jobs  Influxdb: • optional db to store time based data (cpu/memory usage, number of jobs, etc.)
  • 10. Components  CLI : Command Line Interface  Web interface / REST API  Authentication / ACLs => plugins  Scheduler => plugins  Executor => plugins  Watchers => plugins
  • 11. Command Line Interface  Execute commands using the REST API of the web server: • submit and control running jobs • download output files from jobs • etc.  Some commands are dedicated to administrators: • project and user quota manager • etc.
  • 12. Web server  Submit and manage tasks via web UI  REST interface for remote control  Partial DRMAA v1 integration  Register new tasks for scheduler.
  • 13. Authentication / ACL  A plugin is available to authenticate users with an LDAP but it should be adapted to your needs • manage authentication for web site • define which volumes/directories can be mounted in container (user home directory etc.), and their mode (ro, rw).  Other plugins can be developed for specific authentication/acl
  • 14. Scheduler  Only one instance of the process  The scheduler reorder job requests: • per priority (user and/or group) • reject if quota reached • different algorithms are available: • fifo • fair(share) • others can be added with plugins
  • 15. Scheduler  It executes the job command using the executor plugin: • Docker Swarm • Apache Mesos • others can be developed • manage port mapping for interactive jobs
  • 16. Executor  Multiple instances can be run to scale with the number of jobs to monitor.  Manage kill or reschedule requests  Checks the status of the job (running, over)  Trigger watchers (see next slide)  When job is over, it updates job status.
  • 17. Watcher  Watchers are plugins called by executors during job execution to act upon job life cycle: • ex: kill job • ex: update some meta-data  New plugins can be added  Available: • Maxlifespanwatcher: kill a job after X days.
  • 18. Monitoring  Cadvisor • helps to monitor “live” job cpu/memory usage. • data can be saved in InfluxDB for later analysis.  Previous jobs data are kept in MongoDB for statistics/analysis.
  • 19. About  Authors: • Olivier Sallou (IRISA / Univ. Rennes 1) • Cyril Monjeaud (IRISA/ Univ. Rennes 1)  License: Apache-2.0 Web: http://www.genouest.org/godocker/ Code: https://bitbucket.org/osallou/go-docker