SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
ssSystem Service Disruption
• HPC in a Box
• Cluster Use-Case
QNIBInventory
Dashboards
• Development Workflow w/ Containers
• Microservices
2
Agenda
This Workshop was recorded:

https://youtu.be/L9SyY9TZyY4
HPC in a Box
4
QNIB
5
Mock-Up a Cluster Stack
• ‘Hello World’ everything
ibsim, SLURM cluster, monitoring
• First attempt using VirtualBox
Predefined resource use
Gave up after starting a handful of VMs
• Stumbled upon Docker in 2013
Applied it to my problem
6
QNIBMonitoring
srv backendconsul
• CONTAINERISE all the things
Service discovery, health checks, clustering
CONTAINER RELEVANT SERVICE
7
Consul
8
Consul
9
QNIBMonitoring
srv backendconsul
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
• CONTAINERISE all the things
Service discovery, health checks, clustering
metrics engine
CONTAINER RELEVANT SERVICE
10
Performance (graphite)
11
QNIBMonitoring
elasticsearch
srv backendconsul
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana3 kibana3
kopf es-kopf
• CONTAINERISE all the things
Service discovery, health checks, clustering
metrics engine
log event framework
CONTAINER RELEVANT SERVICE
12
Logs (ELK)
$ echo "Hello World"|nc -w1 192.168.99.100 5514
13
Event/Metric Correlation
Cluster Use-Case
docker host
elasticsearch
• Small SLURM cluster
7 compute nodes, 2 spine-, 2 leaf-SW
simple workload (ib_write_bw)
15
Cluster Use-case
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
compute0
compute1
compute2
compute3
compute4
compute5
compute6
IB
ETH
QNIBInventory
• InfiniBand topology is reflected in GraphDB
17
QNIBInventory
• InfiniBand topology is reflected in GraphDB
Routing information
18
QNIBInventory
• InfiniBand topology is reflected in GraphDB
Routing information
• SLURM information
19
QNIBInventory
• Enrich Log/Events
20
QNIBInventory
before
after
• Enrich Log/Events
• Build up history
21
QNIBInventory
Dashboards
23
Static Dashboards
24
IB Dashboard
• SLURM job overview
25
Autogenerated Dashboards
• SLURM job overview
• Individual Job Dashboards
26
Autogenerated Dashboards
Benefits / Implications
• New dashboard cubism?
28
Rapid Prototyping
elasticsearch
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
• New dashboard cubism?
29
Rapid Prototyping
elasticsearch
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
cubismcubism.js
• New backend for graphite: InfluxDB
written in go, explicit TS-database, nice API, SQL-queries
compatible input “key val ts” and integrates with Graphite-API
30
Rapid Prototyping #2
elasticsearch
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
carbon.service.consul
• New backend for graphite: InfluxDB
written in go, explicit TS-database, nice API, SQL-queries
compatible input “key val ts” and integrates with Graphite-API
31
Rapid Prototyping #2
elasticsearch
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
influxdb influxdb
graphite-api’ graphite-api
grafana’ grafana
carbon-relaycarbon
carbon.service.consul
Workstation
~/dev/inventory/
• Write local, execute within container
Reproducible, reliable development environment
32
Development Environment
Container
/opt/inventory/
Workstation
~/dev/inventory/~/prod/inventory/
• Write local, execute within container
Reproducible, reliable development environment
33
Development Environment
Container
/opt/inventory/
docker host
elasticsearch
34
Iterate on Log-Patterns
srv backend consul
opensmopensm
Cluster
carboncarbon
graphite-apigraphite-api
Performance
grafanagrafana
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
neo4j neo4j
Inventory
inventory QINBInv
clusterinfo
ibinfo
slurminfo
compute0
compute1
compute2
compute3
compute4
compute5
compute6
IB
ETH
• GROK test








• GROK pattern
• Logstash
NEW_PORT Hello World
35
Iterate on Log-Patterns
NEW_PORT:
compare: "%{NEW_PORT}"
input: "Creating new port object with GUID 0x0002c90300ee1b81"
result: {
"src_port_guid": "2c90300ee1b81"
}
if [osm_func] == "ni_rcv_process_existing_ca_or_router" {
grok {
patterns_dir => "/etc/grok/patterns/"
match => [ "message", "%{NEW_PORT}" ]
}
}
NEW_PORT Creatings+news+ports+objects+withs+GUIDs+%{SRC_PORT_GUID}
MacBook
elasticsearch
36
Iterate on Log-Patterns
srv backend consul
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
• Start minimal stack on workstation
Microservices
38
Microservice Definition
–Adrian Cockcroft
„Loosely coupled service oriented architecture
with bounded contexts“
MacBook
elasticsearch
39
bounded context
srv backend consul
Log/Events
elasticsearch
logger logstash
kibana kiabana
kopf es-kopf
• Iterate within the context
• Rely on stable version of (loosely coupled)
dependencies
Thinkpad
elasticsearch
srv backend consul
	
  
Log/Events
elasticsearch
neo4j neo4j
Inventory
inventory QINBInv
40
µServices: Break down Silos
Super
User
Prod
Mgr
Sys
Arch
Dev QA
Sys
Adm
Net
Adm
HPC
Snow-
flakes
Product Team Using Monolithic Delivery
Product Team Using Monolithic Delivery
Product Team Using Microservices A

P

I
Srv Team
Srv TeamProduct Team Using Microservices
• Linux Containers do not add much overhead
• Lighting fast development iteration, 

since boot-up is plain ‘fork()’
• build, distribute and run extremely powerful
private/public registries
• image hierarchy to add new service in minutes
41
Conclusion
Q&A

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

DevConf 2017 - Realistic Container Platform Simulations
DevConf 2017 - Realistic Container Platform SimulationsDevConf 2017 - Realistic Container Platform Simulations
DevConf 2017 - Realistic Container Platform Simulations
 
Whose Job Is It Anyway? Kubernetes, CRI, & Container Runtimes
Whose Job Is It Anyway? Kubernetes, CRI, & Container RuntimesWhose Job Is It Anyway? Kubernetes, CRI, & Container Runtimes
Whose Job Is It Anyway? Kubernetes, CRI, & Container Runtimes
 
FOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project UpdateFOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project Update
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
Let's Try Every CRI Runtime Available for Kubernetes
Let's Try Every CRI Runtime Available for KubernetesLet's Try Every CRI Runtime Available for Kubernetes
Let's Try Every CRI Runtime Available for Kubernetes
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 
'Package Once/Run Anywhere' Big Data and HPC workloads
'Package Once/Run Anywhere' Big Data and HPC workloads'Package Once/Run Anywhere' Big Data and HPC workloads
'Package Once/Run Anywhere' Big Data and HPC workloads
 
CRI, OCI, and CRI-O
CRI, OCI, and CRI-OCRI, OCI, and CRI-O
CRI, OCI, and CRI-O
 
Docker London Meetup: Docker Engine Evolution
Docker London Meetup: Docker Engine EvolutionDocker London Meetup: Docker Engine Evolution
Docker London Meetup: Docker Engine Evolution
 
Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24Introduction kubernetes 2017_12_24
Introduction kubernetes 2017_12_24
 
OpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShiftOpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShift
 
Building stateful applications on Kubernetes with Rook
Building stateful applications on Kubernetes with RookBuilding stateful applications on Kubernetes with Rook
Building stateful applications on Kubernetes with Rook
 
Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps. Releasing a Distribution in the Age of DevOps.
Releasing a Distribution in the Age of DevOps.
 
CNCF Projects Overview
CNCF Projects OverviewCNCF Projects Overview
CNCF Projects Overview
 
Linuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharborLinuxcon secureefficientcontainerimagemanagementharbor
Linuxcon secureefficientcontainerimagemanagementharbor
 
Web後端技術的演變
Web後端技術的演變Web後端技術的演變
Web後端技術的演變
 
How to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these projectHow to integrate Kubernetes in OpenStack: You need to know these project
How to integrate Kubernetes in OpenStack: You need to know these project
 
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
Integrate Kubernetes into CORD(Central Office Re-architected as a Datacenter)
 
KubeCon EU 2016: Bringing an open source Containerized Container Platform to ...
KubeCon EU 2016: Bringing an open source Containerized Container Platform to ...KubeCon EU 2016: Bringing an open source Containerized Container Platform to ...
KubeCon EU 2016: Bringing an open source Containerized Container Platform to ...
 
CraftConf 2019: CRI Runtimes Deep Dive: Who Is Running My Pod?
CraftConf 2019:  CRI Runtimes Deep Dive: Who Is Running My Pod?CraftConf 2019:  CRI Runtimes Deep Dive: Who Is Running My Pod?
CraftConf 2019: CRI Runtimes Deep Dive: Who Is Running My Pod?
 

Similar a HPC in a Box - Docker Workshop at ISC 2015

Similar a HPC in a Box - Docker Workshop at ISC 2015 (20)

Docker, Monitoring and SLURM Specific Visualisations
Docker, Monitoring and SLURM Specific VisualisationsDocker, Monitoring and SLURM Specific Visualisations
Docker, Monitoring and SLURM Specific Visualisations
 
Using CVMFS on a distributed Kubernetes cluster - The PRP Experience
Using CVMFS on a distributed Kubernetes cluster - The PRP ExperienceUsing CVMFS on a distributed Kubernetes cluster - The PRP Experience
Using CVMFS on a distributed Kubernetes cluster - The PRP Experience
 
2014 11-05 hpcac-kniep_christian_dockermpi
2014 11-05 hpcac-kniep_christian_dockermpi2014 11-05 hpcac-kniep_christian_dockermpi
2014 11-05 hpcac-kniep_christian_dockermpi
 
Fabio rapposelli pks-vmug
Fabio rapposelli   pks-vmugFabio rapposelli   pks-vmug
Fabio rapposelli pks-vmug
 
Openshift serverless Solution
Openshift serverless SolutionOpenshift serverless Solution
Openshift serverless Solution
 
Serverless stream processing of Debezium data change events with Knative | De...
Serverless stream processing of Debezium data change events with Knative | De...Serverless stream processing of Debezium data change events with Knative | De...
Serverless stream processing of Debezium data change events with Knative | De...
 
Testing applications with traffic control in containers / Alban Crequy (Kinvolk)
Testing applications with traffic control in containers / Alban Crequy (Kinvolk)Testing applications with traffic control in containers / Alban Crequy (Kinvolk)
Testing applications with traffic control in containers / Alban Crequy (Kinvolk)
 
The path to a serverless-native era with Kubernetes
The path to a serverless-native era with KubernetesThe path to a serverless-native era with Kubernetes
The path to a serverless-native era with Kubernetes
 
Cncf explore k8s_api_go
Cncf explore k8s_api_goCncf explore k8s_api_go
Cncf explore k8s_api_go
 
K8s is not for App Developers
K8s is not for App DevelopersK8s is not for App Developers
K8s is not for App Developers
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
 
Getting started with kubernetes
Getting started with kubernetesGetting started with kubernetes
Getting started with kubernetes
 
Openzipkin conf: Zipkin at Yelp
Openzipkin conf: Zipkin at YelpOpenzipkin conf: Zipkin at Yelp
Openzipkin conf: Zipkin at Yelp
 
Event Driven Microservices
Event Driven MicroservicesEvent Driven Microservices
Event Driven Microservices
 
Dude where's my volume, open stack summit vancouver 2015
Dude where's my volume, open stack summit vancouver 2015Dude where's my volume, open stack summit vancouver 2015
Dude where's my volume, open stack summit vancouver 2015
 
Meetup Openshift Geneva 03/10
Meetup Openshift Geneva 03/10Meetup Openshift Geneva 03/10
Meetup Openshift Geneva 03/10
 
Lessons learned using GitOps
Lessons learned using GitOpsLessons learned using GitOps
Lessons learned using GitOps
 
Container Camp London (2016-09-09)
Container Camp London (2016-09-09)Container Camp London (2016-09-09)
Container Camp London (2016-09-09)
 
20171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v0120171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v01
 
CyberAgent における OSS の CI/CD 基盤開発 myshoes #CICD2021
CyberAgent における OSS の CI/CD 基盤開発 myshoes #CICD2021CyberAgent における OSS の CI/CD 基盤開発 myshoes #CICD2021
CyberAgent における OSS の CI/CD 基盤開発 myshoes #CICD2021
 

Más de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 

Más de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

HPC in a Box - Docker Workshop at ISC 2015

  • 2. • HPC in a Box • Cluster Use-Case QNIBInventory Dashboards • Development Workflow w/ Containers • Microservices 2 Agenda This Workshop was recorded:
 https://youtu.be/L9SyY9TZyY4
  • 3. HPC in a Box
  • 5. 5 Mock-Up a Cluster Stack • ‘Hello World’ everything ibsim, SLURM cluster, monitoring • First attempt using VirtualBox Predefined resource use Gave up after starting a handful of VMs • Stumbled upon Docker in 2013 Applied it to my problem
  • 6. 6 QNIBMonitoring srv backendconsul • CONTAINERISE all the things Service discovery, health checks, clustering CONTAINER RELEVANT SERVICE
  • 9. 9 QNIBMonitoring srv backendconsul carboncarbon graphite-apigraphite-api Performance grafanagrafana • CONTAINERISE all the things Service discovery, health checks, clustering metrics engine CONTAINER RELEVANT SERVICE
  • 11. 11 QNIBMonitoring elasticsearch srv backendconsul carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana3 kibana3 kopf es-kopf • CONTAINERISE all the things Service discovery, health checks, clustering metrics engine log event framework CONTAINER RELEVANT SERVICE
  • 12. 12 Logs (ELK) $ echo "Hello World"|nc -w1 192.168.99.100 5514
  • 15. docker host elasticsearch • Small SLURM cluster 7 compute nodes, 2 spine-, 2 leaf-SW simple workload (ib_write_bw) 15 Cluster Use-case srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo compute0 compute1 compute2 compute3 compute4 compute5 compute6 IB ETH
  • 17. • InfiniBand topology is reflected in GraphDB 17 QNIBInventory
  • 18. • InfiniBand topology is reflected in GraphDB Routing information 18 QNIBInventory
  • 19. • InfiniBand topology is reflected in GraphDB Routing information • SLURM information 19 QNIBInventory
  • 21. • Enrich Log/Events • Build up history 21 QNIBInventory
  • 25. • SLURM job overview 25 Autogenerated Dashboards
  • 26. • SLURM job overview • Individual Job Dashboards 26 Autogenerated Dashboards
  • 28. • New dashboard cubism? 28 Rapid Prototyping elasticsearch srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo
  • 29. • New dashboard cubism? 29 Rapid Prototyping elasticsearch srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo cubismcubism.js
  • 30. • New backend for graphite: InfluxDB written in go, explicit TS-database, nice API, SQL-queries compatible input “key val ts” and integrates with Graphite-API 30 Rapid Prototyping #2 elasticsearch srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo carbon.service.consul
  • 31. • New backend for graphite: InfluxDB written in go, explicit TS-database, nice API, SQL-queries compatible input “key val ts” and integrates with Graphite-API 31 Rapid Prototyping #2 elasticsearch srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo influxdb influxdb graphite-api’ graphite-api grafana’ grafana carbon-relaycarbon carbon.service.consul
  • 32. Workstation ~/dev/inventory/ • Write local, execute within container Reproducible, reliable development environment 32 Development Environment Container /opt/inventory/
  • 33. Workstation ~/dev/inventory/~/prod/inventory/ • Write local, execute within container Reproducible, reliable development environment 33 Development Environment Container /opt/inventory/
  • 34. docker host elasticsearch 34 Iterate on Log-Patterns srv backend consul opensmopensm Cluster carboncarbon graphite-apigraphite-api Performance grafanagrafana Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf neo4j neo4j Inventory inventory QINBInv clusterinfo ibinfo slurminfo compute0 compute1 compute2 compute3 compute4 compute5 compute6 IB ETH
  • 35. • GROK test 
 
 
 
 • GROK pattern • Logstash NEW_PORT Hello World 35 Iterate on Log-Patterns NEW_PORT: compare: "%{NEW_PORT}" input: "Creating new port object with GUID 0x0002c90300ee1b81" result: { "src_port_guid": "2c90300ee1b81" } if [osm_func] == "ni_rcv_process_existing_ca_or_router" { grok { patterns_dir => "/etc/grok/patterns/" match => [ "message", "%{NEW_PORT}" ] } } NEW_PORT Creatings+news+ports+objects+withs+GUIDs+%{SRC_PORT_GUID}
  • 36. MacBook elasticsearch 36 Iterate on Log-Patterns srv backend consul Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf • Start minimal stack on workstation
  • 38. 38 Microservice Definition –Adrian Cockcroft „Loosely coupled service oriented architecture with bounded contexts“
  • 39. MacBook elasticsearch 39 bounded context srv backend consul Log/Events elasticsearch logger logstash kibana kiabana kopf es-kopf • Iterate within the context • Rely on stable version of (loosely coupled) dependencies Thinkpad elasticsearch srv backend consul   Log/Events elasticsearch neo4j neo4j Inventory inventory QINBInv
  • 40. 40 µServices: Break down Silos Super User Prod Mgr Sys Arch Dev QA Sys Adm Net Adm HPC Snow- flakes Product Team Using Monolithic Delivery Product Team Using Monolithic Delivery Product Team Using Microservices A
 P
 I Srv Team Srv TeamProduct Team Using Microservices
  • 41. • Linux Containers do not add much overhead • Lighting fast development iteration, 
 since boot-up is plain ‘fork()’ • build, distribute and run extremely powerful private/public registries • image hierarchy to add new service in minutes 41 Conclusion
  • 42. Q&A