SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
bit.ly/kubemaster1
1
GPU Enablement for Data
Science on OpenShift
Pete MacKinnon
Red Hat AI Center of Excellence
@pdmackinnon
● pmackinn@redhat.com
● Principal Engineer in the Red Hat AI Center of Excellence
● Kubeflow committer since project formation
● Open Data Hub and NVIDIA GPU Operator contributor
● KubeCon, TensorFlow World, GTC, ODSC, OpenShift
Commons, and SCaLE 17x presenter
● Technical Editor for upcoming Kubeflow publication
● Co-author of “Linux Unleashed”
● Thirty years of distributed computing consulting and
engineering experience
• Data science: data and models
• AI/ML lifecycle: training to inference
• Scalars, vectors, and tensors
• CPU and GPU
• Notebooks and frameworks
• The OpenShift GPU operator “family”
• The components of GPU enablement
• Installation and demo
Agenda
Data
Models
The AI/ML lifecycle
Inference/Serving
Training
Data collection
Feature
extraction
Labeling
Monitoring
Logging
Analysis
Transformation
Validation
Splitting
Model validation
Hyperparameter tuning
Algorithm selection or
development
Model Data and Model
in Production
Data
Scalars, vectors, and tensors
Scalar - a real number having magnitude that measures
something: volume, density, speed, energy, mass, time, etc.
Vector - a one-dimensional array of scalars: force, velocity,
momentum, etc.
Tensor - a higher-order algebraic object that could be a scalar, a
vector, a multidimensional array, a multilinear map, etc.
Modern CPU have advanced instruction sets for vector algebra
but modern GPU are built specifically to perform complex
tensor operations with a high degree of parallelism
Scalars, vectors, and tensors
How many matrix multiplications can be done in one clock cycle?
Image: https://iq.opengenus.org/
10¹ 10⁴ 10⁵
So, in one clock cycle...
CPU (scalar)
CPU/GPU
(vector)
GPU (tensor)
Or, DL with real world data...
Object
(scalar)
Movement
(vector)
Classification, velocity,
bearing, and much more
(tensor)
CPU and GPU
NVIDIA Ampere A100
• 6912 FP32 CUDA Cores
• 432 Gen3 Tensor Cores
but
• FP32 -> 19.5 TFLOPS
AMD EPYC 7702 (Rome)
• 64 CPU Cores
• 128 Threads
• 2.0GHz Base Clock
• FP32 -> 1-2 TFLOPS
A GPU notebook
Profit
380x speedup over CPU in basic CNN smoke test
(Intel Xeon E5-2686 vs. NVIDIA V100-SXM2-16Gi)
Special Resource Operator
(SRO)
● Community operator
● Reference
implementation for other
specialized hardware
○ NIC, FPGA
● Provided the code basis
for the NVIDIA GPU
Operator
● Deployed from
OperatorHub
GPU operators
NVIDIA GPU Operator
● Certified and supported on
OpenShift by NVIDIA and Red Hat
● Can be deployed from embedded
OperatorHub or with Helm
Both operators require node feature
discovery (NFD)
NVIDIA also provides the GPU feature
operator for enhanced labeling
Operator components
• Container-runtime-toolkit: The NVIDIA GPU Operator
supports docker and cri-o container runtimes. This daemonset
ensures the correct runtime setup for the GPU hook.
• Driver: A container deployed as a daemonset that holds all
userspace and kernelspace software to make the GPU device
work.
• Device plugin: A daemonset that monitors the health and
availability of the GPU on the node. Vital for pod scheduling.
• DCGM: Data Center GPU Monitoring - a node exporter that
captures GPU metrics for use by Prometheus.
nodeSelector:
feature.node.kubernetes.io/pci-10de.present: "true"
Installation
Demo
Thank You

Más contenido relacionado

La actualidad más candente

Operatorhub.io and your Kubernetes cluster | DevNation Tech Talk
Operatorhub.io and your Kubernetes cluster | DevNation Tech TalkOperatorhub.io and your Kubernetes cluster | DevNation Tech Talk
Operatorhub.io and your Kubernetes cluster | DevNation Tech TalkRed Hat Developers
 
16. Cncf meetup-docker
16. Cncf meetup-docker16. Cncf meetup-docker
16. Cncf meetup-dockerJuraj Hantak
 
OpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShiftOpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShiftGlobal Knowledge Technologies
 
Journey of Kubernetes Scaling
Journey of Kubernetes ScalingJourney of Kubernetes Scaling
Journey of Kubernetes ScalingOpsta
 
What you have to know about Certified Kubernetes Administrator (CKA)
What you have to know about Certified Kubernetes Administrator (CKA)What you have to know about Certified Kubernetes Administrator (CKA)
What you have to know about Certified Kubernetes Administrator (CKA)Opsta
 
Kubernetes - A Rising Hero
Kubernetes - A Rising HeroKubernetes - A Rising Hero
Kubernetes - A Rising HeroHuynh Thai Bao
 
Containerd + buildkit breakout
Containerd + buildkit breakoutContainerd + buildkit breakout
Containerd + buildkit breakoutDocker, Inc.
 
Cicd pixelfederation
Cicd pixelfederationCicd pixelfederation
Cicd pixelfederationJuraj Hantak
 
Managing kubernetes deployment with operators
Managing kubernetes deployment with operatorsManaging kubernetes deployment with operators
Managing kubernetes deployment with operatorsCloud Technology Experts
 
GlueCon kubernetes & container engine
GlueCon kubernetes & container engineGlueCon kubernetes & container engine
GlueCon kubernetes & container enginebrendandburns
 
Introduction to Kubernetes and GKE
Introduction to Kubernetes and GKEIntroduction to Kubernetes and GKE
Introduction to Kubernetes and GKEOpsta
 
Multi-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroMulti-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroKublr
 
Docker ee an architecture and operations overview
Docker ee an architecture and operations overviewDocker ee an architecture and operations overview
Docker ee an architecture and operations overviewDocker, Inc.
 
Extended and embedding: containerd update & project use cases
Extended and embedding: containerd update & project use casesExtended and embedding: containerd update & project use cases
Extended and embedding: containerd update & project use casesPhil Estes
 
Implementing an Automated Staging Environment
Implementing an Automated Staging EnvironmentImplementing an Automated Staging Environment
Implementing an Automated Staging EnvironmentDaniel Oliveira Filho
 
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...DevOps.com
 
Kubernetes configuration and security policies with KubeLinter | DevNation Te...
Kubernetes configuration and security policies with KubeLinter | DevNation Te...Kubernetes configuration and security policies with KubeLinter | DevNation Te...
Kubernetes configuration and security policies with KubeLinter | DevNation Te...Red Hat Developers
 
How Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App ModernizationHow Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App ModernizationDocker, Inc.
 
FOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project UpdateFOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project UpdatePhil Estes
 

La actualidad más candente (20)

Operatorhub.io and your Kubernetes cluster | DevNation Tech Talk
Operatorhub.io and your Kubernetes cluster | DevNation Tech TalkOperatorhub.io and your Kubernetes cluster | DevNation Tech Talk
Operatorhub.io and your Kubernetes cluster | DevNation Tech Talk
 
16. Cncf meetup-docker
16. Cncf meetup-docker16. Cncf meetup-docker
16. Cncf meetup-docker
 
OpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShiftOpenShift Application Development | DO288 | Red Hat OpenShift
OpenShift Application Development | DO288 | Red Hat OpenShift
 
Journey of Kubernetes Scaling
Journey of Kubernetes ScalingJourney of Kubernetes Scaling
Journey of Kubernetes Scaling
 
What you have to know about Certified Kubernetes Administrator (CKA)
What you have to know about Certified Kubernetes Administrator (CKA)What you have to know about Certified Kubernetes Administrator (CKA)
What you have to know about Certified Kubernetes Administrator (CKA)
 
Kubernetes Logging
Kubernetes LoggingKubernetes Logging
Kubernetes Logging
 
Kubernetes - A Rising Hero
Kubernetes - A Rising HeroKubernetes - A Rising Hero
Kubernetes - A Rising Hero
 
Containerd + buildkit breakout
Containerd + buildkit breakoutContainerd + buildkit breakout
Containerd + buildkit breakout
 
Cicd pixelfederation
Cicd pixelfederationCicd pixelfederation
Cicd pixelfederation
 
Managing kubernetes deployment with operators
Managing kubernetes deployment with operatorsManaging kubernetes deployment with operators
Managing kubernetes deployment with operators
 
GlueCon kubernetes & container engine
GlueCon kubernetes & container engineGlueCon kubernetes & container engine
GlueCon kubernetes & container engine
 
Introduction to Kubernetes and GKE
Introduction to Kubernetes and GKEIntroduction to Kubernetes and GKE
Introduction to Kubernetes and GKE
 
Multi-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroMulti-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with Velero
 
Docker ee an architecture and operations overview
Docker ee an architecture and operations overviewDocker ee an architecture and operations overview
Docker ee an architecture and operations overview
 
Extended and embedding: containerd update & project use cases
Extended and embedding: containerd update & project use casesExtended and embedding: containerd update & project use cases
Extended and embedding: containerd update & project use cases
 
Implementing an Automated Staging Environment
Implementing an Automated Staging EnvironmentImplementing an Automated Staging Environment
Implementing an Automated Staging Environment
 
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...
Five Lessons Learned from Large-scale Implementation of Kubernetes in the Ent...
 
Kubernetes configuration and security policies with KubeLinter | DevNation Te...
Kubernetes configuration and security policies with KubeLinter | DevNation Te...Kubernetes configuration and security policies with KubeLinter | DevNation Te...
Kubernetes configuration and security policies with KubeLinter | DevNation Te...
 
How Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App ModernizationHow Docker EE is Finnish Railway’s Ticket to App Modernization
How Docker EE is Finnish Railway’s Ticket to App Modernization
 
FOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project UpdateFOSDEM 2019: A containerd Project Update
FOSDEM 2019: A containerd Project Update
 

Similar a GPU enablement for data science on OpenShift | DevNation Tech Talk

Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievVolodymyr Saviak
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Haidee McMahon
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceOdinot Stanislas
 
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAMaking the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAFacultad de Informática UCM
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfMuhammadAbdullah311866
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningSergey Karayev
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftDatabricks
 
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ..."The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...Edge AI and Vision Alliance
 
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Unity Technologies
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...Rogue Wave Software
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 

Similar a GPU enablement for data science on OpenShift | DevNation Tech Talk (20)

Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Kindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 KievKindratenko hpc day 2011 Kiev
Kindratenko hpc day 2011 Kiev
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGAMaking the most out of Heterogeneous Chips with CPU, GPU and FPGA
Making the most out of Heterogeneous Chips with CPU, GPU and FPGA
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
 
Infrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep LearningInfrastructure and Tooling - Full Stack Deep Learning
Infrastructure and Tooling - Full Stack Deep Learning
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Programming Models for Heterogeneous Chips
Programming Models for  Heterogeneous ChipsProgramming Models for  Heterogeneous Chips
Programming Models for Heterogeneous Chips
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Scaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at LyftScaling Apache Spark on Kubernetes at Lyft
Scaling Apache Spark on Kubernetes at Lyft
 
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ..."The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...
"The Caffe2 Framework for Mobile and Embedded Deep Learning," a Presentation ...
 
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 

Más de Red Hat Developers

DevNation Tech Talk: Getting GitOps
DevNation Tech Talk: Getting GitOpsDevNation Tech Talk: Getting GitOps
DevNation Tech Talk: Getting GitOpsRed Hat Developers
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesRed Hat Developers
 
GitHub Makeover | DevNation Tech Talk
GitHub Makeover | DevNation Tech TalkGitHub Makeover | DevNation Tech Talk
GitHub Makeover | DevNation Tech TalkRed Hat Developers
 
Quinoa: A modern Quarkus UI with no hassles | DevNation tech Talk
Quinoa: A modern Quarkus UI with no hassles | DevNation tech TalkQuinoa: A modern Quarkus UI with no hassles | DevNation tech Talk
Quinoa: A modern Quarkus UI with no hassles | DevNation tech TalkRed Hat Developers
 
Extra micrometer practices with Quarkus | DevNation Tech Talk
Extra micrometer practices with Quarkus | DevNation Tech TalkExtra micrometer practices with Quarkus | DevNation Tech Talk
Extra micrometer practices with Quarkus | DevNation Tech TalkRed Hat Developers
 
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...Red Hat Developers
 
Integrating Loom in Quarkus | DevNation Tech Talk
Integrating Loom in Quarkus | DevNation Tech TalkIntegrating Loom in Quarkus | DevNation Tech Talk
Integrating Loom in Quarkus | DevNation Tech TalkRed Hat Developers
 
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...Red Hat Developers
 
Containers without docker | DevNation Tech Talk
Containers without docker | DevNation Tech TalkContainers without docker | DevNation Tech Talk
Containers without docker | DevNation Tech TalkRed Hat Developers
 
Distributed deployment of microservices across multiple OpenShift clusters | ...
Distributed deployment of microservices across multiple OpenShift clusters | ...Distributed deployment of microservices across multiple OpenShift clusters | ...
Distributed deployment of microservices across multiple OpenShift clusters | ...Red Hat Developers
 
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...Red Hat Developers
 
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...Red Hat Developers
 
11 CLI tools every developer should know | DevNation Tech Talk
11 CLI tools every developer should know | DevNation Tech Talk11 CLI tools every developer should know | DevNation Tech Talk
11 CLI tools every developer should know | DevNation Tech TalkRed Hat Developers
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkRed Hat Developers
 
GitHub Actions and OpenShift: ​​Supercharging your software development loops...
GitHub Actions and OpenShift: ​​Supercharging your software development loops...GitHub Actions and OpenShift: ​​Supercharging your software development loops...
GitHub Actions and OpenShift: ​​Supercharging your software development loops...Red Hat Developers
 
To the moon and beyond with Java 17 APIs! | DevNation Tech Talk
To the moon and beyond with Java 17 APIs! | DevNation Tech TalkTo the moon and beyond with Java 17 APIs! | DevNation Tech Talk
To the moon and beyond with Java 17 APIs! | DevNation Tech TalkRed Hat Developers
 
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...Red Hat Developers
 
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...Red Hat Developers
 
Level-up your gaming telemetry using Kafka Streams | DevNation Tech Talk
Level-up your gaming telemetry using Kafka Streams | DevNation Tech TalkLevel-up your gaming telemetry using Kafka Streams | DevNation Tech Talk
Level-up your gaming telemetry using Kafka Streams | DevNation Tech TalkRed Hat Developers
 
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Red Hat Developers
 

Más de Red Hat Developers (20)

DevNation Tech Talk: Getting GitOps
DevNation Tech Talk: Getting GitOpsDevNation Tech Talk: Getting GitOps
DevNation Tech Talk: Getting GitOps
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
GitHub Makeover | DevNation Tech Talk
GitHub Makeover | DevNation Tech TalkGitHub Makeover | DevNation Tech Talk
GitHub Makeover | DevNation Tech Talk
 
Quinoa: A modern Quarkus UI with no hassles | DevNation tech Talk
Quinoa: A modern Quarkus UI with no hassles | DevNation tech TalkQuinoa: A modern Quarkus UI with no hassles | DevNation tech Talk
Quinoa: A modern Quarkus UI with no hassles | DevNation tech Talk
 
Extra micrometer practices with Quarkus | DevNation Tech Talk
Extra micrometer practices with Quarkus | DevNation Tech TalkExtra micrometer practices with Quarkus | DevNation Tech Talk
Extra micrometer practices with Quarkus | DevNation Tech Talk
 
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...
Event-driven autoscaling through KEDA and Knative Integration | DevNation Tec...
 
Integrating Loom in Quarkus | DevNation Tech Talk
Integrating Loom in Quarkus | DevNation Tech TalkIntegrating Loom in Quarkus | DevNation Tech Talk
Integrating Loom in Quarkus | DevNation Tech Talk
 
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...
Quarkus Renarde 🦊♥: an old-school Web framework with today's touch | DevNatio...
 
Containers without docker | DevNation Tech Talk
Containers without docker | DevNation Tech TalkContainers without docker | DevNation Tech Talk
Containers without docker | DevNation Tech Talk
 
Distributed deployment of microservices across multiple OpenShift clusters | ...
Distributed deployment of microservices across multiple OpenShift clusters | ...Distributed deployment of microservices across multiple OpenShift clusters | ...
Distributed deployment of microservices across multiple OpenShift clusters | ...
 
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...
DevNation Workshop: Object detection with Red Hat OpenShift Data Science [Mar...
 
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...
Dear security, compliance, and auditing: We’re sorry. Love, DevOps | DevNatio...
 
11 CLI tools every developer should know | DevNation Tech Talk
11 CLI tools every developer should know | DevNation Tech Talk11 CLI tools every developer should know | DevNation Tech Talk
11 CLI tools every developer should know | DevNation Tech Talk
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
GitHub Actions and OpenShift: ​​Supercharging your software development loops...
GitHub Actions and OpenShift: ​​Supercharging your software development loops...GitHub Actions and OpenShift: ​​Supercharging your software development loops...
GitHub Actions and OpenShift: ​​Supercharging your software development loops...
 
To the moon and beyond with Java 17 APIs! | DevNation Tech Talk
To the moon and beyond with Java 17 APIs! | DevNation Tech TalkTo the moon and beyond with Java 17 APIs! | DevNation Tech Talk
To the moon and beyond with Java 17 APIs! | DevNation Tech Talk
 
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...
Profile your Java apps in production on Red Hat OpenShift with Cryostat | Dev...
 
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
Kafka at the Edge: an IoT scenario with OpenShift Streams for Apache Kafka | ...
 
Level-up your gaming telemetry using Kafka Streams | DevNation Tech Talk
Level-up your gaming telemetry using Kafka Streams | DevNation Tech TalkLevel-up your gaming telemetry using Kafka Streams | DevNation Tech Talk
Level-up your gaming telemetry using Kafka Streams | DevNation Tech Talk
 
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...
Friends don't let friends do dual writes: Outbox pattern with OpenShift Strea...
 

Último

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

GPU enablement for data science on OpenShift | DevNation Tech Talk

  • 1. bit.ly/kubemaster1 1 GPU Enablement for Data Science on OpenShift Pete MacKinnon Red Hat AI Center of Excellence
  • 2. @pdmackinnon ● pmackinn@redhat.com ● Principal Engineer in the Red Hat AI Center of Excellence ● Kubeflow committer since project formation ● Open Data Hub and NVIDIA GPU Operator contributor ● KubeCon, TensorFlow World, GTC, ODSC, OpenShift Commons, and SCaLE 17x presenter ● Technical Editor for upcoming Kubeflow publication ● Co-author of “Linux Unleashed” ● Thirty years of distributed computing consulting and engineering experience
  • 3. • Data science: data and models • AI/ML lifecycle: training to inference • Scalars, vectors, and tensors • CPU and GPU • Notebooks and frameworks • The OpenShift GPU operator “family” • The components of GPU enablement • Installation and demo Agenda
  • 6. The AI/ML lifecycle Inference/Serving Training Data collection Feature extraction Labeling Monitoring Logging Analysis Transformation Validation Splitting Model validation Hyperparameter tuning Algorithm selection or development Model Data and Model in Production Data
  • 7. Scalars, vectors, and tensors Scalar - a real number having magnitude that measures something: volume, density, speed, energy, mass, time, etc. Vector - a one-dimensional array of scalars: force, velocity, momentum, etc. Tensor - a higher-order algebraic object that could be a scalar, a vector, a multidimensional array, a multilinear map, etc. Modern CPU have advanced instruction sets for vector algebra but modern GPU are built specifically to perform complex tensor operations with a high degree of parallelism
  • 8. Scalars, vectors, and tensors How many matrix multiplications can be done in one clock cycle? Image: https://iq.opengenus.org/ 10¹ 10⁴ 10⁵
  • 9. So, in one clock cycle... CPU (scalar) CPU/GPU (vector) GPU (tensor)
  • 10. Or, DL with real world data... Object (scalar) Movement (vector) Classification, velocity, bearing, and much more (tensor)
  • 11. CPU and GPU NVIDIA Ampere A100 • 6912 FP32 CUDA Cores • 432 Gen3 Tensor Cores but • FP32 -> 19.5 TFLOPS AMD EPYC 7702 (Rome) • 64 CPU Cores • 128 Threads • 2.0GHz Base Clock • FP32 -> 1-2 TFLOPS
  • 13. Profit 380x speedup over CPU in basic CNN smoke test (Intel Xeon E5-2686 vs. NVIDIA V100-SXM2-16Gi)
  • 14. Special Resource Operator (SRO) ● Community operator ● Reference implementation for other specialized hardware ○ NIC, FPGA ● Provided the code basis for the NVIDIA GPU Operator ● Deployed from OperatorHub GPU operators NVIDIA GPU Operator ● Certified and supported on OpenShift by NVIDIA and Red Hat ● Can be deployed from embedded OperatorHub or with Helm Both operators require node feature discovery (NFD) NVIDIA also provides the GPU feature operator for enhanced labeling
  • 15. Operator components • Container-runtime-toolkit: The NVIDIA GPU Operator supports docker and cri-o container runtimes. This daemonset ensures the correct runtime setup for the GPU hook. • Driver: A container deployed as a daemonset that holds all userspace and kernelspace software to make the GPU device work. • Device plugin: A daemonset that monitors the health and availability of the GPU on the node. Vital for pod scheduling. • DCGM: Data Center GPU Monitoring - a node exporter that captures GPU metrics for use by Prometheus. nodeSelector: feature.node.kubernetes.io/pci-10de.present: "true"
  • 17. Demo