SlideShare una empresa de Scribd logo
1 de 22
Kubernetes for Machine
Learning
by Akash Agrawal
Agenda
● Machine Learning Overview
● Machine Learning in Production
● Machine Learning on Google Cloud Platform (GCP)
● Kubernetes Overview
● Google Kubernetes Engine (GKE) Overview
● Kubeflow
● Design & Best Practices
About Me
● Google Developer Expert (on Google Cloud Platform Category)
● 11 years of experience in IT Industry
● Worked with various clients like Sabre/Citi Bank/Goldman Sachs/L&T
Infotech etc.
● Currently I work as Independent Consultant (as Technical
Adviser/Architect Role) & Tech Evangelist
What this Talk is (about/not about)
● About:
○ ML System Understanding
○ ML & Kubernetes Integration / Design
● Not About:
○ ML Code Syntax/Structure
○ ML Algorithms
Machine Learning Overview
● Teaching Computers to recognize patterns in the same way as our brains
do
● Model Building ---> Model Training ---> Model Serving
Machine Learning Overview
● Machine Learning Lifecycle:
○ Build Machine Learning Model:
■ Write Machine Learning Code in any supporting/framework e.g. TensorFlow, SciKit
Learn, XGBoost, PyTorch
○ Input Data:
■ You divide Input data into Training & Testing Data
■ Inference/Serving time you pass Inference Input Data
■ Data may have labels or not
○ Train the Model with Input Data:
■ Training generates Model (some kind of Graph e.g. TensorFlow Graph/DAG)
Machine Learning Overview
● Machine Learning Lifecycle:
○ Serve/Inference:
■ You can take the model & serve it as REST api endpoint
○ Predictions:
■ You use these REST api endpoints for Online/Batch Prediction (Confidence Value)
Machine Learning Overview
● Extra Steps:
○ Pre Processing
○ Post Processing
At what Stage are you with ML today
● Experimenting / Learning
● Building Proof of Concepts (POCs) / Prototyping
● Designing (Deployment/Workflows/Scaling/Management) for Production
Machine Learning In Production
● Few extra things to take care of:
○ Collaborative Environment with folks in different roles e.g. Data Scientists / Platform
Engineers / DevOPs / Researchers
○ Production ML Applications are designed to run 24/7/365
○ Input Data (Training/Testing & Inference) is floating continuously - Streaming/Batch
○ You can use different kind of frameworks for ML models building e.g. TensorFlow, SciKit
Learn, XGBoost, PyTorch etc.
○ These models constantly updated, improved upon & deployed
○ Repetitive ML Tasks like Feature Engineering, Hyperparameter Tuning, Data Cleansing &
Validations
Machine Learning In Production
● Few extra things to take care of:
○ Config Separation on different environments
○ RBAC (Role Based Access Control)
○ Different Deployment/Hosting Options : Cloud (e.g. GCP) or Private Data Centers/Cloud
(e.g. VMWare Based)
○ Different Hardwares/Accelerators for Compute intensive workloads e.g. GPUs/TPUs
○ Scaling Requirements:
■ Distributed Processing (Training or Serving)
■ Distributed Processing (e.g. One Model is running on multiple GPUs/TPUs or one
GPU is used to run multiple Models)
Machine Learning on GCP
● 3 ways:
○ ML as an API ( Cloud Vision API, Cloud Video Intelligence API, Cloud Speech API, Cloud
Natural Language API, Cloud Translation API)
○ AutoML
○ Custom Models
■ With Cloud ML Engine
■ With Kubernetes / GKE / Kubeflow etc.
Kubernetes Overview
● Kubernetes is an Open Source system for Container Orchestration
(Deployment/Management/Scaling)
● Features:
○ Scheduling
○ Self Healing / Auto Repairing
○ Scaling (Manual / Auto Scaling / Scaling Out / Scaling In)
○ ...
Google Kubernete Engine (GKE) Overview
● Managed Service for Kubernetes on Google Cloud (focused on
Deployment/Management/Scaling)
● Provides Reliable, Efficient & Secured way to run Kubernetes Clusters (on
GCP)
● GKE On-Prem
Google Kubernete Engine (GKE) Overview
● Features:
○ Fully Managed
○ Auto Scaling / Auto Upgrade / Auto Repair
○ Integration : IAM / StackDriver / VPC
○ Security, Compliance, Runs on Optimized OS (COS)
○ Accelerators Support : GPUs/TPUs
○ Various Cluster Topologies : Zonal Clusters / Regional Clusters
○ Workload Portability : On-Premises / Cloud
○ ...
Kubeflow:
● Focused on Deployment of ML Workflows on Kubernetes (Simple,
Portable & Scalable)
● Goal: is to support deployment of Best-of-breed Open Source Systems for
ML to diverse Infrastructure
● Anywhere you are running Kubernetes, you can run Kubeflow
Kubeflow:
● Features:
○ Pipelines: for deploying & managing End to End ML Workflows.
○ Integration:
■ Jupyter Notebooks
■ TensorFlow Model Training Controller
■ Seldon Core : for Model Serving
○ Multi-Framework Support: TensorFlow, PyTorch, Apache MXNet
○ Share/Reuse using AI Hub
Design & Best Practices:
● Separate out Compute & Storage
● Scaling & Self Healing Capabilities
● Cloud & GKE Topologies
● Docker Best Practices
● Kubernetes Best Practices
● ML Framework Best Practices
Look for:
● AI Hub: https://aihub.cloud.google.com/
● Qwik Labs:
○ Qwests:
■ Kubernetes Solutions: https://www.qwiklabs.com/quests/45
○ Labs:
■ Kubeflow Labs
Google Cloud Platform - Resources
● Google Cloud Platform 101 (Cloud Next ‘19):
https://www.youtube.com/watch?v=vmOMataJZWw
● Google Cloud Developer Cheat Sheet:
https://raw.githubusercontent.com/gregsramblings/google-cloud-4-
words/master/Poster-medres.png
● 100+ announcements from Google Cloud Next ‘19:
https://cloud.google.com/blog/topics/inside-google-cloud/100-plus-
announcements-from-google-cloud-next19
Google Cloud Platform - Resources
● Google Cloud Next ‘19 Sessions:
https://www.youtube.com/playlist?list=PLIivdWyY5sqIXvUGVrFuZibCUdK
VzEoUw
● GCP Certification Resources: https://github.com/ddneves/awesome-
gcp-certifications
Akash Agrawal
LinkedIn : akash-agrawal-58a97813
Twitter : @akkiagrawal29
Thanks

Más contenido relacionado

La actualidad más candente

CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD OverviewAn Nguyen
 
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnCI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnVietnam Open Infrastructure User Group
 
GitOps - Operation By Pull Request
GitOps - Operation By Pull RequestGitOps - Operation By Pull Request
GitOps - Operation By Pull RequestKasper Nissen
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsKnoldus Inc.
 
Test Data Management and Its Role in DevOps
Test Data Management and Its Role in DevOpsTest Data Management and Its Role in DevOps
Test Data Management and Its Role in DevOpsTechWell
 
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...Sonatype
 
Introduction to CICD
Introduction to CICDIntroduction to CICD
Introduction to CICDKnoldus Inc.
 
Intro to Helm for Kubernetes
Intro to Helm for KubernetesIntro to Helm for Kubernetes
Intro to Helm for KubernetesCarlos E. Salazar
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopBob Killen
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updatesMichal Rostecki
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Animesh Singh
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
 
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdf
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdfKinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdf
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdfStringee JSC
 
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM Communities
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM CommunitiesMongoDB Days Silicon Valley: Using MongoDB with Adobe AEM Communities
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM CommunitiesMongoDB
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on KubernetesOpsta
 

La actualidad más candente (20)

CI/CD Overview
CI/CD OverviewCI/CD Overview
CI/CD Overview
 
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường ChiếnCI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
CI/CD trên Cloud OpenStack tại Viettel Networks | Hà Minh Công, Phạm Tường Chiến
 
GitOps - Operation By Pull Request
GitOps - Operation By Pull RequestGitOps - Operation By Pull Request
GitOps - Operation By Pull Request
 
Kubeflow
KubeflowKubeflow
Kubeflow
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 
Test Data Management and Its Role in DevOps
Test Data Management and Its Role in DevOpsTest Data Management and Its Role in DevOps
Test Data Management and Its Role in DevOps
 
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
DevOps and Continuous Delivery Reference Architectures (including Nexus and o...
 
Introduction to CICD
Introduction to CICDIntroduction to CICD
Introduction to CICD
 
Intro to Helm for Kubernetes
Intro to Helm for KubernetesIntro to Helm for Kubernetes
Intro to Helm for Kubernetes
 
Basic Kong API Gateway
Basic Kong API GatewayBasic Kong API Gateway
Basic Kong API Gateway
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
BitBucket presentation
BitBucket presentationBitBucket presentation
BitBucket presentation
 
Gitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCDGitlab, GitOps & ArgoCD
Gitlab, GitOps & ArgoCD
 
Cilium - overview and recent updates
Cilium - overview and recent updatesCilium - overview and recent updates
Cilium - overview and recent updates
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdf
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdfKinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdf
Kinh nghiệm triển khai K8s tại Stringee - Mr Trần Tiến.pdf
 
Meetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOpsMeetup 23 - 03 - Application Delivery on K8S with GitOps
Meetup 23 - 03 - Application Delivery on K8S with GitOps
 
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM Communities
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM CommunitiesMongoDB Days Silicon Valley: Using MongoDB with Adobe AEM Communities
MongoDB Days Silicon Valley: Using MongoDB with Adobe AEM Communities
 
Deploy Application on Kubernetes
Deploy Application on KubernetesDeploy Application on Kubernetes
Deploy Application on Kubernetes
 

Similar a Kubernetes for machine learning

Leonid Kuligin "Training ML models with Cloud"
 Leonid Kuligin   "Training ML models with Cloud" Leonid Kuligin   "Training ML models with Cloud"
Leonid Kuligin "Training ML models with Cloud"Lviv Startup Club
 
MicroServices with Containers, Kubernetes & ServiceMesh
MicroServices with Containers, Kubernetes & ServiceMeshMicroServices with Containers, Kubernetes & ServiceMesh
MicroServices with Containers, Kubernetes & ServiceMeshAkash Agrawal
 
Google Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep diveGoogle Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep diveAkash Agrawal
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)Akash Agrawal
 
Google Cloud Certifications & Machine Learning
Google Cloud Certifications & Machine LearningGoogle Cloud Certifications & Machine Learning
Google Cloud Certifications & Machine LearningDan Sullivan, Ph.D.
 
End to end MLworkflows
End to end MLworkflowsEnd to end MLworkflows
End to end MLworkflowsAdam Gibson
 
MicroService architecture_&_Kubernetes
MicroService architecture_&_KubernetesMicroService architecture_&_Kubernetes
MicroService architecture_&_KubernetesAkash Agrawal
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 
SACON NY 19: "Creating an effective developer experience for cloud-native apps"
SACON NY 19: "Creating an effective developer experience for cloud-native apps"SACON NY 19: "Creating an effective developer experience for cloud-native apps"
SACON NY 19: "Creating an effective developer experience for cloud-native apps"Daniel Bryant
 
Google cloud infrastructure workshop
Google cloud infrastructure workshopGoogle cloud infrastructure workshop
Google cloud infrastructure workshopAkash Agrawal
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managabilityGaurav Bahrani
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro servicesSpyros Lambrinidis
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprisedoppenhe
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Akash Tandon
 
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...Dan Farrelly
 
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...Daniel Bryant
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps WorkshopWeaveworks
 
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"Daniel Bryant
 

Similar a Kubernetes for machine learning (20)

Leonid Kuligin "Training ML models with Cloud"
 Leonid Kuligin   "Training ML models with Cloud" Leonid Kuligin   "Training ML models with Cloud"
Leonid Kuligin "Training ML models with Cloud"
 
MicroServices with Containers, Kubernetes & ServiceMesh
MicroServices with Containers, Kubernetes & ServiceMeshMicroServices with Containers, Kubernetes & ServiceMesh
MicroServices with Containers, Kubernetes & ServiceMesh
 
Google Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep diveGoogle Kubernetes Engine (GKE) deep dive
Google Kubernetes Engine (GKE) deep dive
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)
 
Google Cloud Certifications & Machine Learning
Google Cloud Certifications & Machine LearningGoogle Cloud Certifications & Machine Learning
Google Cloud Certifications & Machine Learning
 
End to end MLworkflows
End to end MLworkflowsEnd to end MLworkflows
End to end MLworkflows
 
MicroService architecture_&_Kubernetes
MicroService architecture_&_KubernetesMicroService architecture_&_Kubernetes
MicroService architecture_&_Kubernetes
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
SACON NY 19: "Creating an effective developer experience for cloud-native apps"
SACON NY 19: "Creating an effective developer experience for cloud-native apps"SACON NY 19: "Creating an effective developer experience for cloud-native apps"
SACON NY 19: "Creating an effective developer experience for cloud-native apps"
 
Google cloud infrastructure workshop
Google cloud infrastructure workshopGoogle cloud infrastructure workshop
Google cloud infrastructure workshop
 
Promise of DevOps
Promise of DevOpsPromise of DevOps
Promise of DevOps
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
 
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
 
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...
Building a Pluggable, Cloud-native Event-driven Serverless Architecture - Rea...
 
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...
CloudNativeLondon 2018: "In Search of the Perfect Cloud Native Developer Expe...
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
 
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"
muCon 2019: "Creating an Effective Developer Experience for Cloud-Native Apps"
 

Último

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Último (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Kubernetes for machine learning

  • 2. Agenda ● Machine Learning Overview ● Machine Learning in Production ● Machine Learning on Google Cloud Platform (GCP) ● Kubernetes Overview ● Google Kubernetes Engine (GKE) Overview ● Kubeflow ● Design & Best Practices
  • 3. About Me ● Google Developer Expert (on Google Cloud Platform Category) ● 11 years of experience in IT Industry ● Worked with various clients like Sabre/Citi Bank/Goldman Sachs/L&T Infotech etc. ● Currently I work as Independent Consultant (as Technical Adviser/Architect Role) & Tech Evangelist
  • 4. What this Talk is (about/not about) ● About: ○ ML System Understanding ○ ML & Kubernetes Integration / Design ● Not About: ○ ML Code Syntax/Structure ○ ML Algorithms
  • 5. Machine Learning Overview ● Teaching Computers to recognize patterns in the same way as our brains do ● Model Building ---> Model Training ---> Model Serving
  • 6. Machine Learning Overview ● Machine Learning Lifecycle: ○ Build Machine Learning Model: ■ Write Machine Learning Code in any supporting/framework e.g. TensorFlow, SciKit Learn, XGBoost, PyTorch ○ Input Data: ■ You divide Input data into Training & Testing Data ■ Inference/Serving time you pass Inference Input Data ■ Data may have labels or not ○ Train the Model with Input Data: ■ Training generates Model (some kind of Graph e.g. TensorFlow Graph/DAG)
  • 7. Machine Learning Overview ● Machine Learning Lifecycle: ○ Serve/Inference: ■ You can take the model & serve it as REST api endpoint ○ Predictions: ■ You use these REST api endpoints for Online/Batch Prediction (Confidence Value)
  • 8. Machine Learning Overview ● Extra Steps: ○ Pre Processing ○ Post Processing
  • 9. At what Stage are you with ML today ● Experimenting / Learning ● Building Proof of Concepts (POCs) / Prototyping ● Designing (Deployment/Workflows/Scaling/Management) for Production
  • 10. Machine Learning In Production ● Few extra things to take care of: ○ Collaborative Environment with folks in different roles e.g. Data Scientists / Platform Engineers / DevOPs / Researchers ○ Production ML Applications are designed to run 24/7/365 ○ Input Data (Training/Testing & Inference) is floating continuously - Streaming/Batch ○ You can use different kind of frameworks for ML models building e.g. TensorFlow, SciKit Learn, XGBoost, PyTorch etc. ○ These models constantly updated, improved upon & deployed ○ Repetitive ML Tasks like Feature Engineering, Hyperparameter Tuning, Data Cleansing & Validations
  • 11. Machine Learning In Production ● Few extra things to take care of: ○ Config Separation on different environments ○ RBAC (Role Based Access Control) ○ Different Deployment/Hosting Options : Cloud (e.g. GCP) or Private Data Centers/Cloud (e.g. VMWare Based) ○ Different Hardwares/Accelerators for Compute intensive workloads e.g. GPUs/TPUs ○ Scaling Requirements: ■ Distributed Processing (Training or Serving) ■ Distributed Processing (e.g. One Model is running on multiple GPUs/TPUs or one GPU is used to run multiple Models)
  • 12. Machine Learning on GCP ● 3 ways: ○ ML as an API ( Cloud Vision API, Cloud Video Intelligence API, Cloud Speech API, Cloud Natural Language API, Cloud Translation API) ○ AutoML ○ Custom Models ■ With Cloud ML Engine ■ With Kubernetes / GKE / Kubeflow etc.
  • 13. Kubernetes Overview ● Kubernetes is an Open Source system for Container Orchestration (Deployment/Management/Scaling) ● Features: ○ Scheduling ○ Self Healing / Auto Repairing ○ Scaling (Manual / Auto Scaling / Scaling Out / Scaling In) ○ ...
  • 14. Google Kubernete Engine (GKE) Overview ● Managed Service for Kubernetes on Google Cloud (focused on Deployment/Management/Scaling) ● Provides Reliable, Efficient & Secured way to run Kubernetes Clusters (on GCP) ● GKE On-Prem
  • 15. Google Kubernete Engine (GKE) Overview ● Features: ○ Fully Managed ○ Auto Scaling / Auto Upgrade / Auto Repair ○ Integration : IAM / StackDriver / VPC ○ Security, Compliance, Runs on Optimized OS (COS) ○ Accelerators Support : GPUs/TPUs ○ Various Cluster Topologies : Zonal Clusters / Regional Clusters ○ Workload Portability : On-Premises / Cloud ○ ...
  • 16. Kubeflow: ● Focused on Deployment of ML Workflows on Kubernetes (Simple, Portable & Scalable) ● Goal: is to support deployment of Best-of-breed Open Source Systems for ML to diverse Infrastructure ● Anywhere you are running Kubernetes, you can run Kubeflow
  • 17. Kubeflow: ● Features: ○ Pipelines: for deploying & managing End to End ML Workflows. ○ Integration: ■ Jupyter Notebooks ■ TensorFlow Model Training Controller ■ Seldon Core : for Model Serving ○ Multi-Framework Support: TensorFlow, PyTorch, Apache MXNet ○ Share/Reuse using AI Hub
  • 18. Design & Best Practices: ● Separate out Compute & Storage ● Scaling & Self Healing Capabilities ● Cloud & GKE Topologies ● Docker Best Practices ● Kubernetes Best Practices ● ML Framework Best Practices
  • 19. Look for: ● AI Hub: https://aihub.cloud.google.com/ ● Qwik Labs: ○ Qwests: ■ Kubernetes Solutions: https://www.qwiklabs.com/quests/45 ○ Labs: ■ Kubeflow Labs
  • 20. Google Cloud Platform - Resources ● Google Cloud Platform 101 (Cloud Next ‘19): https://www.youtube.com/watch?v=vmOMataJZWw ● Google Cloud Developer Cheat Sheet: https://raw.githubusercontent.com/gregsramblings/google-cloud-4- words/master/Poster-medres.png ● 100+ announcements from Google Cloud Next ‘19: https://cloud.google.com/blog/topics/inside-google-cloud/100-plus- announcements-from-google-cloud-next19
  • 21. Google Cloud Platform - Resources ● Google Cloud Next ‘19 Sessions: https://www.youtube.com/playlist?list=PLIivdWyY5sqIXvUGVrFuZibCUdK VzEoUw ● GCP Certification Resources: https://github.com/ddneves/awesome- gcp-certifications
  • 22. Akash Agrawal LinkedIn : akash-agrawal-58a97813 Twitter : @akkiagrawal29 Thanks