SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Alluxio on Kubernetes
Powering training through
Container Storage Interface (CSI) plugin
Alluxio Fuse
Alluxio on Kubernetes
Alluxio Master pod
Master
Container
Job
Master
Container
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
DaemonSet: Worker & Fuse
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Host Machine 1 Host Machine 2 Host Machine 3
Application
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Host Machine
Volume
mount
Application
Alluxio Worker pod
Worker
Container
Job
Worker
Container
Alluxio Fuse pod
Fuse
Container
Host Machine
Application pod
Application
Container
Volume
mount
mount
req data
Advantages
1. Better performance because of
data-locality
2. Easy to deploy. `helm install` to start
the whole Alluxio cluster
Alluxio Worker pod
Worker
Contain
er
Job
Worker
Contain
er
Alluxio Fuse pod
Fuse
Container
Host Machine
Application pod
Application
Container
Volume
mount
mount
req data
Challenges
Alluxio Worker pod
Worker
Contain
er
Job
Worker
Contain
er
Alluxio Fuse pod
Fuse
Container
Host Machine
Application pod
Application
Container
Volume
mount
mount
req data
1. Fuse pod is not always needed but
always taking resources.
Ex. data preprocessing
2. In some workloads Fuse pod is not
needed on some machines in the
cluster. Lots of manual work.
Before CSI on Kubernetes
Pod
Volume
AWS EBS
AzureDisk
AzureFile
CephFS
…
CSI on Kubernetes
Pod
Volume CSI
AWS EBS
AzureDisk
AzureFile
CephFS
…
More than 100
existing CSI drivers
Alluxio CSI Driver v1.0.0
Host Machine
Alluxio CSI Driver
Components
Alluxio CSI Driver v1.0.0
Host Machine
Alluxio CSI Driver
Components
Persistent
volume +
claim
Alluxio CSI Driver v1.0.0
Host Machine
Alluxio CSI Driver
Components
Application pod
Application
Container
Persistent
volume +
claim
mount
mount
+
Alluxio Fuse
Advantages of Alluxio CSI Driver v1.0.0
1. Fuse process lifecycle is the same as
the application pod. Perfectly
solving our previous challenges
2. Easy to deploy. With CSI deployed
with `helm install`, Fuse process is
automated.
Host
Machine
CSI
Components
Application pod
Application
Container
Persistent
volume +
claim
mount
mount
+
Alluxio
Fuse
Challenges of Alluxio CSI Driver v1.0.0
1. Fuse processes reside in one of the
CSI components - nodeserver
What if nodeserver is down?
What if nodeserver needs to be upgraded?
Host
Machine
CSI
Components
Application pod
Application
Container
Persistent
volume +
claim
mount
mount
+
Alluxio
Fuse
Challenges of Alluxio CSI Driver v1.0.0
1. Fuse processes reside in one of the
CSI components - nodeserver
What if nodeserver is down?
What if nodeserver needs to be upgraded?
Host
Machine
Application pod
Application
Container
Persistent
volume +
claim
mount
Alluxio CSI Driver v1.1.0
Host Machine
Persistent
volume +
claim
CSI
Components
Alluxio CSI Driver v1.1.0
Alluxio Fuse pod
Fuse
Container
Host Machine
Application pod
Application
Container
Persistent
volume +
claim
mount
mount
CSI
Components
create
Alluxio CSI Driver v1.1.0
Alluxio Fuse pod
Fuse
Container
Host Machine
Application pod
Application
Container
Persistent
volume +
claim
mount
mount
What if nodeserver is down?
What if nodeserver needs to be
upgraded?
Problem solved!
1. Fuse pod resource allocation
2. In some cases, different applications always use different Fuse
processes
Next Steps
Acknowledgement
Community users
● Kevin Cai@AntFin
● Hui Fei, Haoning Sun@Shopee
● Binyang Li@Microsoft
Thanks for watching
Questions?

Más contenido relacionado

Similar a Alluxio on Kubernetes - Powering training through Container Storage Interface plugin

Masterless Puppet Using AWS S3 Buckets and IAM Roles
Masterless Puppet Using AWS S3 Buckets and IAM RolesMasterless Puppet Using AWS S3 Buckets and IAM Roles
Masterless Puppet Using AWS S3 Buckets and IAM RolesMalcolm Duncanson, CISSP
 
Run your Appium tests using Docker Android - AppiumConf 2019
Run your Appium tests using Docker Android - AppiumConf 2019Run your Appium tests using Docker Android - AppiumConf 2019
Run your Appium tests using Docker Android - AppiumConf 2019Sargis Sargsyan
 
Breaking the Monolith Using AWS Container Services
Breaking the Monolith Using AWS Container ServicesBreaking the Monolith Using AWS Container Services
Breaking the Monolith Using AWS Container ServicesAmazon Web Services
 
VMware Tanzu Introduction
VMware Tanzu IntroductionVMware Tanzu Introduction
VMware Tanzu IntroductionVMware Tanzu
 
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptx
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptxvSphere with Tanzu Tech Overview 7.0 U1 (1).pptx
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptxhokismen
 
Virtualize Your Disaster! Introduction & Update
Virtualize Your Disaster! Introduction & UpdateVirtualize Your Disaster! Introduction & Update
Virtualize Your Disaster! Introduction & UpdateEmirates Computers
 
GitOps on Kubernetes with Carvel
GitOps on Kubernetes with CarvelGitOps on Kubernetes with Carvel
GitOps on Kubernetes with CarvelAlexandre Roman
 
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...Amazon Web Services
 
Ibm smart cloud entry+ for system x administrator guide
Ibm smart cloud entry+ for system x administrator guideIbm smart cloud entry+ for system x administrator guide
Ibm smart cloud entry+ for system x administrator guideIBM India Smarter Computing
 
Code Factory avec GitLab CI et Rancher
Code Factory avec GitLab CI et RancherCode Factory avec GitLab CI et Rancher
Code Factory avec GitLab CI et RancherSUSE
 
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...Using Security To Build With Confidence in AWS – Justin Foster, Director of P...
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...Amazon Web Services
 
AWS CodeDeploy - basic intro
AWS CodeDeploy - basic introAWS CodeDeploy - basic intro
AWS CodeDeploy - basic introAnton Babenko
 
Introduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesIntroduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesanynines GmbH
 
AKS backup with Velero and Workload Identities
AKS backup with Velero and Workload IdentitiesAKS backup with Velero and Workload Identities
AKS backup with Velero and Workload IdentitiesKumton Suttiraksiri
 
Dev ops & laas fundamental
Dev ops & laas fundamentalDev ops & laas fundamental
Dev ops & laas fundamentalKanin Kearpimy
 

Similar a Alluxio on Kubernetes - Powering training through Container Storage Interface plugin (20)

Masterless Puppet Using AWS S3 Buckets and IAM Roles
Masterless Puppet Using AWS S3 Buckets and IAM RolesMasterless Puppet Using AWS S3 Buckets and IAM Roles
Masterless Puppet Using AWS S3 Buckets and IAM Roles
 
Run your Appium tests using Docker Android - AppiumConf 2019
Run your Appium tests using Docker Android - AppiumConf 2019Run your Appium tests using Docker Android - AppiumConf 2019
Run your Appium tests using Docker Android - AppiumConf 2019
 
70 533 study material
70 533 study material70 533 study material
70 533 study material
 
Breaking the Monolith Using AWS Container Services
Breaking the Monolith Using AWS Container ServicesBreaking the Monolith Using AWS Container Services
Breaking the Monolith Using AWS Container Services
 
VMware Tanzu Introduction
VMware Tanzu IntroductionVMware Tanzu Introduction
VMware Tanzu Introduction
 
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptx
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptxvSphere with Tanzu Tech Overview 7.0 U1 (1).pptx
vSphere with Tanzu Tech Overview 7.0 U1 (1).pptx
 
Agility Requires Safety
Agility Requires SafetyAgility Requires Safety
Agility Requires Safety
 
Virtualize Your Disaster! Introduction & Update
Virtualize Your Disaster! Introduction & UpdateVirtualize Your Disaster! Introduction & Update
Virtualize Your Disaster! Introduction & Update
 
GitOps on Kubernetes with Carvel
GitOps on Kubernetes with CarvelGitOps on Kubernetes with Carvel
GitOps on Kubernetes with Carvel
 
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...
TLS303 How to Deploy Python Applications on AWS Elastic Beanstalk - AWS re:In...
 
Ibm smart cloud entry+ for system x administrator guide
Ibm smart cloud entry+ for system x administrator guideIbm smart cloud entry+ for system x administrator guide
Ibm smart cloud entry+ for system x administrator guide
 
Code Factory avec GitLab CI et Rancher
Code Factory avec GitLab CI et RancherCode Factory avec GitLab CI et Rancher
Code Factory avec GitLab CI et Rancher
 
Django Deployment
Django DeploymentDjango Deployment
Django Deployment
 
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...Using Security To Build With Confidence in AWS – Justin Foster, Director of P...
Using Security To Build With Confidence in AWS – Justin Foster, Director of P...
 
AWS CodeDeploy - basic intro
AWS CodeDeploy - basic introAWS CodeDeploy - basic intro
AWS CodeDeploy - basic intro
 
VBR v8 Overview-handout
VBR v8 Overview-handoutVBR v8 Overview-handout
VBR v8 Overview-handout
 
Introduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anyninesIntroduction into Cloud Foundry and Bosh | anynines
Introduction into Cloud Foundry and Bosh | anynines
 
Oracle autovue
Oracle autovueOracle autovue
Oracle autovue
 
AKS backup with Velero and Workload Identities
AKS backup with Velero and Workload IdentitiesAKS backup with Velero and Workload Identities
AKS backup with Velero and Workload Identities
 
Dev ops & laas fundamental
Dev ops & laas fundamentalDev ops & laas fundamental
Dev ops & laas fundamental
 

Más de Alluxio, Inc.

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio, Inc.
 

Más de Alluxio, Inc. (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Último

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 

Último (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

Alluxio on Kubernetes - Powering training through Container Storage Interface plugin

  • 1. Alluxio on Kubernetes Powering training through Container Storage Interface (CSI) plugin
  • 3. Alluxio on Kubernetes Alluxio Master pod Master Container Job Master Container Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container
  • 4. DaemonSet: Worker & Fuse Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Host Machine 1 Host Machine 2 Host Machine 3
  • 5. Application Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Host Machine Volume mount
  • 6. Application Alluxio Worker pod Worker Container Job Worker Container Alluxio Fuse pod Fuse Container Host Machine Application pod Application Container Volume mount mount req data
  • 7. Advantages 1. Better performance because of data-locality 2. Easy to deploy. `helm install` to start the whole Alluxio cluster Alluxio Worker pod Worker Contain er Job Worker Contain er Alluxio Fuse pod Fuse Container Host Machine Application pod Application Container Volume mount mount req data
  • 8. Challenges Alluxio Worker pod Worker Contain er Job Worker Contain er Alluxio Fuse pod Fuse Container Host Machine Application pod Application Container Volume mount mount req data 1. Fuse pod is not always needed but always taking resources. Ex. data preprocessing 2. In some workloads Fuse pod is not needed on some machines in the cluster. Lots of manual work.
  • 9. Before CSI on Kubernetes Pod Volume AWS EBS AzureDisk AzureFile CephFS …
  • 10. CSI on Kubernetes Pod Volume CSI AWS EBS AzureDisk AzureFile CephFS … More than 100 existing CSI drivers
  • 11. Alluxio CSI Driver v1.0.0 Host Machine Alluxio CSI Driver Components
  • 12. Alluxio CSI Driver v1.0.0 Host Machine Alluxio CSI Driver Components Persistent volume + claim
  • 13. Alluxio CSI Driver v1.0.0 Host Machine Alluxio CSI Driver Components Application pod Application Container Persistent volume + claim mount mount + Alluxio Fuse
  • 14. Advantages of Alluxio CSI Driver v1.0.0 1. Fuse process lifecycle is the same as the application pod. Perfectly solving our previous challenges 2. Easy to deploy. With CSI deployed with `helm install`, Fuse process is automated. Host Machine CSI Components Application pod Application Container Persistent volume + claim mount mount + Alluxio Fuse
  • 15. Challenges of Alluxio CSI Driver v1.0.0 1. Fuse processes reside in one of the CSI components - nodeserver What if nodeserver is down? What if nodeserver needs to be upgraded? Host Machine CSI Components Application pod Application Container Persistent volume + claim mount mount + Alluxio Fuse
  • 16. Challenges of Alluxio CSI Driver v1.0.0 1. Fuse processes reside in one of the CSI components - nodeserver What if nodeserver is down? What if nodeserver needs to be upgraded? Host Machine Application pod Application Container Persistent volume + claim mount
  • 17. Alluxio CSI Driver v1.1.0 Host Machine Persistent volume + claim CSI Components
  • 18. Alluxio CSI Driver v1.1.0 Alluxio Fuse pod Fuse Container Host Machine Application pod Application Container Persistent volume + claim mount mount CSI Components create
  • 19. Alluxio CSI Driver v1.1.0 Alluxio Fuse pod Fuse Container Host Machine Application pod Application Container Persistent volume + claim mount mount What if nodeserver is down? What if nodeserver needs to be upgraded? Problem solved!
  • 20. 1. Fuse pod resource allocation 2. In some cases, different applications always use different Fuse processes Next Steps
  • 21. Acknowledgement Community users ● Kevin Cai@AntFin ● Hui Fei, Haoning Sun@Shopee ● Binyang Li@Microsoft