SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
Backend.AI Technical Introduction
Lablup Inc.
2019. 09
ENG
1 / 38
GPU Computing: Maximizing GPU utilization via Backend.AI
Backend.AI: The most efficient way to build and train your machine learning models
2 / 38
Synergy of Deep Learning and GPU
Deep Learning = Repetition of numeric ops on millions/billions of parameter matrices
2015 Microsoft ResNet
2015
13,000
2016 Baidu Deep Speech 2
2016
20,000
Google NMT
50억 달러
2017
60 million parameters
70 quadrillion calc.
300 million parameters
200 quadrillion calc.
Calc.: GOPS * bandwidth
Reference: NVIDIA 2017 “A NET COMPUTING ERA”
8.7 billion parameters
1.05 quintillion calc.1 2 3
4 5 6
7 8
9 10
11 12
58
1 + 7 + 2 + 9 + 3 + 11 = 58
3 / 38
Synergy of Deep Learning and GPU
DRAM
Cache
Control
ALU
ALU ALU
ALU
CPU GPU
DRAMDRAM
GPU = More computing units per chip area (ALU)
§ C/C++ codes that runs on the GPU in parallel made easy with NVIDIA's CUDA (2007) and OpenCL (2009)
§ Used in machine learning, numerical analysis, and scientific computing
CPU single thread performance
1000X
(Till 2025)
YoY 1.5X
YoY 1.1X
GPU computing power
YoY 1.5X
102
103
104
105
106
107
1980 1990 2000 2010 2020
4 / 38
Why GPU Computing?
HPC & AI = High utilization of large-scale resources
GPU = High-density computing chips
Note(s): CPU Baselined to 5000 Servers for each workload | Capex Costs: CPU node with 2x Skylake CPU’s ~$9K; GPU node with 4x V100 GPU’s ~$45K | Opex Costs: Power & cooling is $180/kW/month | Power: CPU serv
er + n/w = 0.6 KW; GPU server + n/w = 1.6 KW; DGX-1V/HGX-1 Server = 3.2KW | HPC: GPU node with 4xV100 compared to 2xCPU Server | DL Training: DGX-1V compared to a 2xCPU server | DL Inference: HGX-1 based s
erver (8xV100) Compared to 2x CPU Server |numbers rounded to nearest $0.5M
Workload
Baseline
(CPU-Only)
HPC
(Amber,LAMPS)
AI Training
(TensorFlow)
AI Inference
(Image, Speech)
Speed Up 1x 20x >100x 60x
Servers 5,000 250 <50 84
Capex $45M $11M $7.5M $7M
3 Year Opex
(Power+Cooling)
$19.5M $2.5M $1M $1.5M
TCO Saving N/A 79% 86% 86%
GPU is necessary!
5 / 38
Backend.AI https://www.backend.ai
Easy and fast
streamlined platform
to train and serve
Machine Learning models
on-premises and clouds
6 / 38
Backend.AI https://www.backend.ai
Easy and fast
streamlined platform
to train and serve
Machine Learning models
on-premises and clouds
7 / 38
Backend.AI Goal
GPU
GPU
GPU
GPU
GPU
GPU ??? Backend.AI
GPU
GPU
GPU
GPU
GPU
GPU
GPU GPU GPU
GPU
GPU
§ Manual assignment of GPUs to researchers
§ Inefficient allocation
§ Manual checks for SW compatibilities
§ Automatic sharing and consolidation of GPUs
§ Need-based use of GPUs
§ Containerized runtime environments
8 / 38
Backend.AI Usage Scenario
Backend.AI
GPUGPUGPU
GPUGPUGPU
GPUGPUGPU
GPUGPUGPU
Backend.AI
GPUGPUGPUGPUGPUGPUGPUGPU
Backend.AI
GPUGPUGPU
GPUGPUGPU
Cloud
Building GPU clusters
Sharing
high-end GPU nodes
Dynamic scaling out
from on-prem to clouds
9 / 38
Backend.AI Platform
IaaS / OS
Hardware Infra.
Managed GPU Apps
Data
Scientists
Data
Analysts
Instructors &
Learners
Developers
Container-level
GPU virtualization
Click-to-ready GPU
Environments
Web GUI for
monitoring & control
Backend.AI Manager
IDE Integration
Backend.AI Client
Backend.AI Agent
Brand Guidelines
TensorFlow is an end-to-end open-source platform
for machine learning. It has a comprehensive,
flexible ecosystem of tools, libraries and community
resources that lets researchers push the
10 / 38
Backend.AI Differentiation
• The only solution that provides machine learning container technology in a
single framework
­ Existing orchestration layers are optimized for domain-specific functions other than
machine learning (e.g. scheduling, microservice hosting)
­ Lack of products to solve the problems of real machine learning researchers and
developers
• Backend.AI
­ GPU optimization technology
ü Implementing CUDA-optimized solutions with NVIDIA partnership
ü The only container-based multi / partial GPU sharing (fractional scaling) solution
­ Dynamic sandboxing: programmable and rewritable syscall filters
ü Support for richer programmable policies compared to apparmor/seccomp, etc.
­ Docker-based legacy app resource control
ü Calibration of the number of CPU cores recognized by mathematical libraries
such as OpenBLAS
11 / 38
GPU Virtualization Technology
Backend.AI: The most efficient way to build and train your machine learning models
12 / 38
Backend.AI https://www.backend.ai
Backend.AI is an open-source
cloud resource management platform.
We provide fractional GPU resourcing so
you can scale efficiently
whether you’re a scientist, DevOps,
enterprise, or an AI hobbyist.
13 / 38
Backend.AI: GPU Features
• Container-level fractional GPU scaling
­ Assigning slices of SMP / RAM to containers
ü e.g.) Allocating 2.5 GPUs or 0.3 GPUs
­ Shared GPUs for inference & education workloads
­ Multiple GPUs for model training workloads
­ With proprietary CUDA virtualization layer
• NVIDIA platform integration
­ Optimized for DGX server families
­ Supports NGC (for DL / HPC) image integration
Example of GPU sharing / allocation
(2.5 / 0.5 slots)
2.5 GPUs 0.5 GPUs
14 / 38
Container 2
Backend.AI GPU Virtualizer
Container 1 Container 3 Container 4
Fractional & Multi-GPU Scaling
nvidia-docker + CUDA Driver
PCIE/0 PCIE/1 PCIE/2 PCIE/3 PCIE/4 PCIE/5
PCIE/0PCIE/1PCIE/0 PCIE/0 PCIE/1 PCIE/0 PCIE/1 PCIE/2
/device:GPU:0 /device:GPU:0 /device:GPU:1 /device:GPU:0 /device:GPU:0 /device:GPU:1 /device:GPU:2/device:GPU:1
/device:GPU:0 /device:GPU:1 /device:GPU:2 /device:GPU:3 /device:GPU:4 /device:GPU:5
Host-side
view:
15 / 39
NVIDIA DGX Series
• NVIDIA DGX-1/DGX-2
­ Complete multi-GPU environment system
ü Ubuntu-based Host OS (also RedHat support)
ü NV Link / NV Switch based high-speed networking
ü Great testbed for various load tests!
• Backend.AI on DGX-family
­ Complements to NVIDIA Container Runtime
ü GPU sharing for multi-user support
ü Scheduling with CPU/GPU topology
ü Features for machine learning pipeline
­ Technology collaboration via NVIDIA Inception Program
SYSTEM SPECIFICATIONS
GPUs 16X NVIDIA®
Tesla V100
GPU Memory 512GB total
Performance 2 petaFLOPS
NVIDIA CUDA®
Cores 81920
NVIDIA Tensor Cores 10240
NVSwitches 12
Maximum Power Usage 10 kW
CPU Dual Intel Xeon Platinum
8168, 2.7 GHz, 24-cores
System Memory 1.5TB
Network 8X 100Gb/sec
Infiniband/100GigE
Dual 10/25Gb/sec Ethernet
NVIDIA DGX-2
THE WORLD’S MOST POWERFUL
DEEP LEARNING SYSTEM FOR THE
MOST COMPLEX AI CHALLENGES
The Challenge of Scaling to Meet the Demands of
Modern AI and Deep Learning
Deep neural networks are rapidly growing in size and complexity, in response to the
most pressing challenges in business and research. The computational capacity
needed to support today’s modern AI workloads has outpaced traditional data center
architectures. Modern techniques that exploit increasing use of model parallelism
are colliding with the limits of inter-GPU bandwidth, as developers build increasingly
large accelerated computing clusters, pushing the limits of data center scale.
A new approach is needed - one that delivers almost limitless AI computing scale
in order to break through the barriers to achieving faster insights that can transform
the world.
Deep-learning Framework
TensorFlow, Caffe, Torch, mxnet,
Theano, etc.
Deep-learning user program
NVIDIA DIGITS
Container tools
NVIDIA Container Runtime for Docker
GPU Driver
NVIDIA Driver
System
Ubuntu-based Host OS
16 / 38
NVIDIA Platform Integration: NGC
• NGC (NVIDIA GPU Cloud)
­ Container image collection optimized for nvidia-docker
ü Direct optimization options and library dependency
management by NVIDIA
­ Announced expansion to model store from GTC 2019
ü Sharing deep learning models between users and organizations
ü Supports transfer learning by adding additional data based on
learned deep learning model
ü Easier and faster model learning environment through model
script
• Backend.AI with NGC
­ Supports all NGC-based image execution
­ NVIDIA recommended options applied (including docker shm limit)
­ Fractional GPU sharing
­ Supports NGC model store and model script (soon)
17 / 38
Backend.AI @NVIDIA GTC Silicon Valley 2019
• DGX User Group Meetup
­ Hearing DGX deployment case and
customer requirements
• NGC User Group Meetup
­ Presenting Backend.AI NGC
integration
• Main Session Talk
­ Introducing Backend.AI technology
• Inception Startup Booth
­ Demonstrating container-level GPU
virtualization
­ Having direct Q&A with NVIDIA CUDA
Developers
18 / 38
Backend.AI Competitor Analysis
Technology
nvidia-
docker
Docker
Swarm
OpenStack Kubernetes
Apache
Mesos
Backend.AI
GPU Support
GPU Assignment &
Isolation
Heterogeneous
Accelerators
Fractional GPU
Scaling
Security
Sandboxing via
Hypervisor/Container
Programmable
Sandboxing
Virtualization
VM (Hypervisor)
Docker Container
Scheduling
Availability-slot based
Advanced (e.g. DRF)
Integration
Modern AI
Frameworks
* Now on beta testing
** Cloud vendor / OpenStack handles VM management
*** slot-based but can do advanced customization with label feature
***
**** **
*
19 / 38
Flexible Resource Management
Backend.AI: The most efficient way to build and train your machine learning models
20 / 38
Flexible Resource Allocation: Resource Groups
• Resource Groups: Groups of managed hardware resources
­ Specify the available resource groups for each user, project, and domain
­ Allow resource requests to be allocated only within specific resource groups
­ Autoscale implementation in the cloud can be applied in units of resource groups
• Examples
­ Resource Groups by Device Performance : V100 / P100 / K80 / etc.
­ Resource Groups by Nodes : Servers / Workstations / IDC / etc.
­ Resource Groups by Clouds : AWS / GCP / Azure / etc.
• Applications
­ Assign specific hardware or GPU only to specific users, projects, teams, or domains
­ Divide node groups by CPU / GPU / storage
­ Group and manage nodes that are physically in the same network (for multi-network cluster)
21 / 38
Flexible Resource Allocation: Scenarios
Resource Group A (On-premise)
Resource Group D (Cloud / Scalable)Resource Group B (On-premise)
Storage Group C
Backend.AI
Manager
22 / 38
Flexible Resource Allocation: Scenarios
• Per-user resource group
permission
­ User 1: grant to RG A
­ User 2: grant to RG A, B
­ Each user has separate privileges
to Storage C
• Session / task batch
­ Manual batch to specific RG
­ Automatic discovery of optimal
resource combinations across all
available RGs before starting
Resource Group A (On-premise)
Resource Group D (Cloud)Resource Group B (On-premise)
Storage Group C
Backend.AI
Manager
User 1 User 2
User 2
23 / 38
Flexible Resource Allocation: Scenarios
• Project-wise resource group
permission
­ Project 1: grant to RG A, B
­ Project 2: grant to RG B, D
• Storage sharing
­ Different resource groups can
share the same storage groups
­ Personal storage folder
ü Only owners can access
ü Invitation feature for sharing
­ Project storage folder
ü All project members can access
Resource Group A (On-premise)
Resource Group D (Cloud)Resource Group B (On-premise)
Storage Group C
Backend.AI
Manager
Project 1
Project 2 Project 2Project 1
24 / 38
Flexible Resource Allocation: Scenarios
• Resource group example
­ RG A: NVIDIA V100 GPU group
­ RG B: NVIDIA P100 GPU group
­ User 1 can only use V100, User 2
can use both V100 and P100
­ Project 3 can use P100 group and
AWS cloud
­ Project 4 can only use Microsoft
Azure cloud
Resource Group A (V100)
Resource Group D (AWS)Resource Group B (P100)
Storage Group C
Backend.AI
Manager
Project 3 Project 3
Project 4
User 1 User 2
User 2
Resource Group E (Azure) 25 / 38
User-Friendly GUI
Backend.AI: The most efficient way to build and train your machine learning models
26 / 38
Resource Monitor
27 / 38
Environment and Resource Selection
28 / 38
Interactive Development Tool
29 / 38
Data Manipulation
30 / 38
Cases
Backend.AI: The most efficient way to build and train your machine learning models
31 / 38
Backend.AI Cases: AI Bigdata MBA Dept., Kookmin Univ.
• GPU server farm for students and researchers in finance fields
­ 3 servers with 24 GPUs for the simultaneous use of more than 80 students in a class and
researchers in labs
• Spec.
­ Different resource policy for students and researchers
­ 18 TiB ceph distributed file system by binding HDDs on nodes with LAN connection
­ Web GUI for operation and maintenance: no operator needed
1 Gbps LAN
24 NVIDIA GPUs
Node-specific CPU
18 TiB
Distributed file system
(cephfs)
Manager + Agent Agent Agent
ML class with
40+ students
10+ graduates
and researchers
32 / 38
Backend.AI Cases: Lablup GPU Cloud
• Backend.AI service for cloud users (B2C)
­ https://cloud.backend.ai/ (in private beta)
­ Use Backend.AI on the web after sign-up (invitation needed)
• Spec.
­ Unified AWS + Azure + GCP
­ Google TPU support (beta)
­ Azure FileShare + AWS EFS (Elastic File System) for datastore
DGX-2 Custom-built GPU nodes
ap-northeast-2 LG U+ IDC
korea-south
FileShare
asia-east1
EFSRDS TPUsUsers
Internet
Manager + Agents
Agents
Agents 33 / 38
GPU Virtualization Performance
Backend.AI: The most efficient way to build and train your machine learning models
34 / 38
Backend.AI Performance Benchmark
• Example: fashion-MNIST
• V100/P100 GPU cluster (1-node 8 GPU)
­ Model # : 50
­ Comparison
ü V100 As-Is standalone (Whole)
ü V100 Backend.AI sharing (Fractional*)
ü P100** Backend.AI sharing (Fractional)
* Note, Fraction : 4 SMP, GPU Memory 1 GiB
** Note, P100 is a generation older GPU than V100
GPU virtualization and automatic
resource management increases
utilization of expensive GPU
resources
100.00%
20.24%
33.74%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
V100 x 8
As-Is
(Whole)
V100 x 8
Backend.AI
(Fractional)
P100 x 8
Backend.AI
(Fractional)
Fashion-MNIST 50 Tasks Processing Time
(Relative)
35 / 38
Cost-saving with Backend.AI
Backend.AI: The most efficient way to build and train your machine learning models
36 / 38
Case of Cloud for Machine Learning Education
• Machine learning education and
development cloud service
­ 25 users / 2 months for each term
• Optimal utilization of each education /
development through GPU virtualization
­ Infrastructure costs reduced by more than 75%
• Automatic resource allocation and
environment preparation with GUI
­ Optimal operation without dedicated
administrator
­ Eliminates long term maintenance burden
Infrastructure / management cost reduction through GPU virtualization also applies to on-premise solutions.
20%
0%
23%
0% 20% 40% 60% 80% 100% 120%
Total Cost
Operator Payroll
Infra. Cost
Case Study : Cloud-base ML Education
Service Costs Comparison
A company ML Cloud Backend.AI Cloud
37 / 38
Make AI Accessible!
For more information,
Lablup Inc.
Backend.AI
Backend.AI GitHub
Backend.AI Cloud (beta)
https://www.lablup.com
https://www.backend.ai
https://github.com/lablup/backend.ai
https://cloud.backend.ai
38 / 38

Más contenido relacionado

La actualidad más candente

AI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowAI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowSteve Guhr
 
Easy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersKazuaki Ishizaki
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group Ganesan Narayanasamy
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Intel® Software
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用NVIDIA Taiwan
 
Making Hardware Accelerator Easier to Use
Making Hardware Accelerator Easier to UseMaking Hardware Accelerator Easier to Use
Making Hardware Accelerator Easier to UseKazuaki Ishizaki
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudUniva, an Altair Company
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...Edge AI and Vision Alliance
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan
 
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCINVIDIA Japan
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Clusterairbots
 

La actualidad más candente (20)

AI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowAI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using Kubeflow
 
Easy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java Programmers
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
OSPRay 1.0 and Beyond
OSPRay 1.0 and BeyondOSPRay 1.0 and Beyond
OSPRay 1.0 and Beyond
 
RAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature EngineeringRAPIDS: GPU-Accelerated ETL and Feature Engineering
RAPIDS: GPU-Accelerated ETL and Feature Engineering
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用
 
Making Hardware Accelerator Easier to Use
Making Hardware Accelerator Easier to UseMaking Hardware Accelerator Easier to Use
Making Hardware Accelerator Easier to Use
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
 
Scaling MLOps on NVIDIA DGX Systems
Scaling MLOps on NVIDIA DGX SystemsScaling MLOps on NVIDIA DGX Systems
Scaling MLOps on NVIDIA DGX Systems
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 

Similar a Backend.AI Technical Introduction (19.09 / 2019 Autumn)

Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0Ganesan Narayanasamy
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSDatabricks
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesWithTheBest
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...Databricks
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIndrajit Poddar
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeAnand Haridass
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGATO project
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarBill Wong
 

Similar a Backend.AI Technical Introduction (19.09 / 2019 Autumn) (20)

Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...
 
Hardware in Space
Hardware in SpaceHardware in Space
Hardware in Space
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 
GIST AI-X Computing Cluster
GIST AI-X Computing ClusterGIST AI-X Computing Cluster
GIST AI-X Computing Cluster
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
EPSRC CDT Conference
EPSRC CDT ConferenceEPSRC CDT Conference
EPSRC CDT Conference
 
AI + E-commerce
AI + E-commerceAI + E-commerce
AI + E-commerce
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. Lowndes
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand Challenge
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Demystify OpenPOWER
Demystify OpenPOWERDemystify OpenPOWER
Demystify OpenPOWER
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 

Más de Lablup Inc.

Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"
Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"
Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"Lablup Inc.
 
Lablupconf session8 "Paving the road to AI-powered world"
Lablupconf session8 "Paving the road to AI-powered world"Lablupconf session8 "Paving the road to AI-powered world"
Lablupconf session8 "Paving the road to AI-powered world"Lablup Inc.
 
Lablupconf session7 People don't know what they want until LABLUP show it to ...
Lablupconf session7 People don't know what they want until LABLUP show it to ...Lablupconf session7 People don't know what they want until LABLUP show it to ...
Lablupconf session7 People don't know what they want until LABLUP show it to ...Lablup Inc.
 
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"Lablup Inc.
 
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablup Inc.
 
Lablupconf session5 "Application of machine learning to classify normal and d...
Lablupconf session5 "Application of machine learning to classify normal and d...Lablupconf session5 "Application of machine learning to classify normal and d...
Lablupconf session5 "Application of machine learning to classify normal and d...Lablup Inc.
 
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"Lablup Inc.
 
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"Lablup Inc.
 
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진Lablup Inc.
 
Lablupconf keynote
Lablupconf keynoteLablupconf keynote
Lablupconf keynoteLablup Inc.
 
초심자를 위한 무작정 시작하는 Backend.AI-04
초심자를 위한 무작정 시작하는 Backend.AI-04초심자를 위한 무작정 시작하는 Backend.AI-04
초심자를 위한 무작정 시작하는 Backend.AI-04Lablup Inc.
 
초심자를 위한 무작정 시작하는 Backend.AI-03
초심자를 위한 무작정 시작하는 Backend.AI-03초심자를 위한 무작정 시작하는 Backend.AI-03
초심자를 위한 무작정 시작하는 Backend.AI-03Lablup Inc.
 
Backend.ai tutorial-2ndweek
Backend.ai tutorial-2ndweekBackend.ai tutorial-2ndweek
Backend.ai tutorial-2ndweekLablup Inc.
 
Backend.ai tutorial-01
Backend.ai tutorial-01Backend.ai tutorial-01
Backend.ai tutorial-01Lablup Inc.
 
Backend.AI: Brochure (2019 Autumn / 19.09)
Backend.AI: Brochure (2019 Autumn / 19.09)Backend.AI: Brochure (2019 Autumn / 19.09)
Backend.AI: Brochure (2019 Autumn / 19.09)Lablup Inc.
 
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...Lablup Inc.
 
JMI Techtalk : Backend.AI
JMI Techtalk : Backend.AIJMI Techtalk : Backend.AI
JMI Techtalk : Backend.AILablup Inc.
 
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가Lablup Inc.
 

Más de Lablup Inc. (18)

Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"
Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"
Lablupconf session1-2 "거대한 백엔드에 벽돌 끼워넣기"
 
Lablupconf session8 "Paving the road to AI-powered world"
Lablupconf session8 "Paving the road to AI-powered world"Lablupconf session8 "Paving the road to AI-powered world"
Lablupconf session8 "Paving the road to AI-powered world"
 
Lablupconf session7 People don't know what they want until LABLUP show it to ...
Lablupconf session7 People don't know what they want until LABLUP show it to ...Lablupconf session7 People don't know what they want until LABLUP show it to ...
Lablupconf session7 People don't know what they want until LABLUP show it to ...
 
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"
Lablupconf session6 "IoT에서 BI까지, 조선소 ML 파이프라인 만들기"
 
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
Lablupconf session3 "Application of DL in fight against COVID-19(EN)"
 
Lablupconf session5 "Application of machine learning to classify normal and d...
Lablupconf session5 "Application of machine learning to classify normal and d...Lablupconf session5 "Application of machine learning to classify normal and d...
Lablupconf session5 "Application of machine learning to classify normal and d...
 
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"
Lablupconf session4 "스토리지 솔루션 입출력 파이프라인 가속화와 개발 범위 간의 균형 잡기"
 
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"
Lablupconf session2 "MLOps를 활용한 AI빅데이터 교육 사례"
 
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진
Lablupconf session1-1 "Lablup과 함께하는 컨트리뷰션 아카데미" - 김수진
 
Lablupconf keynote
Lablupconf keynoteLablupconf keynote
Lablupconf keynote
 
초심자를 위한 무작정 시작하는 Backend.AI-04
초심자를 위한 무작정 시작하는 Backend.AI-04초심자를 위한 무작정 시작하는 Backend.AI-04
초심자를 위한 무작정 시작하는 Backend.AI-04
 
초심자를 위한 무작정 시작하는 Backend.AI-03
초심자를 위한 무작정 시작하는 Backend.AI-03초심자를 위한 무작정 시작하는 Backend.AI-03
초심자를 위한 무작정 시작하는 Backend.AI-03
 
Backend.ai tutorial-2ndweek
Backend.ai tutorial-2ndweekBackend.ai tutorial-2ndweek
Backend.ai tutorial-2ndweek
 
Backend.ai tutorial-01
Backend.ai tutorial-01Backend.ai tutorial-01
Backend.ai tutorial-01
 
Backend.AI: Brochure (2019 Autumn / 19.09)
Backend.AI: Brochure (2019 Autumn / 19.09)Backend.AI: Brochure (2019 Autumn / 19.09)
Backend.AI: Brochure (2019 Autumn / 19.09)
 
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...
JMI Techtalk: 강재욱 - Toward tf.keras from tf.estimator - From TensorFlow 2.0 p...
 
JMI Techtalk : Backend.AI
JMI Techtalk : Backend.AIJMI Techtalk : Backend.AI
JMI Techtalk : Backend.AI
 
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가
Backend.AI: 왜 우리는 우리 핵심 제품을 오픈소스화 했는가
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Último (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Backend.AI Technical Introduction (19.09 / 2019 Autumn)

  • 1. Backend.AI Technical Introduction Lablup Inc. 2019. 09 ENG 1 / 38
  • 2. GPU Computing: Maximizing GPU utilization via Backend.AI Backend.AI: The most efficient way to build and train your machine learning models 2 / 38
  • 3. Synergy of Deep Learning and GPU Deep Learning = Repetition of numeric ops on millions/billions of parameter matrices 2015 Microsoft ResNet 2015 13,000 2016 Baidu Deep Speech 2 2016 20,000 Google NMT 50억 달러 2017 60 million parameters 70 quadrillion calc. 300 million parameters 200 quadrillion calc. Calc.: GOPS * bandwidth Reference: NVIDIA 2017 “A NET COMPUTING ERA” 8.7 billion parameters 1.05 quintillion calc.1 2 3 4 5 6 7 8 9 10 11 12 58 1 + 7 + 2 + 9 + 3 + 11 = 58 3 / 38
  • 4. Synergy of Deep Learning and GPU DRAM Cache Control ALU ALU ALU ALU CPU GPU DRAMDRAM GPU = More computing units per chip area (ALU) § C/C++ codes that runs on the GPU in parallel made easy with NVIDIA's CUDA (2007) and OpenCL (2009) § Used in machine learning, numerical analysis, and scientific computing CPU single thread performance 1000X (Till 2025) YoY 1.5X YoY 1.1X GPU computing power YoY 1.5X 102 103 104 105 106 107 1980 1990 2000 2010 2020 4 / 38
  • 5. Why GPU Computing? HPC & AI = High utilization of large-scale resources GPU = High-density computing chips Note(s): CPU Baselined to 5000 Servers for each workload | Capex Costs: CPU node with 2x Skylake CPU’s ~$9K; GPU node with 4x V100 GPU’s ~$45K | Opex Costs: Power & cooling is $180/kW/month | Power: CPU serv er + n/w = 0.6 KW; GPU server + n/w = 1.6 KW; DGX-1V/HGX-1 Server = 3.2KW | HPC: GPU node with 4xV100 compared to 2xCPU Server | DL Training: DGX-1V compared to a 2xCPU server | DL Inference: HGX-1 based s erver (8xV100) Compared to 2x CPU Server |numbers rounded to nearest $0.5M Workload Baseline (CPU-Only) HPC (Amber,LAMPS) AI Training (TensorFlow) AI Inference (Image, Speech) Speed Up 1x 20x >100x 60x Servers 5,000 250 <50 84 Capex $45M $11M $7.5M $7M 3 Year Opex (Power+Cooling) $19.5M $2.5M $1M $1.5M TCO Saving N/A 79% 86% 86% GPU is necessary! 5 / 38
  • 6. Backend.AI https://www.backend.ai Easy and fast streamlined platform to train and serve Machine Learning models on-premises and clouds 6 / 38
  • 7. Backend.AI https://www.backend.ai Easy and fast streamlined platform to train and serve Machine Learning models on-premises and clouds 7 / 38
  • 8. Backend.AI Goal GPU GPU GPU GPU GPU GPU ??? Backend.AI GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU § Manual assignment of GPUs to researchers § Inefficient allocation § Manual checks for SW compatibilities § Automatic sharing and consolidation of GPUs § Need-based use of GPUs § Containerized runtime environments 8 / 38
  • 10. Backend.AI Platform IaaS / OS Hardware Infra. Managed GPU Apps Data Scientists Data Analysts Instructors & Learners Developers Container-level GPU virtualization Click-to-ready GPU Environments Web GUI for monitoring & control Backend.AI Manager IDE Integration Backend.AI Client Backend.AI Agent Brand Guidelines TensorFlow is an end-to-end open-source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the 10 / 38
  • 11. Backend.AI Differentiation • The only solution that provides machine learning container technology in a single framework ­ Existing orchestration layers are optimized for domain-specific functions other than machine learning (e.g. scheduling, microservice hosting) ­ Lack of products to solve the problems of real machine learning researchers and developers • Backend.AI ­ GPU optimization technology ü Implementing CUDA-optimized solutions with NVIDIA partnership ü The only container-based multi / partial GPU sharing (fractional scaling) solution ­ Dynamic sandboxing: programmable and rewritable syscall filters ü Support for richer programmable policies compared to apparmor/seccomp, etc. ­ Docker-based legacy app resource control ü Calibration of the number of CPU cores recognized by mathematical libraries such as OpenBLAS 11 / 38
  • 12. GPU Virtualization Technology Backend.AI: The most efficient way to build and train your machine learning models 12 / 38
  • 13. Backend.AI https://www.backend.ai Backend.AI is an open-source cloud resource management platform. We provide fractional GPU resourcing so you can scale efficiently whether you’re a scientist, DevOps, enterprise, or an AI hobbyist. 13 / 38
  • 14. Backend.AI: GPU Features • Container-level fractional GPU scaling ­ Assigning slices of SMP / RAM to containers ü e.g.) Allocating 2.5 GPUs or 0.3 GPUs ­ Shared GPUs for inference & education workloads ­ Multiple GPUs for model training workloads ­ With proprietary CUDA virtualization layer • NVIDIA platform integration ­ Optimized for DGX server families ­ Supports NGC (for DL / HPC) image integration Example of GPU sharing / allocation (2.5 / 0.5 slots) 2.5 GPUs 0.5 GPUs 14 / 38
  • 15. Container 2 Backend.AI GPU Virtualizer Container 1 Container 3 Container 4 Fractional & Multi-GPU Scaling nvidia-docker + CUDA Driver PCIE/0 PCIE/1 PCIE/2 PCIE/3 PCIE/4 PCIE/5 PCIE/0PCIE/1PCIE/0 PCIE/0 PCIE/1 PCIE/0 PCIE/1 PCIE/2 /device:GPU:0 /device:GPU:0 /device:GPU:1 /device:GPU:0 /device:GPU:0 /device:GPU:1 /device:GPU:2/device:GPU:1 /device:GPU:0 /device:GPU:1 /device:GPU:2 /device:GPU:3 /device:GPU:4 /device:GPU:5 Host-side view: 15 / 39
  • 16. NVIDIA DGX Series • NVIDIA DGX-1/DGX-2 ­ Complete multi-GPU environment system ü Ubuntu-based Host OS (also RedHat support) ü NV Link / NV Switch based high-speed networking ü Great testbed for various load tests! • Backend.AI on DGX-family ­ Complements to NVIDIA Container Runtime ü GPU sharing for multi-user support ü Scheduling with CPU/GPU topology ü Features for machine learning pipeline ­ Technology collaboration via NVIDIA Inception Program SYSTEM SPECIFICATIONS GPUs 16X NVIDIA® Tesla V100 GPU Memory 512GB total Performance 2 petaFLOPS NVIDIA CUDA® Cores 81920 NVIDIA Tensor Cores 10240 NVSwitches 12 Maximum Power Usage 10 kW CPU Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores System Memory 1.5TB Network 8X 100Gb/sec Infiniband/100GigE Dual 10/25Gb/sec Ethernet NVIDIA DGX-2 THE WORLD’S MOST POWERFUL DEEP LEARNING SYSTEM FOR THE MOST COMPLEX AI CHALLENGES The Challenge of Scaling to Meet the Demands of Modern AI and Deep Learning Deep neural networks are rapidly growing in size and complexity, in response to the most pressing challenges in business and research. The computational capacity needed to support today’s modern AI workloads has outpaced traditional data center architectures. Modern techniques that exploit increasing use of model parallelism are colliding with the limits of inter-GPU bandwidth, as developers build increasingly large accelerated computing clusters, pushing the limits of data center scale. A new approach is needed - one that delivers almost limitless AI computing scale in order to break through the barriers to achieving faster insights that can transform the world. Deep-learning Framework TensorFlow, Caffe, Torch, mxnet, Theano, etc. Deep-learning user program NVIDIA DIGITS Container tools NVIDIA Container Runtime for Docker GPU Driver NVIDIA Driver System Ubuntu-based Host OS 16 / 38
  • 17. NVIDIA Platform Integration: NGC • NGC (NVIDIA GPU Cloud) ­ Container image collection optimized for nvidia-docker ü Direct optimization options and library dependency management by NVIDIA ­ Announced expansion to model store from GTC 2019 ü Sharing deep learning models between users and organizations ü Supports transfer learning by adding additional data based on learned deep learning model ü Easier and faster model learning environment through model script • Backend.AI with NGC ­ Supports all NGC-based image execution ­ NVIDIA recommended options applied (including docker shm limit) ­ Fractional GPU sharing ­ Supports NGC model store and model script (soon) 17 / 38
  • 18. Backend.AI @NVIDIA GTC Silicon Valley 2019 • DGX User Group Meetup ­ Hearing DGX deployment case and customer requirements • NGC User Group Meetup ­ Presenting Backend.AI NGC integration • Main Session Talk ­ Introducing Backend.AI technology • Inception Startup Booth ­ Demonstrating container-level GPU virtualization ­ Having direct Q&A with NVIDIA CUDA Developers 18 / 38
  • 19. Backend.AI Competitor Analysis Technology nvidia- docker Docker Swarm OpenStack Kubernetes Apache Mesos Backend.AI GPU Support GPU Assignment & Isolation Heterogeneous Accelerators Fractional GPU Scaling Security Sandboxing via Hypervisor/Container Programmable Sandboxing Virtualization VM (Hypervisor) Docker Container Scheduling Availability-slot based Advanced (e.g. DRF) Integration Modern AI Frameworks * Now on beta testing ** Cloud vendor / OpenStack handles VM management *** slot-based but can do advanced customization with label feature *** **** ** * 19 / 38
  • 20. Flexible Resource Management Backend.AI: The most efficient way to build and train your machine learning models 20 / 38
  • 21. Flexible Resource Allocation: Resource Groups • Resource Groups: Groups of managed hardware resources ­ Specify the available resource groups for each user, project, and domain ­ Allow resource requests to be allocated only within specific resource groups ­ Autoscale implementation in the cloud can be applied in units of resource groups • Examples ­ Resource Groups by Device Performance : V100 / P100 / K80 / etc. ­ Resource Groups by Nodes : Servers / Workstations / IDC / etc. ­ Resource Groups by Clouds : AWS / GCP / Azure / etc. • Applications ­ Assign specific hardware or GPU only to specific users, projects, teams, or domains ­ Divide node groups by CPU / GPU / storage ­ Group and manage nodes that are physically in the same network (for multi-network cluster) 21 / 38
  • 22. Flexible Resource Allocation: Scenarios Resource Group A (On-premise) Resource Group D (Cloud / Scalable)Resource Group B (On-premise) Storage Group C Backend.AI Manager 22 / 38
  • 23. Flexible Resource Allocation: Scenarios • Per-user resource group permission ­ User 1: grant to RG A ­ User 2: grant to RG A, B ­ Each user has separate privileges to Storage C • Session / task batch ­ Manual batch to specific RG ­ Automatic discovery of optimal resource combinations across all available RGs before starting Resource Group A (On-premise) Resource Group D (Cloud)Resource Group B (On-premise) Storage Group C Backend.AI Manager User 1 User 2 User 2 23 / 38
  • 24. Flexible Resource Allocation: Scenarios • Project-wise resource group permission ­ Project 1: grant to RG A, B ­ Project 2: grant to RG B, D • Storage sharing ­ Different resource groups can share the same storage groups ­ Personal storage folder ü Only owners can access ü Invitation feature for sharing ­ Project storage folder ü All project members can access Resource Group A (On-premise) Resource Group D (Cloud)Resource Group B (On-premise) Storage Group C Backend.AI Manager Project 1 Project 2 Project 2Project 1 24 / 38
  • 25. Flexible Resource Allocation: Scenarios • Resource group example ­ RG A: NVIDIA V100 GPU group ­ RG B: NVIDIA P100 GPU group ­ User 1 can only use V100, User 2 can use both V100 and P100 ­ Project 3 can use P100 group and AWS cloud ­ Project 4 can only use Microsoft Azure cloud Resource Group A (V100) Resource Group D (AWS)Resource Group B (P100) Storage Group C Backend.AI Manager Project 3 Project 3 Project 4 User 1 User 2 User 2 Resource Group E (Azure) 25 / 38
  • 26. User-Friendly GUI Backend.AI: The most efficient way to build and train your machine learning models 26 / 38
  • 28. Environment and Resource Selection 28 / 38
  • 31. Cases Backend.AI: The most efficient way to build and train your machine learning models 31 / 38
  • 32. Backend.AI Cases: AI Bigdata MBA Dept., Kookmin Univ. • GPU server farm for students and researchers in finance fields ­ 3 servers with 24 GPUs for the simultaneous use of more than 80 students in a class and researchers in labs • Spec. ­ Different resource policy for students and researchers ­ 18 TiB ceph distributed file system by binding HDDs on nodes with LAN connection ­ Web GUI for operation and maintenance: no operator needed 1 Gbps LAN 24 NVIDIA GPUs Node-specific CPU 18 TiB Distributed file system (cephfs) Manager + Agent Agent Agent ML class with 40+ students 10+ graduates and researchers 32 / 38
  • 33. Backend.AI Cases: Lablup GPU Cloud • Backend.AI service for cloud users (B2C) ­ https://cloud.backend.ai/ (in private beta) ­ Use Backend.AI on the web after sign-up (invitation needed) • Spec. ­ Unified AWS + Azure + GCP ­ Google TPU support (beta) ­ Azure FileShare + AWS EFS (Elastic File System) for datastore DGX-2 Custom-built GPU nodes ap-northeast-2 LG U+ IDC korea-south FileShare asia-east1 EFSRDS TPUsUsers Internet Manager + Agents Agents Agents 33 / 38
  • 34. GPU Virtualization Performance Backend.AI: The most efficient way to build and train your machine learning models 34 / 38
  • 35. Backend.AI Performance Benchmark • Example: fashion-MNIST • V100/P100 GPU cluster (1-node 8 GPU) ­ Model # : 50 ­ Comparison ü V100 As-Is standalone (Whole) ü V100 Backend.AI sharing (Fractional*) ü P100** Backend.AI sharing (Fractional) * Note, Fraction : 4 SMP, GPU Memory 1 GiB ** Note, P100 is a generation older GPU than V100 GPU virtualization and automatic resource management increases utilization of expensive GPU resources 100.00% 20.24% 33.74% 0.00% 20.00% 40.00% 60.00% 80.00% 100.00% 120.00% V100 x 8 As-Is (Whole) V100 x 8 Backend.AI (Fractional) P100 x 8 Backend.AI (Fractional) Fashion-MNIST 50 Tasks Processing Time (Relative) 35 / 38
  • 36. Cost-saving with Backend.AI Backend.AI: The most efficient way to build and train your machine learning models 36 / 38
  • 37. Case of Cloud for Machine Learning Education • Machine learning education and development cloud service ­ 25 users / 2 months for each term • Optimal utilization of each education / development through GPU virtualization ­ Infrastructure costs reduced by more than 75% • Automatic resource allocation and environment preparation with GUI ­ Optimal operation without dedicated administrator ­ Eliminates long term maintenance burden Infrastructure / management cost reduction through GPU virtualization also applies to on-premise solutions. 20% 0% 23% 0% 20% 40% 60% 80% 100% 120% Total Cost Operator Payroll Infra. Cost Case Study : Cloud-base ML Education Service Costs Comparison A company ML Cloud Backend.AI Cloud 37 / 38
  • 38. Make AI Accessible! For more information, Lablup Inc. Backend.AI Backend.AI GitHub Backend.AI Cloud (beta) https://www.lablup.com https://www.backend.ai https://github.com/lablup/backend.ai https://cloud.backend.ai 38 / 38