2016 06-30-deep-learning-archi

The architecture of Deep Learning web services

自己紹介
• 長尾太介 ( Daisuke Nagao )
• 2002 – 2015: Fujixerox, 粉体シミュレータの開発, HPC on AWS の導入/構築
• 2015 - 現在 : NVIDIA, クラウドビジネスデベロップメント

Learning system requires HPC architecture.
Inference system requires Web architecture.
My messages (Deep learning web services)
NVIDIA’s deep learning strategy intends for large-scale data center.
If you want to build the services on a cloud such as AWS, you can have other way.
長尾の個人的見解ですが・・

ここからはNVIDIAが提供する
Deep Learning 環境の説明

For example
container shipmite Motor scooter leopard
画像のタグ付け・・・・

Learning
Neural Network
Structure
Inference
Learning: HPC Architecture Inference: Web Architecture
BIGDATA Labels
Deploy a trained
Network
apple Orange strawberryBanana
Deep Learning web services require two system.
Require real time processRequire Many Core, BIGDATA analysis
API Micro
Service
• マイクロサービスを提供する
• ~250μs response time
• リクエストを常時待ちうけ。
• 数日間、CPU and/or GPUの負荷がほぼ100%で動作する。マル
チノード、マルチGPUに対応
• ジョブを走らせているときのみ計算ノードを利用

Learning
Neural Network
Structure
Inference
BIGDATA Labels
Deploy a trained
Network
Deep Learning web services require two system.
API Micro
Service
Learningに適したGPU 推論に適したGPU

Learning
Neural Network
Structure
Inference
BIGDATA Labels
Deploy a trained
Network
API Micro
Service
nvidia-docker
• There are many Deep Learning framework and version.
• Need to deploy the trained network to Inference server.

Neural Network
Structure
BIGDATA Labels
Deploy a trained
Network
API
Learning Inference
nvidia-docker
• There are many Deep Learning framework and version.
• Need to deploy the trained network to Inference server.
Docker
Registry
PUSH PULL
Micro
Service

Learning
Neural Network
Structure
Inference
BIGDATA Labels
Deploy a trained
Network
API
GPU Rest Engine is template written by go-lang to launch the micro-service.
This template launch webserver with the port number that is set by admin.
GPU Rest EngineDocker
Registry
https://github.com/NVIDIA/gpu-rest-engine

GPU Rest Engine & Nvidia Docker
Software stack
GPU2 GPU3 GPU4 GPU6 GPU7
NVIDIA CUDA Driver
Docker Engine
GPU0 GPU1 GPU0 GPU1 GPU2 GPU0 GPU1 GPU2
GPU5
GPU Rest Engine
Golang
GPU Rest Engine
Golang
GPU Rest Engine
Golang
Golang
net/http
Golang
net/http
Golang
net/http
Application1 Application2
Application
3
GPU0 GPU1
Docker container1
Host PC
Docker container2 Docker container2

http://qiita.com/daikumatan/items/2efa5dbbf7276e1e017d

Neural Network
Structure
BIGDATA Labels
Deploy a trained
Network
API
ついに HPC でも Docker が！ How deploy your apps?Mesos can give an abstract of datacenter with both HPC and Webserver
Submit Job Daemon
Learning Inference
GPU Rest EngineDocker
Registry

Deep learning services on AWS
もしAWSのみで環境を構築するならば・・・

Neural Network
Structure
BIGDATA
Labels
Deploy a
trained
Network
API
ついに HPC でも Docker が！ How deploy your apps?
Learning Inference
AWS Elastic
Beanstalk
bucket
cfncluster
Amazon
DynamoDB
Amazon
DynamoDB
bucket
Meta Data
Amazon API
Gateway
Submit Job Daemon
HPC on AWS 用ミドルウェア
• HPC Clusterの動的作成・削除・管理
• スケジューリング機能

Neural Network
Structure
BIGDATA
Labels
Deploy a
trained
Network
API
ついに HPC でも Docker が！ How deploy your apps?
Learning Inference
AWS Elastic
Beanstalk
bucket
cfncluster
Amazon
DynamoDB
Amazon
DynamoDB
bucket
Meta Data
Amazon API
Gateway
学習もAPI起動するなら、
ジョブ管理ソフトをキッ
クするよう構成
Submit Job Daemon

Amazon API
Gateway
Cfncluster
• インスタンスの起動・ターミネート、スケージューリングの役目を担う
• これがあることで、多数のリクエストを適切に処理できる。必要なとき
に必要なだけGPUインスタンスを使うことができる
トレーニングデータ
Labels
メタ情報
cfncluster

まとめ
• HPC屋はWebのアーキテクチャー、Web屋はHPCのアーキテクチャーを知る
必要がある
• Deep Learningを行う上で、NVIDIAはキープレイヤーではあるが、データセ
ンター事業者向けアーキと推測。GPUを売る会社なので当然・・・
• AWS上にサービスをすべて乗っける場合など、使用する環境にあった、
アーキを構築すべき

2016 06-30-deep-learning-archi

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a 2016 06-30-deep-learning-archi

Similar a 2016 06-30-deep-learning-archi (20)

2016 06-30-deep-learning-archi