SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning on AWS
No Math, I promise!
Julien Simon
Principal Technical Evangelist
Amazon Web Services

julsimon@amazon.fr
@julsimon

22/11/2016
Agenda
•  Neural networks
•  Recommendation @ Amazon.com

•  GPU instances & Nvidia CUDA
•  Amazon DSSTNE
•  Deep Learning AMI
Neural networks
•  Nodes, weights and layers
•  Activation function
•  Forward propagation
•  Training
•  Input layer: 1 to n data samples
•  Output layer: the expected result
•  Use backpropagation to minimize error by
adjusting weights
•  Predicting
•  Input layer: 1 data sample
•  Output layer: the predicted result
http://machinelearningmastery.com/what-is-deep-learning/ 
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Neural networks for product recommendation
•  The input and output layers correspond to products"
(“if you bought this, you’ll like this”)
•  Amazon.com has hundreds of millions of products"
à Recommendation problems require very large "
input & output layers
•  Training and recommendation are performed "
using matrix operations, which can best be scaled "
using GPUs.

•  Most of the matrix is zero-filled, because only a fraction of products are
recommended in the dataset
https://commons.wikimedia.org/
https://aws.amazon.com/blogs/big-data/generating-recommendations-at-amazon-scale-with-
apache-spark-and-amazon-dsstne/
Recommendation @ Amazon.com
Amazon DSSTNE (aka ‘Destiny’)
• Deep Scalable Sparse Tensor Network Engine
• Open source library for deep neural networks using GPUs
https://github.com/amznlabs/amazon-dsstne 
• Multi-GPU scale for training and prediction
• Larger networks than are possible with a single GPU
• Optimized for fast performance on sparse datasets
• Human-readable networks (JSON file)
• Can run locally or on AWS
AWS GPU Instances
•  g2 (2xlarge, 8xlarge)
•  32 vCPUs, 60 GB RAM
•  4 NVIDIA K520 GPUs
•  16 GB of GPU memory, 6144 CUDA cores
•  p2 (xlarge, 8xlarge, 16xlarge)
•  Launched in 09/16
•  64 vCPUs, 732 GB RAM
•  16 NVIDIA GK210 GPUs
•  192 GB of GPU memory, 39936 CUDA cores
•  20 Gbit/s networking

https://aws.amazon.com/blogs/aws/new-g2-instance-type-with-4x-more-gpu-power/ 
https://aws.amazon.com/blogs/aws/new-p2-instance-type-for-amazon-ec2-up-to-16-gpus/
Nvidia CUDA architecture
•  CUDA is a parallel computing
platform and application
programming interface model
created by Nvidia
•  CUDA toolkit to build applications
(compiler, etc.)
•  CUDA drivers to manage GPUs
http://www.nvidia.com/object/cuda_home_new.html
https://en.wikipedia.org/wiki/CUDA
Deploying CUDA apps with Docker
•  Docker containers are
hardware-agnostic and
platform-agnostic
•  Installing Nvidia drivers
inside containers breaks
this model
•  nvidia-docker solves this
issue by using host drivers
https://www.docker.com/what-docker
https://github.com/NVIDIA/nvidia-docker
Building DSSTNE
DSSTNE has quite a few dependencies "
(see https://github.com/amznlabs/amazon-dsstne)
1.  Install them manually
2.  Use the DSSTNE AMI (CUDA 7.0)
3.  Launch a container installing all dependencies in its
Docker file (CUDA 8.0, the latest version)"
https://github.com/tristanpenman/docker-dsstne 
Full details here:
https://github.com/juliensimon/aws/blob/master/dsstne/
demo-docker.txt
Demo: Amazon Destiny



http://grouplens.org/datasets/movielens/ 
27,000 movies
138,000 users
20 million movie recommendations
(Matrix is 99.5% sparse)




		 Alice	 Bob	 Charlie	 David	 Ernest	
Star	Wars	 1	 1	 1	
Lord	of	the	Rings	 1	 1	 1	
Incep>on	 1	
Bambi	 1	 1	
PreAy	Woman	 1	 1	
		 Alice	 Bob	 Charlie	 David	 Ernest	
Star	Wars	 0.23	 1	 0.12	 1	 1	
Lord	of	the	Rings	 1	 0.34	 1	 0.89	 1	
Incep>on	 0.8	 1	 0.43	 0.76	 0.45	
Bambi	 1	 0.42	 0.5	 1	 0.34	
PreAy	Woman	 1	 0.09	 0.67	 0.04	 1	
à Train a neural network
Input layer: 27,000 neurons"
(1 for each movie)
1 hidden layer: 128 neurons
Output layer: 27,000 neurons"
(1 for each movie)
à  Recommend 10 movies per user
Training & predicting
# Fetch the Movie Lens dataset and the neural network config file 
 

wget https://s3.amazonaws.com/amazon-dsstne-data/movielens/ml20m-all 

wget https://s3.amazonaws.com/amazon-dsstne-data/movielens/config.json 
 
 

# Format the input and output data for training 

generateNetCDF -d gl_input -i ml20m-all -o gl_input.nc -f features_input -s samples_input -c 

generateNetCDF -d gl_output -i ml20m-all -o gl_output.nc -f features_output -s samples_input -c 

# Train the model on 8 GPUs 
 

mpirun -np 8 train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10

# Predict the results 

predict -b 1024 -d gl -i features_input -o features_output -k 10 -n gl.nc -f ml20m-all -s recs -r ml20m-all
http://www.unidata.ucar.edu/software/netcdf 
https://www.open-mpi.org
Amazon Destiny vs Google TensorFlow
“DSSTNE on a single virtualized K520 GPU (released in 2012) is faster
than TensorFlow on a bare metal Tesla M40 (released in 2015)”
“TensorFlow does not provide the automagic model parallelism
provided by DSSTNE”
https://medium.com/@scottlegrand/first-dsstne-benchmarks-tldr-almost-15x-faster-than-
tensorflow-393dbeb80c0f
Training Amazon Destiny on multiple GPUs
Movie Lens 20M, g2.8xlarge vs p2.16xlarge
mpirun –np <n> train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10
AWS Deep Learning AMI
• Deep Learning Frameworks – 5 popular Deep Learning
Frameworks (MXNet, Caffe, Tensorflow, Theano, and
Torch) all prebuilt and pre-installed
• Pre-installed components – Nvidia drivers, cuDNN,
Anaconda, Python2 and Python3
• AWS Integration – Packages and configurations that
provide tight integration with Amazon Web Services like
Amazon EFS (Elastic File System)
Resources
Big Data Architectural Patterns and Best Practices on AWS
https://www.youtube.com/watch?v=K7o5OlRLtvU 
Real-World Smart Applications With Amazon Machine Learning
https://www.youtube.com/watch?v=sHJx1KJf8p0 
Deep Learning: Going Beyond Machine Learning"
https://www.youtube.com/watch?v=Ra6m70d3t0o

DSSTNE: A new Deep Learning Framework For Large Sparse Datasets
https://www.youtube.com/watch?v=LbYR6Mzq6FE

AWS Big Data blog: https://blogs.aws.amazon.com/bigdata/
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Julien Simon
Principal Technical Evangelist
Amazon Web Services

julsimon@amazon.fr
@julsimon

Más contenido relacionado

La actualidad más candente

AWS Big Data combo
AWS Big Data comboAWS Big Data combo
AWS Big Data comboJulien SIMON
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAmazon Web Services
 
Picking the right AWS backend for your Java application
Picking the right AWS backend for your Java applicationPicking the right AWS backend for your Java application
Picking the right AWS backend for your Java applicationJulien SIMON
 
Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)Julien SIMON
 
Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)Julien SIMON
 
The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)Julien SIMON
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaBig Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaJulien SIMON
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側SmartNews, Inc.
 
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料Amazon Web Services
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsAmazon Web Services
 
AWS re:Invent 2016 recap (part 1)
AWS re:Invent 2016 recap (part 1)AWS re:Invent 2016 recap (part 1)
AWS re:Invent 2016 recap (part 1)Julien SIMON
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureAmazon Web Services
 
AWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAmazon Web Services
 
AWS Summit London 2014 | Options for Hybrid Environments (200)
AWS Summit London 2014 | Options for Hybrid Environments (200)AWS Summit London 2014 | Options for Hybrid Environments (200)
AWS Summit London 2014 | Options for Hybrid Environments (200)Amazon Web Services
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...Amazon Web Services
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...Amazon Web Services
 
使用 Blox 實現容器任務調度與資源編排
使用 Blox 實現容器任務調度與資源編排使用 Blox 實現容器任務調度與資源編排
使用 Blox 實現容器任務調度與資源編排Amazon Web Services
 

La actualidad más candente (20)

AWS Big Data combo
AWS Big Data comboAWS Big Data combo
AWS Big Data combo
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best Practices
 
Picking the right AWS backend for your Java application
Picking the right AWS backend for your Java applicationPicking the right AWS backend for your Java application
Picking the right AWS backend for your Java application
 
Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)Running Open Source Platforms on AWS (November 2016)
Running Open Source Platforms on AWS (November 2016)
 
Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)Advanced Task Scheduling with Amazon ECS (June 2017)
Advanced Task Scheduling with Amazon ECS (June 2017)
 
The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)The AWS DevOps combo (January 2017)
The AWS DevOps combo (January 2017)
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaBig Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側AWSの進化とSmartNewsの裏側
AWSの進化とSmartNewsの裏側
 
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on aws
 
AWS re:Invent 2016 recap (part 1)
AWS re:Invent 2016 recap (part 1)AWS re:Invent 2016 recap (part 1)
AWS re:Invent 2016 recap (part 1)
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud Architecture
 
AWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDSAWS Webcast - High Availability SQL Server with Amazon RDS
AWS Webcast - High Availability SQL Server with Amazon RDS
 
AWS Summit London 2014 | Options for Hybrid Environments (200)
AWS Summit London 2014 | Options for Hybrid Environments (200)AWS Summit London 2014 | Options for Hybrid Environments (200)
AWS Summit London 2014 | Options for Hybrid Environments (200)
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
 
(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users(ARC301) Scaling Up to Your First 10 Million Users
(ARC301) Scaling Up to Your First 10 Million Users
 
Introduction on Amazon EC2
 Introduction on Amazon EC2 Introduction on Amazon EC2
Introduction on Amazon EC2
 
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...
Introduction to Amazon Web Services - How to Scale your Next Idea on AWS : A ...
 
使用 Blox 實現容器任務調度與資源編排
使用 Blox 實現容器任務調度與資源編排使用 Blox 實現容器任務調度與資源編排
使用 Blox 實現容器任務調度與資源編排
 

Similar a Deep Learning on AWS (November 2016)

The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化NVIDIA Taiwan
 
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PXNVIDIA Japan
 
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksDeep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksAmazon Web Services
 
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksAmazon Web Services
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCGanesan Narayanasamy
 
A journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfLiang Yan
 
Gpudigital lab for english partners
Gpudigital lab for english partnersGpudigital lab for english partners
Gpudigital lab for english partnersOleg Gubanov
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...SQUADEX
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
 
Deep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the BayDeep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the BayAdam Gibson
 
Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Julien SIMON
 
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...Amazon Web Services
 

Similar a Deep Learning on AWS (November 2016) (20)

The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
 
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
 
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
車載組み込み用ディープラーニング・エンジン NVIDIA DRIVE PX
 
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksDeep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive: Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
 
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Elastic GPUs - May 2017 AWS Online Tech Talks
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
Experiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRCExperiences with Power 9 at A*STAR CRC
Experiences with Power 9 at A*STAR CRC
 
A journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdfA journay to do AI research in the cloud.pdf
A journay to do AI research in the cloud.pdf
 
Gpudigital lab for english partners
Gpudigital lab for english partnersGpudigital lab for english partners
Gpudigital lab for english partners
 
GPU Algorithms and trends 2018
GPU Algorithms and trends 2018GPU Algorithms and trends 2018
GPU Algorithms and trends 2018
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your CostsJOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your Costs
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
 
Cuda
CudaCuda
Cuda
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Deep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the BayDeep Learning with GPUs in Production - AI By the Bay
Deep Learning with GPUs in Production - AI By the Bay
 
Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)Build, train, and deploy Machine Learning models at scale (May 2018)
Build, train, and deploy Machine Learning models at scale (May 2018)
 
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...
AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...
 

Más de Julien SIMON

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceJulien SIMON
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersJulien SIMON
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with TransformersJulien SIMON
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Julien SIMON
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Julien SIMON
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Julien SIMON
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)Julien SIMON
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...Julien SIMON
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)Julien SIMON
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...Julien SIMON
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)Julien SIMON
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Julien SIMON
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Julien SIMON
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)Julien SIMON
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Julien SIMON
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Julien SIMON
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Julien SIMON
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Julien SIMON
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Julien SIMON
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Julien SIMON
 

Más de Julien SIMON (20)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 

Último

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Último (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Deep Learning on AWS (November 2016)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning on AWS No Math, I promise! Julien Simon Principal Technical Evangelist Amazon Web Services julsimon@amazon.fr @julsimon 22/11/2016
  • 2. Agenda •  Neural networks •  Recommendation @ Amazon.com •  GPU instances & Nvidia CUDA •  Amazon DSSTNE •  Deep Learning AMI
  • 3. Neural networks •  Nodes, weights and layers •  Activation function •  Forward propagation •  Training •  Input layer: 1 to n data samples •  Output layer: the expected result •  Use backpropagation to minimize error by adjusting weights •  Predicting •  Input layer: 1 data sample •  Output layer: the predicted result http://machinelearningmastery.com/what-is-deep-learning/ https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
  • 4. Neural networks for product recommendation •  The input and output layers correspond to products" (“if you bought this, you’ll like this”) •  Amazon.com has hundreds of millions of products" à Recommendation problems require very large " input & output layers •  Training and recommendation are performed " using matrix operations, which can best be scaled " using GPUs. •  Most of the matrix is zero-filled, because only a fraction of products are recommended in the dataset https://commons.wikimedia.org/
  • 6. Amazon DSSTNE (aka ‘Destiny’) • Deep Scalable Sparse Tensor Network Engine • Open source library for deep neural networks using GPUs https://github.com/amznlabs/amazon-dsstne • Multi-GPU scale for training and prediction • Larger networks than are possible with a single GPU • Optimized for fast performance on sparse datasets • Human-readable networks (JSON file) • Can run locally or on AWS
  • 7. AWS GPU Instances •  g2 (2xlarge, 8xlarge) •  32 vCPUs, 60 GB RAM •  4 NVIDIA K520 GPUs •  16 GB of GPU memory, 6144 CUDA cores •  p2 (xlarge, 8xlarge, 16xlarge) •  Launched in 09/16 •  64 vCPUs, 732 GB RAM •  16 NVIDIA GK210 GPUs •  192 GB of GPU memory, 39936 CUDA cores •  20 Gbit/s networking https://aws.amazon.com/blogs/aws/new-g2-instance-type-with-4x-more-gpu-power/ https://aws.amazon.com/blogs/aws/new-p2-instance-type-for-amazon-ec2-up-to-16-gpus/
  • 8. Nvidia CUDA architecture •  CUDA is a parallel computing platform and application programming interface model created by Nvidia •  CUDA toolkit to build applications (compiler, etc.) •  CUDA drivers to manage GPUs http://www.nvidia.com/object/cuda_home_new.html https://en.wikipedia.org/wiki/CUDA
  • 9. Deploying CUDA apps with Docker •  Docker containers are hardware-agnostic and platform-agnostic •  Installing Nvidia drivers inside containers breaks this model •  nvidia-docker solves this issue by using host drivers https://www.docker.com/what-docker https://github.com/NVIDIA/nvidia-docker
  • 10. Building DSSTNE DSSTNE has quite a few dependencies " (see https://github.com/amznlabs/amazon-dsstne) 1.  Install them manually 2.  Use the DSSTNE AMI (CUDA 7.0) 3.  Launch a container installing all dependencies in its Docker file (CUDA 8.0, the latest version)" https://github.com/tristanpenman/docker-dsstne Full details here: https://github.com/juliensimon/aws/blob/master/dsstne/ demo-docker.txt
  • 11. Demo: Amazon Destiny http://grouplens.org/datasets/movielens/ 27,000 movies 138,000 users 20 million movie recommendations (Matrix is 99.5% sparse) Alice Bob Charlie David Ernest Star Wars 1 1 1 Lord of the Rings 1 1 1 Incep>on 1 Bambi 1 1 PreAy Woman 1 1 Alice Bob Charlie David Ernest Star Wars 0.23 1 0.12 1 1 Lord of the Rings 1 0.34 1 0.89 1 Incep>on 0.8 1 0.43 0.76 0.45 Bambi 1 0.42 0.5 1 0.34 PreAy Woman 1 0.09 0.67 0.04 1 à Train a neural network Input layer: 27,000 neurons" (1 for each movie) 1 hidden layer: 128 neurons Output layer: 27,000 neurons" (1 for each movie) à  Recommend 10 movies per user
  • 12. Training & predicting # Fetch the Movie Lens dataset and the neural network config file wget https://s3.amazonaws.com/amazon-dsstne-data/movielens/ml20m-all wget https://s3.amazonaws.com/amazon-dsstne-data/movielens/config.json # Format the input and output data for training generateNetCDF -d gl_input -i ml20m-all -o gl_input.nc -f features_input -s samples_input -c generateNetCDF -d gl_output -i ml20m-all -o gl_output.nc -f features_output -s samples_input -c # Train the model on 8 GPUs mpirun -np 8 train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10 # Predict the results predict -b 1024 -d gl -i features_input -o features_output -k 10 -n gl.nc -f ml20m-all -s recs -r ml20m-all http://www.unidata.ucar.edu/software/netcdf https://www.open-mpi.org
  • 13. Amazon Destiny vs Google TensorFlow “DSSTNE on a single virtualized K520 GPU (released in 2012) is faster than TensorFlow on a bare metal Tesla M40 (released in 2015)” “TensorFlow does not provide the automagic model parallelism provided by DSSTNE” https://medium.com/@scottlegrand/first-dsstne-benchmarks-tldr-almost-15x-faster-than- tensorflow-393dbeb80c0f
  • 14. Training Amazon Destiny on multiple GPUs Movie Lens 20M, g2.8xlarge vs p2.16xlarge mpirun –np <n> train -c config.json -i gl_input.nc -o gl_output.nc -n gl.nc -b 256 -e 10
  • 15. AWS Deep Learning AMI • Deep Learning Frameworks – 5 popular Deep Learning Frameworks (MXNet, Caffe, Tensorflow, Theano, and Torch) all prebuilt and pre-installed • Pre-installed components – Nvidia drivers, cuDNN, Anaconda, Python2 and Python3 • AWS Integration – Packages and configurations that provide tight integration with Amazon Web Services like Amazon EFS (Elastic File System)
  • 16. Resources Big Data Architectural Patterns and Best Practices on AWS https://www.youtube.com/watch?v=K7o5OlRLtvU Real-World Smart Applications With Amazon Machine Learning https://www.youtube.com/watch?v=sHJx1KJf8p0 Deep Learning: Going Beyond Machine Learning" https://www.youtube.com/watch?v=Ra6m70d3t0o DSSTNE: A new Deep Learning Framework For Large Sparse Datasets https://www.youtube.com/watch?v=LbYR6Mzq6FE AWS Big Data blog: https://blogs.aws.amazon.com/bigdata/
  • 17. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! Julien Simon Principal Technical Evangelist Amazon Web Services julsimon@amazon.fr @julsimon