SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
©2017,	Amazon	Web	Services,	Inc.	or	its	affiliates.	All	rights	reserved
Deep Learning with Apache MXNet
Julien Simon, Principal Technical Evangelist
@julsimon
Selected customers running AI on AWS
Questions
• What’s the business problem my IT has failed to solve?
– That’s probably where Deep Learning can help
• Should I design and train my own Deep Learning model?
– Do I have the expertise?
– Do I have enough time, data & compute to train it?
• Should I use a pre-trained model?
– How well does it fit my use case?
– On what data was it trained? How close is this to my own data?
• Should I use a SaaS solution?
Amazon AI for every developer
More information at aws.amazon.com/ai
Neural Networks
Frank Rosenblatt
Artificial Intelligence pioneer
1957 – The Perceptron
Appropriate weights are applied to the inputs, and
the resulting weighted sum is passed to a function
that produces the output.
The neuron
x = [x1, x2, …. xI ]
w = [w1, w2, …. wI ]
u = x.w
Activation function
The neural network
x =
x11, x12, …. x1I
x21, x22, …. x2I
x…, x.., …. x.I
xm1, xm2, …. xmI
I features
m samples
y =
y1
y2
y…
ym
m labels
Total number of predictions
Accuracy =
Number of correct predictions
The training process
• The difference between the predicted output and the actual output (aka
ground truth) is called the prediction loss.
• There are different ways to compute it (loss functions).
• The purpose of training is to iteratively minimize loss and maximize
accuracy for a given data set.
• We need a way to adjust weights (aka parameters) in order to gradually
minimize loss
à Backpropagation + optimization algorithm
Paul Werbos
Artificial Intelligence pioneer
IEEE Neural Network Pioneer Award
1974 – Backpropagation
The back-propagation algorithm acts as an error
correcting mechanism at each neuron level, thereby
helping the network to learn effectively.
Stochastic Gradient Descent (SGD)
Imagine you stand on top of a mountain with skis
strapped to your feet. You want to get down to the
valley as quickly as possible, but there is fog and
you can only see your immediate surroundings.
How can you get down the mountain as quickly as
possible? You look around and identify the steepest
path down, go down that path for a bit, again look
around and find the new steepest path, go down
that path, and repeat—this is exactly what gradient
descent does.
Tim Dettmers
University of Lugano
2015
Source: https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/
The « step size » is called
the learning rate
There is such a thing as « learning too well »
• If a network is large enough and given enough time, it will perfectly
learn a data set (universal approximation theorem).
• But what about new samples? Can it also predict them correctly?
• Does the network generalize well or not?
• To prevent overfitting, we need to know when to stop training.
• The training data set is not enough.
Data sets
Full data set
Training data set
Validation data set
Test data set
Training set: this data set is used to
adjust the weights on the neural
network.
Validation set: this data set is used to
minimize overfitting. You're not
adjusting the weights of the network
with this data set, you're just verifying
that any increase in accuracy over the
training data set actually yields an
increase in accuracy over a data set
that has not been shown to the
network before.
Testing set: this data set is used only
for testing the final weights in order to
benchmark the actual predictive power
of the network.
Training
Training data set Training
Trained
neural network
Number of epochs
Batch size
Learning rate
Hyper parameters
Validation
Validation Data set Trained
neural network
Validation
accuracy
Prediction at
the end of
each epoch
Stop training when validation accuracy stops increasing
Saving parameters at the end of each epoch is a good idea ;)
Convolutional Neural Networks
Le Cun, 1998: handwritten digit recognition, 32x32 pixels
Feature extraction and downsampling allow smaller networks
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Apache MXNet
Apache MXNet
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
Most Open Best On AWS
Optimized for
deep learning on AWS
Accepted into the
Apache Incubator
import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
d = c + 1
• Straightforward and flexible.
• Take advantage of language native
features (loop, condition,
debugger).
• E.g. Numpy, Matlab, Torch, …
• Hard to optimize
PROS
CONS
Easy to tweak
in Python
Imperative Programming
• More chances for optimization
• Cross different languages
• E.g. TensorFlow, Theano, Caffe
• Less flexible
PROS
CONS
C can share memory with D
because C is deleted later
A = Variable('A')
B = Variable('B')
C = B * A
D = C + 1
f = compile(D)
d = f(A=np.ones(10),
B=np.ones(10)*2)
A B
1
+
x
Declarative Programming
0
4
8
12
16
1 2 4 8 16
Ideal
Inception v3
Resnet
Alexnet
91%
Efficiency
Multi-GPU Scaling With MXNet
Input Output
1 1 1
1 0 1
0 0 0
3
mx.sym.Convolution(data, kernel=(5,5), num_filter=20)
mx.sym.Pooling(data, pool_type="max", kernel=(2,2),
stride=(2,2)
lstm.lstm_unroll(num_lstm_layer, seq_len, len, num_hidden, num_embed)
4 2
2 0
4=Max
1
3
...
4
0.2
-0.1
...
0.7
mx.sym.FullyConnected(data, num_hidden=128)
2
mx.symbol.Embedding(data, input_dim, output_dim = k)
0.2
-0.1
...
0.7
Queen
4 2
2 0
2=Avg
Input Weights
cos(w, queen) = cos(w, king) - cos(w, man) + cos(w, woman)
mx.sym.Activation(data, act_type="xxxx")
"relu"
"tanh"
"sigmoid"
"softrelu"
Neural Art
Face Search
Image Segmentation
Image Caption
“People Riding
Bikes”
Bicycle, People,
Road, Sport
Image Labels
Image
Video
Speech
Text
“People Riding
Bikes”
Machine Translation
“Οι άνθρωποι
ιππασίας ποδήλατα”
Events
mx.model.FeedForward model.fit
mx.sym.SoftmaxOutput
Let’s code
1. Our first network
2. Using pre-trained models
3. Learning from scratch (MNIST, MLP, CNN)
4. Learning and fine-tuning (CIFAR-10, Resnext-101)
5. Using Apache MXNet as a Keras backend
6. Fine-tuning with Keras/MXNet (CIFAR-10, Resnet-50)
7. A quick word about Sockeye
http://mxnet.io
http://medium.com/@julsimon
https://github.com/juliensimon/aws/tree/master/mxnet
Thank you!
http://aws.amazon.com/evangelists/julien-simon
@julsimon

Más contenido relacionado

La actualidad más candente

Docker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and KubernetesDocker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and KubernetesJulien SIMON
 
Managing Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksManaging Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksAmazon Web Services
 
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Amazon Web Services
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayAmazon Web Services Korea
 
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017Amazon Web Services
 
AWS Compute Evolved Week: Running Kubernetes on AWS
AWS Compute Evolved Week: Running Kubernetes on AWSAWS Compute Evolved Week: Running Kubernetes on AWS
AWS Compute Evolved Week: Running Kubernetes on AWSAmazon Web Services
 
Build A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersBuild A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersAmazon Web Services
 
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트Amazon Web Services Korea
 
CON307_Building Effective Container Images
CON307_Building Effective Container ImagesCON307_Building Effective Container Images
CON307_Building Effective Container ImagesAmazon Web Services
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon Web Services
 
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot Instances
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot InstancesWorkshop; Deploy a Deep Learning Framework on Amazon ECS and Spot Instances
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot InstancesAmazon Web Services
 
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...Amazon Web Services
 
DEV323_Introduction to the AWS CLI
DEV323_Introduction to the AWS CLIDEV323_Introduction to the AWS CLI
DEV323_Introduction to the AWS CLIAmazon Web Services
 
CON213_Hands-on Kubernetes on AWS
CON213_Hands-on Kubernetes on AWSCON213_Hands-on Kubernetes on AWS
CON213_Hands-on Kubernetes on AWSAmazon Web Services
 
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017Amazon Web Services
 
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017Amazon Web Services
 
Deep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateDeep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateAmazon Web Services
 

La actualidad más candente (20)

Kubernetes on AWS
Kubernetes on AWSKubernetes on AWS
Kubernetes on AWS
 
Development Workflows on AWS
Development Workflows on AWSDevelopment Workflows on AWS
Development Workflows on AWS
 
Docker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and KubernetesDocker clusters on AWS with Amazon ECS and Kubernetes
Docker clusters on AWS with Amazon ECS and Kubernetes
 
Managing Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech TalksManaging Container Images with Amazon ECR - AWS Online Tech Talks
Managing Container Images with Amazon ECR - AWS Online Tech Talks
 
Serverless - State Of the Union
Serverless - State Of the UnionServerless - State Of the Union
Serverless - State Of the Union
 
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017Batch Processing with Containers on AWS - CON304 - re:Invent 2017
Batch Processing with Containers on AWS - CON304 - re:Invent 2017
 
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container DayECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
ECS & ECR Deep Dive - 김기완 솔루션즈 아키텍트 :: AWS Container Day
 
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017
Moving to Amazon ECS – the Not-So-Obvious Benefits - CON356 - re:Invent 2017
 
AWS Compute Evolved Week: Running Kubernetes on AWS
AWS Compute Evolved Week: Running Kubernetes on AWSAWS Compute Evolved Week: Running Kubernetes on AWS
AWS Compute Evolved Week: Running Kubernetes on AWS
 
Build A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersBuild A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million Users
 
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트
AWS 고객사를 위한 ‘AWS 컨테이너 교육’ - 유재석, AWS 솔루션즈 아키텍트
 
CON307_Building Effective Container Images
CON307_Building Effective Container ImagesCON307_Building Effective Container Images
CON307_Building Effective Container Images
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for Kubernetes
 
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot Instances
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot InstancesWorkshop; Deploy a Deep Learning Framework on Amazon ECS and Spot Instances
Workshop; Deploy a Deep Learning Framework on Amazon ECS and Spot Instances
 
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...
Running Lean and Mean: Designing Cost-efficient Architectures on AWS (ARC313)...
 
DEV323_Introduction to the AWS CLI
DEV323_Introduction to the AWS CLIDEV323_Introduction to the AWS CLI
DEV323_Introduction to the AWS CLI
 
CON213_Hands-on Kubernetes on AWS
CON213_Hands-on Kubernetes on AWSCON213_Hands-on Kubernetes on AWS
CON213_Hands-on Kubernetes on AWS
 
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
Build a Java Spring Application on Amazon ECS - CON332 - re:Invent 2017
 
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
Interstella 8888: CICD for Containers on AWS - CON319 - re:Invent 2017
 
Deep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & FargateDeep Dive into Amazon ECS & Fargate
Deep Dive into Amazon ECS & Fargate
 

Similar a Deep Learning for Developers (Advanced Workshop)

Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Julien SIMON
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformShivaji Dutta
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)Julien SIMON
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetAmazon Web Services
 
Machine Learning with Spark
Machine Learning with SparkMachine Learning with Spark
Machine Learning with Sparkelephantscale
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Scalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNetScalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNetJulien SIMON
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningdoppenhe
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)Amazon Web Services
 
Deep Learning for Developers (December 2017)
Deep Learning for Developers (December 2017)Deep Learning for Developers (December 2017)
Deep Learning for Developers (December 2017)Julien SIMON
 
Deep Learning with Apache MXNet
Deep Learning with Apache MXNetDeep Learning with Apache MXNet
Deep Learning with Apache MXNetJulien SIMON
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
Scalable Deep Learning on AWS using Apache MXNet (May 2017)
Scalable Deep Learning on AWS using Apache MXNet (May 2017)Scalable Deep Learning on AWS using Apache MXNet (May 2017)
Scalable Deep Learning on AWS using Apache MXNet (May 2017)Julien SIMON
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningS N
 

Similar a Deep Learning for Developers (Advanced Workshop) (20)

Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Machine Learning with Spark
Machine Learning with SparkMachine Learning with Spark
Machine Learning with Spark
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Scalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNetScalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNet
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
 
Deep Learning for Developers (December 2017)
Deep Learning for Developers (December 2017)Deep Learning for Developers (December 2017)
Deep Learning for Developers (December 2017)
 
Deep Learning with Apache MXNet
Deep Learning with Apache MXNetDeep Learning with Apache MXNet
Deep Learning with Apache MXNet
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Scalable Deep Learning on AWS using Apache MXNet (May 2017)
Scalable Deep Learning on AWS using Apache MXNet (May 2017)Scalable Deep Learning on AWS using Apache MXNet (May 2017)
Scalable Deep Learning on AWS using Apache MXNet (May 2017)
 
Deep Learning Demystified
Deep Learning DemystifiedDeep Learning Demystified
Deep Learning Demystified
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 

Más de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Deep Learning for Developers (Advanced Workshop)

  • 1. ©2017, Amazon Web Services, Inc. or its affiliates. All rights reserved Deep Learning with Apache MXNet Julien Simon, Principal Technical Evangelist @julsimon
  • 3. Questions • What’s the business problem my IT has failed to solve? – That’s probably where Deep Learning can help • Should I design and train my own Deep Learning model? – Do I have the expertise? – Do I have enough time, data & compute to train it? • Should I use a pre-trained model? – How well does it fit my use case? – On what data was it trained? How close is this to my own data? • Should I use a SaaS solution?
  • 4. Amazon AI for every developer More information at aws.amazon.com/ai
  • 6. Frank Rosenblatt Artificial Intelligence pioneer 1957 – The Perceptron Appropriate weights are applied to the inputs, and the resulting weighted sum is passed to a function that produces the output.
  • 7. The neuron x = [x1, x2, …. xI ] w = [w1, w2, …. wI ] u = x.w Activation function
  • 8. The neural network x = x11, x12, …. x1I x21, x22, …. x2I x…, x.., …. x.I xm1, xm2, …. xmI I features m samples y = y1 y2 y… ym m labels Total number of predictions Accuracy = Number of correct predictions
  • 9. The training process • The difference between the predicted output and the actual output (aka ground truth) is called the prediction loss. • There are different ways to compute it (loss functions). • The purpose of training is to iteratively minimize loss and maximize accuracy for a given data set. • We need a way to adjust weights (aka parameters) in order to gradually minimize loss à Backpropagation + optimization algorithm
  • 10. Paul Werbos Artificial Intelligence pioneer IEEE Neural Network Pioneer Award 1974 – Backpropagation The back-propagation algorithm acts as an error correcting mechanism at each neuron level, thereby helping the network to learn effectively.
  • 11. Stochastic Gradient Descent (SGD) Imagine you stand on top of a mountain with skis strapped to your feet. You want to get down to the valley as quickly as possible, but there is fog and you can only see your immediate surroundings. How can you get down the mountain as quickly as possible? You look around and identify the steepest path down, go down that path for a bit, again look around and find the new steepest path, go down that path, and repeat—this is exactly what gradient descent does. Tim Dettmers University of Lugano 2015 Source: https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ The « step size » is called the learning rate
  • 12. There is such a thing as « learning too well » • If a network is large enough and given enough time, it will perfectly learn a data set (universal approximation theorem). • But what about new samples? Can it also predict them correctly? • Does the network generalize well or not? • To prevent overfitting, we need to know when to stop training. • The training data set is not enough.
  • 13. Data sets Full data set Training data set Validation data set Test data set Training set: this data set is used to adjust the weights on the neural network. Validation set: this data set is used to minimize overfitting. You're not adjusting the weights of the network with this data set, you're just verifying that any increase in accuracy over the training data set actually yields an increase in accuracy over a data set that has not been shown to the network before. Testing set: this data set is used only for testing the final weights in order to benchmark the actual predictive power of the network.
  • 14. Training Training data set Training Trained neural network Number of epochs Batch size Learning rate Hyper parameters
  • 15. Validation Validation Data set Trained neural network Validation accuracy Prediction at the end of each epoch Stop training when validation accuracy stops increasing Saving parameters at the end of each epoch is a good idea ;)
  • 16. Convolutional Neural Networks Le Cun, 1998: handwritten digit recognition, 32x32 pixels Feature extraction and downsampling allow smaller networks https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
  • 18. Apache MXNet Programmable Portable High Performance Near linear scaling across hundreds of GPUs Highly efficient models for mobile and IoT Simple syntax, multiple languages Most Open Best On AWS Optimized for deep learning on AWS Accepted into the Apache Incubator
  • 19. import numpy as np a = np.ones(10) b = np.ones(10) * 2 c = b * a d = c + 1 • Straightforward and flexible. • Take advantage of language native features (loop, condition, debugger). • E.g. Numpy, Matlab, Torch, … • Hard to optimize PROS CONS Easy to tweak in Python Imperative Programming
  • 20. • More chances for optimization • Cross different languages • E.g. TensorFlow, Theano, Caffe • Less flexible PROS CONS C can share memory with D because C is deleted later A = Variable('A') B = Variable('B') C = B * A D = C + 1 f = compile(D) d = f(A=np.ones(10), B=np.ones(10)*2) A B 1 + x Declarative Programming
  • 21. 0 4 8 12 16 1 2 4 8 16 Ideal Inception v3 Resnet Alexnet 91% Efficiency Multi-GPU Scaling With MXNet
  • 22. Input Output 1 1 1 1 0 1 0 0 0 3 mx.sym.Convolution(data, kernel=(5,5), num_filter=20) mx.sym.Pooling(data, pool_type="max", kernel=(2,2), stride=(2,2) lstm.lstm_unroll(num_lstm_layer, seq_len, len, num_hidden, num_embed) 4 2 2 0 4=Max 1 3 ... 4 0.2 -0.1 ... 0.7 mx.sym.FullyConnected(data, num_hidden=128) 2 mx.symbol.Embedding(data, input_dim, output_dim = k) 0.2 -0.1 ... 0.7 Queen 4 2 2 0 2=Avg Input Weights cos(w, queen) = cos(w, king) - cos(w, man) + cos(w, woman) mx.sym.Activation(data, act_type="xxxx") "relu" "tanh" "sigmoid" "softrelu" Neural Art Face Search Image Segmentation Image Caption “People Riding Bikes” Bicycle, People, Road, Sport Image Labels Image Video Speech Text “People Riding Bikes” Machine Translation “Οι άνθρωποι ιππασίας ποδήλατα” Events mx.model.FeedForward model.fit mx.sym.SoftmaxOutput
  • 23. Let’s code 1. Our first network 2. Using pre-trained models 3. Learning from scratch (MNIST, MLP, CNN) 4. Learning and fine-tuning (CIFAR-10, Resnext-101) 5. Using Apache MXNet as a Keras backend 6. Fine-tuning with Keras/MXNet (CIFAR-10, Resnet-50) 7. A quick word about Sockeye http://mxnet.io http://medium.com/@julsimon https://github.com/juliensimon/aws/tree/master/mxnet