SlideShare una empresa de Scribd logo
1 de 63
Descargar para leer sin conexión
1© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Making Computer Vision Real
Dr Ramine Tinati
Sn AI/ML Specialist
2© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Agenda
Introduction to Computer Vision
Advancements and Performance Testing
ML @ AWS
Use Case: Car Insurance Claims
3© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 3© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Background + State-of-the-Art
Computer Vision
4© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision
The Goal: Computer Driven Image Recognition
5© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
A Quick Intro: Artificial Neural Network
A Neuron
Takes the inputs and multiplies them
by their weights
Sums them up
Applied the activation function (tanh,
sigmoid, ReLU) to the sum
6© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
A Neural Network
Initialise the weights of the neurons
Performs the forward propagation:
takes the input and each neuron
calculates the output, and produces
the final output
Back propagation: Readjusts the
weights by calculating the error
using a chosen cost function (e.g.
sum of squared errors)
Aim is to minimize the cost
Image from: https://medium.com/datathings/neural-networks-and-
backpropagation-explained-in-a-simple-way-f540a3611f5e
A Quick Intro: Artificial Neural Network
7© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Cost Functions + Optimizers
Similar to traditional Machine Learning models, cost functions
help us tell ‘how good’ our model is at making predictions for
a given set of parameters.
Different cost functions measure different types of error
metrics, and are used for different predictive tasks (e.g.
regression vs prediction).
Optimizers are algorithms used to change the parameters of
the neural network (e.g. learning rate, weights), in order to
help reduce the overall calculated.
Optimizers such as Gradient Descent are not favourable in
Neural Networks due to the complexity of the networks
(updating weights, convergence, memory), instead we choose
other optimizer suitable for complexity (e.g. Adam)
Image from: https://medium.com/datathings/neural-networks-and-
backpropagation-explained-in-a-simple-way-f540a3611f5e
A Quick Intro: Artificial Neural Network
9© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Convolutional Neural Networks
The concept of CNNs were introduced in the early 1980
(..NN were even earlier)
The premise of a CNN was to be able to detect an object
within an image, independent of its position within the
frame, its rotation, or it’s interaction with other objects
The basic principle of a Convolutional Neural Network (CNN)
is to transform an image in to a matrix of values
Using this numerical representation segment the image
using a series of special neural network hidden layers which
expose both the depth of the image (e.g. the RGB colors),
and the orientation of the pixels, and use these to detect
patterns!
10© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Convolutional Neural Networks
We want to start with an image And produce a numerical representation
of the image which can be used to detect
repeatable patterns
11© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Convolutional Layers
A convolutional layer + a kernel, forms the foundation of a CNN
architecture.
In a traditional Neural Network, a Fully Connected layers is used,
where the nodes are connected to every node in the immediate
previous layer
However, a convolutional layer is locally connected, meaning
that they are only connected to a small subset of the previous
layer, but also share the same hyperparameters.
The process of training, backpropagation and forward pass are
very similar to traditional neural networks.
Tuning CNNs are a little trickier as they several more
hyperparameters:
Filter size
Stride/Padding
Pooling Layers
12© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Convolutional Layers
Start with an initial
image size h*w.
(3 color channels)
Create a 5x5 filter (kernel) and slide it across
the image using a specified Stride size. We also
need to take into consideration padding!
13© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Pooling Layers
The purpose of a pooling layer (or
down sampling layer) is to reduce
the size of the layer, thus reduce the
number of trainable parameters.
There are several parameters to set
when adding a pooling layer:
- Type (MaxPooling, AvgPooling, GlobalMax, GlobalAverage)
- Pooling size
- Stride
14© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Fully Connected Layer
The final layer in the CNN architecture is a fully connected layer
This takes ALL the outputs of the previous layer (which maybe a
pooling or convolutional Layer), and outputs a N dimensional
layer, where N represents the number of classes.
For a multi-class classification problem, a soft-max activation
function is used, whereas for a binary classification problem, a
sigmoid activation function is used in the final layer.
When defining the network, depending on the problem and
dataset, the type of loss function will need to be defined, e.g.
Categorical Cross Entropy.
15© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Activation Function
After a convolutional layer, an activation function
will be applied, typically the ReLU function is used
(but not for the final layer*)
The purpose of the activation function is to
introduce non-linearity to the process.
The ReLU activation function is favored over others
such as tanh/sigmoid as it reduces training time,
and decreases the vanishing gradient problem (as it
improves gradient descent) – but it does have it’s
problems, so use wisely!
16© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - SoftMax Activation Function
Given sample vector input x and weight
vectors {wi }, the predicted probability of y = j
A type of activation layer, usually at the end of FC
layer outputs
Can be viewed as a normalizer (a.k.a. Normalized
exponential function)
Produces a discrete probability distribution vector
Very convenient when combined with cross-
entropy loss
In practice, when building multi-class classifiers,
this is used as the last output layer.
(other activation functions do exist.. SVM,
regression layers)
17© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Loss // Regularization Layers
L1, L2 loss
- Cross-Entropy loss works well for
classification
- Huber Loss is more resilient to outliers with
smooth gradient
Minimum Squared Error works well for
regression task
Regularization Layers:
- Dropout
- Batch norm
- Gradient clipping
- Max norm constraint
18© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Dropout Layer
Whilst not always required, the Dropout Layer
helps reduce the common problem of overfitting,
and improves generalization.
The dropout layer simply ‘drops’ a random set of
activations in the preceding layer, by setting their
values to 0.
The aim here is to force the network to produce
the correct classification even though some of the
network is ‘deactivated’, thus this reduces the
chances of overfitting to the original data
19© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture - Batch Normalization
Networks train faster - Iterations will be slower, however
convergence will be quicker
Allows higher learning rates – Can use larger learning rates,
thus faster training time (careful experimentation is
required!)
Makes weights easier to initialize – Less effort required on
the initialization of weights, but still recommended to use
some form of distribution to set weights
Makes more activation functions viable - Use it with ReLU
and you reduced the issues of nonlinearities
Provides some regularization – This reduces the amount of
dropout required in the Architecture
Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift, Ioffe and Szegedy (2015)
http://proceedings.mlr.press/v37/ioffe15.pdf
20© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
CNN Architecture
21© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Training CNNs
Training CNNs is very similar to training any other Neural Network:
Perform a forward-pass across all nodes
Then update the weights during the backward pass
The aim is to obtain the best weights, which can be considered at the point with the
lowest amount of validation loss.
Hyperparameters play an important part in obtaining a decent accuracy
Tuning the HPs such as learning rate, batch size, filter size, etc. need to reflect the
training data and task trying to be achieved
(deeper) Architectures + Hyperparameters > Training Speed*
*lots of (GPU) compute resources are helpfulJ
22© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Training CNNs – Data Augmentation
The size and quality of the data plays an important part of the performance of a CNN.
However, bigger doesn’t always mean better!
One technique found to boost performance is to augment the original data sources to create
a larger dataset
Additionally, there are several reference datasets which can be used to help train a model
(and extremely useful for transfer learning)
MINST
CIFAR-10 / CIFAR-100
ImageNet
Caltech 101 / Caltech 256
23© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Transfer Learning
1. “Forward” transfer: train on one
task, transfer to a new task
2. Multi-task transfer: train on many
tasks, transfer to a new task
3. Multi-task meta-learning: learn to
learn from many tasks
A Survey on Transfer Learning:
https://www.cse.ust.hk/~qyang/Docs/2009/tkde_transfer_learning.pdf
24© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Transfer Learning Strategies
25© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Transfer Learning in Practice
1. Take an existing trained model (preferably in the
same domain)
2. (If manually creating a model) Train the existing
model, this means training some of the layers within
the architecture, and freezing some.
Small Dataset and lots of params – Freeze more
layers to avoid overfitting
Large Dataset and fewer params – Unfreeze more
layers as overfitting will be less of an issue.
3. Depending on the type of the model and purpose
(CNN, LSTM), we can remove the last layer within the
network (e.g. SoftMax), and then replace with an
appropriate layer which matches our purpose.
Transfer
Learning
Approaches
Description
Instance-
Transfer
To re-weight some labelled data in
the source domain for use in the
target domain
Feature-
Representation-
Transfer
Find a “good” feature representation
that reduces difference between the
source and the target domains and
the error of classification and
regression models
Parameter-
Transfer
Discover shared parameters or priors
between the source domain and
target domain models, which can
benefit for transfer learning
Relational-
Knowledge
Transfer
Build mapping of relational
knowledge between the source
domain and the target domains. Both
domains are relational domains and
i.i.d assumption is relaxed in each
domain
26© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Negative Transfer
Depending on the Domain and task, transfer learning may not
be the appropriate method for developing a predictive model.
The concept of Negative Transfer describes the process of
reducing the learning capacity of the target domain due to
the lack of transferability between the source domain and
task.
Rosenstein et al. [1] discussed the challenges in transfer
learning and the limit/bounds of task transferability.
One approach suggested [2] to overcome negative transfer is
to cluster types of tasks into groups (which share a low-
dimensional representation).
[1] M. T. Rosenstein, Z. Marx, and L. P. Kaelbling, “To transfer or not to transfer,” in a NIPS-05 Workshop on Inductive Transfer: 10 Years Later, December 2005
[2] B. Bakker and T. Heskes, “Task clustering and gating for bayesian multitask learning,” Journal of Machine Learning Reserch, vol. 4, pp. 83–99, 2003.
Source domain Target domain
Transfer Learning
Task: Predict
Car Model
Confidence:
{
‘car’: 0.1,
‘letter_a’: 0.6
}
Inference
27© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 27© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Deeper Dive into the techniques used to compare models
Measuring Performance
28© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
How to measure performance of models
Understanding how to measure the performance of a
model (or how research papers rank CV models) is
essential for comparing how different models perform
on a given dataset/challenge.
There are several metrics are used to measure the
performance of a model depending on the type of
model (object detection, boundary detection)
Many reference datasets exist, most research papers
use these as benchmarks for comparison (ImageNet,
COCO, CIFAR10, MINST)
The PASCAL Visual Object Classes (VOC) Challenge http://homepages.inf.ed.ac.uk/ckiw/postscript/ijcv_voc09.pdf
PRECISION!
RECALL!
AUC
ROC!
mAP!
IoU!
29© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – Precision + Recall
Precision measures how accurate is your
predictions. i.e. the percentage of your
predictions are correct.
Recall measures how good you find all
the positives. For example, we can find
80% of the possible positive cases in
our top K predictions.
F1 Score is also known as the harmonic
mean. Note: it doesn’t take into account
True Negatives
Scikit learn classification matrix
30© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – IoU (Intersection over union)
IoU is a metric which represents
the overlap between 2
boundaries, comparing the
predicted boundary region
against the ground truth,
If the predicted boundary boxes
I(x1,y2,x2,y2) was equal to
g(x1,y1,x2,y2), then the IoU score
would be 1.
31© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – Average Precision (AP)
AP is a more complex metric which
combines Precision + Recall, IoU and
some simple integrals.
It’s a commonly used for object
detection computer vision models, as it
provides a measure of how well a
model is predicting classes of objects,
based on the ranking of the confidence
of predictions.
Often there will be Metrics such as
AP50, AP75 which represents the AP
when the IoU is at least 50%, 75%, etc.
The AP summarises the shape of the
precision/recall curve, and is defined as
the mean precision at a set of eleven
equally spaced recall levels [0,0.1,...,1].
The precision at each recall level r is
interpolated by taking the maximum
precision measured for a method for
which the corresponding recall exceeds r:
where p(r˜) is the measured precision at recall ˜r.
32© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – Average Precision (AP)
To calculate AP, first we generate all our predictions, and
then rank them in descending order based on their
confidence score.
A confidence score of > 0.5 means a correct classification.
IoU is the metric for our confidence score
In this example, there are only 5 objects to be detected. We
then row by row calculate the Precision and Recall.
Row 4:
Precision (TP / TP + FP) à 2/4 = 0.5
Recall (TP / TP + FN) à 2/5 = 0.4
Note: As the confidence score decreases, the recall increases,
but the precision flutters up and down.
Rank Conf Correct
?
Precision Recall
1 0.99 True 1.0 0.2
2 0.97 True 1.0 0.4
3 0.80 False 0.67 0.4
4 0.78 False 0.5 0.4
5 0.76 False 0.4 0.4
6 0.75 True 0.5 0.6
7 0.75 True 0.57 0.8
8 0.74 False 0.5 0.8
9 0.71 False 0.44 0.8
10 0.70 True 0.5 1.0
33© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – Average Precision (AP)
The Zig-Zag effect can be seen more clearly
using a precision-recall plot. At this point we
can now examine the integral for calculating
the AP, which results in a single numerical
value of the AP.
0.00
0.20
0.40
0.60
0.80
1.00
0.20 0.40 0.60 0.80 1.00
Precision Recall
For the mathematicians, we can either smooth
the curve curve, or calculate a polynomial to
represent for use in calculating the integral of
the PR curve.
34© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – Average Precision (AP)
One approach to calculating the integral
of the PR curve is to use the Maximum
prevision at each ‘step’, which makes it
less suspectable to smaller variations in
the rankings.
For definition for replacing the
precision value for recall (r~) with the
maximum prevision is defined as:
35© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Model Performance – COCO mAP (Mean Average Precision)
As the COCO dataset has become a
gold standard reference dataset for
CV, Ap is the average over multiple
IoU)
The mAP is the average of AP. In some
contexts this is comping the AP for
each class and average them, in other
contexts, AP and mAP are the same
thing.
For example, under the COCO context,
there is no difference between AP and
mAP.
36© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Advancements in Computer Vision
Development of Network Architectures
37© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision - AlexNet
In 2012, The first CNN with an acceptable level of accuracy was published.
AlexNet was based on the now popular ImageNet dataset, achieved a top-5 test
error of 15.4% (2012 ILSVRC) (the next best has >25% test error )
Due to the compute complexity, the team split the processing across two GPU
pipelines
ImageNet Classification with Deep Convolutional Neural Networks:
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
38© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision - VGG
In 2014, the Very Deep Convolutional
Network (VGG-19) was produced.
A 19 layer CNN, with a small filter (3x3)
compared to AlexNet.
Used 3 back-to-back convolutional
layers before pooling.
First to demonstrated the use of very
deep layers
Achieved a top-5 test error of 7.3%
Very deep convolutional networks for large-scale image recognition
Https://arxiv.Org/pdf/1409.1556.Pdf
39© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision - GoogleNet
Google announced GoogleNet in 2015, which tackled the problem of the huge
computational costs needed to train a CNN
The architecture introduced a method of reducing the number of features (thus trainable
parameters), using 1x1 convolutional layers, and running parallel convolutions (Inception
Model).
Demonstrated that stacking is not the only
approach to developing CNNs
This achieved a very reasonable
top-5 test error of 6.7%
Later revisions of the Inception
model introduced batch normalization as a
layer to improve performance and reduce training time Going Deeper with Convolutions
https://arxiv.org/abs/1409.4842
40© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision - ResNet
In 2015, Microsoft produced ResNet, which was a 152 layer network.
The basis behind the Residual Network Is the Residual Block, where the
outputs of the conv-relu-conv cycle is added to the original input.
At each cycle the computation is tracking to small change to the original
input, rather than just form a completely new representation of the
image.
These changes are then used to update the next cycle, which also
improves the process of training during the back-propagation stage
This architecture achieve a top-5 test error of 3.6% (humans are
usually in the range of 5-10%)
Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385
41© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Computer Vision – Region-CNNs
Region-based CNNs can be considered as
of the recent advancements in the field of
computer vision. R-CNNs aim to solve the
problem of object detection tasks.
By using the fundamentals of CNNs,
regions which correspond to objects within
an image can be detected, and bounding
boxes can be drawn.
Search for Mask-RCNN or Fast-CNNs for
more info on the application and
architectures
42© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
YOLO (You Only Look Once)
YOLO is a network for object detection.
Compared to existing R-CNN approaches
which use a pipeline approach, YOLO uses a
single NN to perform object detection (single
regression problem).
The speed allows for real-time processing of
images à Videos!
Network is based on the ResNet Architecture,
and various flavors exist to suit different
computational needs
The most recent YOLOv3 uses 75
Convolutional Layers, No Fully Connected
Layers, No Pooling, and No SoftMax Layer
https://arxiv.org/pdf/1506.02640.pdf
43© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Single Shot MultiBox Detector (SSD)
SSD only requires a single shot to detect multiple
objects within an image. This means only one
forward pass, where as other regional based
models require multiple shots.
The single pass means SSD is great for object
detection in video!
For each region, k bounding boxes b are
identified. These k bounding boxes have different
sizes and aspect ratios. For each b , c class scores
are computed along with 4 offsets relative to the
original default bounding box shape. (hence
MultiBox)
Architecture is built on VGG-16, and for smaller
objects, achieves a higher level of accuracy
compared to YOLO
https://arxiv.org/pdf/1512.02325.pdf
https://cv-tricks.com/object-detection/faster-r-cnn-yolo-ssd/
44© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 44© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Deeper Dive into the Technology
Machine Learning @ AWS
45© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
The AWS ML Stack
Broadest and most complete set of Machine Learning capabilities
VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD DEVELOPMENT CONTACT CENTERS
Ground
Truth
Augmented
AI
ML
Marketplace
Neo
Built-in
algorithms
Notebooks Experiments
Model
training &
tuning
Debugger Autopilot
Model
hosting
Model Monitor
Deep Learning
AMIs & Containers
GPUs &
CPUs
Elastic
Inference
Inferentia FPGA
Amazon
Rekognition
Amazon
Polly
Amazon
Transcribe
+Medical
Amazon
Comprehend
+Medical
Amazon
Translate
Amazon
Lex
Amazon
Personalize
Amazon
Forecast
Amazon
Fraud Detector
Amazon
CodeGuru
AI SERVICES
ML SERVICES
ML FRAMEWORKS & INFRASTRUCTURE
Amazon
Textract
Amazon
Kendra
Contact Lens
For Amazon Connect
SageMaker Studio IDE
NEW
NEW! NEW! NEW! NEW!
NEW!
NEW! NEW! NEW! NEW! NEW!
Amazon SageMaker
46© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Fully managed data
processing jobs and
data labeling
workflows
One-click collaborative
notebooks and built-
in, high performance
algorithms and models
One-click
training
Debugging and
optimization
One-click
deployment and
autoscaling
Amazon SageMaker helps you build, train, and deploy models
Visually track and
compare experiments
Automatically
spot
concept drift
Fully
managed with
auto-scaling
for 75% less
Prepare Build Train & Tune Deploy & Manage
101011010
010101010
000011110
Collect and
prepare
training data
Choose or build an
ML algorithm
Set up and manage
environments
for training
Train, debug, and
tune models
Deploy
model in
production
Manage training runs Monitor
models
Add human
review of
predictions
Web-based IDE for machine learning
Automatically build and train models
47© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
AMAZON SAGEMAKER IS FULLY MANAGED
One click model deployment
Auto-scaling Python SDK
Bring your
own model
Low latency and
high throughput
Deploy multiple
models on an
endpoint
48© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Amazon SageMaker Notebooks
Access your notebooks in
seconds
Administrators manage
access and permissions
Share notebooks
with a single click
Dial up or down
compute resources
(Coming soon)
Start your notebooks
without spinning up
compute resources
Fast-start sharable notebooks (in preview)
49© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
• Fully-managed training and
hosting
• Near-linear scaling across 100s of
GPUs
• 3x faster network throughput with
EC2 P3
65% Stock
TensorFlow
AWS-optimized
TensorFlow90%
AMAZON SAGEMAKER IS THE BEST PLACE TO RUN TENSORFLOW
50© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Classification Computer Vision Topic Modeling
Working with Text
Recommendation
Forecasting
• Linear Learner
• XGBoost
• KNN
• Image Classification
• BlazingText
• Supervised
• Unsupervised
• Factorization Machines
• DeepAR
• LDA
• NTM
AMAZON SAGEMAKER HAS BUILT-IN ALGORITHMS OR BRING YOUR OWN
Anomaly Detection
• Random Cut Forests
Sequence Translation
• Seq2Seq
• Object Detection
Clustering
• KMeans
Feature Reduction
• PCA
Regression
• Linear
Learner
• XGBoost
• KNN
• IP Insights
• Semantic Segmentation
• Object2Vec
51© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
GluonCV: Deep Learning Toolkit for Computer Vision
GluonCV – open source deep learning
interface for quickly build machine
learning models, without compromising
performance
• Training with SOTA results from latest
papers
• Large set of pre-trained models
• Carefully designed APIs, easy to
understand implementations
• Community support
Benefits
52© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
AWS AWS R-CNN Example
https://github.com/aws-samples/mask-rcnn-tensorflow
Primary focus was on increasing training
throughput without sacrificing any accuracy.
We do this by training with a batch size > 1
per GPU using FP16 and two custom TF ops.
Dataset: COCO 2017
Pre-Trained Model: ResNet-r50
EC2 Instance Type: P3dn.24xl I
Num_GPUs x
Images_Per_
GPU
Trainin
g time
Box mAP Mask mAP
8x4 9.78h 38.25% 35.08%
16x4 5.60h 38.44% 35.18%
32x4 3.33h 38.33% 35.12%
53© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
You can shop for algorithms, models, and data in AWS Marketplace
AWS MARKETPLACE
Browse or search
AWS Marketplace
Subscribe in a
single click
Available in
Amazon SageMaker
54© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
HUNDREDS OF ALGORITHMS, MODELS, AND DATA
Natural language processing
Text-to-speech
Object detection
Speech recognition
Grammar and parsingText generation
Speaker identification
Regression
Text OCR
Text classification
Text clustering
Computer vision
3D images
Handwriting recognition
Named entity recognition
Anomaly detection
Ranking Video classification
Automatic labeling via machine learning
IP protection
Automated billing and metering
SELLERS
Broad selection of paid, free, and open-source
algorithms and models
Data protection
Discoverable on your AWS bill
BUYERS
55© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
60+ Computer Vision Models and Algorithms
56© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 56© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Example Use Case
Using SageMaker for Detecting
False Insurance Claims Images
57© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
58© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Detecting False Claims
Global car insurance organizations receive tens of
thousands of claims per day, which require
significant human resources to review, investigate,
and approve the claims.
The use of computer vision can help reduce the
overheads of the claims team by providing an
automated mechanism for detecting potentially false
or spam insurance claims.
In this session we’re going to build a custom solution
which uses computer vision models, to detect cars
and damages on cars.
59© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Detecting False Claims: Solution Architecture
60© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Detecting False Claims: Inferencing
Custom Trained Model
61© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Detecting False Claims: Using SageMaker
Amazon SageMaker is the first step in producing a custom
Image Classification Model.
At this stage, data preparation, exploration, is performed to
ensure the initial data used for training a model is suitable.
Model training is an iterative process, and the first model will
be supported by a cleansed dataset
Once model performance is acceptable, the model can be
deployed and then the code wrapped up for deployment on
Kubernetes
62© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Detecting False Claims: Using SageMaker
63© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 63© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Car Image Detection Workflow
Demo
65© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Wrap up…phew!
We’ve covered A LOT of content:
Neural Networks for CV
Architecture Advances
Performance Measuring
AWS Services
Demos of using SageMaker for Image Classification
…Hopefully you can take something from this and go explore!

Más contenido relacionado

La actualidad más candente

Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOAnimesh Singh
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLSeldon
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)Julien SIMON
 
TFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformTFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformShunya Ueta
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranDatabricks
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsDatabricks
 
On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidYufeng Guo
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelinesjeykottalam
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Databricks
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarRomeo Kienzler
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageAnimesh Singh
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleLviv Startup Club
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflowDatabricks
 
Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018David Tan
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...Fei Chen
 

La actualidad más candente (20)

Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
 
TFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformTFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platform
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
 
How to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML modelsHow to use Apache TVM to optimize your ML models
How to use Apache TVM to optimize your ML models
 
On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on Android
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, QatarIBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
 
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycleKyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflow
 
Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
 

Similar a Build computer vision models to perform object detection and classification with AWS

Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Amazon Web Services
 
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...Amazon Web Services
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptxSaloniMalhotra23
 
IRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET Journal
 
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...Amazon Web Services
 
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonMCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonAmazon Web Services
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...Edge AI and Vision Alliance
 
IRJET- Self Driving RC Car using Behavioral Cloning
IRJET-  	  Self Driving RC Car using Behavioral CloningIRJET-  	  Self Driving RC Car using Behavioral Cloning
IRJET- Self Driving RC Car using Behavioral CloningIRJET Journal
 
Using Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetUsing Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetApache MXNet
 
Deep Learning in Java with Apache MXNet
Deep Learning in Java with Apache MXNetDeep Learning in Java with Apache MXNet
Deep Learning in Java with Apache MXNetQing Lan
 
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...IJECEIAES
 
IRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET Journal
 
CAR DAMAGE DETECTION USING DEEP LEARNING
CAR DAMAGE DETECTION USING DEEP LEARNINGCAR DAMAGE DETECTION USING DEEP LEARNING
CAR DAMAGE DETECTION USING DEEP LEARNINGIRJET Journal
 
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS Summit
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS SummitDeploying cost-effective machine learning models - AIM202 - Atlanta AWS Summit
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS SummitAmazon Web Services
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET Journal
 
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS Summit
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS SummitDeploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS Summit
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS SummitAmazon Web Services
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABCodeOps Technologies LLP
 
IRJET - Steering Wheel Angle Prediction for Self-Driving Cars
IRJET - Steering Wheel Angle Prediction for Self-Driving CarsIRJET - Steering Wheel Angle Prediction for Self-Driving Cars
IRJET - Steering Wheel Angle Prediction for Self-Driving CarsIRJET Journal
 
IRJET- A Vision based Hand Gesture Recognition System using Convolutional...
IRJET-  	  A Vision based Hand Gesture Recognition System using Convolutional...IRJET-  	  A Vision based Hand Gesture Recognition System using Convolutional...
IRJET- A Vision based Hand Gesture Recognition System using Convolutional...IRJET Journal
 

Similar a Build computer vision models to perform object detection and classification with AWS (20)

Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
Deep Learning for Developers: An Introduction, Featuring Samsung SDS (AIM301-...
 
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
[REPEAT] Deep Learning for Developers: An Introduction, Featuring Samsung SDS...
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx
 
IRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural NetworksIRJET- Mango Classification using Convolutional Neural Networks
IRJET- Mango Classification using Convolutional Neural Networks
 
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
 
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and GluonMCL310_Building Deep Learning Applications with Apache MXNet and Gluon
MCL310_Building Deep Learning Applications with Apache MXNet and Gluon
 
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
“High-fidelity Conversion of Floating-point Networks for Low-precision Infere...
 
IRJET- Self Driving RC Car using Behavioral Cloning
IRJET-  	  Self Driving RC Car using Behavioral CloningIRJET-  	  Self Driving RC Car using Behavioral Cloning
IRJET- Self Driving RC Car using Behavioral Cloning
 
Using Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetUsing Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNet
 
Deep Learning in Java with Apache MXNet
Deep Learning in Java with Apache MXNetDeep Learning in Java with Apache MXNet
Deep Learning in Java with Apache MXNet
 
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...
 
IRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog ImagesIRJET- Image Classification – Cat and Dog Images
IRJET- Image Classification – Cat and Dog Images
 
CAR DAMAGE DETECTION USING DEEP LEARNING
CAR DAMAGE DETECTION USING DEEP LEARNINGCAR DAMAGE DETECTION USING DEEP LEARNING
CAR DAMAGE DETECTION USING DEEP LEARNING
 
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS Summit
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS SummitDeploying cost-effective machine learning models - AIM202 - Atlanta AWS Summit
Deploying cost-effective machine learning models - AIM202 - Atlanta AWS Summit
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language Classification
 
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS Summit
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS SummitDeploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS Summit
Deploying Cost-Effective Machine Learning Models - AIM204 - Anaheim AWS Summit
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
 
IRJET - Steering Wheel Angle Prediction for Self-Driving Cars
IRJET - Steering Wheel Angle Prediction for Self-Driving CarsIRJET - Steering Wheel Angle Prediction for Self-Driving Cars
IRJET - Steering Wheel Angle Prediction for Self-Driving Cars
 
IRJET- A Vision based Hand Gesture Recognition System using Convolutional...
IRJET-  	  A Vision based Hand Gesture Recognition System using Convolutional...IRJET-  	  A Vision based Hand Gesture Recognition System using Convolutional...
IRJET- A Vision based Hand Gesture Recognition System using Convolutional...
 

Más de Bill Liu

Walk Through a Real World ML Production Project
Walk Through a Real World ML Production ProjectWalk Through a Real World ML Production Project
Walk Through a Real World ML Production ProjectBill Liu
 
Redefining MLOps with Model Deployment, Management and Observability in Produ...
Redefining MLOps with Model Deployment, Management and Observability in Produ...Redefining MLOps with Model Deployment, Management and Observability in Produ...
Redefining MLOps with Model Deployment, Management and Observability in Produ...Bill Liu
 
Productizing Machine Learning at the Edge
Productizing Machine Learning at the EdgeProductizing Machine Learning at the Edge
Productizing Machine Learning at the EdgeBill Liu
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroBill Liu
 
Practical Crowdsourcing for ML at Scale
Practical Crowdsourcing for ML at ScalePractical Crowdsourcing for ML at Scale
Practical Crowdsourcing for ML at ScaleBill Liu
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBill Liu
 
Deep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its ApplicationsDeep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its ApplicationsBill Liu
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Bill Liu
 
Causal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningCausal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningBill Liu
 
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningWeekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningBill Liu
 
AISF19 - On Blending Machine Learning with Microeconomics
AISF19 - On Blending Machine Learning with MicroeconomicsAISF19 - On Blending Machine Learning with Microeconomics
AISF19 - On Blending Machine Learning with MicroeconomicsBill Liu
 
AISF19 - Travel in the AI-First World
AISF19 - Travel in the AI-First WorldAISF19 - Travel in the AI-First World
AISF19 - Travel in the AI-First WorldBill Liu
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...Bill Liu
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917Bill Liu
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLPBill Liu
 
Auto visualization and viml
Auto visualization and vimlAuto visualization and viml
Auto visualization and vimlBill Liu
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AIBill Liu
 
An Introduction to Neural Architecture Search
An Introduction to Neural Architecture SearchAn Introduction to Neural Architecture Search
An Introduction to Neural Architecture SearchBill Liu
 
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...Bill Liu
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise controlBill Liu
 

Más de Bill Liu (20)

Walk Through a Real World ML Production Project
Walk Through a Real World ML Production ProjectWalk Through a Real World ML Production Project
Walk Through a Real World ML Production Project
 
Redefining MLOps with Model Deployment, Management and Observability in Produ...
Redefining MLOps with Model Deployment, Management and Observability in Produ...Redefining MLOps with Model Deployment, Management and Observability in Produ...
Redefining MLOps with Model Deployment, Management and Observability in Produ...
 
Productizing Machine Learning at the Edge
Productizing Machine Learning at the EdgeProductizing Machine Learning at the Edge
Productizing Machine Learning at the Edge
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to Hero
 
Practical Crowdsourcing for ML at Scale
Practical Crowdsourcing for ML at ScalePractical Crowdsourcing for ML at Scale
Practical Crowdsourcing for ML at Scale
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Deep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its ApplicationsDeep Reinforcement Learning and Its Applications
Deep Reinforcement Learning and Its Applications
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19
 
Causal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine LearningCausal Inference in Data Science and Machine Learning
Causal Inference in Data Science and Machine Learning
 
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine LearningWeekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
Weekly #105: AutoViz and Auto_ViML Visualization and Machine Learning
 
AISF19 - On Blending Machine Learning with Microeconomics
AISF19 - On Blending Machine Learning with MicroeconomicsAISF19 - On Blending Machine Learning with Microeconomics
AISF19 - On Blending Machine Learning with Microeconomics
 
AISF19 - Travel in the AI-First World
AISF19 - Travel in the AI-First WorldAISF19 - Travel in the AI-First World
AISF19 - Travel in the AI-First World
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
 
Auto visualization and viml
Auto visualization and vimlAuto visualization and viml
Auto visualization and viml
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
An Introduction to Neural Architecture Search
An Introduction to Neural Architecture SearchAn Introduction to Neural Architecture Search
An Introduction to Neural Architecture Search
 
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...
weekly AI tech talk #85 ml-agents Enabling Learned Behaviors with Reinforceme...
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise control
 

Último

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Build computer vision models to perform object detection and classification with AWS

  • 1. 1© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Making Computer Vision Real Dr Ramine Tinati Sn AI/ML Specialist
  • 2. 2© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Agenda Introduction to Computer Vision Advancements and Performance Testing ML @ AWS Use Case: Car Insurance Claims
  • 3. 3© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 3© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Background + State-of-the-Art Computer Vision
  • 4. 4© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision The Goal: Computer Driven Image Recognition
  • 5. 5© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | A Quick Intro: Artificial Neural Network A Neuron Takes the inputs and multiplies them by their weights Sums them up Applied the activation function (tanh, sigmoid, ReLU) to the sum
  • 6. 6© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | A Neural Network Initialise the weights of the neurons Performs the forward propagation: takes the input and each neuron calculates the output, and produces the final output Back propagation: Readjusts the weights by calculating the error using a chosen cost function (e.g. sum of squared errors) Aim is to minimize the cost Image from: https://medium.com/datathings/neural-networks-and- backpropagation-explained-in-a-simple-way-f540a3611f5e A Quick Intro: Artificial Neural Network
  • 7. 7© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Cost Functions + Optimizers Similar to traditional Machine Learning models, cost functions help us tell ‘how good’ our model is at making predictions for a given set of parameters. Different cost functions measure different types of error metrics, and are used for different predictive tasks (e.g. regression vs prediction). Optimizers are algorithms used to change the parameters of the neural network (e.g. learning rate, weights), in order to help reduce the overall calculated. Optimizers such as Gradient Descent are not favourable in Neural Networks due to the complexity of the networks (updating weights, convergence, memory), instead we choose other optimizer suitable for complexity (e.g. Adam) Image from: https://medium.com/datathings/neural-networks-and- backpropagation-explained-in-a-simple-way-f540a3611f5e A Quick Intro: Artificial Neural Network
  • 8. 9© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Convolutional Neural Networks The concept of CNNs were introduced in the early 1980 (..NN were even earlier) The premise of a CNN was to be able to detect an object within an image, independent of its position within the frame, its rotation, or it’s interaction with other objects The basic principle of a Convolutional Neural Network (CNN) is to transform an image in to a matrix of values Using this numerical representation segment the image using a series of special neural network hidden layers which expose both the depth of the image (e.g. the RGB colors), and the orientation of the pixels, and use these to detect patterns!
  • 9. 10© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Convolutional Neural Networks We want to start with an image And produce a numerical representation of the image which can be used to detect repeatable patterns
  • 10. 11© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Convolutional Layers A convolutional layer + a kernel, forms the foundation of a CNN architecture. In a traditional Neural Network, a Fully Connected layers is used, where the nodes are connected to every node in the immediate previous layer However, a convolutional layer is locally connected, meaning that they are only connected to a small subset of the previous layer, but also share the same hyperparameters. The process of training, backpropagation and forward pass are very similar to traditional neural networks. Tuning CNNs are a little trickier as they several more hyperparameters: Filter size Stride/Padding Pooling Layers
  • 11. 12© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Convolutional Layers Start with an initial image size h*w. (3 color channels) Create a 5x5 filter (kernel) and slide it across the image using a specified Stride size. We also need to take into consideration padding!
  • 12. 13© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Pooling Layers The purpose of a pooling layer (or down sampling layer) is to reduce the size of the layer, thus reduce the number of trainable parameters. There are several parameters to set when adding a pooling layer: - Type (MaxPooling, AvgPooling, GlobalMax, GlobalAverage) - Pooling size - Stride
  • 13. 14© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Fully Connected Layer The final layer in the CNN architecture is a fully connected layer This takes ALL the outputs of the previous layer (which maybe a pooling or convolutional Layer), and outputs a N dimensional layer, where N represents the number of classes. For a multi-class classification problem, a soft-max activation function is used, whereas for a binary classification problem, a sigmoid activation function is used in the final layer. When defining the network, depending on the problem and dataset, the type of loss function will need to be defined, e.g. Categorical Cross Entropy.
  • 14. 15© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Activation Function After a convolutional layer, an activation function will be applied, typically the ReLU function is used (but not for the final layer*) The purpose of the activation function is to introduce non-linearity to the process. The ReLU activation function is favored over others such as tanh/sigmoid as it reduces training time, and decreases the vanishing gradient problem (as it improves gradient descent) – but it does have it’s problems, so use wisely!
  • 15. 16© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - SoftMax Activation Function Given sample vector input x and weight vectors {wi }, the predicted probability of y = j A type of activation layer, usually at the end of FC layer outputs Can be viewed as a normalizer (a.k.a. Normalized exponential function) Produces a discrete probability distribution vector Very convenient when combined with cross- entropy loss In practice, when building multi-class classifiers, this is used as the last output layer. (other activation functions do exist.. SVM, regression layers)
  • 16. 17© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Loss // Regularization Layers L1, L2 loss - Cross-Entropy loss works well for classification - Huber Loss is more resilient to outliers with smooth gradient Minimum Squared Error works well for regression task Regularization Layers: - Dropout - Batch norm - Gradient clipping - Max norm constraint
  • 17. 18© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Dropout Layer Whilst not always required, the Dropout Layer helps reduce the common problem of overfitting, and improves generalization. The dropout layer simply ‘drops’ a random set of activations in the preceding layer, by setting their values to 0. The aim here is to force the network to produce the correct classification even though some of the network is ‘deactivated’, thus this reduces the chances of overfitting to the original data
  • 18. 19© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture - Batch Normalization Networks train faster - Iterations will be slower, however convergence will be quicker Allows higher learning rates – Can use larger learning rates, thus faster training time (careful experimentation is required!) Makes weights easier to initialize – Less effort required on the initialization of weights, but still recommended to use some form of distribution to set weights Makes more activation functions viable - Use it with ReLU and you reduced the issues of nonlinearities Provides some regularization – This reduces the amount of dropout required in the Architecture Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe and Szegedy (2015) http://proceedings.mlr.press/v37/ioffe15.pdf
  • 19. 20© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | CNN Architecture
  • 20. 21© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Training CNNs Training CNNs is very similar to training any other Neural Network: Perform a forward-pass across all nodes Then update the weights during the backward pass The aim is to obtain the best weights, which can be considered at the point with the lowest amount of validation loss. Hyperparameters play an important part in obtaining a decent accuracy Tuning the HPs such as learning rate, batch size, filter size, etc. need to reflect the training data and task trying to be achieved (deeper) Architectures + Hyperparameters > Training Speed* *lots of (GPU) compute resources are helpfulJ
  • 21. 22© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Training CNNs – Data Augmentation The size and quality of the data plays an important part of the performance of a CNN. However, bigger doesn’t always mean better! One technique found to boost performance is to augment the original data sources to create a larger dataset Additionally, there are several reference datasets which can be used to help train a model (and extremely useful for transfer learning) MINST CIFAR-10 / CIFAR-100 ImageNet Caltech 101 / Caltech 256
  • 22. 23© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Transfer Learning 1. “Forward” transfer: train on one task, transfer to a new task 2. Multi-task transfer: train on many tasks, transfer to a new task 3. Multi-task meta-learning: learn to learn from many tasks A Survey on Transfer Learning: https://www.cse.ust.hk/~qyang/Docs/2009/tkde_transfer_learning.pdf
  • 23. 24© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Transfer Learning Strategies
  • 24. 25© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Transfer Learning in Practice 1. Take an existing trained model (preferably in the same domain) 2. (If manually creating a model) Train the existing model, this means training some of the layers within the architecture, and freezing some. Small Dataset and lots of params – Freeze more layers to avoid overfitting Large Dataset and fewer params – Unfreeze more layers as overfitting will be less of an issue. 3. Depending on the type of the model and purpose (CNN, LSTM), we can remove the last layer within the network (e.g. SoftMax), and then replace with an appropriate layer which matches our purpose. Transfer Learning Approaches Description Instance- Transfer To re-weight some labelled data in the source domain for use in the target domain Feature- Representation- Transfer Find a “good” feature representation that reduces difference between the source and the target domains and the error of classification and regression models Parameter- Transfer Discover shared parameters or priors between the source domain and target domain models, which can benefit for transfer learning Relational- Knowledge Transfer Build mapping of relational knowledge between the source domain and the target domains. Both domains are relational domains and i.i.d assumption is relaxed in each domain
  • 25. 26© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Negative Transfer Depending on the Domain and task, transfer learning may not be the appropriate method for developing a predictive model. The concept of Negative Transfer describes the process of reducing the learning capacity of the target domain due to the lack of transferability between the source domain and task. Rosenstein et al. [1] discussed the challenges in transfer learning and the limit/bounds of task transferability. One approach suggested [2] to overcome negative transfer is to cluster types of tasks into groups (which share a low- dimensional representation). [1] M. T. Rosenstein, Z. Marx, and L. P. Kaelbling, “To transfer or not to transfer,” in a NIPS-05 Workshop on Inductive Transfer: 10 Years Later, December 2005 [2] B. Bakker and T. Heskes, “Task clustering and gating for bayesian multitask learning,” Journal of Machine Learning Reserch, vol. 4, pp. 83–99, 2003. Source domain Target domain Transfer Learning Task: Predict Car Model Confidence: { ‘car’: 0.1, ‘letter_a’: 0.6 } Inference
  • 26. 27© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 27© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Deeper Dive into the techniques used to compare models Measuring Performance
  • 27. 28© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | How to measure performance of models Understanding how to measure the performance of a model (or how research papers rank CV models) is essential for comparing how different models perform on a given dataset/challenge. There are several metrics are used to measure the performance of a model depending on the type of model (object detection, boundary detection) Many reference datasets exist, most research papers use these as benchmarks for comparison (ImageNet, COCO, CIFAR10, MINST) The PASCAL Visual Object Classes (VOC) Challenge http://homepages.inf.ed.ac.uk/ckiw/postscript/ijcv_voc09.pdf PRECISION! RECALL! AUC ROC! mAP! IoU!
  • 28. 29© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – Precision + Recall Precision measures how accurate is your predictions. i.e. the percentage of your predictions are correct. Recall measures how good you find all the positives. For example, we can find 80% of the possible positive cases in our top K predictions. F1 Score is also known as the harmonic mean. Note: it doesn’t take into account True Negatives Scikit learn classification matrix
  • 29. 30© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – IoU (Intersection over union) IoU is a metric which represents the overlap between 2 boundaries, comparing the predicted boundary region against the ground truth, If the predicted boundary boxes I(x1,y2,x2,y2) was equal to g(x1,y1,x2,y2), then the IoU score would be 1.
  • 30. 31© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – Average Precision (AP) AP is a more complex metric which combines Precision + Recall, IoU and some simple integrals. It’s a commonly used for object detection computer vision models, as it provides a measure of how well a model is predicting classes of objects, based on the ranking of the confidence of predictions. Often there will be Metrics such as AP50, AP75 which represents the AP when the IoU is at least 50%, 75%, etc. The AP summarises the shape of the precision/recall curve, and is defined as the mean precision at a set of eleven equally spaced recall levels [0,0.1,...,1]. The precision at each recall level r is interpolated by taking the maximum precision measured for a method for which the corresponding recall exceeds r: where p(r˜) is the measured precision at recall ˜r.
  • 31. 32© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – Average Precision (AP) To calculate AP, first we generate all our predictions, and then rank them in descending order based on their confidence score. A confidence score of > 0.5 means a correct classification. IoU is the metric for our confidence score In this example, there are only 5 objects to be detected. We then row by row calculate the Precision and Recall. Row 4: Precision (TP / TP + FP) à 2/4 = 0.5 Recall (TP / TP + FN) à 2/5 = 0.4 Note: As the confidence score decreases, the recall increases, but the precision flutters up and down. Rank Conf Correct ? Precision Recall 1 0.99 True 1.0 0.2 2 0.97 True 1.0 0.4 3 0.80 False 0.67 0.4 4 0.78 False 0.5 0.4 5 0.76 False 0.4 0.4 6 0.75 True 0.5 0.6 7 0.75 True 0.57 0.8 8 0.74 False 0.5 0.8 9 0.71 False 0.44 0.8 10 0.70 True 0.5 1.0
  • 32. 33© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – Average Precision (AP) The Zig-Zag effect can be seen more clearly using a precision-recall plot. At this point we can now examine the integral for calculating the AP, which results in a single numerical value of the AP. 0.00 0.20 0.40 0.60 0.80 1.00 0.20 0.40 0.60 0.80 1.00 Precision Recall For the mathematicians, we can either smooth the curve curve, or calculate a polynomial to represent for use in calculating the integral of the PR curve.
  • 33. 34© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – Average Precision (AP) One approach to calculating the integral of the PR curve is to use the Maximum prevision at each ‘step’, which makes it less suspectable to smaller variations in the rankings. For definition for replacing the precision value for recall (r~) with the maximum prevision is defined as:
  • 34. 35© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Model Performance – COCO mAP (Mean Average Precision) As the COCO dataset has become a gold standard reference dataset for CV, Ap is the average over multiple IoU) The mAP is the average of AP. In some contexts this is comping the AP for each class and average them, in other contexts, AP and mAP are the same thing. For example, under the COCO context, there is no difference between AP and mAP.
  • 35. 36© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Advancements in Computer Vision Development of Network Architectures
  • 36. 37© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision - AlexNet In 2012, The first CNN with an acceptable level of accuracy was published. AlexNet was based on the now popular ImageNet dataset, achieved a top-5 test error of 15.4% (2012 ILSVRC) (the next best has >25% test error ) Due to the compute complexity, the team split the processing across two GPU pipelines ImageNet Classification with Deep Convolutional Neural Networks: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  • 37. 38© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision - VGG In 2014, the Very Deep Convolutional Network (VGG-19) was produced. A 19 layer CNN, with a small filter (3x3) compared to AlexNet. Used 3 back-to-back convolutional layers before pooling. First to demonstrated the use of very deep layers Achieved a top-5 test error of 7.3% Very deep convolutional networks for large-scale image recognition Https://arxiv.Org/pdf/1409.1556.Pdf
  • 38. 39© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision - GoogleNet Google announced GoogleNet in 2015, which tackled the problem of the huge computational costs needed to train a CNN The architecture introduced a method of reducing the number of features (thus trainable parameters), using 1x1 convolutional layers, and running parallel convolutions (Inception Model). Demonstrated that stacking is not the only approach to developing CNNs This achieved a very reasonable top-5 test error of 6.7% Later revisions of the Inception model introduced batch normalization as a layer to improve performance and reduce training time Going Deeper with Convolutions https://arxiv.org/abs/1409.4842
  • 39. 40© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision - ResNet In 2015, Microsoft produced ResNet, which was a 152 layer network. The basis behind the Residual Network Is the Residual Block, where the outputs of the conv-relu-conv cycle is added to the original input. At each cycle the computation is tracking to small change to the original input, rather than just form a completely new representation of the image. These changes are then used to update the next cycle, which also improves the process of training during the back-propagation stage This architecture achieve a top-5 test error of 3.6% (humans are usually in the range of 5-10%) Deep Residual Learning for Image Recognition https://arxiv.org/abs/1512.03385
  • 40. 41© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Computer Vision – Region-CNNs Region-based CNNs can be considered as of the recent advancements in the field of computer vision. R-CNNs aim to solve the problem of object detection tasks. By using the fundamentals of CNNs, regions which correspond to objects within an image can be detected, and bounding boxes can be drawn. Search for Mask-RCNN or Fast-CNNs for more info on the application and architectures
  • 41. 42© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | YOLO (You Only Look Once) YOLO is a network for object detection. Compared to existing R-CNN approaches which use a pipeline approach, YOLO uses a single NN to perform object detection (single regression problem). The speed allows for real-time processing of images à Videos! Network is based on the ResNet Architecture, and various flavors exist to suit different computational needs The most recent YOLOv3 uses 75 Convolutional Layers, No Fully Connected Layers, No Pooling, and No SoftMax Layer https://arxiv.org/pdf/1506.02640.pdf
  • 42. 43© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Single Shot MultiBox Detector (SSD) SSD only requires a single shot to detect multiple objects within an image. This means only one forward pass, where as other regional based models require multiple shots. The single pass means SSD is great for object detection in video! For each region, k bounding boxes b are identified. These k bounding boxes have different sizes and aspect ratios. For each b , c class scores are computed along with 4 offsets relative to the original default bounding box shape. (hence MultiBox) Architecture is built on VGG-16, and for smaller objects, achieves a higher level of accuracy compared to YOLO https://arxiv.org/pdf/1512.02325.pdf https://cv-tricks.com/object-detection/faster-r-cnn-yolo-ssd/
  • 43. 44© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 44© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Deeper Dive into the Technology Machine Learning @ AWS
  • 44. 45© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | The AWS ML Stack Broadest and most complete set of Machine Learning capabilities VISION SPEECH TEXT SEARCH CHATBOTS PERSONALIZATION FORECASTING FRAUD DEVELOPMENT CONTACT CENTERS Ground Truth Augmented AI ML Marketplace Neo Built-in algorithms Notebooks Experiments Model training & tuning Debugger Autopilot Model hosting Model Monitor Deep Learning AMIs & Containers GPUs & CPUs Elastic Inference Inferentia FPGA Amazon Rekognition Amazon Polly Amazon Transcribe +Medical Amazon Comprehend +Medical Amazon Translate Amazon Lex Amazon Personalize Amazon Forecast Amazon Fraud Detector Amazon CodeGuru AI SERVICES ML SERVICES ML FRAMEWORKS & INFRASTRUCTURE Amazon Textract Amazon Kendra Contact Lens For Amazon Connect SageMaker Studio IDE NEW NEW! NEW! NEW! NEW! NEW! NEW! NEW! NEW! NEW! NEW! Amazon SageMaker
  • 45. 46© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Fully managed data processing jobs and data labeling workflows One-click collaborative notebooks and built- in, high performance algorithms and models One-click training Debugging and optimization One-click deployment and autoscaling Amazon SageMaker helps you build, train, and deploy models Visually track and compare experiments Automatically spot concept drift Fully managed with auto-scaling for 75% less Prepare Build Train & Tune Deploy & Manage 101011010 010101010 000011110 Collect and prepare training data Choose or build an ML algorithm Set up and manage environments for training Train, debug, and tune models Deploy model in production Manage training runs Monitor models Add human review of predictions Web-based IDE for machine learning Automatically build and train models
  • 46. 47© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AMAZON SAGEMAKER IS FULLY MANAGED One click model deployment Auto-scaling Python SDK Bring your own model Low latency and high throughput Deploy multiple models on an endpoint
  • 47. 48© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Amazon SageMaker Notebooks Access your notebooks in seconds Administrators manage access and permissions Share notebooks with a single click Dial up or down compute resources (Coming soon) Start your notebooks without spinning up compute resources Fast-start sharable notebooks (in preview)
  • 48. 49© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | • Fully-managed training and hosting • Near-linear scaling across 100s of GPUs • 3x faster network throughput with EC2 P3 65% Stock TensorFlow AWS-optimized TensorFlow90% AMAZON SAGEMAKER IS THE BEST PLACE TO RUN TENSORFLOW
  • 49. 50© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Classification Computer Vision Topic Modeling Working with Text Recommendation Forecasting • Linear Learner • XGBoost • KNN • Image Classification • BlazingText • Supervised • Unsupervised • Factorization Machines • DeepAR • LDA • NTM AMAZON SAGEMAKER HAS BUILT-IN ALGORITHMS OR BRING YOUR OWN Anomaly Detection • Random Cut Forests Sequence Translation • Seq2Seq • Object Detection Clustering • KMeans Feature Reduction • PCA Regression • Linear Learner • XGBoost • KNN • IP Insights • Semantic Segmentation • Object2Vec
  • 50. 51© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | GluonCV: Deep Learning Toolkit for Computer Vision GluonCV – open source deep learning interface for quickly build machine learning models, without compromising performance • Training with SOTA results from latest papers • Large set of pre-trained models • Carefully designed APIs, easy to understand implementations • Community support Benefits
  • 51. 52© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | AWS AWS R-CNN Example https://github.com/aws-samples/mask-rcnn-tensorflow Primary focus was on increasing training throughput without sacrificing any accuracy. We do this by training with a batch size > 1 per GPU using FP16 and two custom TF ops. Dataset: COCO 2017 Pre-Trained Model: ResNet-r50 EC2 Instance Type: P3dn.24xl I Num_GPUs x Images_Per_ GPU Trainin g time Box mAP Mask mAP 8x4 9.78h 38.25% 35.08% 16x4 5.60h 38.44% 35.18% 32x4 3.33h 38.33% 35.12%
  • 52. 53© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | You can shop for algorithms, models, and data in AWS Marketplace AWS MARKETPLACE Browse or search AWS Marketplace Subscribe in a single click Available in Amazon SageMaker
  • 53. 54© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | HUNDREDS OF ALGORITHMS, MODELS, AND DATA Natural language processing Text-to-speech Object detection Speech recognition Grammar and parsingText generation Speaker identification Regression Text OCR Text classification Text clustering Computer vision 3D images Handwriting recognition Named entity recognition Anomaly detection Ranking Video classification Automatic labeling via machine learning IP protection Automated billing and metering SELLERS Broad selection of paid, free, and open-source algorithms and models Data protection Discoverable on your AWS bill BUYERS
  • 54. 55© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | 60+ Computer Vision Models and Algorithms
  • 55. 56© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 56© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Example Use Case Using SageMaker for Detecting False Insurance Claims Images
  • 56. 57© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |
  • 57. 58© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved | Detecting False Claims Global car insurance organizations receive tens of thousands of claims per day, which require significant human resources to review, investigate, and approve the claims. The use of computer vision can help reduce the overheads of the claims team by providing an automated mechanism for detecting potentially false or spam insurance claims. In this session we’re going to build a custom solution which uses computer vision models, to detect cars and damages on cars.
  • 58. 59© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Detecting False Claims: Solution Architecture
  • 59. 60© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Detecting False Claims: Inferencing Custom Trained Model
  • 60. 61© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Detecting False Claims: Using SageMaker Amazon SageMaker is the first step in producing a custom Image Classification Model. At this stage, data preparation, exploration, is performed to ensure the initial data used for training a model is suitable. Model training is an iterative process, and the first model will be supported by a cleansed dataset Once model performance is acceptable, the model can be deployed and then the code wrapped up for deployment on Kubernetes
  • 61. 62© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Detecting False Claims: Using SageMaker
  • 62. 63© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | 63© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Car Image Detection Workflow Demo
  • 63. 65© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved | Wrap up…phew! We’ve covered A LOT of content: Neural Networks for CV Architecture Advances Performance Measuring AWS Services Demos of using SageMaker for Image Classification …Hopefully you can take something from this and go explore!