SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Scaling up deep
learning by scaling
down
—
Nick Pentreath
Principal Engineer
@MLnick
About
IBM Developer / © 2020 IBM Corporation
– @MLnick on Twitter, Github, LinkedIn
– Principal Engineer, IBM CODAIT (Center for
Open-Source Data & AI Technologies)
– Machine Learning & AI
– Apache Spark committer & PMC
– Author of Machine Learning with Spark
– Various conferences & meetups
2
Improving the Enterprise AI Lifecycle in Open Source
IBM Developer / © 2020 IBM Corporation 3
– CODAIT aims to make AI solutions
dramatically easier to create,
deploy, and manage in the
enterprise.
– We contribute to and advocate for
the open-source technologies that
are foundational to IBM’s AI
offerings.
– 30+ open-source developers!
Center for Open Source Data & AI Technologies
codait.org
CODAIT
Open Source @ IBM
Agenda
4
– Deep Learning overview & computational
considerations
– Evolving efficiency of model architectures
– Model compression
– Model distillation
– Conclusion
DEG / June 4, 2020 / © 2020 IBM Corporation
Machine Learning
Workflow
5
Data Analyze Process Train Deploy
Predict
&
Maintain
DEG / June 4, 2020 / © 2020 IBM Corporation
Compute-heavy
Deep Learning
– Original theory from 1940s; computer
models originated around 1960s; fell out of
favor in 1980s/90s
– Recent resurgence due to
• Bigger (and better) data; standard datasets
(e.g. ImageNet)
• Better hardware (GPUs, TPUs)
• Improvements to algorithms, architectures and
optimization
– Leading to new state-of-the-art results in
computer vision (images and video);
speech/text; language translation and
more
IBM Developer / © 2020 IBM Corporation 6Source: Wikipedia
Modern Neural Networks
– Deep (multi-layer) networks
– Computer vision
• Convolution neural networks (CNNs)
• Image classification, object detection,
segmentation
– Sequences and time-series
• Machine translation, text generation
• Recurrent neural networks - LSTM, GRU
– Natural language processing
• Word embeddings
• Transformers, attention mechanisms
– Deep learning frameworks
• Flexibility, computation graphs, auto-
differentiation, GPUs
IBM Developer / © 2020 IBM Corporation 7Source: Stanford CS231n
Evolution of Training
Computation Requirements
IBM Developer / © 2020 IBM Corporation 8
Source
Computational
resources
required for
training AI
models doubles
every 3 to 4
months
Example: Image
Classification
IBM Developer / © 2020 IBM Corporation 9Source
Input image Inference Prediction
beagle: 0.82
basset: 0.09
bluetick: 0.07
...
Example: Inception V3
Source
IBM Developer / © 2020 IBM Corporation 10
Effectively
matrix
multiplication
~24 million
parameters
78.8%
accuracy
(ImageNet)
Accuracy vs Computational
Complexity (ImageNet)
IBM Developer / © 2020 IBM Corporation 11
Source: Paper, blog
Computational efficiency
(ImageNet)
IBM Developer / © 2020 IBM Corporation 12
Source: Paper, blog
Deep Learning
Deployment
IBM Developer / © 2020 IBM Corporation 13
– Model training typically
uses substantial
hardware
– GPU / multi-GPU
– Cloud-based
deployment scenarios
Deep Learning
Deployment
IBM Developer / © 2020 IBM Corporation 14
– Edge devices have more limited resources
• Memory
• Compute (CPU, mobile GPU, edge TPU)
• Network bandwidth
– Also applies to low-latency applications
IBM Developer / © 2020 IBM Corporation 15
How do we improve
performance
efficiency?
– Architecture
improvements
– Model pruning
– Quantization
– Model distillation
Architecture
Improvements
IBM Developer / © 2020 IBM Corporation 16
Specialized architectures for
low-resource targets
Source
IBM Developer / © 2020 IBM Corporation 17
Standard
Convolution
Building Block
Inception V3 MobileNet V1
Depthwise
Convolution
Building Block
(~8x less
computation)
~4 million
parameters
70.9%
accuracy
~24 million
parameters
78.8%
accuracy
(ImageNet)
Trade off accuracy vs model size
Source
IBM Developer / © 2020 IBM Corporation 18
– Scale layer width &
resolution multiplier to
target available
computation budget
– Width multiplier =
“thinner” models
– Resolution multiplier
scales input image
representation
MobileNet V2
Source
IBM Developer / © 2020 IBM Corporation 19
– Same depthwise
convolution backbone
– Add linear bottlenecks
& shortcut connections
~3.4 million
parameters
72%
accuracy
Accuracy vs Computation - Updated
(ImageNet)
IBM Developer / © 2020 IBM Corporation 20
Source: Paper, blog
Computational efficiency - Updated
(ImageNet)
IBM Developer / © 2020 IBM Corporation 21
Source: Paper, blog
EfficientNet
Source: blog post, paper
IBM Developer / © 2020 IBM Corporation 22
– Neural Architecture
Search to find backbone
– Optimize for accuracy &
efficiency (FLOPS)
~5.3 million
parameters
77.3%
accuracy
~60 million
parameters
84.5%
accuracy
MobileNet V3
Source: GitHub, paper
IBM Developer / © 2020 IBM Corporation 23
– Hardware-aware Neural
Architecture Search
~5.4 million
parameters
75.2%
accuracy
One network to rule them all?
Source: GitHub, paper
IBM Developer / © 2020 IBM Corporation 24
– Once for All: Train One
Network and Specialize
it for Efficient
Deployment
– Manual design or NAS is
hugely costly in terms of
computation
– Train one network,
“cherry-pick” the sub-
net without additional
training
Model Pruning
IBM Developer / © 2020 IBM Corporation 25
– Reduce # of model
parameters
– Effectively like L1
regularization – remove
weights with small
impact on prediction
– Sparse weights ->
model compression &
lower latency
Model Pruning
IBM Developer / © 2020 IBM Corporation 26
Source
70
71
72
73
74
75
76
77
78
79
0% 20% 40% 60% 80% 100%
Top-1Accuracy(%)
Model Sparsity
ImageNet Classification
InceptionV3
MobileNet V1 224
Model Pruning
IBM Developer / © 2020 IBM Corporation 27
Source
26
26.5
27
27.5
28
28.5
29
29.5
30
0% 20% 40% 60% 80% 100%
BLEUScore
Model Sparsity
Language Translation
English-German
German-English
Quantization
IBM Developer / © 2020 IBM Corporation 28
Quantization
IBM Developer / © 2020 IBM Corporation 29
Source
– Most DL computation
users 32 (or even 64)
bits floating point
– Quantization reduces
numerical precision of
weights by binning
values
– Popular targets are 16-
bit FP and 8-bit integer
coding
Quantization
IBM Developer / © 2020 IBM Corporation 30
Source
– Post-training quantization
• Useful if you can’t (or don’t wish
to) retrain a model
• Give up accuracy
• Various options
– Float16
– Dynamic
– Int8
– Training-aware quantization
• Much more complex
• Can provide large efficiency
gains with minimal accuracy loss
78
71.9
77.2
63.7
77.5
70.9
InceptionV3 MobileNet V2 224Top-1Accuracy(%)
ImageNet Classification
Original Post-training Training-aware
Quantization
IBM Developer / © 2020 IBM Corporation 31
Source
100% 100%
25% 26%
InceptionV3 MobileNet V2 224
%orginalmodelsize
ImageNet Classification
Original Quantized
75%
110%
48%
61%
InceptionV3 MobileNet V2 224
%orginallatency
ImageNet Classification
Post-training Training-aware
Quantization
IBM Developer / © 2020 IBM Corporation 32
– TensorFlow Model
Optimization
– PyTorch
– Distiller for PyTorch
Model Distillation
IBM Developer / © 2020 IBM Corporation 33
– Large models may be
over-parameterized
– Use a large, complex
model to teach a
smaller, simpler model
– Effectively distil the
core knowledge of the
large model
Model Distillation
IBM Developer / © 2020 IBM Corporation 34
Source: Distiller docs, paper
Model Distillation
IBM Developer / © 2020 IBM Corporation 35
– BERT model distillations have been very
successful
– DistilBERT
– TinyBERT
– Others (see this blog post)
Conclusion
– Model distillation is less popular but
potentially compelling in NLP tasks
– Area of rapid research evolution
– New efficient model architectures are
rapidly evolving
• If one fits your needs, use it!
– Compression techniques can yield
large efficiency gains
• Now good support in DL frameworks
/ supporting libraries
– Perhaps combining pruning &
quantization (though trickier)
36DEG / June 4, 2020 / © 2020 IBM Corporation
Thank you
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
37DEG / June 4, 2020 / © 2020 IBM Corporation
Check out the Model Asset Exchange
https://ibm.biz/model-exchange
Sign up for IBM Cloud
https://ibm.biz/BdqdSi
References
Efficient Inference in Deep Learning – Where
is the Problem?
Analysis of deep neural networks
MobileNets
EfficientNet
Making Neural Nets Work With Low Precision
Speeding up BERT
38DEG / June 4, 2020 / © 2020 IBM Corporation
Distilling the Knowledge in a Neural Network
Once for All: Train One Network and
Specialize it for Efficient Deployment
Distiller – PyTorch
TensorFlow Model Optimization
Deep Compression: Compressing Deep
Neural Networks with Pruning, Trained
Quantization and Huffman Coding
IBM Developer / © 2020 IBM Corporation 39

Más contenido relacionado

La actualidad más candente

Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreMoritz Meister
 
Deep AutoViML For Tensorflow Models and MLOps Workflows
Deep AutoViML For Tensorflow Models and MLOps WorkflowsDeep AutoViML For Tensorflow Models and MLOps Workflows
Deep AutoViML For Tensorflow Models and MLOps WorkflowsBill Liu
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In ProductionSamir Bessalah
 
Weekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileWeekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileBill Liu
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Databricks
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowStepan Pushkarev
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudMárton Kodok
 
Blind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterBlind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterIdo Shilon
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningLviv Startup Club
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operationsStepan Pushkarev
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to productionGeorg Heiler
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadData Con LA
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Battling Model Decay with Deep Learning and Gamification
Battling Model Decay with Deep Learning and GamificationBattling Model Decay with Deep Learning and Gamification
Battling Model Decay with Deep Learning and GamificationDatabricks
 

La actualidad más candente (20)

Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2Richard Coffey (x18140785) - Research in Computing CA2
Richard Coffey (x18140785) - Research in Computing CA2
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
Deep AutoViML For Tensorflow Models and MLOps Workflows
Deep AutoViML For Tensorflow Models and MLOps WorkflowsDeep AutoViML For Tensorflow Models and MLOps Workflows
Deep AutoViML For Tensorflow Models and MLOps Workflows
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
 
Weekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on MobileWeekly #106: Deep Learning on Mobile
Weekly #106: Deep Learning on Mobile
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
 
Automating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflowAutomating machine learning lifecycle with kubeflow
Automating machine learning lifecycle with kubeflow
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
 
Blind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterBlind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forter
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Magdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine LearningMagdalena Stenius: MLOPS Will Change Machine Learning
Magdalena Stenius: MLOPS Will Change Machine Learning
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer insteadForget becoming a Data Scientist, become a Machine Learning Engineer instead
Forget becoming a Data Scientist, become a Machine Learning Engineer instead
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Battling Model Decay with Deep Learning and Gamification
Battling Model Decay with Deep Learning and GamificationBattling Model Decay with Deep Learning and Gamification
Battling Model Decay with Deep Learning and Gamification
 

Similar a Scaling up deep learning by scaling down

Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling downNick Pentreath
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeLuciano Resende
 
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTXDecision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTXSanjayKPrasad2
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examplesLuciano Resende
 
Deploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDeploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDatabricks
 
End-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXEnd-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXNick Pentreath
 
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...Andrey Sadovykh
 
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...Luciano Resende
 
Creating a Machine Learning Model on the Cloud
Creating a Machine Learning Model on the CloudCreating a Machine Learning Model on the Cloud
Creating a Machine Learning Model on the CloudAlexander Al Basosi
 
Scaling Up Presentation
Scaling Up PresentationScaling Up Presentation
Scaling Up PresentationJiaqi Xie
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksLuciano Resende
 
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...
2019 Top IT Trends - Understanding the fundamentals of the next generation ...Tony Pearson
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...Antje Barth
 
How to deploy machine learning models into production
How to deploy machine learning models into productionHow to deploy machine learning models into production
How to deploy machine learning models into productionDataWorks Summit
 
Build and deploy your machine learning models effortlessly (2)
Build and deploy your machine learning models effortlessly (2)Build and deploy your machine learning models effortlessly (2)
Build and deploy your machine learning models effortlessly (2)Anam Mahmood
 
G111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910aG111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910aTony Pearson
 
Model driven engineering for big data management systems
Model driven engineering for big data management systemsModel driven engineering for big data management systems
Model driven engineering for big data management systemsMarcos Almeida
 
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingEray Cakici
 

Similar a Scaling up deep learning by scaling down (20)

Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling down
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTXDecision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
Decision Optimization - CPLEX Optimization Studio - Product Overview(2).PPTX
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
Deploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDeploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNX
 
End-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNXEnd-to-End Deep Learning Deployment with ONNX
End-to-End Deep Learning Deployment with ONNX
 
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
 
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...
From Data to AI - Silicon Valley Open Source projects come to you - Madrid me...
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
Creating a Machine Learning Model on the Cloud
Creating a Machine Learning Model on the CloudCreating a Machine Learning Model on the Cloud
Creating a Machine Learning Model on the Cloud
 
Scaling Up Presentation
Scaling Up PresentationScaling Up Presentation
Scaling Up Presentation
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooks
 
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
 
AI in the enterprise
AI in the enterprise AI in the enterprise
AI in the enterprise
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
 
How to deploy machine learning models into production
How to deploy machine learning models into productionHow to deploy machine learning models into production
How to deploy machine learning models into production
 
Build and deploy your machine learning models effortlessly (2)
Build and deploy your machine learning models effortlessly (2)Build and deploy your machine learning models effortlessly (2)
Build and deploy your machine learning models effortlessly (2)
 
G111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910aG111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910a
 
Model driven engineering for big data management systems
Model driven engineering for big data management systemsModel driven engineering for big data management systems
Model driven engineering for big data management systems
 
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
 

Más de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Último (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Scaling up deep learning by scaling down

  • 1. Scaling up deep learning by scaling down — Nick Pentreath Principal Engineer @MLnick
  • 2. About IBM Developer / © 2020 IBM Corporation – @MLnick on Twitter, Github, LinkedIn – Principal Engineer, IBM CODAIT (Center for Open-Source Data & AI Technologies) – Machine Learning & AI – Apache Spark committer & PMC – Author of Machine Learning with Spark – Various conferences & meetups 2
  • 3. Improving the Enterprise AI Lifecycle in Open Source IBM Developer / © 2020 IBM Corporation 3 – CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise. – We contribute to and advocate for the open-source technologies that are foundational to IBM’s AI offerings. – 30+ open-source developers! Center for Open Source Data & AI Technologies codait.org CODAIT Open Source @ IBM
  • 4. Agenda 4 – Deep Learning overview & computational considerations – Evolving efficiency of model architectures – Model compression – Model distillation – Conclusion DEG / June 4, 2020 / © 2020 IBM Corporation
  • 5. Machine Learning Workflow 5 Data Analyze Process Train Deploy Predict & Maintain DEG / June 4, 2020 / © 2020 IBM Corporation Compute-heavy
  • 6. Deep Learning – Original theory from 1940s; computer models originated around 1960s; fell out of favor in 1980s/90s – Recent resurgence due to • Bigger (and better) data; standard datasets (e.g. ImageNet) • Better hardware (GPUs, TPUs) • Improvements to algorithms, architectures and optimization – Leading to new state-of-the-art results in computer vision (images and video); speech/text; language translation and more IBM Developer / © 2020 IBM Corporation 6Source: Wikipedia
  • 7. Modern Neural Networks – Deep (multi-layer) networks – Computer vision • Convolution neural networks (CNNs) • Image classification, object detection, segmentation – Sequences and time-series • Machine translation, text generation • Recurrent neural networks - LSTM, GRU – Natural language processing • Word embeddings • Transformers, attention mechanisms – Deep learning frameworks • Flexibility, computation graphs, auto- differentiation, GPUs IBM Developer / © 2020 IBM Corporation 7Source: Stanford CS231n
  • 8. Evolution of Training Computation Requirements IBM Developer / © 2020 IBM Corporation 8 Source Computational resources required for training AI models doubles every 3 to 4 months
  • 9. Example: Image Classification IBM Developer / © 2020 IBM Corporation 9Source Input image Inference Prediction beagle: 0.82 basset: 0.09 bluetick: 0.07 ...
  • 10. Example: Inception V3 Source IBM Developer / © 2020 IBM Corporation 10 Effectively matrix multiplication ~24 million parameters 78.8% accuracy (ImageNet)
  • 11. Accuracy vs Computational Complexity (ImageNet) IBM Developer / © 2020 IBM Corporation 11 Source: Paper, blog
  • 12. Computational efficiency (ImageNet) IBM Developer / © 2020 IBM Corporation 12 Source: Paper, blog
  • 13. Deep Learning Deployment IBM Developer / © 2020 IBM Corporation 13 – Model training typically uses substantial hardware – GPU / multi-GPU – Cloud-based deployment scenarios
  • 14. Deep Learning Deployment IBM Developer / © 2020 IBM Corporation 14 – Edge devices have more limited resources • Memory • Compute (CPU, mobile GPU, edge TPU) • Network bandwidth – Also applies to low-latency applications
  • 15. IBM Developer / © 2020 IBM Corporation 15 How do we improve performance efficiency? – Architecture improvements – Model pruning – Quantization – Model distillation
  • 16. Architecture Improvements IBM Developer / © 2020 IBM Corporation 16
  • 17. Specialized architectures for low-resource targets Source IBM Developer / © 2020 IBM Corporation 17 Standard Convolution Building Block Inception V3 MobileNet V1 Depthwise Convolution Building Block (~8x less computation) ~4 million parameters 70.9% accuracy ~24 million parameters 78.8% accuracy (ImageNet)
  • 18. Trade off accuracy vs model size Source IBM Developer / © 2020 IBM Corporation 18 – Scale layer width & resolution multiplier to target available computation budget – Width multiplier = “thinner” models – Resolution multiplier scales input image representation
  • 19. MobileNet V2 Source IBM Developer / © 2020 IBM Corporation 19 – Same depthwise convolution backbone – Add linear bottlenecks & shortcut connections ~3.4 million parameters 72% accuracy
  • 20. Accuracy vs Computation - Updated (ImageNet) IBM Developer / © 2020 IBM Corporation 20 Source: Paper, blog
  • 21. Computational efficiency - Updated (ImageNet) IBM Developer / © 2020 IBM Corporation 21 Source: Paper, blog
  • 22. EfficientNet Source: blog post, paper IBM Developer / © 2020 IBM Corporation 22 – Neural Architecture Search to find backbone – Optimize for accuracy & efficiency (FLOPS) ~5.3 million parameters 77.3% accuracy ~60 million parameters 84.5% accuracy
  • 23. MobileNet V3 Source: GitHub, paper IBM Developer / © 2020 IBM Corporation 23 – Hardware-aware Neural Architecture Search ~5.4 million parameters 75.2% accuracy
  • 24. One network to rule them all? Source: GitHub, paper IBM Developer / © 2020 IBM Corporation 24 – Once for All: Train One Network and Specialize it for Efficient Deployment – Manual design or NAS is hugely costly in terms of computation – Train one network, “cherry-pick” the sub- net without additional training
  • 25. Model Pruning IBM Developer / © 2020 IBM Corporation 25 – Reduce # of model parameters – Effectively like L1 regularization – remove weights with small impact on prediction – Sparse weights -> model compression & lower latency
  • 26. Model Pruning IBM Developer / © 2020 IBM Corporation 26 Source 70 71 72 73 74 75 76 77 78 79 0% 20% 40% 60% 80% 100% Top-1Accuracy(%) Model Sparsity ImageNet Classification InceptionV3 MobileNet V1 224
  • 27. Model Pruning IBM Developer / © 2020 IBM Corporation 27 Source 26 26.5 27 27.5 28 28.5 29 29.5 30 0% 20% 40% 60% 80% 100% BLEUScore Model Sparsity Language Translation English-German German-English
  • 28. Quantization IBM Developer / © 2020 IBM Corporation 28
  • 29. Quantization IBM Developer / © 2020 IBM Corporation 29 Source – Most DL computation users 32 (or even 64) bits floating point – Quantization reduces numerical precision of weights by binning values – Popular targets are 16- bit FP and 8-bit integer coding
  • 30. Quantization IBM Developer / © 2020 IBM Corporation 30 Source – Post-training quantization • Useful if you can’t (or don’t wish to) retrain a model • Give up accuracy • Various options – Float16 – Dynamic – Int8 – Training-aware quantization • Much more complex • Can provide large efficiency gains with minimal accuracy loss 78 71.9 77.2 63.7 77.5 70.9 InceptionV3 MobileNet V2 224Top-1Accuracy(%) ImageNet Classification Original Post-training Training-aware
  • 31. Quantization IBM Developer / © 2020 IBM Corporation 31 Source 100% 100% 25% 26% InceptionV3 MobileNet V2 224 %orginalmodelsize ImageNet Classification Original Quantized 75% 110% 48% 61% InceptionV3 MobileNet V2 224 %orginallatency ImageNet Classification Post-training Training-aware
  • 32. Quantization IBM Developer / © 2020 IBM Corporation 32 – TensorFlow Model Optimization – PyTorch – Distiller for PyTorch
  • 33. Model Distillation IBM Developer / © 2020 IBM Corporation 33 – Large models may be over-parameterized – Use a large, complex model to teach a smaller, simpler model – Effectively distil the core knowledge of the large model
  • 34. Model Distillation IBM Developer / © 2020 IBM Corporation 34 Source: Distiller docs, paper
  • 35. Model Distillation IBM Developer / © 2020 IBM Corporation 35 – BERT model distillations have been very successful – DistilBERT – TinyBERT – Others (see this blog post)
  • 36. Conclusion – Model distillation is less popular but potentially compelling in NLP tasks – Area of rapid research evolution – New efficient model architectures are rapidly evolving • If one fits your needs, use it! – Compression techniques can yield large efficiency gains • Now good support in DL frameworks / supporting libraries – Perhaps combining pruning & quantization (though trickier) 36DEG / June 4, 2020 / © 2020 IBM Corporation
  • 37. Thank you codait.org twitter.com/MLnick github.com/MLnick developer.ibm.com 37DEG / June 4, 2020 / © 2020 IBM Corporation Check out the Model Asset Exchange https://ibm.biz/model-exchange Sign up for IBM Cloud https://ibm.biz/BdqdSi
  • 38. References Efficient Inference in Deep Learning – Where is the Problem? Analysis of deep neural networks MobileNets EfficientNet Making Neural Nets Work With Low Precision Speeding up BERT 38DEG / June 4, 2020 / © 2020 IBM Corporation Distilling the Knowledge in a Neural Network Once for All: Train One Network and Specialize it for Efficient Deployment Distiller – PyTorch TensorFlow Model Optimization Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
  • 39. IBM Developer / © 2020 IBM Corporation 39