SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Spark Technology
Center
Deep Neural Network Regression at
Scale in MLlib
Jeremy
Nixon
Acknowledgements - Built off of work by
Alexander Ulanov and Xiangrui Meng
Structure
1. Introduction / About
2. Motivation
a. Regression
b. Comparison with prominent MLlib algorithms
3. Framing Deep Learning
4. The Model
5. Properties
a. Automated Feature Generation
b. Capable of Learning Non-Linear Structure
c. Non-Local Generalization
6. Applications
7. Features / Usage
8. Optimization
9. Future Work
Jeremy Nixon
● Machine Learning Engineer at the Spark Technology Center
● Contributor to MLlib, dedicated to scalable deep learning
● Previously, studied Applied Mathematics to Computer Science /
Economics at Harvard
Regression Models are Valuable For:
● Location Tracking in Images
● Housing Price Prediction
● Predicting Lifetime value of a customer
● Stock market stock evaluation
● Forecasting Demand for a product
● Pricing Optimization
● Price Sensitivity
● Dynamic Pricing
● Many, many other applications.
Ever trained a Linear Regression Model?
Linear Regression Models
Major Downsides:
Cannot discover non-linear structure in data.
Manual feature engineering by the Data Scientist. This is time consuming and
can be infeasible for high dimensional data.
Decision Tree Based Model? (RF, GB)
Decision Tree Models
Upside:
Capable of automatically picking up on non-linear structure.
Downsides:
Incapable of generalizing outside of the range of the input data.
Restricted to cut points for relationships.
Thankfully, there’s an algorithmic solution.
Multilayer Perceptron Regression
- New Algorithm on Spark MLlib -
Deep Feedforward Neural Network for Regression.
Many Successes of Deep Learning
1. CNNs - State of the art
a. Object Recognition
b. Object Localization
c. Image Segmentation
d. Image Restoration
2. RNNs (LSTM) - State of the Art
a. Speech Recognition
b. Question Answering
c. Machine Translation
d. Text Summarization
e. Named Entity Recognition
f. Natural Language Generation
g. Word Sense Disambiguation
h. Image / Video Captioning
i. Sentiment Analysis
Many Ways to Frame Deep Learning
1. Automated Feature Engineering
2. Non-local generalization
3. Manifold Learning
4. Exponentially structured flexibility countering curse of dimensionality
5. Hierarchical Abstraction
6. Learning Representation / Input Space Contortion / Transformation for
Linear Separability
7. Extreme model flexibility leading to the ability to absorb much larger data
without penalty
Properties
Overview
1. Automated Feature Generation
2. Capable of Learning Non-linear Structure
3. Generalization outside input data range
The Model
X = Normalized Data, W1
, W2
= Weights, b = Bias
Forward:
1. Multiply data by first layer weights | (X*W1
+ b1
)
2. Put output through non-linear activation | max(0, X*W1
+ b1
)
3. Multiply output by second layer weights | max(0, X*W1
+ b) * W2
+ b2
4. Return predicted output
Automated Feature Generation
● Pixel - Edges - Shapes - Parts - Objects : Prediction
● Learns features that are optimized for the data
Capable of Learning Non-Linear Structure
Generalization Outside Data Range
DNN Regression Applications
Great results in:
● Computer Vision
○ Object Localization / Detection as DNN Regression
○ Self-driving Steering Command Prediction
○ Human Pose Regression
● Finance
○ Currency Exchange Rate
○ Stock Price Prediction
○ Forecasting Financial Time Series
○ Crude Oil Price Prediction
DNN Regression Applications
Great results in:
● Atmospheric Sciences
○ Air Quality Prediction
○ Carbon Dioxide Pollution Prediction
○ Ozone Concentration Modeling
○ Sulphur Dioxide Concentration Prediction
● Infrastructure
○ Road Tunnel Cost Estimation
○ Highway Engineering Cost Estimation
● Geology / Physics
○ Meteorology and Oceanography Application
○ Pacific Sea Surface Temperature Prediction
○ Hydrological Modeling
Features of DNNR
1. Automatically Scaling Output Labels
2. Pipeline API Integration
3. Save / Load Models Automatically
4. Gradient Descent and L-BFGS
5. Tanh and Relu Activation Functions
Optimization
Loss Function
We compute our errors (difference between our predictions and the real
outcome) using the mean squared error function:
Optimization
Parallel implementation of backpropagation:
1. Each worker gets weights from master node.
2. Each worker computes a gradient on its data.
3. Each worker sends gradient to master.
4. Master averages the gradients and updates the weights.
Performance
● Parallel MLP on Spark with 7 nodes ~= Caffe w/GPU (single node).
● Advantages to parallelism diminish with additional nodes due to
communication costs.
● Additional workers are valuable up to ~20 workers.
● See https://github.com/avulanov/ann-benchmark for more details
Future Work
1. Convolutional Neural Networks
a. Convolutional Layer Type
b. Max Pooling Layer Type
2. Flexible Deep Learning API
3. More Modern Optimizers
a. Adam
b. Adadelta + Nesterov Momentum
4. More Modern activations
5. Dropout / L2 Regularization
6. Batch Normalization
7. Tensor Support
8. Recurrent Neural Networks (LSTM)
● Detection as DNN Regression: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf
● Object Localization: http://arxiv.org/pdf/1312.6229v4.pdf
● Pose Regression: https://www.robots.ox.ac.uk/~vgg/publications/2014/Pfister14a/pfister14a.pdf
● Currency Exchange Rate: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.2442
● Stock Price Prediction: https://arxiv.org/pdf/1003.1457.pdf
● Forcasting Financial Time Series: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.8688&rep=rep1&type=pdf
● Crude Oil Price Prediction: http://www.sciencedirect.com/science/article/pii/S0140988308000765
● Air Quality Prediction:
https://www.researchgate.net/profile/VR_Prybutok/publication/8612909_Prybutok_R._A_neural_network_model_forecasting_for_prediction_of_daily_maximum_ozone_concent
ration_in_an_industrialized_urban_area._Environ._Pollut._92(3)_349-357/links/0deec53babcab9c32f000000.pdf
● Air Pollution Prediction - Carbon Dioxide http://202.116.197.15/cadalcanton/Fulltext/21276_2014319_102457_186.pdf
● Atmospheric Sulphyr Dioxide Concentrations http://cdn.intechweb.org/pdfs/17396.pdf
● Oxone Concentration Comparison
https://www.researchgate.net/publication/263416130_Statistical_Surface_Ozone_Models_An_Improved_Methodology_to_Account_for_Non-Linear_Behaviour
● Road Tunnel Cost Estimationhttp://ascelibrary.org/doi/abs/10.1061/(ASCE)CO.1943-7862.0000479
● Highway Engineering Cost Estimationhttp://www.jcomputers.us/vol5/jcp0511-19.pdf
● Pacific Sea Surface Temperature http://www.ncbi.nlm.nih.gov/pubmed/16527455
● Meteorology and Oceanography https://open.library.ubc.ca/cIRcle/collections/facultyresearchandpublications/32536/items/1.0041821
● Hydrological Modeling: http://hydrol-earth-syst-sci.net/13/1607/2009/hess-13-1607-2009.pdf
References
Thank You!
Questions?
Acknowledgements:
Built off of work by
Alexander Ulanov, Bert Grenovich and Xiangrui Meng

Más contenido relacionado

La actualidad más candente

Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
Katy Lee
 

La actualidad más candente (20)

Fpga based efficient multiplier for image processing applications using recur...
Fpga based efficient multiplier for image processing applications using recur...Fpga based efficient multiplier for image processing applications using recur...
Fpga based efficient multiplier for image processing applications using recur...
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
 
Inference & Learning in Linear-Chain Conditional Random Fields (CRFs)
Inference & Learning in Linear-Chain Conditional Random Fields (CRFs)Inference & Learning in Linear-Chain Conditional Random Fields (CRFs)
Inference & Learning in Linear-Chain Conditional Random Fields (CRFs)
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB images
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
 
Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.Road Network Extraction using Satellite Imagery.
Road Network Extraction using Satellite Imagery.
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation
 
Backpropagation algo
Backpropagation  algoBackpropagation  algo
Backpropagation algo
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
Scolari's ICCD17 Talk
Scolari's ICCD17 TalkScolari's ICCD17 Talk
Scolari's ICCD17 Talk
 
Ocr using tensor flow
Ocr using tensor flowOcr using tensor flow
Ocr using tensor flow
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursion
 
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
 
murali-resume
murali-resumemurali-resume
murali-resume
 
Tldr
TldrTldr
Tldr
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
[2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review][2020 CVPR Efficient DET paper review]
[2020 CVPR Efficient DET paper review]
 
face recognition system using LBP
face recognition system using LBPface recognition system using LBP
face recognition system using LBP
 
Bs25412419
Bs25412419Bs25412419
Bs25412419
 

Similar a Deep Neural Network Regression at Scale in Spark MLlib

Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
Hirantha Pradeep
 

Similar a Deep Neural Network Regression at Scale in Spark MLlib (20)

Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Supervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking systemSupervised embedding techniques in search ranking system
Supervised embedding techniques in search ranking system
 
Convolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibConvolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlib
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Model remodeling with modern deep learning frameworks
Model remodeling with modern deep learning frameworksModel remodeling with modern deep learning frameworks
Model remodeling with modern deep learning frameworks
 
Practical ML
Practical MLPractical ML
Practical ML
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
A Hand Book of Visual Basic 6.0.pdf.pdf
A Hand Book of Visual Basic 6.0.pdf.pdfA Hand Book of Visual Basic 6.0.pdf.pdf
A Hand Book of Visual Basic 6.0.pdf.pdf
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
Knowledge Formulation For Ai Planning
Knowledge Formulation For Ai PlanningKnowledge Formulation For Ai Planning
Knowledge Formulation For Ai Planning
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Deep Neural Network Regression at Scale in Spark MLlib

  • 1. Spark Technology Center Deep Neural Network Regression at Scale in MLlib Jeremy Nixon Acknowledgements - Built off of work by Alexander Ulanov and Xiangrui Meng
  • 2. Structure 1. Introduction / About 2. Motivation a. Regression b. Comparison with prominent MLlib algorithms 3. Framing Deep Learning 4. The Model 5. Properties a. Automated Feature Generation b. Capable of Learning Non-Linear Structure c. Non-Local Generalization 6. Applications 7. Features / Usage 8. Optimization 9. Future Work
  • 3. Jeremy Nixon ● Machine Learning Engineer at the Spark Technology Center ● Contributor to MLlib, dedicated to scalable deep learning ● Previously, studied Applied Mathematics to Computer Science / Economics at Harvard
  • 4. Regression Models are Valuable For: ● Location Tracking in Images ● Housing Price Prediction ● Predicting Lifetime value of a customer ● Stock market stock evaluation ● Forecasting Demand for a product ● Pricing Optimization ● Price Sensitivity ● Dynamic Pricing ● Many, many other applications.
  • 5. Ever trained a Linear Regression Model?
  • 6. Linear Regression Models Major Downsides: Cannot discover non-linear structure in data. Manual feature engineering by the Data Scientist. This is time consuming and can be infeasible for high dimensional data.
  • 7. Decision Tree Based Model? (RF, GB)
  • 8. Decision Tree Models Upside: Capable of automatically picking up on non-linear structure. Downsides: Incapable of generalizing outside of the range of the input data. Restricted to cut points for relationships. Thankfully, there’s an algorithmic solution.
  • 9. Multilayer Perceptron Regression - New Algorithm on Spark MLlib - Deep Feedforward Neural Network for Regression.
  • 10. Many Successes of Deep Learning 1. CNNs - State of the art a. Object Recognition b. Object Localization c. Image Segmentation d. Image Restoration 2. RNNs (LSTM) - State of the Art a. Speech Recognition b. Question Answering c. Machine Translation d. Text Summarization e. Named Entity Recognition f. Natural Language Generation g. Word Sense Disambiguation h. Image / Video Captioning i. Sentiment Analysis
  • 11. Many Ways to Frame Deep Learning 1. Automated Feature Engineering 2. Non-local generalization 3. Manifold Learning 4. Exponentially structured flexibility countering curse of dimensionality 5. Hierarchical Abstraction 6. Learning Representation / Input Space Contortion / Transformation for Linear Separability 7. Extreme model flexibility leading to the ability to absorb much larger data without penalty
  • 12. Properties Overview 1. Automated Feature Generation 2. Capable of Learning Non-linear Structure 3. Generalization outside input data range
  • 13. The Model X = Normalized Data, W1 , W2 = Weights, b = Bias Forward: 1. Multiply data by first layer weights | (X*W1 + b1 ) 2. Put output through non-linear activation | max(0, X*W1 + b1 ) 3. Multiply output by second layer weights | max(0, X*W1 + b) * W2 + b2 4. Return predicted output
  • 14. Automated Feature Generation ● Pixel - Edges - Shapes - Parts - Objects : Prediction ● Learns features that are optimized for the data
  • 15. Capable of Learning Non-Linear Structure
  • 17. DNN Regression Applications Great results in: ● Computer Vision ○ Object Localization / Detection as DNN Regression ○ Self-driving Steering Command Prediction ○ Human Pose Regression ● Finance ○ Currency Exchange Rate ○ Stock Price Prediction ○ Forecasting Financial Time Series ○ Crude Oil Price Prediction
  • 18. DNN Regression Applications Great results in: ● Atmospheric Sciences ○ Air Quality Prediction ○ Carbon Dioxide Pollution Prediction ○ Ozone Concentration Modeling ○ Sulphur Dioxide Concentration Prediction ● Infrastructure ○ Road Tunnel Cost Estimation ○ Highway Engineering Cost Estimation ● Geology / Physics ○ Meteorology and Oceanography Application ○ Pacific Sea Surface Temperature Prediction ○ Hydrological Modeling
  • 19. Features of DNNR 1. Automatically Scaling Output Labels 2. Pipeline API Integration 3. Save / Load Models Automatically 4. Gradient Descent and L-BFGS 5. Tanh and Relu Activation Functions
  • 20. Optimization Loss Function We compute our errors (difference between our predictions and the real outcome) using the mean squared error function:
  • 21. Optimization Parallel implementation of backpropagation: 1. Each worker gets weights from master node. 2. Each worker computes a gradient on its data. 3. Each worker sends gradient to master. 4. Master averages the gradients and updates the weights.
  • 22. Performance ● Parallel MLP on Spark with 7 nodes ~= Caffe w/GPU (single node). ● Advantages to parallelism diminish with additional nodes due to communication costs. ● Additional workers are valuable up to ~20 workers. ● See https://github.com/avulanov/ann-benchmark for more details
  • 23. Future Work 1. Convolutional Neural Networks a. Convolutional Layer Type b. Max Pooling Layer Type 2. Flexible Deep Learning API 3. More Modern Optimizers a. Adam b. Adadelta + Nesterov Momentum 4. More Modern activations 5. Dropout / L2 Regularization 6. Batch Normalization 7. Tensor Support 8. Recurrent Neural Networks (LSTM)
  • 24. ● Detection as DNN Regression: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf ● Object Localization: http://arxiv.org/pdf/1312.6229v4.pdf ● Pose Regression: https://www.robots.ox.ac.uk/~vgg/publications/2014/Pfister14a/pfister14a.pdf ● Currency Exchange Rate: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.2442 ● Stock Price Prediction: https://arxiv.org/pdf/1003.1457.pdf ● Forcasting Financial Time Series: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.8688&rep=rep1&type=pdf ● Crude Oil Price Prediction: http://www.sciencedirect.com/science/article/pii/S0140988308000765 ● Air Quality Prediction: https://www.researchgate.net/profile/VR_Prybutok/publication/8612909_Prybutok_R._A_neural_network_model_forecasting_for_prediction_of_daily_maximum_ozone_concent ration_in_an_industrialized_urban_area._Environ._Pollut._92(3)_349-357/links/0deec53babcab9c32f000000.pdf ● Air Pollution Prediction - Carbon Dioxide http://202.116.197.15/cadalcanton/Fulltext/21276_2014319_102457_186.pdf ● Atmospheric Sulphyr Dioxide Concentrations http://cdn.intechweb.org/pdfs/17396.pdf ● Oxone Concentration Comparison https://www.researchgate.net/publication/263416130_Statistical_Surface_Ozone_Models_An_Improved_Methodology_to_Account_for_Non-Linear_Behaviour ● Road Tunnel Cost Estimationhttp://ascelibrary.org/doi/abs/10.1061/(ASCE)CO.1943-7862.0000479 ● Highway Engineering Cost Estimationhttp://www.jcomputers.us/vol5/jcp0511-19.pdf ● Pacific Sea Surface Temperature http://www.ncbi.nlm.nih.gov/pubmed/16527455 ● Meteorology and Oceanography https://open.library.ubc.ca/cIRcle/collections/facultyresearchandpublications/32536/items/1.0041821 ● Hydrological Modeling: http://hydrol-earth-syst-sci.net/13/1607/2009/hess-13-1607-2009.pdf References
  • 25. Thank You! Questions? Acknowledgements: Built off of work by Alexander Ulanov, Bert Grenovich and Xiangrui Meng