SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
Are you sure about that!?
Uncertainty Quantification in AI
Florian Wilhelm Berlin, October 10th 2019
2
Dr. Florian Wilhelm
Principal Data Scientist @ inovex
@FlorianWilhelm
FlorianWilhelm
florianwilhelm.info
Mathematical Modelling
Data Science to Production
Recommender Systems
Uncertainty Quantification & Causality
Python Data Stack
Maintainer PyScaffold
3
Simon Bachstein
Data Scientist @ inovex
2018/07 – 2019/01 Master Thesis at inovex:
Uncertainty Quantification in Deep Learning
• Blogpost:
http://inovex.de/blog/uncertainty-quantification-deep-learning
• Master Thesis:
https://sbachstein.de/master_thesis.pdf
@simonbachstein
sbachstein
sbachstein.de
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
4
Deep Networks cannot look beyond their horizon
Motivation
5
90% cat
10% dog
Deep Networks cannot look beyond their horizon
Motivation
6
40% cat
60% dog
Deep Networks cannot look beyond their horizon
Motivation
7
?
Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying
Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org
Learning and the Unknown
8
Simple Regression Problem
Interpolation
9
Simple Regression Problem
Deep Networks don’t extrapolate
Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
Simple Regression Problem
Deep Networks don’t extrapolate
11
Simple Regression Problem
Uncertainty about interpolation and extrapolation
12
Types of Uncertainty
13
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
14
Methods for Uncertainty Quantification
16
Relaxation of mathematical assumptions about data
Gaussian
Processes
Deep Ensembles / Dropout
Ensembles
Quantile
Regression
Monte Carlo
Dropout
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
17
A Gaussian Process can be thought of as a random function
which is defined by its mean and covariance functions
Gaussian Processes
18
Definition
Gaussian Processes
19
Example
Gaussian Processes
20
Inference
Gaussian Processes
21
Inference
Gaussian Processes
22
Inference with perfect interpolation
Gaussian Processes
23
Inference with noisy observations
Gaussian Processes
24
Inference
Inference using given data points can be done analytically. For
example, when assuming the (prior) mean function to be zero
everywhere, we get:
Good introduction:
Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck
• https://www.youtube.com/watch?v=-sIOMs4MSuA
• https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc
computationally intense
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
25
MC Dropout
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al.
26
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
27
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28
Custom loss function:
Capture uncertainty directly at training time
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29
Combine an ensemble of networks
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
30
Dropout Ensembles
31
The best of both worlds?
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
32
Using the cumulative distribution function (cdf) of a random variable Y, we
define the quantile:
Loss function to estimate quantile:
Quantile Regression
33
Intuition behind Quantile Regression
34
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.2
0.5
0.8
Error: 0.0 + 1.6 = 1.6
Intuition behind Quantile Regression
35
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.4
0.7
Error: 0.0 + 1.2 = 1.2
0.0
Intuition behind Quantile Regression
36
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.3
0.6
Error: 0.1 + 0.9 = 1.0
0.0
Intuition behind Quantile Regression
37
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
Error: 0.3 + 0.7 = 1.0
0.1
No change due to the linearity of the error!
+0.1
+0.1
-0.1
-0.1
Now the 0.75th Quantile
38
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume 𝜏 = 0.75 is here
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
0.1
Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6
Right-side error weights 3 times as much as
the left-side error
Now the 0.75th Quantile
39
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
𝜏 = 0.75
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.5
0.1
0.2
Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4
0.4
Change in the right-side error also weights
3 times as much as the left-side error
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
40
According to the function
Samples are generated as follows:
Experiments
Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41
Dataset
Experiments
42
Dataset
Neural networks
› 2 hidden layers with 20 ReLU neurons each
› 5 networks for Deep Ensembles
› 100 iterations for Dropout predictions
› Adam optimizer with batch size of 128
› LR, weight decay, dropout probability are optimized
Gaussian Processes
› squared exponential covariance and zero mean function prior
› covariance function parameters and aleatory noise are optimized
Experiments
43
Network setup and hyperparameters
Mean squared error (MSE)
Mean negative log likelihood (MNLL)
Mean Kullback-Leibler (KL) divergence
Experiments
44
Measures for generalization quality
Experiments
45
Interpolation
Experiments
46
They still don’t extrapolate and they don’t quite realize
Experiments
47
Gaussian Process
Experiments
Convergence
48
Experiments
49
Heteroscedastic noise
Experiments
50
Non-Gaussian noise
Experiments
51
Uncertainty split
aleatoric epistemic
Summary
52
GP MCD DeepE DropoutE QR
Homoscedastic
noise
++ o + o o
Heteroscedastic
noise
-- - ++ + +
Non-Gaussian
noise
+ o + + -
Convergence ++ - + - +
Speed (--) + - / (+) + ++
Uncertainty split yes no yes yes no
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
53
› Neural network approaches discussed here are very aware of
aleatory uncertainty, however, not capable of correctly estimating
epistemic uncertainty
› Gaussian Processes give clear signals about ignorance but do not
scale
A combined solution needs to be developed because
uncertainty estimation is needed in critical applications
Conclusion
54
There is work to be done
› Bayesian Neural Networks (e.g. with PyMC)
› Sparse Gaussian Process approximations
› Gaussian Processes on top of neural networks
Outlook
55
Other approaches
Thank You!
Florian Wilhelm
Principal Data Scientist
inovex GmbH
Schanzenstraße 6-20
Kupferhütte 1.13
51063 Köln
florian.wilhelm@inovex.de

Más contenido relacionado

La actualidad más candente

Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Dropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationDropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationSangwoo Mo
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationAdnan Masood
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural networkDr. C.V. Suresh Babu
 
Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]JULIO GONZALEZ SANZ
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
Dempster Shafer Theory AI CSE 8th Sem
Dempster Shafer Theory AI CSE 8th SemDempster Shafer Theory AI CSE 8th Sem
Dempster Shafer Theory AI CSE 8th SemDigiGurukul
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Edureka!
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkPrakash K
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learningTom Dierickx
 

La actualidad más candente (20)

Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Dropout as a Bayesian Approximation
Dropout as a Bayesian ApproximationDropout as a Bayesian Approximation
Dropout as a Bayesian Approximation
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural network
 
Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
Dempster Shafer Theory AI CSE 8th Sem
Dempster Shafer Theory AI CSE 8th SemDempster Shafer Theory AI CSE 8th Sem
Dempster Shafer Theory AI CSE 8th Sem
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 

Similar a Uncertainty Quantification in AI

PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberJisang Yoon
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningDanil Nagy
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsData Science Milan
 
Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Christian Robert
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 butest
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningBruno Gonçalves
 
Experiments in genetic programming
Experiments in genetic programmingExperiments in genetic programming
Experiments in genetic programmingLars Marius Garshol
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3Oliver Zhang
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance blissStephen Senn
 
Machine learning in the life sciences with knime
Machine learning in the life sciences with knimeMachine learning in the life sciences with knime
Machine learning in the life sciences with knimeGreg Landrum
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)Julien SIMON
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...PyData
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)NAVER Engineering
 
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Ricardo Guerrero Gómez-Olmedo
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsBalázs Kégl
 
Deep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLDeep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLAlex Conway
 

Similar a Uncertainty Quantification in AI (20)

PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
Data Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine LearningData Mining the City - A (practical) introduction to Machine Learning
Data Mining the City - A (practical) introduction to Machine Learning
 
Robustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning MethodsRobustness Metrics for ML Models based on Deep Learning Methods
Robustness Metrics for ML Models based on Deep Learning Methods
 
Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016Discussion of Persi Diaconis' lecture at ISBA 2016
Discussion of Persi Diaconis' lecture at ISBA 2016
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1
 
A practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) LearningA practical Introduction to Machine(s) Learning
A practical Introduction to Machine(s) Learning
 
Experiments in genetic programming
Experiments in genetic programmingExperiments in genetic programming
Experiments in genetic programming
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
Practical Ai Class 3
Practical Ai Class 3Practical Ai Class 3
Practical Ai Class 3
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance bliss
 
Machine learning in the life sciences with knime
Machine learning in the life sciences with knimeMachine learning in the life sciences with knime
Machine learning in the life sciences with knime
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling i...
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
 
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
Meetup Python Madrid 2018: ¿Segmentación semántica? ¿Pero de qué me estás hab...
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural nets
 
Deep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveMLDeep Learning for Computer Vision - ExecutiveML
Deep Learning for Computer Vision - ExecutiveML
 

Más de Florian Wilhelm

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingFlorian Wilhelm
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackFlorian Wilhelm
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Florian Wilhelm
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...Florian Wilhelm
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Florian Wilhelm
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Florian Wilhelm
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use caseFlorian Wilhelm
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...Florian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceFlorian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Florian Wilhelm
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and ProgrammingFlorian Wilhelm
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Florian Wilhelm
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19Florian Wilhelm
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Florian Wilhelm
 

Más de Florian Wilhelm (16)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer Programming
 
WALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics StackWALD: A Modern & Sustainable Analytics Stack
WALD: A Modern & Sustainable Analytics Stack
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and Programming
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...
 

Último

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 

Último (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 

Uncertainty Quantification in AI

  • 1. Are you sure about that!? Uncertainty Quantification in AI Florian Wilhelm Berlin, October 10th 2019
  • 2. 2 Dr. Florian Wilhelm Principal Data Scientist @ inovex @FlorianWilhelm FlorianWilhelm florianwilhelm.info Mathematical Modelling Data Science to Production Recommender Systems Uncertainty Quantification & Causality Python Data Stack Maintainer PyScaffold
  • 3. 3 Simon Bachstein Data Scientist @ inovex 2018/07 – 2019/01 Master Thesis at inovex: Uncertainty Quantification in Deep Learning • Blogpost: http://inovex.de/blog/uncertainty-quantification-deep-learning • Master Thesis: https://sbachstein.de/master_thesis.pdf @simonbachstein sbachstein sbachstein.de
  • 4. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 4
  • 5. Deep Networks cannot look beyond their horizon Motivation 5 90% cat 10% dog
  • 6. Deep Networks cannot look beyond their horizon Motivation 6 40% cat 60% dog
  • 7. Deep Networks cannot look beyond their horizon Motivation 7 ?
  • 8. Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org Learning and the Unknown 8
  • 10. Simple Regression Problem Deep Networks don’t extrapolate Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
  • 11. Simple Regression Problem Deep Networks don’t extrapolate 11
  • 12. Simple Regression Problem Uncertainty about interpolation and extrapolation 12
  • 14. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 14
  • 15. Methods for Uncertainty Quantification 16 Relaxation of mathematical assumptions about data Gaussian Processes Deep Ensembles / Dropout Ensembles Quantile Regression Monte Carlo Dropout
  • 16. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 17
  • 17. A Gaussian Process can be thought of as a random function which is defined by its mean and covariance functions Gaussian Processes 18 Definition
  • 21. Gaussian Processes 22 Inference with perfect interpolation
  • 23. Gaussian Processes 24 Inference Inference using given data points can be done analytically. For example, when assuming the (prior) mean function to be zero everywhere, we get: Good introduction: Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck • https://www.youtube.com/watch?v=-sIOMs4MSuA • https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc computationally intense
  • 24. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 25
  • 25. MC Dropout Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al. 26 ...
  • 26. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 27
  • 27. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28 Custom loss function: Capture uncertainty directly at training time
  • 28. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29 Combine an ensemble of networks
  • 29. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 30
  • 30. Dropout Ensembles 31 The best of both worlds? ...
  • 31. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 32
  • 32. Using the cumulative distribution function (cdf) of a random variable Y, we define the quantile: Loss function to estimate quantile: Quantile Regression 33
  • 33. Intuition behind Quantile Regression 34 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.2 0.5 0.8 Error: 0.0 + 1.6 = 1.6
  • 34. Intuition behind Quantile Regression 35 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.4 0.7 Error: 0.0 + 1.2 = 1.2 0.0
  • 35. Intuition behind Quantile Regression 36 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.3 0.6 Error: 0.1 + 0.9 = 1.0 0.0
  • 36. Intuition behind Quantile Regression 37 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 Error: 0.3 + 0.7 = 1.0 0.1 No change due to the linearity of the error! +0.1 +0.1 -0.1 -0.1
  • 37. Now the 0.75th Quantile 38 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume 𝜏 = 0.75 is here 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 0.1 Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6 Right-side error weights 3 times as much as the left-side error
  • 38. Now the 0.75th Quantile 39 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 𝜏 = 0.75 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.5 0.1 0.2 Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4 0.4 Change in the right-side error also weights 3 times as much as the left-side error
  • 39. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 40
  • 40. According to the function Samples are generated as follows: Experiments Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41 Dataset
  • 42. Neural networks › 2 hidden layers with 20 ReLU neurons each › 5 networks for Deep Ensembles › 100 iterations for Dropout predictions › Adam optimizer with batch size of 128 › LR, weight decay, dropout probability are optimized Gaussian Processes › squared exponential covariance and zero mean function prior › covariance function parameters and aleatory noise are optimized Experiments 43 Network setup and hyperparameters
  • 43. Mean squared error (MSE) Mean negative log likelihood (MNLL) Mean Kullback-Leibler (KL) divergence Experiments 44 Measures for generalization quality
  • 45. Experiments 46 They still don’t extrapolate and they don’t quite realize
  • 51. Summary 52 GP MCD DeepE DropoutE QR Homoscedastic noise ++ o + o o Heteroscedastic noise -- - ++ + + Non-Gaussian noise + o + + - Convergence ++ - + - + Speed (--) + - / (+) + ++ Uncertainty split yes no yes yes no
  • 52. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 53
  • 53. › Neural network approaches discussed here are very aware of aleatory uncertainty, however, not capable of correctly estimating epistemic uncertainty › Gaussian Processes give clear signals about ignorance but do not scale A combined solution needs to be developed because uncertainty estimation is needed in critical applications Conclusion 54 There is work to be done
  • 54. › Bayesian Neural Networks (e.g. with PyMC) › Sparse Gaussian Process approximations › Gaussian Processes on top of neural networks Outlook 55 Other approaches
  • 55. Thank You! Florian Wilhelm Principal Data Scientist inovex GmbH Schanzenstraße 6-20 Kupferhütte 1.13 51063 Köln florian.wilhelm@inovex.de