SlideShare una empresa de Scribd logo
1 de 71
Descargar para leer sin conexión
Learning Loss
for Active Learning
Donggeun Yoo In So Kweon
CVPR 2019 (Oral presentation)
Lunit KAIST
Introduction
•Very important for deep learning
•It is not questionable that
more data still improves network performance
[Mahajan et al., ECCV’18]
천만~10억장
Introduction
•Problem: Limited budget for annotation
Horse=1
$ $$ $$$
Introduction
•Problem: Limited budget for annotation
•Disease-level annotations for medical images:
super-expensive
$$$$$
Horse=1
$
Active Learning
Labeled
Training
Active Learning
Unlabeled
pool
Labeled
Inference
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
If uncertain,
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
Training
If uncertain,
Active Learning
Unlabeled
pool
Labeled
Inference
Labeling
Training
If uncertain,
Active Learning
If uncertain,
The key of active learning is
how to measure the uncertainty.
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Entropy
• An information-theoretic measure that represents the
information amount needed to “encode” a distribution.
• The use of entropy in active learning
• Dense prediction (0.33, 0.33, 0.33) → maximum
• Sparse prediction (1.00, 0.00, 0.00) → minimum
*Entropy
• An information-theoretic measure that represents the
information amount needed to “encode” a distribution.
• The use of entropy in active learning
• Dense prediction (0.33, 0.33, 0.33) → maximum
• Sparse prediction (1.00, 0.00, 0.00) → minimum
(+) Very simple but works well (also in deep networks)
(−) Specific for classification problem
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Bayesian Inference
• Training
• Dropout layer inserted to every convolution layer
• Inference
• N feed forwards → N predictions
• Uncertainty = variance between predictions
*Bayesian Inference
• Training
• Dropout layer inserted to every convolution layer
(−) Super slow convergence
→ impractical for current deep nets
• Inference
• N feed forwards → N predictions
• Uncertainty = variance between predictions
(−) Computationally expensive
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Task-specific design
*Diversity:
Core-set
Distribution of unlabeled pool
*Diversity:
Core-set
𝛿
Distribution of unlabeled pool
{ } is 𝛿-cover of { }
*Diversity:
Core-set
𝛿
Distribution of unlabeled pool
{ } is 𝛿-cover of { }
𝑥 = min
{𝑥}
𝛿
Optimization problem
*Diversity:
Core-set
(+) can be task-agnostic
as it only depends on feature space
(−) not considering ”hard” examples
near the decision boundaries
(−) Expensive optimization for large pool
Active Learning: Limitations
• Heuristic approach
• Highest entropy [Joshi et al., CVPR’09]
• Distance to decision boundaries [Tong & Koller, JMLR’01]
(−) Task-specific design
• Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18]
(−) Not scale to large CNNs and data
• Bayesian approach
• Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07]
• Bayesian inference by dropouts [Gal & Ghahramani ICML’17]
(−) Not scale to large CNNs and data [Sener & Savarese, ICLR’18]
• Distribution approach
• Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18]
(−) Not considering hard examples
Active Learning: Our approach
• Active learning by learning loss
• Attach a “loss prediction module” to a target network
• Learn the module to predict the loss
Unlabeled
pool
⋯Predicted
losses
Labeled
training set
Human oracles
annotate top-𝐾
data points
Active Learning: Our approach
• Requirements
• Task-agnostic method
• Not heuristic, learning-based
• Scalable to state-of-the-art networks and large data
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
Multi-task learning
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
(+) Applicable to
• any network and data
• any tasks
(+) Nearly zero cost
Active Learning by Learning Loss
Model
Loss prediction module
Input
Target
prediction
Loss
prediction
Target
GT
Target
loss
Loss-prediction
loss
(+) Applicable to
• any network and data
• any tasks
(+) Nearly zero cost
𝑥
ො𝑦
𝑦
መ𝑙
𝑙
𝐿loss
መ𝑙, 𝑙
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•Mean square error?
𝐿𝑙𝑜𝑠𝑠
መ𝑙, 𝑙 = መ𝑙 − 𝑙
2
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•Mean square error?
→ target task loss 𝑙 reduced as training progresses
𝐿𝑙𝑜𝑠𝑠
መ𝑙, 𝑙 = መ𝑙 − 𝑙
2
Scale changes
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•To ignore scale changes of 𝑙,
we use a ranking loss
Active Learning by Learning Loss
•The loss for loss-prediction 𝐿loss
መ𝑙, 𝑙
•To ignore scale changes of 𝑙,
we use a ranking loss as
𝐿loss
መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 = max 0, −𝟏 𝑙𝑖, 𝑙𝑗 ⋅ መ𝑙𝑖 − መ𝑙𝑗 + 𝜉
where 𝟏 𝑙𝑖, 𝑙𝑗 = ቊ
+1, if 𝑙𝑖 > 𝑙𝑗
−1, otherwise
A pair of
predicted losses
A pair of
real losses
Margin (=1)
Active Learning by Learning Loss
•Given a mini-batch B,
the total loss is defined as
1
B
෍
𝑥,𝑦 ∈B
𝐿task ො𝑦, 𝑦 + 𝜆
1
B
⋅ ෍
𝑥 𝑖,𝑦 𝑖,𝑥 𝑗,𝑦 𝑗 ∈B
𝐿loss
መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗
where 𝑙𝑖 = 𝐿task ො𝑦𝑖, 𝑦𝑖
Target task Loss prediction
A pair 𝑖, 𝑗 within a mini-batch B
Active Learning by Learning Loss
•MSE loss VS. Ranking loss
MSE
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•MSE loss VS. Ranking loss
MSE
Ranking
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•Loss prediction module
Target model
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Convolved
features
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
FC
Loss
predictionConcat.
Backprop.
to convs
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
• The convolutions would be learned by
the loss prediction loss as well as the target loss
• Sufficiently large receptive field size
Active Learning by Learning Loss
•Loss prediction module
Enough convolutions
• The convolutions would be learned by
the loss prediction loss as well as the target loss
• Sufficiently large receptive field size
→ Don’t need more convolutions,
we just focus on merging the multiple features
Active Learning by Learning Loss
•Loss prediction module
Target model
FC
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
(+) very efficient as GAP reduces the feature dimension
Active Learning by Learning Loss
•Loss prediction module
Target model
Target
model
FC
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
Conv
BN
ReLU
GAP
FC
ReLU
Conv
BN
ReLU
GAP
FC
ReLU
Conv
BN
ReLU
GAP
FC
ReLU
: Added layer
Active Learning by Learning Loss
•Loss prediction module
More convolutions VS. Just FC
ResNet-18
CIFAR-10
Active Learning by Learning Loss
•Loss prediction module
Target model
FC
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Loss
prediction
Mid-
block
Mid-
block
Mid-
block
Out
block
Target
prediction
Concat.
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification
Task Image
classification
Data CIFAR-10
Net ResNet-18
[He et al., CVPR’16]
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification Classification
+ regression
Task Image
classification
Object
detection
Data CIFAR-10 PASCAL VOC
2007+2012
Net ResNet-18
[He et al., CVPR’16]
SSD
[Liu et al., ECCV’16]
Experiments (1)
•To validate “task-agnostic” + “state-of-the-art architectures”
Classification Classification
+ regression
Regression
Task Image
classification
Object
detection
Human pose
estimation
Data CIFAR-10 PASCAL VOC
2007+2012
MPII
Net ResNet-18
[He et al., CVPR’16]
SSD
[Liu et al., ECCV’16]
Stacked
Hourglass
Networks
[Newell et al., ECCV’16]
Results
•Image classification over CIFAR 10
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
ResNet-18
[He et al., CVPR’16]
Target
prediction
512×4×4
256×8×8
64×32×32
128×16×16
128
128
128
128
512
Results
•Image classification
over CIFAR 10
Loss prediction performance
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+3.37%
Results
•Image classification
over CIFAR 10
(mean of 5 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+3.37%
Data selection VS. Architecture
Data selection by active learning → +3.37%
DenseNet121[Huang et al.] − ResNet18 → +2.02%
Results
•Object detection
SSD (ImageNet pre-trained)
[Liu et al., ECCV’16]
FC
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Target
prediction
512×38×38
1024×19×19
512×10×10
256×5×5
256×3×3
256×1×1
128
768
Results
•Object detection over
PASCAL VOC 07+12
Loss prediction performance
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+2.21%
Results
•Object detection on
PASCAL VOC 07+12
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
+2.21%
Data selection VS. Architecture
Data selection by active learning → +2.21%
YOLOv2[Redmon et al.] − SSD → +1.80%
Results
•Human pose estimation
over MPII dataset Stacked Hourglass Network
[Newell et al., ECCV’16]
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
GAP
FC
ReLU
Target
prediction
An hourglass256×64×64
256×64×64
256×64×64
256×64×64
128
128
128
128
1024
Results
•Human pose estimation
over MPII dataset
Loss prediction performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
+1.84%
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→
Results
•Human pose estimation
over MPII dataset
(mean of 3 trials)
+1.84%
Data selection VS. Number of stacks
Data selection by active learning → +1.84%
8-stacked − 2-stacked → +0.25%
Results
•Entropy VS predicted loss over MPII dataset
MSE loss MSE loss
Experiments (2)
•To validate “active domain adaptation”,
Dataset Data stats Active learning
Source
domain
MNIST #train:60k
#test: 10k
Use 60k as an
initial labeled pool
Target
domain
MNIST +
background
#train: 12k
#test: 50k
Add 1k for
each cycle
Results
•Image classification over MNIST
*https://github.com/pytorch/examples/tree/master/mnist
FC
GAP
FC
ReLU
Loss
prediction
Concat.
GAP
FC
ReLU
GAP
FC
ReLU
PyTorch MNIST model*
Target
prediction
Conv
ReLU
Conv
ReLU
FC
ReLU
FC
Image
10×12×12
20×4×4
50
64
64
64
192
Results
•Domain adaptation from MNIST to MNIST+background
•Loss prediction performance
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
+1.20%
Results
•Domain adaptation
from MNIST
to MNIST+background
•Target domain
performance
[Joshi, CVPR’09]→
[Sener et al., ICLR’18]→

Feature space overfitted
to source domain
+1.20%
Data selection VS. Architecture
Data selection by active learning → +1.20%
WideResNet14 − PytorchMNIST(4 layers) → +2.85%
Conclusion
•Introduced a novel active learning method that is
• Works well with current deep networks
• Task-agnostic
•Verified with
• Three major visual recognition tasks
• Three popular network architectures
Conclusion
•Introduced a novel active learning method that is
• Works well with current deep networks
• Task-agnostic
•Verified with
• Three major visual recognition tasks
• Three popular network architectures
“
”
Pick more important data,
and get better performance!

Más contenido relacionado

La actualidad más candente

[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networksTaegyun Jeon
 
PR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sPR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sJinwon Lee
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from XailientEdge AI and Vision Alliance
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
 
Lifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksLifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksNAVER Engineering
 
Continual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesContinual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesVincenzo Lomonaco
 
End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0taeseon ryu
 
PR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionPR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionSunghoon Joo
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisBeerenSahu
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion pathVitaly Bondar
 
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamicsHa Phuong
 
Score based generative model
Score based generative modelScore based generative model
Score based generative modelsangyun lee
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density ModelsSangwoo Mo
 
Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Joonyoung Yi
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]Dongmin Choi
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것NAVER Engineering
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee
 

La actualidad más candente (20)

[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks[PR12] image super resolution using deep convolutional networks
[PR12] image super resolution using deep convolutional networks
 
PR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sPR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020s
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Lifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable NetworksLifelong Learning for Dynamically Expandable Networks
Lifelong Learning for Dynamically Expandable Networks
 
Continual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesContinual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep Architectures
 
End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0End to-end semi-supervised object detection with soft teacher ver.1.0
End to-end semi-supervised object detection with soft teacher ver.1.0
 
PR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionPR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed Recognition
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics
004 20151116 deep_unsupervisedlearningusingnonequlibriumthermodynamics
 
Score based generative model
Score based generative modelScore based generative model
Score based generative model
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)Dynamically Expandable Network (DEN)
Dynamically Expandable Network (DEN)
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
PR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision LearnersPR-355: Masked Autoencoders Are Scalable Vision Learners
PR-355: Masked Autoencoders Are Scalable Vision Learners
 

Similar a Learning loss for active learning

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
Machine learning for Data Science
Machine learning for Data ScienceMachine learning for Data Science
Machine learning for Data ScienceDr. Vaibhav Kumar
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptxmustafa sarac
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 
Graph Analysis of Student Model Networks
Graph Analysis of Student Model NetworksGraph Analysis of Student Model Networks
Graph Analysis of Student Model Networksmallium
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingTed Xiao
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysispotaters
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer LearningAshok Venkatesan
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...Dongmin Choi
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 butest
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Balázs Hidasi
 

Similar a Learning loss for active learning (20)

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Machine learning for Data Science
Machine learning for Data ScienceMachine learning for Data Science
Machine learning for Data Science
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptx
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 
Graph Analysis of Student Model Networks
Graph Analysis of Student Model NetworksGraph Analysis of Student Model Networks
Graph Analysis of Student Model Networks
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Decision Forests and discriminant analysis
Decision Forests and discriminant analysisDecision Forests and discriminant analysis
Decision Forests and discriminant analysis
 
Vi sem
Vi semVi sem
Vi sem
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer Learning
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1
 
Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...Utilizing additional information in factorization methods (research overview,...
Utilizing additional information in factorization methods (research overview,...
 
Ml - A shallow dive
Ml  - A shallow diveMl  - A shallow dive
Ml - A shallow dive
 

Más de NAVER Engineering

디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIXNAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기NAVER Engineering
 

Más de NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Último

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Learning loss for active learning

  • 1. Learning Loss for Active Learning Donggeun Yoo In So Kweon CVPR 2019 (Oral presentation) Lunit KAIST
  • 2. Introduction •Very important for deep learning •It is not questionable that more data still improves network performance [Mahajan et al., ECCV’18] 천만~10억장
  • 3. Introduction •Problem: Limited budget for annotation Horse=1 $ $$ $$$
  • 4. Introduction •Problem: Limited budget for annotation •Disease-level annotations for medical images: super-expensive $$$$$ Horse=1 $
  • 10. Active Learning If uncertain, The key of active learning is how to measure the uncertainty.
  • 11. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 12. *Entropy • An information-theoretic measure that represents the information amount needed to “encode” a distribution. • The use of entropy in active learning • Dense prediction (0.33, 0.33, 0.33) → maximum • Sparse prediction (1.00, 0.00, 0.00) → minimum
  • 13. *Entropy • An information-theoretic measure that represents the information amount needed to “encode” a distribution. • The use of entropy in active learning • Dense prediction (0.33, 0.33, 0.33) → maximum • Sparse prediction (1.00, 0.00, 0.00) → minimum (+) Very simple but works well (also in deep networks) (−) Specific for classification problem
  • 14. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 15. *Bayesian Inference • Training • Dropout layer inserted to every convolution layer • Inference • N feed forwards → N predictions • Uncertainty = variance between predictions
  • 16. *Bayesian Inference • Training • Dropout layer inserted to every convolution layer (−) Super slow convergence → impractical for current deep nets • Inference • N feed forwards → N predictions • Uncertainty = variance between predictions (−) Computationally expensive
  • 17. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large data and CNNs [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Task-specific design
  • 20. *Diversity: Core-set 𝛿 Distribution of unlabeled pool { } is 𝛿-cover of { } 𝑥 = min {𝑥} 𝛿 Optimization problem
  • 21. *Diversity: Core-set (+) can be task-agnostic as it only depends on feature space (−) not considering ”hard” examples near the decision boundaries (−) Expensive optimization for large pool
  • 22. Active Learning: Limitations • Heuristic approach • Highest entropy [Joshi et al., CVPR’09] • Distance to decision boundaries [Tong & Koller, JMLR’01] (−) Task-specific design • Ensemble approach [Freund et al., ML’97], [Beluch et al., CVPR’18] (−) Not scale to large CNNs and data • Bayesian approach • Expected error [Roy & McCallum, ICML’01]/model [Kapoor et al., ICCV’07] • Bayesian inference by dropouts [Gal & Ghahramani ICML’17] (−) Not scale to large CNNs and data [Sener & Savarese, ICLR’18] • Distribution approach • Density-based [Liu & Ferrari, ICCV’17], diversity-based [Sener & Savarese, ICLR’18] (−) Not considering hard examples
  • 23. Active Learning: Our approach • Active learning by learning loss • Attach a “loss prediction module” to a target network • Learn the module to predict the loss Unlabeled pool ⋯Predicted losses Labeled training set Human oracles annotate top-𝐾 data points
  • 24. Active Learning: Our approach • Requirements • Task-agnostic method • Not heuristic, learning-based • Scalable to state-of-the-art networks and large data
  • 25. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss
  • 26. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss Multi-task learning
  • 27. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss (+) Applicable to • any network and data • any tasks (+) Nearly zero cost
  • 28. Active Learning by Learning Loss Model Loss prediction module Input Target prediction Loss prediction Target GT Target loss Loss-prediction loss (+) Applicable to • any network and data • any tasks (+) Nearly zero cost 𝑥 ො𝑦 𝑦 መ𝑙 𝑙 𝐿loss መ𝑙, 𝑙
  • 29. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •Mean square error? 𝐿𝑙𝑜𝑠𝑠 መ𝑙, 𝑙 = መ𝑙 − 𝑙 2
  • 30. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •Mean square error? → target task loss 𝑙 reduced as training progresses 𝐿𝑙𝑜𝑠𝑠 መ𝑙, 𝑙 = መ𝑙 − 𝑙 2 Scale changes
  • 31. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •To ignore scale changes of 𝑙, we use a ranking loss
  • 32. Active Learning by Learning Loss •The loss for loss-prediction 𝐿loss መ𝑙, 𝑙 •To ignore scale changes of 𝑙, we use a ranking loss as 𝐿loss መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 = max 0, −𝟏 𝑙𝑖, 𝑙𝑗 ⋅ መ𝑙𝑖 − መ𝑙𝑗 + 𝜉 where 𝟏 𝑙𝑖, 𝑙𝑗 = ቊ +1, if 𝑙𝑖 > 𝑙𝑗 −1, otherwise A pair of predicted losses A pair of real losses Margin (=1)
  • 33. Active Learning by Learning Loss •Given a mini-batch B, the total loss is defined as 1 B ෍ 𝑥,𝑦 ∈B 𝐿task ො𝑦, 𝑦 + 𝜆 1 B ⋅ ෍ 𝑥 𝑖,𝑦 𝑖,𝑥 𝑗,𝑦 𝑗 ∈B 𝐿loss መ𝑙𝑖, መ𝑙𝑗, 𝑙𝑖, 𝑙𝑗 where 𝑙𝑖 = 𝐿task ො𝑦𝑖, 𝑦𝑖 Target task Loss prediction A pair 𝑖, 𝑗 within a mini-batch B
  • 34. Active Learning by Learning Loss •MSE loss VS. Ranking loss MSE ResNet-18 CIFAR-10
  • 35. Active Learning by Learning Loss •MSE loss VS. Ranking loss MSE Ranking ResNet-18 CIFAR-10
  • 36. Active Learning by Learning Loss •Loss prediction module Target model Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat.
  • 37. Active Learning by Learning Loss •Loss prediction module Enough convolutions Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat. Convolved features
  • 38. Active Learning by Learning Loss •Loss prediction module Enough convolutions Mid- block Mid- block Mid- block Out block Target prediction FC Loss predictionConcat. Backprop. to convs
  • 39. Active Learning by Learning Loss •Loss prediction module Enough convolutions • The convolutions would be learned by the loss prediction loss as well as the target loss • Sufficiently large receptive field size
  • 40. Active Learning by Learning Loss •Loss prediction module Enough convolutions • The convolutions would be learned by the loss prediction loss as well as the target loss • Sufficiently large receptive field size → Don’t need more convolutions, we just focus on merging the multiple features
  • 41. Active Learning by Learning Loss •Loss prediction module Target model FC GAP FC ReLU GAP FC ReLU GAP FC ReLU Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat. (+) very efficient as GAP reduces the feature dimension
  • 42. Active Learning by Learning Loss •Loss prediction module Target model Target model FC Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat. Conv BN ReLU GAP FC ReLU Conv BN ReLU GAP FC ReLU Conv BN ReLU GAP FC ReLU : Added layer
  • 43. Active Learning by Learning Loss •Loss prediction module More convolutions VS. Just FC ResNet-18 CIFAR-10
  • 44. Active Learning by Learning Loss •Loss prediction module Target model FC GAP FC ReLU GAP FC ReLU GAP FC ReLU Loss prediction Mid- block Mid- block Mid- block Out block Target prediction Concat.
  • 45. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Task Image classification Data CIFAR-10 Net ResNet-18 [He et al., CVPR’16]
  • 46. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Classification + regression Task Image classification Object detection Data CIFAR-10 PASCAL VOC 2007+2012 Net ResNet-18 [He et al., CVPR’16] SSD [Liu et al., ECCV’16]
  • 47. Experiments (1) •To validate “task-agnostic” + “state-of-the-art architectures” Classification Classification + regression Regression Task Image classification Object detection Human pose estimation Data CIFAR-10 PASCAL VOC 2007+2012 MPII Net ResNet-18 [He et al., CVPR’16] SSD [Liu et al., ECCV’16] Stacked Hourglass Networks [Newell et al., ECCV’16]
  • 48. Results •Image classification over CIFAR 10 FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU ResNet-18 [He et al., CVPR’16] Target prediction 512×4×4 256×8×8 64×32×32 128×16×16 128 128 128 128 512
  • 49. Results •Image classification over CIFAR 10 Loss prediction performance
  • 50. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→
  • 51. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +3.37%
  • 52. Results •Image classification over CIFAR 10 (mean of 5 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +3.37% Data selection VS. Architecture Data selection by active learning → +3.37% DenseNet121[Huang et al.] − ResNet18 → +2.02%
  • 53. Results •Object detection SSD (ImageNet pre-trained) [Liu et al., ECCV’16] FC Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU GAP FC ReLU Target prediction 512×38×38 1024×19×19 512×10×10 256×5×5 256×3×3 256×1×1 128 768
  • 54. Results •Object detection over PASCAL VOC 07+12 Loss prediction performance
  • 55. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→
  • 56. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +2.21%
  • 57. Results •Object detection on PASCAL VOC 07+12 (mean of 3 trials) [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ +2.21% Data selection VS. Architecture Data selection by active learning → +2.21% YOLOv2[Redmon et al.] − SSD → +1.80%
  • 58. Results •Human pose estimation over MPII dataset Stacked Hourglass Network [Newell et al., ECCV’16] FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU GAP FC ReLU Target prediction An hourglass256×64×64 256×64×64 256×64×64 256×64×64 128 128 128 128 1024
  • 59. Results •Human pose estimation over MPII dataset Loss prediction performance
  • 60. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials)
  • 61. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials) +1.84%
  • 62. [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→ Results •Human pose estimation over MPII dataset (mean of 3 trials) +1.84% Data selection VS. Number of stacks Data selection by active learning → +1.84% 8-stacked − 2-stacked → +0.25%
  • 63. Results •Entropy VS predicted loss over MPII dataset MSE loss MSE loss
  • 64. Experiments (2) •To validate “active domain adaptation”, Dataset Data stats Active learning Source domain MNIST #train:60k #test: 10k Use 60k as an initial labeled pool Target domain MNIST + background #train: 12k #test: 50k Add 1k for each cycle
  • 65. Results •Image classification over MNIST *https://github.com/pytorch/examples/tree/master/mnist FC GAP FC ReLU Loss prediction Concat. GAP FC ReLU GAP FC ReLU PyTorch MNIST model* Target prediction Conv ReLU Conv ReLU FC ReLU FC Image 10×12×12 20×4×4 50 64 64 64 192
  • 66. Results •Domain adaptation from MNIST to MNIST+background •Loss prediction performance
  • 67. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain
  • 68. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain +1.20%
  • 69. Results •Domain adaptation from MNIST to MNIST+background •Target domain performance [Joshi, CVPR’09]→ [Sener et al., ICLR’18]→  Feature space overfitted to source domain +1.20% Data selection VS. Architecture Data selection by active learning → +1.20% WideResNet14 − PytorchMNIST(4 layers) → +2.85%
  • 70. Conclusion •Introduced a novel active learning method that is • Works well with current deep networks • Task-agnostic •Verified with • Three major visual recognition tasks • Three popular network architectures
  • 71. Conclusion •Introduced a novel active learning method that is • Works well with current deep networks • Task-agnostic •Verified with • Three major visual recognition tasks • Three popular network architectures “ ” Pick more important data, and get better performance!