SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Relational Knowledge Distillation
Wonpyo Park
CVLab @ POSTECH
• What is Knowledge Distillation (Transfer) ?
• Recent Approaches
• Relational Knowledge Distillation (RKD)
• Discussion
• Conclusion
Contents
2
Knowledge Distillation (Transfer) Transfer Learning
What is Knowledge Distillation?
Model A Model B
Domain A Domain B
Domain A
Student Model
(Small & Shallow)
Teacher Model
(Big & Deep)
educate
(transfer)
transfer
train
train
train train
• For model compression
• To improve performance of student
over teacher
• When data is not sufficient.
• When label for a problem is not presented.
• E.g., pretrained-model on ImageNet
3
Model Compression using Knowledge Distillation
4
Model 1
Model 2
Model 4
Model 3
Ensemble
Example
v
v
v
v
v
Output of
Each Model
Output of
Ensemble<
• Ensemble is an easy way to improve performance of a Neural Network.
• However, it requires large computing resources.
Model Compression using Knowledge Distillation
5
Model 1
Model 2
Model 4
Model 3
Example
v
v
v
v
v
• By educating the student model to mimic output of the teacher
model, the student model can achieve comparable performance.
Student v
Teacher
Transfer
Model Compression using Knowledge Distillation
6
Distillation
Model 1
Model 2
Model 4
Model 3
Student
Teacher
• Distilling the Knowledge in a Neural Network
Hinton et al. In NIPS, 2014.
Recent Approaches: Transfer Class Probability
𝑥𝑖
𝑙𝑜𝑔𝑖𝑡
𝜏
Image
transfer
Class probability
Student Classifier
𝒇 𝑺
Teacher Classifier
𝒇 𝑻
softmax
softmax
Objective:
7
• FitNets: Hints for Thin Deep Nets
Romero et al. In ICLR, 2015.
Recent Approaches: Transfer Hidden Activation
𝑥𝑖
Teacher
𝒇 𝑻
Student
𝒇 𝑺
𝛽
transfer
Hidden Activation
Random linear
transformation𝐶′
𝐶
where 𝐶′
> 𝐶
Objective:
𝛽 𝑓𝑇 𝑥𝑖
8
• Paying More Attention to Attention: Improving the Performance of
Convolutional Neural Networks via Attention Transfer
Zagoruyko & Komodakis. In ICLR, 2017.
Recent Approaches: Transfer Attention
𝑥𝑖
Student
𝒇 𝑺
H
W
C’
H
W
C
𝑄 𝑇
H
W
𝑄 𝑆
H
W
Average
over channel
transfer
Objective:
Teacher
𝒇 𝑻
9
• Born-Again Neural Networks (Furlanello et al. In ICML, 2018.)
• Label Refinery: Improving ImageNet Classificationthrough Label
Progression (Bagherinezhad et al. In arXiv, 2018.)
Recent Approaches: Student Over Teacher
𝑥𝑖
Student Classifier
𝒇 𝑺
Teacher Classifier
𝒇 𝑻
train
Class probability
Ground-truth for student
Surprisingly, the student is significantly better than the teacher.
Student architecture is identical to teacher
10
• Previous works can be expressed as a form of:
• 𝑓𝑇: teacher, 𝑓𝑆: student, 𝑙: loss, 𝑡𝑖 = 𝑓𝑇 𝑥𝑖 , 𝑠𝑖 = 𝑓𝑆 𝑥𝑖 .
• IKD transfers output of individual example from teacher to student.
Individual Knowledge Distillation: Generalization
11
Q. What constitutes the knowledge in a learned model?
What is the Knowledge of a Model?
12
Q. What constitutes the knowledge in a learned model?
A. (IKD) Output of individual examples represented by the teacher.
What is the Knowledge of a Model?
13
Q. What constitutes the knowledge in a learned model?
A. (IKD) Output of individual examples represented by the teacher.
A. (RKD) Relations among examples represented by the teacher.
What is the Knowledge of a Model?
14
• Relational knowledge distillation can be expressed as a form of:
• 𝜓: function extracting relation.
• RKD transfers relation among examples represented by teacher to student.
Relational Knowledge Distillation: Generalization
15
• IKD transfers output of individual examples represented by teacher to student.
• RKD transfers relation among examples represented by teacher to student.
IKD versus RKD
16
• Among many relations, we transfer the “structure” of embedding space.
• Distance-wise loss (pair)
• Angle-wise loss (triplet)
Relational Knowledge Distillation: Structure to Structure
𝑡1 𝑡2
𝑡3
𝑠1 𝑠2
𝑠3
Structure to Structure
Relational KD
Point to Point
Individual KD
𝑡1 𝑡2
𝑡3
𝑠1
𝑠2
𝑠3
17
vs.
• Distance-wise loss (RKD-D)
• RKD-D transfers relative distance
between points on embedding space.
Relational Knowledge Distillation: Distance-wise Loss
Where 𝑙 𝛿 is Huber loss:
𝑡1
𝑠1
𝑠2
𝑠3
𝑡2
𝑡3
1.2
1.0
0.8
0.9
1.70.4
Embedding Space
18
• Angle-wise loss (RKD-A)
• RKD-A transfers angle formed by three points
on embedding space.
Relational Knowledge Distillation: Angle-wise Loss
𝑡1
𝑠1
𝑠2
𝑠3
𝑡2
𝑡3
Embedding Space
𝜃1
𝜃3
𝜃2
𝜃1
𝜃3
𝜃2
19
• Where to apply ?
• On any hidden layers or embedding layers.
• Not on layer where individual output values are crucial.
 Because, RKD does not transfer output value of individual examples.
 E.g., softmax layer for classification.
• How to use RKD during training ?
• RKD loss can be combined with task-specified loss, ℒ 𝑡𝑎𝑠𝑘 + 𝜆 ⋅ ℒ 𝑅𝐾𝐷.
• RKD loss can be used solely for training embedding network, ℒ 𝑅𝐾𝐷.
Relational Knowledge Distillation: How to use RKD?
20
• Metric Learning (Image retrieval)
• Image Classification
• Few-Show Learning
Experiment
21
Metric learning
• It aims to train an embedding model.
• In embedding space, distances between projected examples
correspond to their semantic similarity.
Experiment: What is Metric Learning?
Images DNN
𝑓(𝑥; 𝑊)
d-dimensional
Embedding Space
𝑥1
𝑥2
𝑥3
𝑓(𝑥1)
𝑓(𝑥2)
𝑓(𝑥3)
t-SNE of embedding space on Cars 196 dataset.
(Wang et al., 2017)
positive
negative
22
• Evaluation
• Image retrieval, recall@k
• Dataset
• Cars 196 (Krause et al. In 3dRR, 2013.)
• CUB-200-2011 (Wah et al. In CNS-TR, 2011.)
• Stanford Online Products (Song et al. In CVPR, 2016.)
• Architecture
• Teacher: ResNet50 (backbone) + 512-d fc layer (embedding layer) + L2 normalization
• Student: ResNet18 + various dimension fc layer + L2 normalization (optional)
• Targeting layer of RKD
• Final embedding outputs of teacher and student
• Training Objective
• Teacher: Triplet loss & Distance-weighted sampling (Wu et al. In ICCV, 2017.)
• Student: Triplet loss, RKD-D, RKD-A, RKD-DA, DarkRank (Chen et al. In AAAI, 2018.)
Experiment: Metric Learning
23
Experiment: Metric Learning
(a) Recall@1 on CUB-200-2011
(b) Recall@1 on Cars 196
Distillationto small network
• Model-d refer to a model with d-dimensional embedding.
24
Self-Distillation
• Teacher: ResNet50 + 512-d fc + L2 normalization
• Trained using triplet loss
• Student: ResNet50 + 512-d fc
• Trained using RKD-DA
Experiment: Metric Learning
(a) Recall@1 of Self-Distillation
25
Comparison with state-of-the-art methods
Experiment: Metric Learning
• CUB-200-2011, we achieve state-of-the art performance regardless of backbone network.
• Cars 196 & Stanford Online Products, we achieve second-best performance.
Note that, ABE8 (Kim et al. In ECCV, 2018) requires additional attention modules for 8 branches.
26
(a) Recall@K comparison with state-of-the-art methods.
Experiment: Metric Learning
Qualitative Results
• Where the teacher (Triplet) fails, the student (RKD-DA) succeeds at top-1.
27(a) Retrieval results on CUB-200-2011. (b) Retrieval results on Cars 196.
Experiment: Image Classification
Image Classification
• Datasets: CIFAR-10, CIFAR-100
• Architecture
• Teacher: ResNet50
• Student: VGG-11 with BatchNorm
• Targeting layer of RKD
• Teacher: output of avgpool layer
• Student: output of pool5 layer
• Training Objective
• Teacher: cross-entropy
• Student: cross-entropy + (Hinton et al., RKD-D and RKD-DA)
(a) Accuracy (%) on CIFAR-10 and CIFAR-100.
28
ResNet50
VGG11 with BN fc
fc
CNN Classifier
Teacher
Student
transfer
Experiment: What is Few-Shot Learning?
Few-shot learning
• A classifier learns to generalize to new unseen classes with only few
examples for each new class.
• Shot: the number of examples given for each new class
• Way: the number of new classes
• E.g., Prototypical Network (Snell et al., In NIPS, 2017)
• An embedding network that
classification is performed based on distance
from given examples of new classes.
Prototypical Networks for few-shot learning.
29
Experiment: Few-Shot Learning
Few-shot learning
• Datasets
• Omniglot (Lake et al. In Science, 2015.)
• miniImageNet (Vinyals et al. In NIPS, 2016.)
• Architecture
• Teacher: 4 convolutional layers
• Student: Same with teacher
• Targeting layer of RKD.
• Final embedding output of
teacher and student
• Training Objective
• Teacher: Snell et al. (prototypical networks)
• Student: Snell et al. + (RKD-D or RKD-DA)
30
(a) Accuracy (%) on Omniglot.
(a) Accuracy (%) on miniImageNet.
Discussion: Effective Adaptation on Source Domain
• Both Cars 196 & CUB-200-2011 are fine-grained classification dataset.
• Requires adaptation to specific characteristic of the domain.
e.g.) Finding local patch that distinguish a object from others.
• ‘Triplet’ is the teacher network used for educating ‘RKD-DA’ model.
(a) Recall@1 curve of train/evaluation set
during training the teacher (Triplet) and
the student (RKD-DA) on Cars 196.
(b) Recall@1 on various domains.
Both ‘Triplet’ and ‘RKD-DA’ are the models
trained on Cars 196.
31
• We have introduced Relational KD that effectively transfers knowledge
using relations among data examples represented by the teacher.
• Experiments conducted on different tasks and benchmarks show that the
Relational KD improves the performance of the educated student networks
with a significant margin.
Conclusion
32
•Thank You
The End
33
[1] G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. In NIPS workshop, 2015.
[2] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio. Fitnets: Hints for thin deep nets. In ICLR, 2015.
[3] S. Zagoruyko and N. Komodakis. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer.
In ICLR, 2017.
[4] T. Furlanello, Z. C. Lipton, M. Tschannen, L. Itti, and A. Anandkumar. Born-again neural networks. In ICML, 2018.
[5] H. Bagherinezhad, M. Horton, M. Rastegari, and A. Farhadi. Label refinery: Improving imagenet classification through label progression. In arXiv, 2018.
[6] J. Krause, M. Stark, J. Deng, and L. Fei-Fei. 3d object representations for fine-grained categorization. In 3dRR, 2013.
[7] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. In CNS-TR, 2011.
[8] H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese. Deep metric learning via lifted structured feature embedding. In CVPR, 2016.
[9] C.-Y. Wu, R. Manmatha, A. J. Smola, and P. Krahenbuhl. Sampling matters in deep embedding learning. In ICCV, 2017.
[10] Y. Chen, N. Wang, and Z. Zhang. Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In AAAI, 2018.
[11] W. Kim, B. Goyal, K. Chawla, J. Lee, and K. Kwon. Attention based ensemble for deep metric learning. In ECCV, 2018.
[12] J. Snell, K. Swersky, and R. Zemel. Prototypical networks for few-shot learning. In NIPS, 2017.
[13] S. R. Lake, Brenden M and J. B. Tenenbaum. Human-level concept learning through probabilistic program induction. In Science, 2015.
[14] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. In NIPS, 2016.
References
34

Más contenido relacionado

La actualidad más candente

Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroBill Liu
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxDeep Learning Italia
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers leopauly
 
Model compression
Model compressionModel compression
Model compressionNanhee Kim
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architecturesananth
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pixYasar Hayat
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningMohamed Loey
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionJinwon Lee
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 

La actualidad más candente (20)

Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to Hero
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
 
Model compression
Model compressionModel compression
Model compression
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
ViT.pptx
ViT.pptxViT.pptx
ViT.pptx
 
MobileViTv1
MobileViTv1MobileViTv1
MobileViTv1
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pix
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for VisionPR-317: MLP-Mixer: An all-MLP Architecture for Vision
PR-317: MLP-Mixer: An all-MLP Architecture for Vision
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 

Similar a Relational Knowledge Transfer Boosts Model Performance

Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningSean Yu
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Universitat Politècnica de Catalunya
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection aftab alam
 
Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationRoya Hosseini
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
 
NS-CUK Seminar: J.H.Lee, Review on "GCC: Graph Contrastive Coding for Graph ...
NS-CUK Seminar: J.H.Lee,  Review on "GCC: Graph Contrastive Coding for Graph ...NS-CUK Seminar: J.H.Lee,  Review on "GCC: Graph Contrastive Coding for Graph ...
NS-CUK Seminar: J.H.Lee, Review on "GCC: Graph Contrastive Coding for Graph ...ssuser4b1f48
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Dongmin Choi
 
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...Ziyuan Zhao
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementMongoDB
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)Yun Huang
 
Community of Practice - Project Specific - Steering Committee 3
Community of Practice - Project Specific - Steering Committee 3Community of Practice - Project Specific - Steering Committee 3
Community of Practice - Project Specific - Steering Committee 3Embedding Employability
 
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...Javier Melero
 
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Unai Lopez-Novoa
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...Sungha Choi
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Modelsoxigoh238
 
NTCIR-15 www-3 kasys poster
NTCIR-15 www-3 kasys posterNTCIR-15 www-3 kasys poster
NTCIR-15 www-3 kasys posterAtsukiMaruta
 

Similar a Relational Knowledge Transfer Boosts Model Performance (20)

Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
NTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer LearningNTU DBME5028 Week8 Transfer Learning
NTU DBME5028 Week8 Transfer Learning
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection Compressing Graphs and Indexes with Recursive Graph Bisection
Compressing Graphs and Indexes with Recursive Graph Bisection
 
Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
NS-CUK Seminar: J.H.Lee, Review on "GCC: Graph Contrastive Coding for Graph ...
NS-CUK Seminar: J.H.Lee,  Review on "GCC: Graph Contrastive Coding for Graph ...NS-CUK Seminar: J.H.Lee,  Review on "GCC: Graph Contrastive Coding for Graph ...
NS-CUK Seminar: J.H.Lee, Review on "GCC: Graph Contrastive Coding for Graph ...
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...
[BMVC 2022] DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detec...
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
 
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
LAK21 Data Driven Redesign of Tutoring Systems (Yun Huang)
 
Community of Practice - Project Specific - Steering Committee 3
Community of Practice - Project Specific - Steering Committee 3Community of Practice - Project Specific - Steering Committee 3
Community of Practice - Project Specific - Steering Committee 3
 
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...
Decision Module: Adapting Learning Paths in Serious Games based on CbKST and ...
 
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
 
Use CNN for Sequence Modeling
Use CNN for Sequence ModelingUse CNN for Sequence Modeling
Use CNN for Sequence Modeling
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Model
 
NTCIR-15 www-3 kasys poster
NTCIR-15 www-3 kasys posterNTCIR-15 www-3 kasys poster
NTCIR-15 www-3 kasys poster
 

Más de NAVER Engineering

디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIXNAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기NAVER Engineering
 

Más de NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Relational Knowledge Transfer Boosts Model Performance

  • 2. • What is Knowledge Distillation (Transfer) ? • Recent Approaches • Relational Knowledge Distillation (RKD) • Discussion • Conclusion Contents 2
  • 3. Knowledge Distillation (Transfer) Transfer Learning What is Knowledge Distillation? Model A Model B Domain A Domain B Domain A Student Model (Small & Shallow) Teacher Model (Big & Deep) educate (transfer) transfer train train train train • For model compression • To improve performance of student over teacher • When data is not sufficient. • When label for a problem is not presented. • E.g., pretrained-model on ImageNet 3
  • 4. Model Compression using Knowledge Distillation 4 Model 1 Model 2 Model 4 Model 3 Ensemble Example v v v v v Output of Each Model Output of Ensemble< • Ensemble is an easy way to improve performance of a Neural Network. • However, it requires large computing resources.
  • 5. Model Compression using Knowledge Distillation 5 Model 1 Model 2 Model 4 Model 3 Example v v v v v • By educating the student model to mimic output of the teacher model, the student model can achieve comparable performance. Student v Teacher Transfer
  • 6. Model Compression using Knowledge Distillation 6 Distillation Model 1 Model 2 Model 4 Model 3 Student Teacher
  • 7. • Distilling the Knowledge in a Neural Network Hinton et al. In NIPS, 2014. Recent Approaches: Transfer Class Probability 𝑥𝑖 𝑙𝑜𝑔𝑖𝑡 𝜏 Image transfer Class probability Student Classifier 𝒇 𝑺 Teacher Classifier 𝒇 𝑻 softmax softmax Objective: 7
  • 8. • FitNets: Hints for Thin Deep Nets Romero et al. In ICLR, 2015. Recent Approaches: Transfer Hidden Activation 𝑥𝑖 Teacher 𝒇 𝑻 Student 𝒇 𝑺 𝛽 transfer Hidden Activation Random linear transformation𝐶′ 𝐶 where 𝐶′ > 𝐶 Objective: 𝛽 𝑓𝑇 𝑥𝑖 8
  • 9. • Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer Zagoruyko & Komodakis. In ICLR, 2017. Recent Approaches: Transfer Attention 𝑥𝑖 Student 𝒇 𝑺 H W C’ H W C 𝑄 𝑇 H W 𝑄 𝑆 H W Average over channel transfer Objective: Teacher 𝒇 𝑻 9
  • 10. • Born-Again Neural Networks (Furlanello et al. In ICML, 2018.) • Label Refinery: Improving ImageNet Classificationthrough Label Progression (Bagherinezhad et al. In arXiv, 2018.) Recent Approaches: Student Over Teacher 𝑥𝑖 Student Classifier 𝒇 𝑺 Teacher Classifier 𝒇 𝑻 train Class probability Ground-truth for student Surprisingly, the student is significantly better than the teacher. Student architecture is identical to teacher 10
  • 11. • Previous works can be expressed as a form of: • 𝑓𝑇: teacher, 𝑓𝑆: student, 𝑙: loss, 𝑡𝑖 = 𝑓𝑇 𝑥𝑖 , 𝑠𝑖 = 𝑓𝑆 𝑥𝑖 . • IKD transfers output of individual example from teacher to student. Individual Knowledge Distillation: Generalization 11
  • 12. Q. What constitutes the knowledge in a learned model? What is the Knowledge of a Model? 12
  • 13. Q. What constitutes the knowledge in a learned model? A. (IKD) Output of individual examples represented by the teacher. What is the Knowledge of a Model? 13
  • 14. Q. What constitutes the knowledge in a learned model? A. (IKD) Output of individual examples represented by the teacher. A. (RKD) Relations among examples represented by the teacher. What is the Knowledge of a Model? 14
  • 15. • Relational knowledge distillation can be expressed as a form of: • 𝜓: function extracting relation. • RKD transfers relation among examples represented by teacher to student. Relational Knowledge Distillation: Generalization 15
  • 16. • IKD transfers output of individual examples represented by teacher to student. • RKD transfers relation among examples represented by teacher to student. IKD versus RKD 16
  • 17. • Among many relations, we transfer the “structure” of embedding space. • Distance-wise loss (pair) • Angle-wise loss (triplet) Relational Knowledge Distillation: Structure to Structure 𝑡1 𝑡2 𝑡3 𝑠1 𝑠2 𝑠3 Structure to Structure Relational KD Point to Point Individual KD 𝑡1 𝑡2 𝑡3 𝑠1 𝑠2 𝑠3 17 vs.
  • 18. • Distance-wise loss (RKD-D) • RKD-D transfers relative distance between points on embedding space. Relational Knowledge Distillation: Distance-wise Loss Where 𝑙 𝛿 is Huber loss: 𝑡1 𝑠1 𝑠2 𝑠3 𝑡2 𝑡3 1.2 1.0 0.8 0.9 1.70.4 Embedding Space 18
  • 19. • Angle-wise loss (RKD-A) • RKD-A transfers angle formed by three points on embedding space. Relational Knowledge Distillation: Angle-wise Loss 𝑡1 𝑠1 𝑠2 𝑠3 𝑡2 𝑡3 Embedding Space 𝜃1 𝜃3 𝜃2 𝜃1 𝜃3 𝜃2 19
  • 20. • Where to apply ? • On any hidden layers or embedding layers. • Not on layer where individual output values are crucial.  Because, RKD does not transfer output value of individual examples.  E.g., softmax layer for classification. • How to use RKD during training ? • RKD loss can be combined with task-specified loss, ℒ 𝑡𝑎𝑠𝑘 + 𝜆 ⋅ ℒ 𝑅𝐾𝐷. • RKD loss can be used solely for training embedding network, ℒ 𝑅𝐾𝐷. Relational Knowledge Distillation: How to use RKD? 20
  • 21. • Metric Learning (Image retrieval) • Image Classification • Few-Show Learning Experiment 21
  • 22. Metric learning • It aims to train an embedding model. • In embedding space, distances between projected examples correspond to their semantic similarity. Experiment: What is Metric Learning? Images DNN 𝑓(𝑥; 𝑊) d-dimensional Embedding Space 𝑥1 𝑥2 𝑥3 𝑓(𝑥1) 𝑓(𝑥2) 𝑓(𝑥3) t-SNE of embedding space on Cars 196 dataset. (Wang et al., 2017) positive negative 22
  • 23. • Evaluation • Image retrieval, recall@k • Dataset • Cars 196 (Krause et al. In 3dRR, 2013.) • CUB-200-2011 (Wah et al. In CNS-TR, 2011.) • Stanford Online Products (Song et al. In CVPR, 2016.) • Architecture • Teacher: ResNet50 (backbone) + 512-d fc layer (embedding layer) + L2 normalization • Student: ResNet18 + various dimension fc layer + L2 normalization (optional) • Targeting layer of RKD • Final embedding outputs of teacher and student • Training Objective • Teacher: Triplet loss & Distance-weighted sampling (Wu et al. In ICCV, 2017.) • Student: Triplet loss, RKD-D, RKD-A, RKD-DA, DarkRank (Chen et al. In AAAI, 2018.) Experiment: Metric Learning 23
  • 24. Experiment: Metric Learning (a) Recall@1 on CUB-200-2011 (b) Recall@1 on Cars 196 Distillationto small network • Model-d refer to a model with d-dimensional embedding. 24
  • 25. Self-Distillation • Teacher: ResNet50 + 512-d fc + L2 normalization • Trained using triplet loss • Student: ResNet50 + 512-d fc • Trained using RKD-DA Experiment: Metric Learning (a) Recall@1 of Self-Distillation 25
  • 26. Comparison with state-of-the-art methods Experiment: Metric Learning • CUB-200-2011, we achieve state-of-the art performance regardless of backbone network. • Cars 196 & Stanford Online Products, we achieve second-best performance. Note that, ABE8 (Kim et al. In ECCV, 2018) requires additional attention modules for 8 branches. 26 (a) Recall@K comparison with state-of-the-art methods.
  • 27. Experiment: Metric Learning Qualitative Results • Where the teacher (Triplet) fails, the student (RKD-DA) succeeds at top-1. 27(a) Retrieval results on CUB-200-2011. (b) Retrieval results on Cars 196.
  • 28. Experiment: Image Classification Image Classification • Datasets: CIFAR-10, CIFAR-100 • Architecture • Teacher: ResNet50 • Student: VGG-11 with BatchNorm • Targeting layer of RKD • Teacher: output of avgpool layer • Student: output of pool5 layer • Training Objective • Teacher: cross-entropy • Student: cross-entropy + (Hinton et al., RKD-D and RKD-DA) (a) Accuracy (%) on CIFAR-10 and CIFAR-100. 28 ResNet50 VGG11 with BN fc fc CNN Classifier Teacher Student transfer
  • 29. Experiment: What is Few-Shot Learning? Few-shot learning • A classifier learns to generalize to new unseen classes with only few examples for each new class. • Shot: the number of examples given for each new class • Way: the number of new classes • E.g., Prototypical Network (Snell et al., In NIPS, 2017) • An embedding network that classification is performed based on distance from given examples of new classes. Prototypical Networks for few-shot learning. 29
  • 30. Experiment: Few-Shot Learning Few-shot learning • Datasets • Omniglot (Lake et al. In Science, 2015.) • miniImageNet (Vinyals et al. In NIPS, 2016.) • Architecture • Teacher: 4 convolutional layers • Student: Same with teacher • Targeting layer of RKD. • Final embedding output of teacher and student • Training Objective • Teacher: Snell et al. (prototypical networks) • Student: Snell et al. + (RKD-D or RKD-DA) 30 (a) Accuracy (%) on Omniglot. (a) Accuracy (%) on miniImageNet.
  • 31. Discussion: Effective Adaptation on Source Domain • Both Cars 196 & CUB-200-2011 are fine-grained classification dataset. • Requires adaptation to specific characteristic of the domain. e.g.) Finding local patch that distinguish a object from others. • ‘Triplet’ is the teacher network used for educating ‘RKD-DA’ model. (a) Recall@1 curve of train/evaluation set during training the teacher (Triplet) and the student (RKD-DA) on Cars 196. (b) Recall@1 on various domains. Both ‘Triplet’ and ‘RKD-DA’ are the models trained on Cars 196. 31
  • 32. • We have introduced Relational KD that effectively transfers knowledge using relations among data examples represented by the teacher. • Experiments conducted on different tasks and benchmarks show that the Relational KD improves the performance of the educated student networks with a significant margin. Conclusion 32
  • 34. [1] G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. In NIPS workshop, 2015. [2] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio. Fitnets: Hints for thin deep nets. In ICLR, 2015. [3] S. Zagoruyko and N. Komodakis. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR, 2017. [4] T. Furlanello, Z. C. Lipton, M. Tschannen, L. Itti, and A. Anandkumar. Born-again neural networks. In ICML, 2018. [5] H. Bagherinezhad, M. Horton, M. Rastegari, and A. Farhadi. Label refinery: Improving imagenet classification through label progression. In arXiv, 2018. [6] J. Krause, M. Stark, J. Deng, and L. Fei-Fei. 3d object representations for fine-grained categorization. In 3dRR, 2013. [7] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. In CNS-TR, 2011. [8] H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese. Deep metric learning via lifted structured feature embedding. In CVPR, 2016. [9] C.-Y. Wu, R. Manmatha, A. J. Smola, and P. Krahenbuhl. Sampling matters in deep embedding learning. In ICCV, 2017. [10] Y. Chen, N. Wang, and Z. Zhang. Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In AAAI, 2018. [11] W. Kim, B. Goyal, K. Chawla, J. Lee, and K. Kwon. Attention based ensemble for deep metric learning. In ECCV, 2018. [12] J. Snell, K. Swersky, and R. Zemel. Prototypical networks for few-shot learning. In NIPS, 2017. [13] S. R. Lake, Brenden M and J. B. Tenenbaum. Human-level concept learning through probabilistic program induction. In Science, 2015. [14] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. In NIPS, 2016. References 34