SlideShare una empresa de Scribd logo
1 de 42
A Survey on
      Transfer Learning
               Sinno Jialin Pan
  Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
        Joint work with Prof. Qiang Yang
Transfer Learning? (DARPA 05)
Transfer Learning (TL):
    The ability of a system to recognize and apply knowledge and
    skills learned in previous tasks to novel tasks (in new domains)



 It is motivated by human learning. People can often transfer knowledge
 learnt previously to novel situations
  Chess  Checkers
  Mathematics  Computer Science
  Table Tennis  Tennis
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Traditional ML vs. TL
                                          (P. Langley 06)
                    Traditional ML in                                      Transfer of learning
                    multiple domains                                         across domains
training items




                                                          training items




                                                                                                       test items
                                             test items




             Humans can learn in many domains.                     Humans can also transfer from one
                                                                   domain to other domains.
Traditional ML vs. TL
           Learning Process of                       Learning Process of
             Traditional ML                           Transfer Learning

              training items                           training items




Learning System          Learning System   Learning System




                                            Knowledge               Learning System
Notation
Domain:
It consists of two components: A feature space     , a marginal distribution


In general, if two domains are different, then they may have different feature
    spaces
or different marginal distributions.

Task:
Given a specific domain and label space      , for each   in the domain, to
predict its corresponding label

In general, if two tasks are different, then they may have different label spaces or
different conditional distributions
Notation
For simplicity, we only consider at most two domains and two tasks.

Source domain:

Task in the source domain:

Target domain:

Task in the target domain
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Why Transfer Learning?
 In some domains, labeled data are in short supply.
 In some domains, the calibration effort is very expensive.
 In some domains, the learning process is time consuming.


  How to extract knowledge learnt from related domains to help
  learning in a target domain with a few labeled data?
  How to extract knowledge learnt from related domains to speed up
  learning in a target domain?


          Transfer learning techniques may help!
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Settings of Transfer Learning
  Transfer learning settings     Labeled data in   Labeled data in      Tasks
                                 a source domain   a target domain
 Inductive Transfer Learning                                         Classification
                                       ×                 √
                                                                      Regression
                                       √                 √                …
Transductive Transfer Learning                                       Classification
                                       √                 ×
                                                                      Regression
                                                                          …
Unsupervised Transfer Learning                                        Clustering
                                       ×                 ×
                                                                         …
An overview of
various settings of                                                                                                 Self-taught
                                                                              Case 1
  transfer learning                                                                                                  Learning
                                                   No labeled data in a source domain


                                       Inductive
                                   Transfer Learning
                                            Labeled data are available in a source domain
   Labeled data are available in
                                                                                                 Source and
         a target domain
                                                                                               target tasks are
                                                                                                                    Multi-task
                                                                              Case 2                learnt          Learning
                                                                                               simultaneously

 Transfer
 Learning
                              Labeled data are
                             available only in a                                                Assumption:
                               source domain                Transductive                          different          Domain
                                                                                                domains but
                                                          Transfer Learning                      single task        Adaptation
   No labeled data in
    both source and
     target domain                                                                      Assumption: single domain
                                                                                             and single task



                              Unsupervised                                                          Sample Selection Bias /
                            Transfer Learning                                                          Covariance Shift
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Approaches to Transfer Learning
Transfer learning approaches                        Description
        Instance-transfer            To re-weight some labeled data in a source
                                        domain for use in the target domain
 Feature-representation-transfer   Find a “good” feature representation that reduces
                                   difference between a source and a target domain
                                            or minimizes error of models
         Model-transfer            Discover shared parameters or priors of models
                                   between a source domain and a target domain
 Relational-knowledge-transfer     Build mapping of relational knowledge between
                                        a source domain and a target domain.
Approaches to Transfer Learning
                             Inductive          Transductive        Unsupervised
                          Transfer Learning   Transfer Learning   Transfer Learning

   Instance-transfer             √                   √
Feature-representation-          √                   √                   √
       transfer
    Model-transfer               √
Relational-knowledge-            √
       transfer
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Inductive Transfer Learning
               Instance-transfer Approaches
• Assumption: the source domain and target domain data use
  exactly the same features and labels.

• Motivation: Although the source domain data can not be
  reused directly, there are some parts of the data that can still
  be reused by re-weighting.

• Main Idea: Discriminatively adjust weighs of data in the
  source domain for use in the target domain.
Inductive Transfer Learning
                        --- Instance-transfer Approaches
                             Non-standard SVMs
                                 [Wu and Dietterich ICML-04]

                       Uniform weights                 Correct the decision boundary by re-weighting




Loss function on the                     Loss function on the
target domain data                       source domain data                       Regularization term


   Differentiate the cost for misclassification of the target and source data
Inductive Transfer Learning
    --- Instance-transfer Approaches
                         TrAdaBoost
                              [Dai et al. ICML-07]

           Hedge ( β )                           AdaBoost
      [Freund et al. 1997]                 [Freund et al. 1997]
  To decrease the weights                    To increase the weights of
  of the misclassified data                  the misclassified data



                                                            The whole
                    Source domain                        training data set
                                         target domain
                     labeled data         labeled data


                                           Classifiers trained on
                                          re-weighted labeled data

                                Target domain
                                unlabeled data
Inductive Transfer Learning
     Feature-representation-transfer Approaches
         Supervised Feature Construction
                  [Argyriou et al. NIPS-06, NIPS-07]

Assumption: If t tasks are related to each other, then they may
share some common features which can benefit for all tasks.

Input: t tasks, each of them has its own training data.

Output: Common features learnt across t tasks and t models for t
tasks, respectively.
Supervised Feature Construction
                    [Argyriou et al. NIPS-06, NIPS-07]




        Average of the empirical
        error across t tasks                            Regularization to make the
                                                        representation sparse

                               Orthogonal Constraints


where
Inductive Transfer Learning
     Feature-representation-transfer Approaches
       Unsupervised Feature Construction
                        [Raina et al. ICML-07]

Three steps:
•   Applying sparse coding [Lee et al. NIPS-07] algorithm to learn
    higher-level representation from unlabeled data in the source
    domain.

•   Transforming the target data to new representations by new bases
    learnt in the first step.

•   Traditional discriminative models can be applied on new
    representations of the target data with corresponding labels.
Unsupervised Feature Construction
                         [Raina et al. ICML-07]

Step1:




Input: Source domain data X S = {xS } and coefficient β
                                         i

Output: New representations of the source domain data AS = {aS }     i


       and new bases B = {bi }

Step2:



Input: Target domain data X T = {xT } , coefficient β and bases B = {bi }
                                     i


Output: New representations of the target domain data AT = {aT } i
Inductive Transfer Learning
                     Model-transfer Approaches
               Regularization-based Method
                          [Evgeiou and Pontil, KDD-04]

Assumption: If t tasks are related to each other, then they may share some
parameters among individual models.

Assume f t = wt ⋅ x be a hyper-plane for task , where        t ∈ {T , S } and




               Common part                       Specific part for individual task

                                                                                 Regularization terms
                                                                                  for multiple tasks
Encode them into SVMs:
Inductive Transfer Learning
       Relational-knowledge-transfer Approaches
                               TAMAR
                        [Mihalkova et al. AAAI-07]
Assumption: If the target domain and source domain are related, then there
may be some relationship between domains being similar, which can be used for
transfer learning

Input:
6. Relational data in the source domain and a statistical relational model,
     Markov Logic Network (MLN), which has been learnt in the source
     domain.
7. Relational data in the target domain.

Output: A new statistical relational model, MLN, in the target domain.

Goal: To learn a MLN in the target domain more efficiently and effectively.
TAMAR [Mihalkova et al. AAAI-07]
Two Stages:
2. Predicate Mapping
   – Establish the mapping between predicates in the source
     and target domain. Once a mapping is established, clauses
     from the source domain can be translated into the target
     domain.
3. Revising the Mapped Structure
   – The clauses mapping from the source domain directly
     may not be completely accurate and may need to be
     revised, augmented , and re-weighted in order to properly
     model the target data.
TAMAR [Mihalkova et al. AAAI-07]
    Source domain (academic domain)            Target domain (movie domain)



               AdvisedBy                                 WorkedFor
Student (B)                   Professor (A)   Actor(A)               Director(B)


      Publication         Publication             MovieMember    MovieMember


              Paper (T)                                  Movie(M)




Mapping




Revising
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Transductive Transfer Learning
                    Instance-transfer Approaches
       Sample Selection Bias / Covariance Shift
                  [Zadrozny ICML-04, Schwaighofer JSPI-00]
Input: A lot of labeled data in the source domain and no labeled data in the
target domain.

Output: Models for use in the target domain data.

Assumption: The source domain and target domain are the same. In addition,
P (YS | X S )   P (YT | X T )                 P( X S )   P( X T )
           and            are the same while         and        may be
different causing by different sampling process (training data and test data).

Main Idea: Re-weighting (important sampling) the source domain data.
Sample Selection Bias/Covariance Shift
 To correct sample selection bias:




                             weights for source
                               domain data




How to estimate        ?
One straightforward solution is to estimate P( X S ) and P( X T ) ,
respectively. However, estimating density function is a hard
   problem.
Sample Selection Bias/Covariance Shift
              Kernel Mean Match (KMM)
                         [Huang et al. NIPS 2006]

Main Idea: KMM tries to estimate              directly instead of estimating
density function.

It can be proved that can be estimated by solving the following quadratic
programming (QP) optimization problem.
                                                     To match means between
                                                 training and test data in a RKHS




Theoretical Support: Maximum Mean Discrepancy (MMD) [Borgwardt et al.
BIOINFOMATICS-06]. The distance of distributions can be measured
by Euclid distance of their mean vectors in a RKHS.
Transductive Transfer Learning
      Feature-representation-transfer Approaches
                       Domain Adaptation
 [Blitzer et al. EMNL-06, Ben-David et al. NIPS-07, Daume III ACL-07]
Assumption: Single task across domains, which means P(YS | X S ) and P(YT | X T )
are the same while P ( X S ) and P ( X T ) may be different causing by feature
representations across domains.

Main Idea: Find a “good” feature representation that reduce the “distance”
between domains.

Input: A lot of labeled data in the source domain and only unlabeled data in the
target domain.

Output: A common representation between source domain data and target
domain data and a model on the new representation for use in the target
     domain.
Domain Adaptation
 Structural Correspondence Learning (SCL)
   [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05]

Motivation: If two domains are related to each other, then there may exist
some “pivot” features across both domain. Pivot features are features that
behave in the same way for discriminative learning in both domains.

Main Idea: To identify correspondences among features from different
domains by modeling their correlations with pivot features. Non-pivot features
form different domains that are correlated with many of the same pivot
features are assumed to correspond, and they are treated similarly in a
discriminative learner.
SCL
[Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang
                          JMLR-05]
                                            a) Heuristically choose m pivot
                                            features, which is task specific.
                                            b) Transform each vector of pivot
                                            feature to a vector of binary
                                            values and then create
                                            corresponding prediction
                                            problem.

                                                Learn parameters of each
                                                prediction problem


                                                Do Eigen Decomposition
                                                on the matrix of
                                                parameters and learn the
                                                linear mapping function.


                                       Use the learnt mapping function to
                                       construct new features and train
                                       classifiers onto the new representations.
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Unsupervised Transfer Learning
      Feature-representation-transfer Approaches
              Self-taught Clustering (STC)
                            [Dai et al. ICML-08]

Input: A lot of unlabeled data in a source domain and a few unlabeled data in a
target domain.

Goal: Clustering the target domain data.

Assumption: The source domain and target domain data share some common
features, which can help clustering in the target domain.

Main Idea: To extend the information theoretic co-clustering algorithm
[Dhillon et al. KDD-03] for transfer learning.
Self-taught Clustering (STC)
                                 [Dai et al. ICML-08]
                                 Common features



                                                        Target domain data



Source domain data

                                                                         Co-clustering in the
                                                                           source domain
Objective function that need to be minimized


      Co-clustering in the target domain


where

                                    Cluster functions
Output
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Negative Transfer
 Most approaches to transfer learning assume transferring knowledge across
  domains be always positive.

 However, in some cases, when two tasks are too dissimilar, brute-force
  transfer may even hurt the performance of the target task, which is called
  negative transfer [Rosenstein et al NIPS-05 Workshop].

 Some researchers have studied how to measure relatedness among tasks
  [Ben-David and Schuller NIPS-03, Bakker and Heskes JMLR-03].

 How to design a mechanism to avoid negative transfer needs to be studied
  theoretically.
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Conclusion
                             Inductive          Transductive        Unsupervised
                          Transfer Learning   Transfer Learning   Transfer Learning

   Instance-transfer             √                   √
Feature-representation-          √                   √                   √
       transfer
    Model-transfer               √
Relational-knowledge-            √
       transfer




How to avoid negative transfer need to be attracted more attention!

Más contenido relacionado

La actualidad más candente

Introduction to continual learning
Introduction to continual learningIntroduction to continual learning
Introduction to continual learningNguyen Giang
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAILviv Startup Club
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsJoonyoung Yi
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드taeseon ryu
 
Computational learning theory
Computational learning theoryComputational learning theory
Computational learning theoryswapnac12
 
Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning IntroductionRidge-i, Inc.
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural networkKaty Lee
 
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...Edureka!
 
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanMIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanPeerasak C.
 
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Dataconomy Media
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Universitat Politècnica de Catalunya
 

La actualidad más candente (20)

Introduction to continual learning
Introduction to continual learningIntroduction to continual learning
Introduction to continual learning
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAIYurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
 
CNN Quantization
CNN QuantizationCNN Quantization
CNN Quantization
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
딥러닝 논문읽기 모임 - 송헌 Deep sets 슬라이드
 
Computational learning theory
Computational learning theoryComputational learning theory
Computational learning theory
 
Continual Learning Introduction
Continual Learning IntroductionContinual Learning Introduction
Continual Learning Introduction
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural network
 
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
 
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanMIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
 
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
 

Destacado

Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clusteringSOYEON KIM
 
Video concept detection by learning from web images
Video concept detection by learning from web imagesVideo concept detection by learning from web images
Video concept detection by learning from web imagesMediaMixerCommunity
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Best Blue Brain ppt ever.
Best Blue Brain ppt ever.Best Blue Brain ppt ever.
Best Blue Brain ppt ever.Suhail Shaikh
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Universitat Politècnica de Catalunya
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainRawan Al-Omari
 
TRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine AnoranTRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine AnoranLorraine Mae Anoran
 
Amith blue brain
Amith blue brainAmith blue brain
Amith blue brainAmith Kp
 
Brain Fingerprinting PPT
Brain Fingerprinting PPTBrain Fingerprinting PPT
Brain Fingerprinting PPTVishnu Mysterio
 

Destacado (17)

Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
 
Video concept detection by learning from web images
Video concept detection by learning from web imagesVideo concept detection by learning from web images
Video concept detection by learning from web images
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Best Blue Brain ppt ever.
Best Blue Brain ppt ever.Best Blue Brain ppt ever.
Best Blue Brain ppt ever.
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google Brain
 
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
 
TRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine AnoranTRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine Anoran
 
Amith blue brain
Amith blue brainAmith blue brain
Amith blue brain
 
Blue brain
Blue brainBlue brain
Blue brain
 
Brain fingerprinting
Brain fingerprintingBrain fingerprinting
Brain fingerprinting
 
BLUE BRAIN
BLUE BRAINBLUE BRAIN
BLUE BRAIN
 
Brain Fingerprinting PPT
Brain Fingerprinting PPTBrain Fingerprinting PPT
Brain Fingerprinting PPT
 

Similar a A survey on transfer learning

Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyNUPUR YADAV
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Presentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob MolenaarPresentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob MolenaarJacob Molenaar
 
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...広樹 本間
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrellzukun
 
Kma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_typeKma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_typegharawi
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
 
Analysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papersAnalysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papersharshavardhan814108
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithmsbutest
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
On Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and BeyondOn Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and BeyondEunjeong (Lucy) Park
 

Similar a A survey on transfer learning (14)

[ppt]
[ppt][ppt]
[ppt]
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Presentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob MolenaarPresentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob Molenaar
 
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrell
 
Seminar dm
Seminar dmSeminar dm
Seminar dm
 
Kma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_typeKma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_type
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
 
Analysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papersAnalysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papers
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithms
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
On Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and BeyondOn Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and Beyond
 

Último

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Último (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

A survey on transfer learning

  • 1. A Survey on Transfer Learning Sinno Jialin Pan Department of Computer Science and Engineering The Hong Kong University of Science and Technology Joint work with Prof. Qiang Yang
  • 2. Transfer Learning? (DARPA 05) Transfer Learning (TL): The ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks (in new domains) It is motivated by human learning. People can often transfer knowledge learnt previously to novel situations  Chess  Checkers  Mathematics  Computer Science  Table Tennis  Tennis
  • 3. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 4. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 5. Traditional ML vs. TL (P. Langley 06) Traditional ML in Transfer of learning multiple domains across domains training items training items test items test items Humans can learn in many domains. Humans can also transfer from one domain to other domains.
  • 6. Traditional ML vs. TL Learning Process of Learning Process of Traditional ML Transfer Learning training items training items Learning System Learning System Learning System Knowledge Learning System
  • 7. Notation Domain: It consists of two components: A feature space , a marginal distribution In general, if two domains are different, then they may have different feature spaces or different marginal distributions. Task: Given a specific domain and label space , for each in the domain, to predict its corresponding label In general, if two tasks are different, then they may have different label spaces or different conditional distributions
  • 8. Notation For simplicity, we only consider at most two domains and two tasks. Source domain: Task in the source domain: Target domain: Task in the target domain
  • 9. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 10. Why Transfer Learning?  In some domains, labeled data are in short supply.  In some domains, the calibration effort is very expensive.  In some domains, the learning process is time consuming.  How to extract knowledge learnt from related domains to help learning in a target domain with a few labeled data?  How to extract knowledge learnt from related domains to speed up learning in a target domain?  Transfer learning techniques may help!
  • 11. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 12. Settings of Transfer Learning Transfer learning settings Labeled data in Labeled data in Tasks a source domain a target domain Inductive Transfer Learning Classification × √ Regression √ √ … Transductive Transfer Learning Classification √ × Regression … Unsupervised Transfer Learning Clustering × × …
  • 13. An overview of various settings of Self-taught Case 1 transfer learning Learning No labeled data in a source domain Inductive Transfer Learning Labeled data are available in a source domain Labeled data are available in Source and a target domain target tasks are Multi-task Case 2 learnt Learning simultaneously Transfer Learning Labeled data are available only in a Assumption: source domain Transductive different Domain domains but Transfer Learning single task Adaptation No labeled data in both source and target domain Assumption: single domain and single task Unsupervised Sample Selection Bias / Transfer Learning Covariance Shift
  • 14. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 15. Approaches to Transfer Learning Transfer learning approaches Description Instance-transfer To re-weight some labeled data in a source domain for use in the target domain Feature-representation-transfer Find a “good” feature representation that reduces difference between a source and a target domain or minimizes error of models Model-transfer Discover shared parameters or priors of models between a source domain and a target domain Relational-knowledge-transfer Build mapping of relational knowledge between a source domain and a target domain.
  • 16. Approaches to Transfer Learning Inductive Transductive Unsupervised Transfer Learning Transfer Learning Transfer Learning Instance-transfer √ √ Feature-representation- √ √ √ transfer Model-transfer √ Relational-knowledge- √ transfer
  • 17. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 18. Inductive Transfer Learning Instance-transfer Approaches • Assumption: the source domain and target domain data use exactly the same features and labels. • Motivation: Although the source domain data can not be reused directly, there are some parts of the data that can still be reused by re-weighting. • Main Idea: Discriminatively adjust weighs of data in the source domain for use in the target domain.
  • 19. Inductive Transfer Learning --- Instance-transfer Approaches Non-standard SVMs [Wu and Dietterich ICML-04] Uniform weights Correct the decision boundary by re-weighting Loss function on the Loss function on the target domain data source domain data Regularization term  Differentiate the cost for misclassification of the target and source data
  • 20. Inductive Transfer Learning --- Instance-transfer Approaches TrAdaBoost [Dai et al. ICML-07] Hedge ( β ) AdaBoost [Freund et al. 1997] [Freund et al. 1997] To decrease the weights To increase the weights of of the misclassified data the misclassified data The whole Source domain training data set target domain labeled data labeled data Classifiers trained on re-weighted labeled data Target domain unlabeled data
  • 21. Inductive Transfer Learning Feature-representation-transfer Approaches Supervised Feature Construction [Argyriou et al. NIPS-06, NIPS-07] Assumption: If t tasks are related to each other, then they may share some common features which can benefit for all tasks. Input: t tasks, each of them has its own training data. Output: Common features learnt across t tasks and t models for t tasks, respectively.
  • 22. Supervised Feature Construction [Argyriou et al. NIPS-06, NIPS-07] Average of the empirical error across t tasks Regularization to make the representation sparse Orthogonal Constraints where
  • 23. Inductive Transfer Learning Feature-representation-transfer Approaches Unsupervised Feature Construction [Raina et al. ICML-07] Three steps: • Applying sparse coding [Lee et al. NIPS-07] algorithm to learn higher-level representation from unlabeled data in the source domain. • Transforming the target data to new representations by new bases learnt in the first step. • Traditional discriminative models can be applied on new representations of the target data with corresponding labels.
  • 24. Unsupervised Feature Construction [Raina et al. ICML-07] Step1: Input: Source domain data X S = {xS } and coefficient β i Output: New representations of the source domain data AS = {aS } i and new bases B = {bi } Step2: Input: Target domain data X T = {xT } , coefficient β and bases B = {bi } i Output: New representations of the target domain data AT = {aT } i
  • 25. Inductive Transfer Learning Model-transfer Approaches Regularization-based Method [Evgeiou and Pontil, KDD-04] Assumption: If t tasks are related to each other, then they may share some parameters among individual models. Assume f t = wt ⋅ x be a hyper-plane for task , where t ∈ {T , S } and Common part Specific part for individual task Regularization terms for multiple tasks Encode them into SVMs:
  • 26. Inductive Transfer Learning Relational-knowledge-transfer Approaches TAMAR [Mihalkova et al. AAAI-07] Assumption: If the target domain and source domain are related, then there may be some relationship between domains being similar, which can be used for transfer learning Input: 6. Relational data in the source domain and a statistical relational model, Markov Logic Network (MLN), which has been learnt in the source domain. 7. Relational data in the target domain. Output: A new statistical relational model, MLN, in the target domain. Goal: To learn a MLN in the target domain more efficiently and effectively.
  • 27. TAMAR [Mihalkova et al. AAAI-07] Two Stages: 2. Predicate Mapping – Establish the mapping between predicates in the source and target domain. Once a mapping is established, clauses from the source domain can be translated into the target domain. 3. Revising the Mapped Structure – The clauses mapping from the source domain directly may not be completely accurate and may need to be revised, augmented , and re-weighted in order to properly model the target data.
  • 28. TAMAR [Mihalkova et al. AAAI-07] Source domain (academic domain) Target domain (movie domain) AdvisedBy WorkedFor Student (B) Professor (A) Actor(A) Director(B) Publication Publication MovieMember MovieMember Paper (T) Movie(M) Mapping Revising
  • 29. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 30. Transductive Transfer Learning Instance-transfer Approaches Sample Selection Bias / Covariance Shift [Zadrozny ICML-04, Schwaighofer JSPI-00] Input: A lot of labeled data in the source domain and no labeled data in the target domain. Output: Models for use in the target domain data. Assumption: The source domain and target domain are the same. In addition, P (YS | X S ) P (YT | X T ) P( X S ) P( X T ) and are the same while and may be different causing by different sampling process (training data and test data). Main Idea: Re-weighting (important sampling) the source domain data.
  • 31. Sample Selection Bias/Covariance Shift To correct sample selection bias: weights for source domain data How to estimate ? One straightforward solution is to estimate P( X S ) and P( X T ) , respectively. However, estimating density function is a hard problem.
  • 32. Sample Selection Bias/Covariance Shift Kernel Mean Match (KMM) [Huang et al. NIPS 2006] Main Idea: KMM tries to estimate directly instead of estimating density function. It can be proved that can be estimated by solving the following quadratic programming (QP) optimization problem. To match means between training and test data in a RKHS Theoretical Support: Maximum Mean Discrepancy (MMD) [Borgwardt et al. BIOINFOMATICS-06]. The distance of distributions can be measured by Euclid distance of their mean vectors in a RKHS.
  • 33. Transductive Transfer Learning Feature-representation-transfer Approaches Domain Adaptation [Blitzer et al. EMNL-06, Ben-David et al. NIPS-07, Daume III ACL-07] Assumption: Single task across domains, which means P(YS | X S ) and P(YT | X T ) are the same while P ( X S ) and P ( X T ) may be different causing by feature representations across domains. Main Idea: Find a “good” feature representation that reduce the “distance” between domains. Input: A lot of labeled data in the source domain and only unlabeled data in the target domain. Output: A common representation between source domain data and target domain data and a model on the new representation for use in the target domain.
  • 34. Domain Adaptation Structural Correspondence Learning (SCL) [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05] Motivation: If two domains are related to each other, then there may exist some “pivot” features across both domain. Pivot features are features that behave in the same way for discriminative learning in both domains. Main Idea: To identify correspondences among features from different domains by modeling their correlations with pivot features. Non-pivot features form different domains that are correlated with many of the same pivot features are assumed to correspond, and they are treated similarly in a discriminative learner.
  • 35. SCL [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05] a) Heuristically choose m pivot features, which is task specific. b) Transform each vector of pivot feature to a vector of binary values and then create corresponding prediction problem. Learn parameters of each prediction problem Do Eigen Decomposition on the matrix of parameters and learn the linear mapping function. Use the learnt mapping function to construct new features and train classifiers onto the new representations.
  • 36. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 37. Unsupervised Transfer Learning Feature-representation-transfer Approaches Self-taught Clustering (STC) [Dai et al. ICML-08] Input: A lot of unlabeled data in a source domain and a few unlabeled data in a target domain. Goal: Clustering the target domain data. Assumption: The source domain and target domain data share some common features, which can help clustering in the target domain. Main Idea: To extend the information theoretic co-clustering algorithm [Dhillon et al. KDD-03] for transfer learning.
  • 38. Self-taught Clustering (STC) [Dai et al. ICML-08] Common features Target domain data Source domain data Co-clustering in the source domain Objective function that need to be minimized Co-clustering in the target domain where Cluster functions Output
  • 39. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 40. Negative Transfer  Most approaches to transfer learning assume transferring knowledge across domains be always positive.  However, in some cases, when two tasks are too dissimilar, brute-force transfer may even hurt the performance of the target task, which is called negative transfer [Rosenstein et al NIPS-05 Workshop].  Some researchers have studied how to measure relatedness among tasks [Ben-David and Schuller NIPS-03, Bakker and Heskes JMLR-03].  How to design a mechanism to avoid negative transfer needs to be studied theoretically.
  • 41. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 42. Conclusion Inductive Transductive Unsupervised Transfer Learning Transfer Learning Transfer Learning Instance-transfer √ √ Feature-representation- √ √ √ transfer Model-transfer √ Relational-knowledge- √ transfer How to avoid negative transfer need to be attracted more attention!

Notas del editor

  1. Seems the changes is involved in moving to new tasks is more radical than moving to new domains.
  2. Generality means human can learn to perform a variety of tasks.
  3. Generality means human can learn to perform a variety of tasks.