Split Knowledge Transfer in Learning Under Privileged Information Framework

Split Knowledge Transfer in Learning Under
Privileged Information Framework
Niharika Gaurahaa
, Fabian S¨oderdahlb
and Ola Spjutha
Varna, Bulgaria, Sep 9, 2019
a
Department of Pharmaceutical Biosciences, Uppsala University, Sweden
b
Statisticon AB, Uppsala, Sweden 1

Our domain
We study life science problems, and in particular eﬀects of drugs,
from various aspects.
Some data can be very expensive to generate, others cheaper.
2

Our domain
Another example: Imaging in multiple resolutions
Can we improve predictions using data from multiple resolutions at
training time? 3

Learning Under Privileged Information
• The classical machine learning paradigm assumes, training
examples in the form of iid pair:
(x1, y1), ..., (xl , yl ), xi ∈ X, yi ∈ {−1, +1}
• LUPI paradigm assumes, training examples in the form of iid
triplets:
(x1, x∗
1 , y1), ..., (xl , x∗
l , yl ), xi ∈ X, x∗
i ∈ X∗
, yi ∈ {−1, +1}
• x∗
i ∈ X∗ is Privileged Information (PI) or additional data
available for each training example, but not available for
testing examples.
4

LUPI contd...
Traning Data
in feature
space X
New Object
in feature
space X
Train Model Predictive
Rules
Prediction
Traning Data
in PI space X∗
5

LUPI and CP
We introduced CP in LUPI at COPA 2018 1.
We used SVM+ implementation of LUPI. Only minor
improvements in eﬃciency. Does not scale well for larger problems.
1
Niharika Gauraha, Lars Carlsson, Ola Spjuth. ”Conformal prediction in learning under privileged information
paradigm with applications in drug discovery”. Proceedings of the Seventh Workshop on Conformal and
Probabilistic Prediction and Applications, PMLR 91:147-156, 2018.
6

Knowledge Transfer LUPI
For knowledge transfer from space X∗− > X, for each privileged
feature a regression function is learned by using p decision features
in X as explanatory variables and privileged feature vectors X∗(i),
for i = 1, . . . , m, as response variables. In total m regression
functions φj (x), j = 1, . . . , m are learned.
7

Knowledge Transfer LUPI
We can augment the training data set as follows:



y1 x1 φ1(x1) . . . φm(x1)
. . . . . . . . . . . . . . .
yl xl φ1(xl ) . . . φm(xl )



Then, we apply some standard algorithm to this modiﬁed training
data and construct an p + m-dimensional decision rule f (x, Φ(x)),
where
Φ(x) = (φ1(x) φ2(x) . . . φm(x))
8

Knowledge Transfer LUPI: Contd...
Figure adapted from: Izmailov et al ”Feature Selection in Learning Using Privileged Information”. ICDM 2017
9

Knowledge Transfer LUPI: Contd...
Various regression techniques can be used for approximating
privileged features. For example, multiple linear Regression and
Kernelized Ridge Regression (KRR) with RBF kernel etc.
In LUPI with knowledge transfer approach the same data is used to
learn the m-regression functions Φ = φj (x), j = 1, . . . , m as for
creating the ﬁnal decision rule.
10

Split Knowledge Transfer LUPI: Training
• Inspired by the cross-validation approach, we propose LUPI
with Split Knowledge Transfer, SKT-LUPI .
• The training set is split into K folds of equal (or almost equal)
size, and then the following two steps are repeated for
k = 1 . . . K.
step 1 We use the kth fold (Xk, X∗
k) for learning the m transfer
function Φk = (Φk1, . . . Φkm)T and the rest of the training set
is augmented with its predicted values from m regressions
¯Xk = (X−k Φk(X−k).
step 2 Using this augmented design matrix ¯Xk and the corresponding
labels, a decision rule, Fk is constructed.
11

Split Knowledge Transfer LUPI: Prediction
• For a new test object x ∈ X, we compute ˆx∗
1 , . . . , ˆx∗
K
regression estimates using regression functions learned
previously.
• Predictions are made for each combination (x, x∗
k ), for
k = 1, . . . , K in the m + p dimension space using the
corresponding decision rule Fk.
• We use the synergy method2 to combine the results from K
decision rules.
2
Vapnik, Vladimir, and Rauf Izmailov. ”Synergy of monotonic rules.” The
Journal of Machine Learning Research 17.1 (2016): 4722-4754.
12

Experiment 1: Synthetic Dataset
Table 1: Comparison of error rates (in %) on the synthetic dataset 3
between four types of classiﬁcation scenarios considered: SVM on
standard features, LUPI with knowledge transfer (KT-LUPI), LUPI with
split knowledge transfer (SKT-LUPI) with 5-folds, and SVM on privileged
features. Multiple linear regression is used for knowledge transfer.
Training Size SVM KT-LUPI SKT-LUPI SVM on PI
25 9.86 7.68 6.10 3.39
35 7.26 5.42 4.80 3.19
45 7.21 4.69 4.22 2.98
55 6.88 5.04 4.84 2.51
3
Vapnik and Izmailov, ”Learning with Intelligent Teacher”, COPA 2016
Proceedings of the 5th International Symposium on Conformal and
Probabilistic Prediction with Applications, vol 9653 pp 3-19 13

Experiment 2: Knowledge Transfer using Linear Transformation
Table 2: Comparison of error rates (in %) on modiﬁed real datasets 4
split knowledge transfer (SKT-LUPI) with 10-folds, and SVM on all
features (standard features and privileged features). Multiple linear
regression is used for knowledge transfer.
Dataset SVM KT-LUPI SKT-LUPI SVM on all features
Parkinsons 10.25 8.58 7.05 6.53
Ionosphere 5.49 4.92 4.29 4.85
kc2 17.28 17.09 16.99 16.80
4
Izmailov et al. ”Feature selection in learning using privileged information”. In
2017 IEEE International Conference on Data Mining Workshops, pp 957-963.
14

Experiment 3: Knowledge Transfer using non-Linear Transfor-
mation
Table 3: Comparison of error rates (in %) on modiﬁed real datasets 5
split knowledge transfer (SKT-LUPI) with 10-folds, and SVM on all
features (standard features and privileged features). Kernel ridge
regression is used for knowledge transfer.
Dataset SVM KT-LUPI SKT-LUPI SVM on all features
Parkinsons 10.25 8.97 7.82 6.53
Ionosphere 5.49 5.07 4.57 4.85
kc2 17.28 17.23 16.95 16.80
5
Izmailov et al. ”Feature selection in learning using privileged information”. In
2017 IEEE International Conference on Data Mining Workshops, pp 957-963.
15

Conclusions
SKT-LUPI shows improved performance over KT-LUPI for the
studied datasets.
Limitations: Only a few (generated) LUPI datasets
Future work:
• SKT-LUPI with conformal prediction
• Explore feature selection / dimension reduction
• Application in image classiﬁcation, compare with Deep
Learning
• Generating LUPI datasets
17

Acknowledgements
This project received ﬁnancial support from the Swedish
Foundation for Strategic Research (SSF) as part of the HASTE
project under the call ‘Big Data and Computational Science’.
The computations were performed on resources provided by SNIC
through Uppsala Multidisciplinary Center for Advanced
Computational Science (UPPMAX) under project
SNIC 2019/8 − 15.
18

Split Knowledge Transfer in Learning Under Privileged Information Framework

Recomendados

Recomendados

Más contenido relacionado

Más de Ola Spjuth

Más de Ola Spjuth (10)

Último

Último (20)

Split Knowledge Transfer in Learning Under Privileged Information Framework