1. Towards Transfer Learning of Link Specications
Axel-Cyrille Ngonga Ngomo
Jens Lehmann
Mofeed Hassan
2013-09-16
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
1 / 29
4. Why Link Discovery?
1
2
3
Fourth Linked Data
principle
Links are central for
Cross-ontology QA
Data Integration
Reasoning
Federated Queries
...
2011 topology of the
LOD Cloud:
31+ billion triples
≈ 0.5 billion links
owl:sameAs in most
cases
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
4 / 29
5. Why is it dicult?
Denition (Link Discovery)
S and T of resources and relation R
Find M = {(s , t ) ∈ S × T : R(s , t )}
Given sets
Task:
Common approaches:
Find
= {( , ) ∈
= {( , ) ∈
Find
M
M
s t S × T : σ(s , t ) ≥ θ}
s t S × T : δ(s , t ) ≤ θ}
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
5 / 29
6. Why is it dicult?
Denition (Link Discovery)
S and T of resources and relation R
Find M = {(s , t ) ∈ S × T : R(s , t )}
Given sets
Task:
Common approaches:
Find
= {( , ) ∈
= {( , ) ∈
Find
M
M
1
s t S × T : σ(s , t ) ≥ θ}
s t S × T : δ(s , t ) ≤ θ}
Time complexity
Large number of triples
Quadratic a-priori runtime
69 days for mapping cities from
DBpedia to Geonames (1ms per
comparison)
Decades for linking DBpedia and LGD
...
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
5 / 29
7. Why is it dicult?
2
Complexity of specications
Combination of several attributes required for high precision
Tedious discovery of most adequate mapping
Dataset-dependent similarity functions
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
6 / 29
9. Link Specication
Detection of accurate link specication is key
Link Specications has three components:
Two sets of restrictions RS ... RS resp. RT ... RT that specify the
m
1
1
k
sets resp. ,
A specication of a complex similarity metric σ via the combination of
several atomic similarity measures σ1 , ..., σn and
A set of thresholds τ1 , ..., τn such that τi is the threshold for σi .
S
T
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
8 / 29
10. Transfer Learning
Classical Learning of Link Specs
Transfer Learning of Link Specs
Current Linking Task
Different Linking Tasks
Task Repository
spec accuracy: α
class similarity: ζ
property similarity: π
Learning System
Learning System
Learning System
In our approach we use
Transfer Learning System
Transductive Transfer Learning
Class and property matching is assumed to be known already
(numerous approaches from ontology matching can be employed) the goal is to nd the complex similarity metric
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
9 / 29
12. Transfer Learning Framework I
Transfer Learning of link specications is reduce to three subproblems:
Restrictions/class similarity ζ : 2C × 2C → [0, 1]
e.g. ζ({City , Village }, {Town}) = 0.6
Property similarity: ξ : 2P × 2P → [0, 1]
e.g. ξ({rdfs : label }, {rdfs : label }) = 1.0
Accuracy of link specications: α : Q → [0, 1]
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
11 / 29
13. Transfer Learning Framework II
Overall similarity measure for transfer learning:
ω(t , t ) = α(q ) · ζ(ψ(q ), C) · ζ(ψ (q ), C ) · ξ(sp (q ), PL ) · ξ(tp (q ), PL )
(details in paper)
Each similarity measure can be implemented in manifold approaches
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
12 / 29
14. Transfer Learning Framework II
Overall similarity measure for transfer learning:
ω(t , t ) = α(q ) · ζ(ψ(q ), C) · ζ(ψ (q ), C ) · ξ(sp (q ), PL ) · ξ(tp (q ), PL )
(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarity
name-based similarity (URI similarity)
data-centric similarity
Properties similarities ξ are dened analogously
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
12 / 29
15. Transfer Learning Framework II
Overall similarity measure for transfer learning:
ω(t , t ) = α(q ) · ζ(ψ(q ), C) · ζ(ψ (q ), C ) · ξ(sp (q ), PL ) · ξ(tp (q ), PL )
(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarity
name-based similarity (URI similarity)
data-centric similarity
Properties similarities ξ are dened analogously
Similarities between single classes/properties can be extended to sets
(e.g. using arithmetic / geometric mean of max. similarity)
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
12 / 29
16. Transfer Learning Framework II
Overall similarity measure for transfer learning:
ω(t , t ) = α(q ) · ζ(ψ(q ), C) · ζ(ψ (q ), C ) · ξ(sp (q ), PL ) · ξ(tp (q ), PL )
(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarity
name-based similarity (URI similarity)
data-centric similarity
Properties similarities ξ are dened analogously
Similarities between single classes/properties can be extended to sets
(e.g. using arithmetic / geometric mean of max. similarity)
Spec can be transferred by replacing properties with most similar
properties in PL and PL
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
12 / 29
17. Example (New Link Task)
Example link specication for mapping drugs in two datasets DBpedia and
Drugbank (DBpedia-Drugbank.xml):
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
13 / 29
18. Example (Restriction part)
Three parts of link specs:
Restrictions part
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
14 / 29
19. Example (Properties Part)
Three parts of link specs:
Restrictions part
Properties part
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
15 / 29
20. Example (Similarities Measures Part)
Three parts of link specs:
Restrictions part
Properties part
Similarity Measures part: similarity metric and thresholds
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
16 / 29
21. Example (Link Repository)
Transfer learning is applied using a repository → restrictions and relevant
properties are assumed to be known → nd the similarity measure by
comparing with all specs in the repository, e.g. DBpedia-SiderDrugs.xml
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
17 / 29
22. Example (Restriction Similarities)
Restrictions in both specications les
Type
DBpedia-Drugbank.xml
DBpedia-SiderDrugs.xml
Source
Target
rdf:type dbpedia-owl:Drug
rdf:type drug:drugs
rdf:type dbpedia-owl:Drug
rdf:type sider:drugs
Straightforward label/URI similarity
For instance, trigram metric in URI similarity without prexes:
ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0
ζ({sider:drugs}, {drug:drugs}) = 1.0
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
18 / 29
23. Example (Restriction Similarities)
Restrictions in both specications les
Type
DBpedia-Drugbank.xml
DBpedia-SiderDrugs.xml
Source
Target
rdf:type dbpedia-owl:Drug
rdf:type drug:drugs
rdf:type dbpedia-owl:Drug
rdf:type sider:drugs
Straightforward label/URI similarity
For instance, trigram metric in URI similarity without prexes:
ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0
ζ({sider:drugs}, {drug:drugs}) = 1.0
1
Data-centric: ζd (s , s ) = |P (s )||P (s
sim(x , y ) where
x ∈P (s ) y ∈P (s )
P (s ) = {x : s p x ∧ p rdf:type owl:DatatypeProperty}
(extends similarity to instances)
Ngonga et. al (Univ. Leipzig)
)|
Transfer Learning of Link Specs
2013-09-16
18 / 29
25. Example (Overall Similarity)
Based on, e.g. F-score assign quality value to q =
DBpedia-SiderDrugs.xml, in our case α(q ) = 0.89
The nal step is calculating the overall similarity measure
ω(DBpedia − Drugbank .xml , DBpedia − SiderDrugs .xml ) =
0.89 * 1.0 * 1.0 * 0.9 * 0.8 = 0.64
The steps are repeated for all link specications in the repository
Most similar link spec can be transferred by replacing its properties
with the most similar ones in the computed property matching
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
20 / 29
27. Experimental Setup I
The goal of evaluation is two-fold:
Evaluating whether transfer learning can be used to build templates
for link spec
Discover whether the transferred templates can be used directly
113 specications were retrieved from LATC, each has manual links
evaluation
15%
10%
3%
2%
Persons
1%
3%
Events
Locations
Diseases
Drugs
Organizations
Misc
66%
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
22 / 29
28. Experimental Setup II
Leave-one-out evaluation
1.) Compare top-scored specication (most similar) and check
whether it uses the same combination of similarity functions assign 1
for match and 0 for no match
2.) Compute F-measure of learned link specs directly works only on
specs with both endpoints alive (only 12 out of 113)
Used URI similarity
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
23 / 29
30. First Experiments Set Results
Detecting right specication in 81% of all cases
In geo-spatial domain 91%
In persons domain 58%
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Ave
s
s
s
s
e
rag erson Event cation sease
P
Di
Lo
Ngonga et. al (Univ. Leipzig)
gs
ns
Dru izatio
n
rga
O
Transfer Learning of Link Specs
Mis
c
2013-09-16
25 / 29
31. Second Experiments Set Results
In the second Experiments series, source and target endpoints need to
be alive such that we can execute transferred link spec (12 out of 113)
In general low F-measures
100%
80%
60%
40%
Precision
Recall
F-Measure
20%
s
nt
ry
op
e
ne
i-c
ou
ty
si
ex
t
er
at
at
iv
gu
a-
pe
di
a-
te
nd
un
a-
at
pe
di
db
db
n
on
so
rs
er
pe
-p
d-
on
oo
og
f
-d
db
pe
di
a-
lin
ke
dg
eo
d
nt
or
t
rp
ai
at
a-
m
-r
st
ad
ev
en
ts
ee
r
so
n
-d
er
db
pe
di
ali
nk
ts
e
en
ev
ve
er
-e
-p
fo
od
og
ge
od
l3
s
p_
bl
er
ed
s
ty
ci
ar
ie
K-
ul
vU
ab
go
st
da
ta
di
a-
se
nt
ev
e
db
pe
di
pe
db
e-
im
Ngonga et. al (Univ. Leipzig)
-d
co
u
aco
n
es
-
-e
ss
xe
ra
eu
bc
r
rk
db
lp
-d
at
as
em
an
tic
w
eb
-
ur
re
se
ar
ch
er
nt
ry
0%
Transfer Learning of Link Specs
2013-09-16
26 / 29
33. Summary
Conclusions:
Detecting right template in 81% of all cases
Transfer learning cannot replace the learning of thresholds in
specications
Future Work:
Combination with machine-learning approaches for link specications
(e.g., EAGLE, COALA), in particular for learning thresholds
More sophisticated class and property similarity approaches
Ngonga et. al (Univ. Leipzig)
Transfer Learning of Link Specs
2013-09-16
28 / 29