Including Co-Referent URIs in a SPARQL Query

Including Co-referent URIs
in a SPARQL Query
Christian Y A Brenninkmeijer,
Carole Goble, Alasdair J G Gray, Paul Groth,
Antonis Loizou, and Steve Pettifer

www.openphacts.org
@open_phacts

A.J.G.Gray@hw.ac.uk
@gray_alasdair

Multiple Identities
Andy Law's Third Law
“The number of unique identifiers assigned to an individual is
never less than the number of Institutions involved in the study”
http://bioinformatics.roslin.ac.uk/lawslaws.html

GB:29384

P12047

Are these the
same thing?

X31045

22/10/2013

COLD 2013

1

Gleevec® = Imatinib Mesylate
Imatinib

Imatinib Mesylate Mesylate
YLMAHDNUQAMNNX-UHFFFAOYSA-N

ChemSpider
22/10/2013

Drugbank
COLD 2013

PubChem
2

Multiple Links: Different Reasons

Link: skos:closeMatch
Reason: non-salt form

22/10/2013

Link: skos:exactMatch
Reason: drug name

COLD 2013

6

Dynamic Equality
Strict

Relaxed

Analysing

Browsing

skos:exactMatch
(InChI)

22/10/2013

COLD 2013

7

Dynamic Equality
Strict

Relaxed

Analysing

Browsing
skos:closeMatch
(Drug Name)

skos:exactMatch
(InChI)

skos:closeMatch
(Drug Name)
22/10/2013

COLD 2013

8

Open PHACTS Discovery Platform
Apps
Interactive
responses
Method
Calls

Domain API

Drug Discovery Platform
Production quality
integration platform

22/10/2013

COLD 2013

9

Integration Approach
•
•
•
•

Data kept in original model
Data cached in central triple store
API call translated to SPARQL query
Query expressed in terms of original data

22/10/2013

COLD 2013

10

OPS Discovery Platform

Core Platform

Apps
Identity
Resolution
Service
Identifier
Management
Service

“Adenosine
receptor 2a”

Linked Data API (RDF/XML, TTL, JSON)

P12374
EC2.43.4
CS4532

Domain
Specific
Services

Semantic Workflow Engine
Chemistry
Registration
Normalisatio
n & Q/C

Data Cache
(Virtuoso Triple Store)

Indexing
VoID

VoID

VoID

Nanopub

Public
Ontologies

Db

Db

22/10/2013

VoID

Nanopub

Db

Nanopub

Db

COLD 2013

Public Content

VoID

Commercial

User
Annotations

11

Platform Interaction
1. Resolve user input:
– User enters search text
– Resolve to a URI for concept

2. Request data for URI
– Expand URI to equivalent for each dataset
– Run resulting SPARQL query

22/10/2013

COLD 2013

12

Query Expansion
GRAPH <http://rdf.chemspider.com> {
cw:979b545d-f9a9 cheminf:logd ?logd .
?iri cheminf:logd ?logd .
FILTER (?iri = cw:979b545d-f9a9 ||
?iri = cs:2157 ||
cw:979b545d-f9a9, L
?iri = chembl:1280 || [cw:979b545d-f9a9, 1
cs:2157,
?iri = db:db00945 )

}

Q, L1

Q’

Query Expander
Service

chembl:1280,
db:db00945]

Identity
Mapping Service
(BridgeDB)

Can also be achieved through UNION

Mappings
Profiles

22/10/2013

COLD 2013

13

Experiment
Is it feasible to use a stand-off
mapping service?
• Base lines (no external call):
– “Perfect” URIs
– Linked data querying

• Expansion approaches (external service call):
– FILTER by Graph
– UNION by Graph
22/10/2013

COLD 2013

14

“Perfect” URI Baseline
WHERE {
GRAPH <chemspider> {
cs:2157 cheminf:logp ?logp .
}
GRAPH <chembl> {
chembl_mol:m1280 cheminf:mw ?mw .
}
}

22/10/2013

COLD 2013

15

Linked Data Baseline
WHERE {
GRAPH <chemspider> {
cs:2157 cheminf:logp ?logp .
}
GRAPH <chembl> {
?chemblid cheminf:mw ?mw .
}
cs:2157 skos:exactMatch ?chemblid .
}

22/10/2013

COLD 2013

16

Queries
Drawn from Open PHACTS API:
1. Simple compound information (1)
2. Compound information (1)
3. Compound pharmacology (M)
4. Simple target information (1)
5. Target information (1)
6. Target pharmacology (M)
22/10/2013

COLD 2013

17

Queries
Drawn from Open PHACTS API:
1. Simple compound information (1)
2. Compound information (1)
3. Compound pharmacology (M)
4. Simple target information (1)
5. Target information (1)
6. Target pharmacology (M)
22/10/2013

COLD 2013

18

Datasets and Links
Data:
167,783,592 triples

22/10/2013

Mappings:
2,114,584 triples

COLD 2013

Lenses:
1

19

Average execution times

22/10/2013

COLD 2013

20

0.018

Average execution times

22/10/2013

COLD 2013

21

Conclusions
• Query expansion slower in general
– Due to separate service call
– Difference below human perception
– UNION faster than FILTER on Virtuoso

• Stand-off mappings feasible
• Infrastructure can support lenses
Strict

Relaxed

Analysing

Browsing

22/10/2013

COLD 2013

29

Questions
A.J.G.Gray@hw.ac.uk
www.macs.hw.ac.uk/~ajg33
@gray_alasdair

Open PHACTS Project

pmu@openphacts.org
www.openphacts.org
@open_phacts

Including Co-Referent URIs in a SPARQL Query

Recomendados

Recomendados

Más contenido relacionado

Destacado

Destacado (11)

Similar a Including Co-Referent URIs in a SPARQL Query

Similar a Including Co-Referent URIs in a SPARQL Query (10)

Más de Alasdair Gray

Más de Alasdair Gray (9)

Último

Último (20)

Including Co-Referent URIs in a SPARQL Query

Notas del editor