myequivalents is a system to manage cross-references between entities that can be identified by pairs composed of a service name (e.g., EBI's ArrayExpress, Wikipedia) and an accession (e.g., E-MEXP-2514, Barack_Obama). For those familiar with the Semantic Web, we plan to support identification of entities via URIs and the owl:sameAs property. For those who already know MIRIAM and identifiers.org, myequivalents is more general than them and we plan to support these services in future.
7. 7
Why a Centralised Service
BioSDSamplesSAMEA597705
AEExperimentsE-AFMX-11
http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11
AEDataE-AFMX-11
http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11
ENASequencesSRR034107
Bundle 1Bundle 1
Bundle 1Bundle 1
http://dbpedia.org/resource/Barak_h_obama
http://en.wikipedia.org/wiki/Barack_Obama
http://www.freebase.com/view/en/barack_obama
Bundle 2Bundle 2
Bundle 2Bundle 2
Managing equivalenceManaging equivalence
classes compactclasses is more compactand efficientand more efficient
Managing equivalenceManaging equivalence
classes compactclasses is more compactand efficientand more efficient
8. 8
Why a Centralised Service
their consumers only
Simplifies management
URI auto-creation
Links updated independently on their consumers and once only
Avoids redundancy
implicit symmetry and transitivity in the bundles
single-point storage and rendering vs one per repository
More efficient
A specialised service for this is potentially faster, e.g. sameas.org
More features can be added to the basic service
Multiple access formats and paradigms (e.g., XML, RDF, SPARQL)
MIRIAM integration
9. 9
The Model
BioSD/SamplesSAMEA597705
AE/ExperimentsE-AFMX-11
AE/DataE-AFMX-11
ENA/SequencesSRR034107
ServiceAccession
Entity
Entity Mapping
BioSD
ENA
AE
Service collectionsame accessions, implicit mapping
Bundle
(i.e., partition class)
provides service
provides
service
provides
service
Repositories
Service Properties:
Title, Description
URI Pattern
Repository Properties:
Title, Description
URL
Managing Organization
Logo URL
10. 10
API Examples (Java, Mapping)
public interface EntityMappingManager {
public void storeMappings ( String ... entityIds );
public void storeMappingBundle ( String ... entityIds );
public int deleteMappings ( String ... entityIds );
public int deleteEntities ( String ... entityIds );
public EntityMappingSearchResult getMappings (
Boolean wantRawResult, String ... entityIds );
public EntityMappingSearchResult getMappingsForTarget (
Boolean wantRawResult, String targetServiceName, String entityId );
public String getMappingsAs (
String outputFormat, Boolean wantRawResult, String ... entityIds );
public String getMappingsForTargetAs (
String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId );
public void close ();
}
public interface EntityMappingManager {
public void storeMappings ( String ... entityIds );
public void storeMappingBundle ( String ... entityIds );
public int deleteMappings ( String ... entityIds );
public int deleteEntities ( String ... entityIds );
public EntityMappingSearchResult getMappings (
Boolean wantRawResult, String ... entityIds );
public EntityMappingSearchResult getMappingsForTarget (
Boolean wantRawResult, String targetServiceName, String entityId );
public String getMappingsAs (
String outputFormat, Boolean wantRawResult, String ... entityIds );
public String getMappingsForTargetAs (
String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId );
public void close ();
}
Multiple access means
Programmatic API
Line Commands
REST Web Service
Multiple data exchange formats
Java and Java REST (Jersey used, client available)
XML (The same that comes from REST, mapped via JAXB)
JSON (future, maybe)
RDF (future, more later)
Queries via service+accession or URI (in future)
Multiple access means
Programmatic API
Line Commands
REST Web Service
Multiple data exchange formats
Java and Java REST (Jersey used, client available)
XML (The same that comes from REST, mapped via JAXB)
JSON (future, maybe)
RDF (future, more later)
Queries via service+accession or URI (in future)
13. 13
Component-based Architecture
Components and their topology configured/instantiated via Spring
Easy to build features like:
Caching
Logging
Layered computations (e.g., add services in the same collection)
Integration of 3-rd party systems (e.g., MIRIAM, more later)
Components and their topology configured/instantiated via Spring
Easy to build features like:
Caching
Logging
Layered computations (e.g., add services in the same collection)
Integration of 3-rd party systems (e.g., MIRIAM, more later)
14. 14
Related Work
myEquivalents inspired to this
Does pretty much what we do
With a very similar internal model
But for URIs only
Code not available
Only available as SAAS, no binary to deploy
myEquivalents inspired to this
Does pretty much what we do
With a very similar internal model
But for URIs only
Code not available
Only available as SAAS, no binary to deploy
15. 15
Related Work
Pair model for URIs is a standard
Equivalence-based model missing
Dual identification mechanism missing
Pair model for URIs is a standard
Equivalence-based model missing
Dual identification mechanism missing
16. 16
Future: RDF, SPARQL, Semantic Web
Dereferenceable URIs, with RDF output
Keeping support to the accession-based model too
SPARQL, with support for both:
?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).
?entity1 owl:sameAs ?entity2 (mapping pair model)
and for entity containers:
_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]
adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service
To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena
Support for inference directly in the object model
faster than a generic reasoner
Support for SPARQL/UPDATE?
Would allow for using an endpoint straight as back-end
Support to keyword-based search, as in sameas.org
Requires the addition of attributes (eg, title, description), nothing available at the
Dereferenceable URIs, with RDF output
Keeping support to the accession-based model too
SPARQL, with support for both:
?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).
?entity1 owl:sameAs ?entity2 (mapping pair model)
and for entity containers:
_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]
adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service
To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena
Support for inference directly in the object model
faster than a generic reasoner
Support for SPARQL/UPDATE?
Would allow for using an endpoint straight as back-end
Support to keyword-based search, as in sameas.org
Requires the addition of attributes (eg, title, description), nothing available at the
17. 17
Related Work
It is to manage entities that share accessions
e.g., PubMed and CiteXplore
So, not enough for us
But would be great to integrate!
It is to manage entities that share accessions
e.g., PubMed and CiteXplore
So, not enough for us
But would be great to integrate!
18. 18
Future: MIRAM and identifiers.org support
Services &
Entities
Service Collection
19. 19
Future: MIRAM and identifiers.org support
Service Collection
Services
Entity
20. 20
Combining MIRAM and myEquivalents
Uniprot P62158
MIR:001000234599080
http://www.ebi.ac.uk/citexplore/
citationDetails.do?
dataSource=MED&externalId=4599080
http://www.ebi.ac.uk/citexplore/
citationDetails.do?
dataSource=MED&externalId=4599080
HubMed4599080
http://www.ncbi.nlm.nih.gov/protein/P62158
http://www.ncbi.nlm.nih.gov/protein/P62158
Mappings Stored in
myEquivalents
Computed by
MIRIAM
Computed by
MIRIAM
Resources importedfrom MIRIAM
21. 21
Issues: Access Control (on-going)
We assume:
by just within the same most of data is publicly readable
except private entities (maybe)
Implies a very simple model, users can have the roles of
reader, can only read public stuff
the only thing got by anonymous (i.e., un-authenticated user)
editor, can change all (mappings, service descriptions etc)
Authentication details
travel via SSL/HTTPS and via POST
makes it unnecessary to have complex mechanisms based on shared We assume:
updates are managed by just a few people, within the same organisation and collaborating team
most of data is publicly readable
except private entities (maybe)
Implies a very simple model, users can have the roles of
reader, can only read public stuff
the only thing got by anonymous (i.e., un-authenticated user)
editor, can change all (mappings, service descriptions etc)
admin, can administrate users and permissions
Though simple, it's a good base for managing provenance too
Authentication details
all requests contains user + hash(password)
travel via SSL/HTTPS and via POST
makes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)
22. 22
Issues: Versioning (future?)
That's been ignored so far
cause we're assuming one version ↔ one accession ↔ one URI
and leaving versioning fun to the repositories
Must be addressed later
Possible scenario:
Entities are identified by means of service + acc + version
New version relations are added (has-version, is-prior-version, has-next- version)
It is still one URI ↔ one entity at the level of a given version
URI pattern contains an additional placeholder for the ver.
It's up to the myEquivalents clients to either:
omit the version (ie, last version is always assumed, even upon ver. increase)
specify a given version (requires manual version update)
Possibly: keep history of all versions
That's been ignored so far
cause we're assuming one version ↔ one accession ↔ one URI
and leaving versioning fun to the repositories
Must be addressed later
Possible scenario:
Entities are identified by means of service + acc + version
New version relations are added (has-version, is-prior-version, has-next- version)
It is still one URI ↔ one entity at the level of a given version
URI pattern contains an additional placeholder for the ver.
It's up to the myEquivalents clients to either:
omit the version (ie, last version is always assumed, even upon ver. increase)
specify a given version (requires manual version update)
Possibly: keep history of all versions
23. 23
That's
all!
Thank
You!
Have a look at the code and the wiki (on-going work!):
http://myequivalentshttp://github.com/EBIBioSamples/myequivalents