A talk about the merger and refactoring of the eagle-i and VIVO ontologies presented by myself, Brian Lowe, Janos Hajagos, and Erich Bremer at the VIVO2013 conference in St. Louis
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Isf vivo2013
1. Integrated Semantic Framework:
launching the next generation
VIVO ontology
Erich Bremer, Jon Corson-Rikert,
Melissa Haendel,
Janos Hajagos and Brian Lowe
@ontowonka
net w o r k
2. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
People and Resources
techniques
training
protocols
affiliation
roles
grants
credentials
genes
anatomy
manufacturer
publications
disease
3. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
CTSAConnect Project
Connecting people and resources
Needs:
Identify potential collaborators, relevant resources, and
expertise across scientific disciplines
Assemble translational teams of scientists to address specific
research questions
Goal is to create a semantic representation of clinician and
basic science researcher expertise to enable:
More effective linking of information about clinicians and basic
science researchers
Computation and publication of clinical expertise data as
Linked Data (LD) for use in other applications
4. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Integrated Semantic Framework
Ontology (ISF) suite
Merge the eagle-i and VIVO ontologies into one single ontology
suite (the ISF)
Extend their coverage to include representation of clinical
encounter
Modularize the ISF such that it can be made available in a set of
files that can be reused independently
eagle-i
Resources
VIVO
People
Coordination
eagle-i
VIVO
Semantic
Clinical
activities
5. ISF Content and modularization
eagle-i
Research resources
VIVO
Person profiling
CTSA ShareCenter
Discussions, requests,
share documents
ISF
Contact Organizations
Affiliations
Services Events
Clinical
Expertise
Reagents
Organisms
Credentials
CTSAconnect
Reveal Connections. Realize Potential.
6. 3/18/2022 6
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Original Ontologies
eagle-i resource ontology VIVO ontology
BFO as upper Ontology No upper Ontology
Has OBO Foundry principles as
guiding design principles
Adopts ontologies already in wide
use across the Linked Data
community such as FOAF and BIBO
Aimed at driving an application as
well as develop an interoperable
core domain ontology
Aimed mostly at supporting data
validation and data entry through
the VIVO application and to
produce Linked data
Active application and ontology
development and live data
Active application and ontology
development and live data
Somewhat unconventional scenario: Usually creating ontologies from
scratch or reusing existing ontologies without above constraints
7. 3/18/2022 7
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
A first approach
Goals:
Identify overlapping and duplicated entities in the eagle-i and VIVO ontologies
Avoid severe disruptions in application compatibility
Minor incremental additions to the ISF and push significant changes back to the
source ontologies
Good for:
Referencing existing entities while developing new ISF-specific modules
Performing initial alignments on classes in some portion of the overlapping
hierarchies
Limits:
Lengthy process of identifying necessary alignments and implementing changes
in the source ontologies
With no disruption to the applications, development was slow and low impact
8. 3/18/2022 8
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Preferred approach
Implement the refactoring and merging disconnected from application and
data constraints
Impact on the application and data migration assessed after refactoring
Better balance of impact on apps and data migration versus total redesign of approach
Refactoring of source files based on content coverage
9. 3/18/2022 9
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
What the ISF means for VIVO
Fewer object properties
hasRole
hasAttendeeRole
hasClinicalRole
hasPartnerRole
hasOrganizerRole
hasOutreachProviderRole
hasTeacherRole
…
10. 3/18/2022 10
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
What the ISF means for VIVO
Fewer object properties
hasRole
hasAttendeeRole
hasClinicalRole
hasPartnerRole
hasOrganizerRole
hasOutreachProviderRole
hasTeacherRole
…
13. 3/18/2022 13
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
New Approach
Consistently query for type of related resource
Configure application behavior for property and class
combinations, not properties alone
Person
Teacher
Role
hasRole
Presenter
Role
hasRole
“teaching activities”
“presentations”
21. 3/18/2022 21
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Fewer Person Subclasses
Most Person subclasses removed from ISF
Retained in VIVO application until 1.7+
Person
FacultyMember
NonFacultyAcademic
Postdoc
Librarian
Archivist
…
22. 3/18/2022 22
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Fewer Person Subclasses
Better to query for people by positions and roles
Person
FacultyMember
NonFacultyAcademic
Postdoc
Librarian
Archivist
…
23. 3/18/2022 23
www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Three examples of merging and
refactoring
Tackle an open design/representation issue proposing
a new design pattern (position of a person over time)
Reference/incorporation of external vocabularies or
taxonomies
Merging two different design approaches (Person and
Contacts) using existing standard (Vcard)
29. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Annotation view with approved or pending approval.
Module view shows pending axiom changes per module and has ability to save the
changes with a log comment, and generate the spreadsheet summary
Protégé refactoring plugin
35. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
So what?
Now that eagle-i and VIVO are “on the same page,” future
development can leverage better consensus and ontologically
rigorous solutions
CTSAs have a new research profiling data standard for exchange
Applications such as Vivo, eagle-i, LOKI, Profiles, SciVal, and
ScienCV are working on generating ISF compliant data
We can profile people based on a much larger diversity of their
activities and products of research
There is still a lot of work to do – this was a short term project
and ISF could be better generalized for other use cases
36. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
ISF/VIVO Ontology Working Group
Extending and building on the ISF going forward
• Collaborating with other VIVO working groups to assure
the ISF evolves as it needs it to
• Synergies with ShareCenter, eagle-i, Plumage from UCSF
Engaging as a community with SciENCV, CASRAI,
euroCRIS LOD group, CTSA Ontology Affinity Group,
and others
Biweekly calls, mailing list, documentation – all on
wiki.duraspace.org/display/vivo
37. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
ISF/ShareCenter Drupal Integration
1) Mapping Drupal Node Fields with corresponding ISF
Predicates
2) Integration Issue #1 – Drupal creates it’s own URIs for
mappings.
3) Importing of custom RDF augmentation to Drupal
RDF Store (ARC2)
4) Integration Issue #2 – ARC2 store wiped clean on
each re-indexing
38. www.ctsaconnect.org CTSAconnect
Reveal Connections. Realize Potential.
Team
CTSA 10-001: 100928SB23
PROJECT #: 00921-0001
OHSU:
Melissa Haendel, Carlo
Torniai, Nicole Vasilevsky,
Shahim Essaid, Eric Orwoll
Cornell University:
Jon Corson-Rikert, Dean
Krafft, Brian Lowe
University of Florida:
Mike Conlon, Chris Barnes,
Nicholas Rejack
Stony Brook University:
Moises Eisenberg, Erich
Bremer, Janos Hajagos
Harvard University:
Daniela Bourges-
Waldegg
Sophia Cheng
Share Center:
Chris Kelleher, Will
Corbett, Ranjit Das,
Ben Sharma
University at Buffalo:
Barry Smith, Dagobert
Soergel
CTSAconnect project
ctsaconnect.org
ISF ontology
sourcehttp://code.google.com/p/connect-
isf/
ISF 1.0 Documentation
http://connect-
isf.googlecode.com/svn/release/2013-07-
31/isf-1.0-documentation.pdf
New Duraspace Ontology Working Group
https://wiki.duraspace.org/pages/viewpage.
action?pageId=34656953
Resources
Notas del editor
The process of integrating the eagle-i and VIVO ontologies, refactoring them, and modularizing the ISF posed a set of interesting challenges and constraints
Trade off between content coverage aggregation and pattern-driven (for example, certain types of axioms in one place, imports, etc.)
For instance, profile module that needs to be generic.
Vocabulary as information model: Person and social security. The axioms that a person has social security is not an axiom that exists in “dictionary”. Informational vs definitional axioms. Informational axioms are about a subset of the entity – E.g. People are not defined by their social security number.
Using Vcard to bring together two different representations.
Both eagle-I and VIVO had representations of contact information but most of it was done with data properties and string values, some of which were not structures. The move to the new vcard/foaf representation imposes more structure and requires a lot more use of classes and object properties. The data migration is not yet done for the applications.
ISF now includes the general idea of contact, which can take multiple forms. An agent can have a contact that can be a FOAF profile (more web based) while the VCArd is more standard.
The Vcard standard is a well established IETF networking standard for exchanging contact related information and the FOAF vocabulary is a commonly used RDF vocabulary to represent contact like information that is more focused on web presence rather than physical addresses and communication as in the Vcard case. The ISF adopts both Vcard and FOAF. Vcard had an existing RDF mapping at the beginning of the project but recently the W3C published a new RDF mapping for version 4 of vcard. The RDF mapping is still in draft status but we are moving to the new RDF mapping for the final release of the ISF.
This shows the use-cases for URIs that don’t fall under the typical OWL class/individual modeling of data. There is a need for an agreed on set of codes, concepts, types, etc. of things in addition to classes and individuals. It is also just another perspective on the domain where there is frequently a need to talk about a whole set (an OWL class) as if it is a single primitive thing (an instance) and SKOS is a formalization of this idea.
Here we have added the punning (if needed)
This diagram shows:
That we make a distinction between the “ontology” on the left side and the “vocabulary” on the right.
This distinction doesn’t mean that the set of URIs on both sides are disjoint. Certain URIs might exist as classes in the ontology and as individuals in the vocabulary. This is the punning, the same URI has two different type assertions (class type vs. individual type)
The “PhD degree” is an individual that can be referenced in a “position” instance to indicate that the position is related to PhD degrees in some way but it doesn’t imply that there is a specific instance of a PhD degree that belong to some agent related by the position. If an agent later obtains an actual instance of a PhD degree, a new URI will be created and asserted to be an instance of the “PhD degree” class from the ontology (the punning of the “PhD degree” URI).
Here ICD example:
Concept scheme class means the vocabulary (Mesh or ICD9) and the SKOS concept
The concept ICD (327.3 exists in ICD9 scheme).
Now the notation (which is an actual datatype such umls-aui) and the value of that datatype. The concept ICD0 is coded with the code
SKOS give you some object property to related concept. The closeMatch, exactMatch
When same AUI or CUI exist we have exact match
Lui Sui CUI AUI
*UI
The idea is using SKOS:exactMatch or closeMatch, we can walk between ontologies and still relate back to ISF
Increasing the complexity of the ontology merging process created more impetus to keep track of changes and document and validate them. To this end, we developed a Protégé plugin that better supports this new process.
When we were in the stage of being very detailed we wanted to mark axioms for each classes if they were migrated or not.
Yellow was reviewed, green was complete with axiom migration