SlideShare una empresa de Scribd logo
1 de 15
Industry Programme Workshop: Data Integration
18-19 September 2013
Ontology-based
data integration
Janna Hastings
Data integration is hard
Technology
Syntax
Semantics
Content
Different data resources, different needs
“why can‟t they all just use the same
- schema
- measurement accuracy
- units
- labels
- content?”
Standards are the solution… (?)
Source: http://xkcd.com/927/
Ontology-based data integration
Ontologies can help with the semantic and the content
aspects of data integration
• Semantic: definition for schemas
• OWL is a good language for defining schemas
• See RDF and Semantic Web presentations, today
• Content: definition of the entities referred to by data
• Ontologies embedded into a data integration workflow help
facilitate content-aware data integration
Core challenge: labelling
Multiple labels can mean the same thing
One label can mean multiple things
Semantics-free identifiers, multiple
synonyms
CHEBI:27732
A trimethylxanthine in which the three methyl
groups are located at positions 1, 3, and 7.
guaranine methyltheobromine
1,3,7-trimethylxanthineKoffein
caféine
Core challenge: biological knowledge
The answer to the question: “Is
Entity A from Data Source 1
the same thing as
Entity B from Data Source 2?”
often depends who is asking and who is answering!
Left lung vs. lung
Hippocampus vs. brain
Dopamine vs. L-dopamine
In vitro vs. In vivo cells of type X
Gene Y and post-translationally modified form Y‟
Gene Z in mouse, Gene Z in human
Hierarchy
left lung
lung
organ
is a
is a
Generalise to the
nearest common ancestor
i.e. if you are integrating data about tissue
samples annotated to „lung‟ in the one
dataset, and „left lung‟ in the other,
The ontology can compute „lung‟ as the
nearest common ancestor
Also for „left lung‟ and „right lung‟
Other relationships
Relationships encode biological knowledge
Rules allow to specify which relationships
can be traversed for data integration purposes
e.g. for tissue samples, part_of:
sample_frompart_of => sample_from
A sample from a part of the brain (e.g. the
hippocampus) is a sample from the brain
(Quite aside from the „is a‟ hierarchy!)
brain
hippocampus
part of
Core challenge: flexibility
… (>150 members)
Fixed-depth hierarchies
force some classes to be
too big, with the lowest level
collapsing biolgoical hierarchy
and others too small
… (<1 member)
Ontologies in content integration
A
B
A&B
1. Schema
mappings
A
B
2. Ontology-
provided
synonyms
A
B
3. Hierarchy
and relationship
rules for integration
OWL language and tools: web-embedded
(but whole-ontology rule reasoning may be slow)
Is ontology integration
just another type of data integration?
Which ontology(-ies) to use?
How to use them together?
How to plug the gaps?
Why should I (as a user) have to
do this integration over and over
Desiderata for ontologies for data
integration
• Ontologies should be neutral and shared community-
wide
• Users should be able to directly and rapidly extend the
ontology where there are gaps (responsiveness)
• The ontology should use semantics-free identifiers and at
the same time energetically annotate synonyms
• When necessary, ontologies should take care of
ontology integration to provide the community with a
one-stop service and appropriate cross-references
• The ontologies should be used
in data annotation
See http://www.obofoundry.org/
Questions?

Más contenido relacionado

La actualidad más candente

Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translate
IJwest
 
Ekaw ontology learning for cost effective large-scale semantic annotation
Ekaw ontology learning for cost effective large-scale semantic annotationEkaw ontology learning for cost effective large-scale semantic annotation
Ekaw ontology learning for cost effective large-scale semantic annotation
Shahab Mokarizadeh
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
cscpconf
 
Ontology engineering: Ontology alignment
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignment
Guus Schreiber
 

La actualidad más candente (20)

Learning ontologies
Learning ontologiesLearning ontologies
Learning ontologies
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontology
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESS
 
Xml based data exchange in the
Xml based data exchange in theXml based data exchange in the
Xml based data exchange in the
 
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...Using Text Comprehension Model for  Learning Concepts, Context, and Topic  of...
Using Text Comprehension Model for Learning Concepts, Context, and Topic of...
 
Ontology
OntologyOntology
Ontology
 
Automatically converting tabular data to
Automatically converting tabular data toAutomatically converting tabular data to
Automatically converting tabular data to
 
Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translate
 
Ekaw ontology learning for cost effective large-scale semantic annotation
Ekaw ontology learning for cost effective large-scale semantic annotationEkaw ontology learning for cost effective large-scale semantic annotation
Ekaw ontology learning for cost effective large-scale semantic annotation
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: Introduction
 
Enhancing Semantic Mining
Enhancing Semantic MiningEnhancing Semantic Mining
Enhancing Semantic Mining
 
mlss
mlssmlss
mlss
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
Translating Ontologies in Real-World Settings
Translating Ontologies in Real-World SettingsTranslating Ontologies in Real-World Settings
Translating Ontologies in Real-World Settings
 
Ontology Building and its Application using Hozo
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using Hozo
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Ontology engineering: Ontology alignment
Ontology engineering: Ontology alignmentOntology engineering: Ontology alignment
Ontology engineering: Ontology alignment
 
Ontology
Ontology Ontology
Ontology
 

Destacado

Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
Charlie Greenbacker
 

Destacado (11)

Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
 
“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”
 
Ontology
OntologyOntology
Ontology
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
 

Similar a Ontology-based Data Integration

Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object model
Mihika Shah
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
ATHMAN HAJ-HAMOU
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
IJwest
 

Similar a Ontology-based Data Integration (20)

Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Database
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professional
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object model
 
The basics of ontologies
The basics of ontologiesThe basics of ontologies
The basics of ontologies
 
Keynote at AgroLT 2008
Keynote at AgroLT 2008Keynote at AgroLT 2008
Keynote at AgroLT 2008
 
Ontology
OntologyOntology
Ontology
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
 
An approach for transforming of relational databases to owl ontology
An approach for transforming of relational databases to owl ontologyAn approach for transforming of relational databases to owl ontology
An approach for transforming of relational databases to owl ontology
 
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTS
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTSUSING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTS
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTS
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
 
OWL-XML-Summer-School-09
OWL-XML-Summer-School-09OWL-XML-Summer-School-09
OWL-XML-Summer-School-09
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
 
An Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain OntologyAn Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain Ontology
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...
 
Swoogle: Showcasing the Significance of Semantic Search
Swoogle: Showcasing the Significance of Semantic SearchSwoogle: Showcasing the Significance of Semantic Search
Swoogle: Showcasing the Significance of Semantic Search
 
TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASES
TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASESTRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASES
TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASES
 
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseSemantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
 

Más de Janna Hastings

Representing sequences of parts in processes using OWL
Representing sequences of parts in processes using OWLRepresenting sequences of parts in processes using OWL
Representing sequences of parts in processes using OWL
Janna Hastings
 

Más de Janna Hastings (20)

Using ChEBI to explore the underlying biology in metabolomics studies
Using ChEBI to explore the underlying biology in metabolomics studiesUsing ChEBI to explore the underlying biology in metabolomics studies
Using ChEBI to explore the underlying biology in metabolomics studies
 
Chemical classification for the Semantic Web
Chemical classification for the Semantic WebChemical classification for the Semantic Web
Chemical classification for the Semantic Web
 
Emotion Ontology and Affective Neuroscience
Emotion Ontology and Affective NeuroscienceEmotion Ontology and Affective Neuroscience
Emotion Ontology and Affective Neuroscience
 
Waves and fields in bio-ontologies
Waves and fields in bio-ontologiesWaves and fields in bio-ontologies
Waves and fields in bio-ontologies
 
Representing addiction in Mental Functioning and Disease ontologies
Representing addiction in Mental Functioning and Disease ontologiesRepresenting addiction in Mental Functioning and Disease ontologies
Representing addiction in Mental Functioning and Disease ontologies
 
Bio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challengesBio-ontologies in bioinformatics: Growing up challenges
Bio-ontologies in bioinformatics: Growing up challenges
 
Mental functioning ontology for interdisciplinary research into mental diseas...
Mental functioning ontology for interdisciplinary research into mental diseas...Mental functioning ontology for interdisciplinary research into mental diseas...
Mental functioning ontology for interdisciplinary research into mental diseas...
 
From chemicals to minds: Integrated ontologies in the search for scientific u...
From chemicals to minds: Integrated ontologies in the search for scientific u...From chemicals to minds: Integrated ontologies in the search for scientific u...
From chemicals to minds: Integrated ontologies in the search for scientific u...
 
Modularity requirements in bio-ontologies: a case study of ChEBI
Modularity requirements in bio-ontologies: a case study of ChEBIModularity requirements in bio-ontologies: a case study of ChEBI
Modularity requirements in bio-ontologies: a case study of ChEBI
 
The SHAPES workshop, and Holes in living beings
The SHAPES workshop, and Holes in living beings The SHAPES workshop, and Holes in living beings
The SHAPES workshop, and Holes in living beings
 
A chemical view into biological systems
A chemical view into biological systemsA chemical view into biological systems
A chemical view into biological systems
 
Chemical diagrams and the IAO
Chemical diagrams and the IAOChemical diagrams and the IAO
Chemical diagrams and the IAO
 
The emotion ontology: enabling interdisciplinary research in the affective sc...
The emotion ontology: enabling interdisciplinary research in the affective sc...The emotion ontology: enabling interdisciplinary research in the affective sc...
The emotion ontology: enabling interdisciplinary research in the affective sc...
 
Hyperontology for the biomedical ontologist
Hyperontology for the biomedical ontologistHyperontology for the biomedical ontologist
Hyperontology for the biomedical ontologist
 
Using multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small moleculesUsing multiple ontologies to characterise the bioactivity of small molecules
Using multiple ontologies to characterise the bioactivity of small molecules
 
Processes and Properties
Processes and PropertiesProcesses and Properties
Processes and Properties
 
Representing sequences of parts in processes using OWL
Representing sequences of parts in processes using OWLRepresenting sequences of parts in processes using OWL
Representing sequences of parts in processes using OWL
 
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using ProntoModelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
 
Chemical ontologies: what are they, what are they for, and what are the chall...
Chemical ontologies: what are they, what are they for, and what are the chall...Chemical ontologies: what are they, what are they for, and what are the chall...
Chemical ontologies: what are they, what are they for, and what are the chall...
 
Ontological dependence, dispositions and institutional reality in chemistry
Ontological dependence, dispositions and institutional reality in chemistryOntological dependence, dispositions and institutional reality in chemistry
Ontological dependence, dispositions and institutional reality in chemistry
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Ontology-based Data Integration

  • 1. Industry Programme Workshop: Data Integration 18-19 September 2013 Ontology-based data integration Janna Hastings
  • 2. Data integration is hard Technology Syntax Semantics Content
  • 3. Different data resources, different needs “why can‟t they all just use the same - schema - measurement accuracy - units - labels - content?”
  • 4. Standards are the solution… (?) Source: http://xkcd.com/927/
  • 5. Ontology-based data integration Ontologies can help with the semantic and the content aspects of data integration • Semantic: definition for schemas • OWL is a good language for defining schemas • See RDF and Semantic Web presentations, today • Content: definition of the entities referred to by data • Ontologies embedded into a data integration workflow help facilitate content-aware data integration
  • 6. Core challenge: labelling Multiple labels can mean the same thing One label can mean multiple things
  • 7. Semantics-free identifiers, multiple synonyms CHEBI:27732 A trimethylxanthine in which the three methyl groups are located at positions 1, 3, and 7. guaranine methyltheobromine 1,3,7-trimethylxanthineKoffein caféine
  • 8. Core challenge: biological knowledge The answer to the question: “Is Entity A from Data Source 1 the same thing as Entity B from Data Source 2?” often depends who is asking and who is answering! Left lung vs. lung Hippocampus vs. brain Dopamine vs. L-dopamine In vitro vs. In vivo cells of type X Gene Y and post-translationally modified form Y‟ Gene Z in mouse, Gene Z in human
  • 9. Hierarchy left lung lung organ is a is a Generalise to the nearest common ancestor i.e. if you are integrating data about tissue samples annotated to „lung‟ in the one dataset, and „left lung‟ in the other, The ontology can compute „lung‟ as the nearest common ancestor Also for „left lung‟ and „right lung‟
  • 10. Other relationships Relationships encode biological knowledge Rules allow to specify which relationships can be traversed for data integration purposes e.g. for tissue samples, part_of: sample_frompart_of => sample_from A sample from a part of the brain (e.g. the hippocampus) is a sample from the brain (Quite aside from the „is a‟ hierarchy!) brain hippocampus part of
  • 11. Core challenge: flexibility … (>150 members) Fixed-depth hierarchies force some classes to be too big, with the lowest level collapsing biolgoical hierarchy and others too small … (<1 member)
  • 12. Ontologies in content integration A B A&B 1. Schema mappings A B 2. Ontology- provided synonyms A B 3. Hierarchy and relationship rules for integration OWL language and tools: web-embedded (but whole-ontology rule reasoning may be slow)
  • 13. Is ontology integration just another type of data integration? Which ontology(-ies) to use? How to use them together? How to plug the gaps? Why should I (as a user) have to do this integration over and over
  • 14. Desiderata for ontologies for data integration • Ontologies should be neutral and shared community- wide • Users should be able to directly and rapidly extend the ontology where there are gaps (responsiveness) • The ontology should use semantics-free identifiers and at the same time energetically annotate synonyms • When necessary, ontologies should take care of ontology integration to provide the community with a one-stop service and appropriate cross-references • The ontologies should be used in data annotation See http://www.obofoundry.org/

Notas del editor

  1. New applications, new formats for exchange
  2. I visited a website listing 10 reasons data integration was hard. Mainly focused on business data integration scenarios, but still relevant for bioinformatics. Included sterling true points such as –technology changes very rapidly, but legacy never 100% goes away, different applications have fundamentally different needs, we keep inventing new products, etc. The first comment was almost brilliantly dumb. It offered the pearl of wisdom – 11th reason – the developers can’t just use the same one golden way to design &lt;schemas/content/etc&gt;! If only they could, data integration would be SO much easier.
  3. Ontologies gather synonyms together around a semantics-free identifier which acts as a “hub” for all the possible labels that could refer to things of that type. This works for ambiguous labels too, since the same label can be associated with multiple ontology terms.