Model-Driven Software Development with Semantic Web Technologies
Jmora.di.oeg.3x1e
1. Query Planning
for Semantic
Information Integration
José Mora, Óscar Corcho
{jmora, ocorcho}@fi.upm.es
Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo s/n
28660 Boadilla del Monte, Madrid, Spain
2. General Scenario – Semantic Information Integration
When sources may have
Local the global schema
Local ontologies ease
Let’s considereased as it
We need a this model
Integration is schema
is an ontology it presents
integration so much that
explicit semantics, their
Query happens at to which the
When thenoSemantic is
according the semantic
Therenow. integration
for is information
some authors proposed
additional advantages:
own ontologies.
when wewill write of the
upgradehave onethe
level, thehappens single
distributed in several
user details first.
AnOntologies can be defined
ontology is a explicit, formal Integration atlanguage,
richer query no global
models with semantic
BTW: H. Wache et al., “Ontology-
The (OWL) DL-Lite family was born queries. This abstracted.
database.are schema will
Then integration occurs
sources We retrieving
databases, can access
level. Mapping creation
ontologies, integration
explicit semantics,
according to different shared
specification of a languages, information from them all
all the ofsemantic level.
most information differ
at the the times in the
This allows a greater
basedgroup of DLsof information-
as a integration with reduced
differerent in expressiveness and
conceptualization. Provides a and inference, easier
is split (divide and
conversion between
a survey of existingefficient query
expressiveness for approaches,” from the – c] in sources
[Wache01 localSeparation
heterogeneity just by
database schemas
automatically is
shared vocabulary which can be
thus in their properties wrt what integration would be
schemas changes
conquer) with other
answering. This evolved to the
in: Ontologies and Information desirable, but notatrivial.
in each database, which
easessupported, more
to be querying it.
comprehension ;)
can be done with a domain. As a
used to model them, complexity sources… (“semantic
automatic. [Wache01 - b]
propagation is limited.
Sharing, vol. schema. QL.
OWL2 profiles EL and
2001, 108-117.
for tasks… even decidability willand integration…
powerful integration.
need to be mapped.
global upgrade”) [Wache01 – a]
Eg: PayGo from- Google.
[Wache01 c]
A A
2
3. Scenario - Subproblems
Schema Query Yes/No
Disparities
• PayGo: Large-Scale, mapping based
definition distribution options
• OBSERVER: Semantic mapping based
• Battré, Quilitz: Semantic, SPARQL
based
Ad-hoc • Straightforward reformulation
GAV GAV
approaches
• Lexic Materialization
• SourceSibarski: Semantic, system
changes affect the SPARQL,
preferences
• Bucket • Networked Graphs: Semantic, ad-hoc
Syntax Update
• Inverse rules Rewriting • Easy to add & remove sources information
• LAV
PICSEL LAV • Global schema has to be stable
Paradigm
• Bleiholder Semantic
Path Search
• Wang description
• SIMS Terms of none
• Pros of both, cons
• GLAV
Planning-by- GLAV • Harder to manage Quality
rewriting Planning
Concepts description
• HTN • Calvanese
Simple Simple • “Simple” to generate automatically
Mappings Reasoning • Pragmatics
Perez-Urbina Many others
Mappings • Non-constructive for integration
• SoftFacts
3
4. State of the Art - Solutions
SIMS Search for
sources
ISI
Web services Planning-by-
(planning) rewriting Physical vs DARQ
HTN Logical
search
Distribute
Battré
Bucket queries
Search for Rewriting Siberski
sources Inverse (preferences)
Rules Semantic
Calvanese
PICSEL
Ontology
Databases
based Reasoning Pérez-Urbina
OBSERVER
Search for
concepts SoftFacts
and sources (fuzzy)
Bleiholder
Path oriented Search for
Wang concepts
4
5. Work – Base: REQUIEM
• Base: REQUIEM by Pérez-Urbina
• Ontology as the global schema, (DL ELHIO¬)
• Rewrites to datalog queries by saturation
• Logical search but not physical search (∃! local schema)
clausification prune
•EL: description logic Clauses
DL-Lite (retains Clause tree
similar to
someValuesFrom )
•H: role inclusions
saturation
•I: inverse roles
•O: basic concepts like {a}
Query
•¬: allows negative inclusions
Mediator
Datalog
program
unfolding
Set of
queries
5
7. Work – previous work
• My previous work: Modification of REQUIEM
• Ontology partially covered by the information source prune
• Increase in efficiency in the process because of this prune
• Futile queries are not generated, less queries in the result
clausification prune
Clauses Clause tree
saturation
Query
Datalog
Mediator
program
unfolding
Set of
queries
7
8. Results - Efficiency
• Checked time for naïve and greedy modes
• Global and first modes for ontology pruning
• Only one ontology, several mapping files
R2OO-BCN-GF
R2OO-BCN-NG
R2OO-EGM-GF
R2OO-EGM-NG
ms
R2OO-Atlas-GF
R2OO-Atlas-NG
PU-G
PU-N
0 1000 2000 3000
8
9. Results – Effectiveness – # of Clauses (~queries) (1/2)
• Checked the number of clauses at several stages of
the algorithm
• After parsing the initial ontology
• Pruning the clauses with the information relevant for the query
• Saturating the clauses
• Unfolding the clauses
• Pruning again (only performed in greedy mode)
• Checked naïve and greedy modes for inference
• Checked global and first modes for ontology pruning
• Only one ontology, several mapping files providing
different coverages
9
10. Results – Effectiveness – # of Clauses (~queries) (2/2)
2500
2000
1500
After parsing
1000 After pruning (i)
After saturation
After unfolding
500
After pruning (ii)
0
10
11. Example
Query:
Q(x) :- Water(x)
Ground
Freshwater
Stream
Groundwater
Water Seawater Aquifer
Continental Running
Water Water
Hydrographic
phenomenon
Water Transition
Collector Water
Surfacewater
Punctual
Junction Upwelling
Hydronym
Mouth Still Water
Continental_Water(x) :- Groundwater(x)
Groundwater(x) :- Ground_Stream(x)
Continental_Water(x) :- Ground_Stream(x) Bold: mapped predicates
11
14. Work – current work
• @ISI: Integration w/ GAV mediator, DQP, OGSA-DAI
• Other mediators should be straightforward
• Real tests (w/ schemas and data): not done (yet)
• Always open to suggestions for future (remote) collaboration
clausification prune
Clauses Clause tree
saturation
Query
Datalog
Mediator
program
unfolding
Set of
queries
14
16. Data Integration Working
Group in the
Ontology Engineering Group
OEG
Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo sn
28660 Boadilla del Monte, Madrid
http://www.oeg-upm.net
Phone: 34.91.3367439, 34.91.3366605
Fax: 34.91.3524819
17. Semantic e-Science
•Data Integration
•Ontology-based DB access:
R2O and ODEMapster
•Semantic Grid
•S-OGSA Architecture
•WS-DAIOnt-RDF(S) OGF
standard ll
•RDF(S) Grid Access Bridge
RDF(S) Grid Access Bridge
Architecture
Upper
Upper Repository
service layer
service layer SelectorService
Web Service Tier
Internediate
Internediate
service layer RepositoryService
service layer
Resource Class Property Statement
Service Service Service Service
Lower
Lower Container List Alt
service layer
service layer Service Service Service
RDFSConnector
RDF(S) Storage Layer
Sesame Jena Atlas
Connector Connector Connector ...
Sesame Jena Atlas
RDF Storage RDF Storage RDF Storage
17
18. General scenario
Several PhD students
Query working in a shared
general scenario at UPM
Jose Mora –
Query plans
Freddy Priyatna – Victor Saquicela –
Carlos Buil –
Multi-RDB2RDF Automatic WS semantic annotation
Distributed
SPARQL queries
Jean-Paul Calbimonte –
Multi-SensorNetwork2RDF
A A
18
19. R2O++ - Freddy Priyatna
R2O
Mapping
Document
R2O Mapping R2O
Parser objects Unfolder
R2O
Properties
SQL
R2O Query
Triples Result Set evaluator
Jena Postprocessor
Model
RDF
Model Writer Document DB
Asunción Gómez Pérez 19
20. Semantic Streaming Data Access – Jean Paul Calbimonte
O-O mapping R2O mappings
q Query qr Query Qc
reconciliation canonisation SNEEql’ (S1 S2 Sn)
SPARQLSTR (Og) SPARQLSTR (O1 O2 On) SNEEql (S1 S2 Sn)
Client
Distributed
Query
Processing
Data Data
reconciliation decanonisation
d dr Dc
[tripleOg] [tripleO1 O2 On] [tuplel1 l2 l3]
Semantic Integrator
20
21. Semantic Annotation of RESTful Services – Victor Saquicela
SpellingSuggestions
Internet
Web applications
& API
Syntactic description
input output
Syntactic description
Semantic annotation
Semantic annotation
User
Repository
21
23. Ontology Engineering Group
Prof. Dr. Asunción Gómez-Pérez, Dr. Oscar Corcho
Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo sn
28660 Boadilla del Monte, Madrid
http://www.oeg-upm.net
{asun,ocorcho}@fi.upm.es
Phone: 34.91.3367439, 34.91.3366605
Fax: 34.91.3524819
Presenter: Jose Mora (jmora@fi.upm.es)
24. People
•Director: A. Gómez-Pérez
•Research Group (37 people)
• 2 Full Professor
• 4 Associate Professors
• 1 Assistant Professor
• 3 Postdocs
• 17 PhD Students
• 8 MSc Students
• 2 Software Engineers
• Management (4)
• 2 Project Managers
• 1 System Administrator
• 1 Secretary
• 50+ Past Collaborators
• 10+ visitors
Asunción Gómez Pérez 24
25. Research Areas
2004 2008
Internet
of Things
Semantic e-Science
(Data Integration, Ontological Engineering
Semantic Grid) 1995
(Social) Natural
Semantic Language
Web Processing
2000 1997
26. Research projects
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Katalyx IGN/RAE/AMPER/XMEDIA WHO/IGN
Group
PLATA España Virtual/mIO!/Buscamedia
REIMDOC (FIT) Red/Gis4Gov/11811/UPnP/UpGrid/Autores3.0/WEBn+1
ContentWeb Servicios Semánticos GeoBuddies
12 Ac. Especiales/Complementarias
HA98-0002 HF02-0013
MKBEEM
OntoWeb
Esperonto
PIKON
Knowledge Web
OntoGrid
SEEMP
NeOn
Marie Curie
ADMIRE
SemSorGrid4Env
DynaLearn
Company EU Project Coordinators
SEALS
Spanish Projects EU Project Participation
MONNET
Asunción Gómez Pérez 26
27. Ontological Engineering
Knowledge Resources
Ontological Resources
•METHONTOLOGY & WebODE
Non Ontological Resources
Glossaries Dictionaries O. Design Patterns O. Repositories and Registries 3 4
Lexicons
Flogic
5 6
Classification
Taxonomies Thesauri RDF(S)
•NeOn Methodology for building
Schemas
OWL Ontological Resource
2 Reuse
5 6
Networks of Ontologies 2
Non Ontological Resource
Ontology Design 4 O. Aligning
Pattern Reuse 3
Reuse
• Ontology Scheduling 6 O. Merging
2 Ontological Resource
7 Reengineering 5
• Ontology Requirement
Alignments
Non Ontological Resource
Reengineering 4 6
1
Specification O. Specification O. Conceptualization O. Formalization O. Implementation
RDF(S)
• Ontology Reuse
Flogic
8
9 Ontology Restructuring
• Non Ontological Resource
(Pruning, Extension, OWL
O. Localization
Specialization, Modularization)
1,2,3,4,5,6,7,8, 9
Reuse and Reengineering Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation;
Configuration Management; Evaluation (V&V); Assessment
• Ontology Localization
• Ontology Mapping
• Ontology Design Patterns
• Ontology Change Propagation
Asunción Gómez Pérez 27
28. Ontologies and Natural Language Processing (NLP)
•LIR – Linguistic Information
Repository
•Multilingual ontologies & Label
Translator
•Lexico-Syntactic Patterns for
automatic ontology building
(Sp, En, Ge)
Entity Properties View
Lexical Entry
Lexical Entry Information
flueve
Part Of Speech
rivière noun
river Synonyms
rivière
Lexicalization Information Translations
Main Entry SI river
Scientific Name
Grammatical Number singular
Lexicalization Sense
Term Type acronym Sense Language in Context
01 en
Lexicalization Source
Source URL
IATE http://iate.europa.eu/iatediff/Search... Definitions
Definition Lang
stream of water of considerable
Lexicalization Notes
volume and length that flows into en
Notes Lang URL the see
Flueve and rivière are
usually considered Definition Source
synonyms. However, the Source URL
en http://www.cnrtl.fr/
use of fleuve should be
avoid when the stream BritannicalOnline http://www.britannica.com/...
does not flow in the sea.
Asunción Gómez Pérez 28
29. (Social) Semantic Web
•Semantic Web Framework
•Semantic Portals
•Semantic Wikis
•Annotation and Browsing Tools
• Web content
• Multimedia content in home
environments
•NeOn Methodology for building
Large Scale Semantic Web
Applications
•Benchmarking Semantic Web
Technologies
•Evolution of folksonomies and
ontologies
Asunción Gómez Pérez 29
30. Internet of Things
• Topics • Large-scale data integration
• Mobile devices • Legacy DB
• Sensor networks • Sensor networks
• Ubiquitous computing • User generated content
• Large-scale data integration
for mobile applications
exploiting user-generated
content
Asunción Gómez Pérez 30
31. Semantic e-Science
•Data Integration
•Ontology-based DB access:
R2O and ODEMapster
•Semantic Grid
•S-OGSA Architecture
•WS-DAIOnt-RDF(S) OGF
standard ll
•RDF(S) Grid Access Bridge
RDF(S) Grid Access Bridge
Architecture
Upper
Upper Repository
service layer
service layer SelectorService
Web Service Tier
Internediate
Internediate
service layer RepositoryService
service layer
Resource Class Property Statement
Service Service Service Service
Lower
Lower Container List Alt
service layer
service layer Service Service Service
RDFSConnector
RDF(S) Storage Layer
Sesame Jena Atlas
Connector Connector Connector ...
Sesame Jena Atlas
RDF Storage RDF Storage RDF Storage
31
32. Colaboration with other research groups
Univ. of Wien DFKI
Univ. of NR & ALS Univ. of Augsburg
KSL. Stanford Univ.
Univ. of Amsterdam Univ. of Innsbruck Univ. of Karlsruhe
Free Univ. of Amsterdam Univ. of Koblenz
Univ. of Hannover
Univ. of Brasilia Univ. of Mannheim
Univ. of Bielefeld
Free Univ. of Brussels
Forschungszentrum Informatik
Univ. of Galway (DERI) Úniv. of Zurich
Ústav Informatiky
Open University
Oxford University Academy of Sciences
Univ. of Manchester
Univ. of Liverpool
Univ. of Sheffield
Univ. of Aberdeen
Univ. of Tel Aviv
Univ. of Edinburgh CNR
Univ. of Southampton Univ. of Trento
INRIA
Univ. of Hull Univ. of Athens
Univ. of Bolzano
TUC
Asunción Gómez Pérez 32
Notas del editor
Referenceshere.ToDo: Halevy, Wache, Kossmann, Corcho, (Haas and Arens are alreadythere) Calvanese98, and thelasttwo boxes, I cannotthinkaboutthemnow. Y todas las de las cajas de la izquierda en querydistribution.