2. 10/8/2009
2
Syntactic Web
10/9/2009 Creative Commons - BY-NC 3
Problems
A typical web page is
designed with markup
language ,HTML,
which is designed for
rendering
presentation and
Hyperlink to related
information. Semantic
content is accessiblecontent is accessible
to humans but not to
computers.
10/9/2009 Creative Commons - BY-NC 4
3. 10/8/2009
3
Linguistic
Concept
ReferentForm
Concept
Relates toActivates
10/9/2009 Creative Commons - BY-NC 5
Tank
Stands for
?
Problems
• Keyword‐based Search
S d H• Synonyms and Homonyms
• No Parameter Search
• No Cross Silos Data Extraction or Comparison
• No Unified View and/or Interpretation of Data
• Limited Ability to Re‐use of Datay
• Difficult to Share Data with Business Partners
10/9/2009 Creative Commons - BY-NC 6
5. 10/8/2009
5
What is Ontology?
http://en.wikipedia.org/wiki/Ontology_%28information_science%29
• In computer science and information science, an
ontology is a formal representation of a set ofontology is a formal representation of a set of
concepts within a domain and the relationships
between those concepts. It is used to reason about
the properties of that domain, and may be used to
define the domain.
• An ontology is a formal, explicit specification of a
conceptualization.
10/9/2009 Creative Commons - BY-NC 9
XML (Extensible
Markup Language)
It is a textual data format,
with strong support via
Unicode for the languages
Well‐formed and error‐handling
• It contains only properly‐encoded legal
Unicode characters. None of the special
syntax characters such as "<" and "&"
appear except when performing their
markup‐delineation roles.
• The begin, end, and empty‐elementUnicode for the languages
of the world. Although
XML’s design focuses on
documents, it is widely
used for the
representation of arbitrary
data structures.
The begin, end, and empty element
tags which delimit the elements are
correctly nested, with none missing and
none overlapping.
• The element tags are case‐sensitive; the
beginning and end tags must match
exactly.
• There is a single "root" element which
contains all the other elements.
10/9/2009 Creative Commons - BY-NC 10
6. 10/8/2009
6
XSD
(XML Schema)
XSD datatypes ‐1/2
• xsd:string,
• xsd:boolean,
• xsd:decimal,
• xsd:float,
• xsd:double,
• xsd:dateTime,
d i
XSD can be used to express
a set of rules to which an
XML document must
XSD datatypes ‐2/2
• xsd:language,
• xsd:NMTOKEN,
• xsd:Name,
• xsd:NCName,
• xsd:integer,
• xsd:nonPositiveInteger,
• xsd:time,
• xsd:date,
• xsd:gYearMonth,
• xsd:gYear,
• xsd:gMonthDay,
• xsd:gDay,
• xsd:gMonth,
• xsd:hexBinary,
• xsd:base64Binary
XML document must
conform in order to be
considered 'valid'
according to that schema.
However, unlike most
other schema languages,
XSD was also designed
with the intent that
xsd:nonPositiveInteger,
• xsd:negativeInteger,
• xsd:long,
• xsd:int,
• xsd:short,
• xsd:byte,
• xsd:nonNegativeInteger,
• xsd:unsignedLong,
d i dIxsd:base64Binary,
• xsd:anyURI,
• xsd:normalizedString,
• xsd:token,
determination of a
document's validity would
produce a collection of
information adhering to
specific data types.
10/9/2009 Creative Commons - BY-NC 11
• xsd:unsignedInt,
• xsd:unsignedShort,
• xsd:unsignedByte,
• xsd:positiveIntegers
RDF (Resource
Descriptive
Framework)
RDF vocabulary
• rdf:type
• rdf:Property
• rdf:XMLLiteral
• rdf:nil
• rdf:List
RDF describes statements
about resources, in
particular Web resources
• rdf:Statement
• rdf:subject
• rdf:predicate
• rdf:object
• rdf:first
• rdf:rest
• rdf:Seq
particular, Web resources,
in the form of subject‐
predicate‐object
expressions. These
expressions are known as
triples in RDF terminology.
rdf:Seq
• rdf:Bag
• rdf:Alt
• rdf:_1
• rdf:_2 ...
• rdf:value
10/9/2009 Creative Commons - BY-NC 12
7. 10/8/2009
7
Triples and Graph
The base element of the
RDF model is the triple:
• a resource (the subject)• a resource (the subject)
• inks (the predicate)
• another resource (the
object)
A resource <subject> has a
property <predicate>
valued by <object>.
10/9/2009 Creative Commons - BY-NC 13
<subject> <predicate> <object>
Pro and Cons of RDF
• Pros
U i l d t d l ( t XML bj t d l ti l– Universal data model (map to XML, object and relational
model)
– Additive, easy to merge multiple RDFs
– Predicate logic (like prolog)
– Use URI to identify a resource
• ConsCons
– Lacks of concepts of enumeration
– Lacks data types
– No Object‐Oriented Features
10/9/2009 Creative Commons - BY-NC 14
8. 10/8/2009
8
Resource (RDFS)
Classes
• rdfs:Resource
• rdfs:Literal
• rdfs:Class
• rdfs:Datatype
df C i
RDF Schema (RDFS) is an
extensible knowledge
representation language
Properties
• rdfs:subClassOf
• rdfs:subPropertyOf
• rdfs:domain
• rdfs:range
• rdfs:label
df t• rdfs:Container
• rdfs:ContainerMe
mbershipProperty
• rdf:List
• rdf:Statement
• rdf:Bag
• rdf:Seq
representation language,
providing basic elements
for the description of
ontologies, otherwise
called Resource
Description Framework
(RDF) vocabularies,
intended to structure RDF
• rdfs:comment
• rdfs:member
• rdfs:seeAlso
• rdfs:isDefinedBy
• rdf:first
• rdf:rest
• rdf:type
• rdf:valuerdf:Seq
• rdf:Alt
• rdf:XMLLiteral
• rdf:Property
resources.
10/9/2009 Creative Commons - BY-NC 15
• rdf:subject
• rdf:predicate
• rdf:object
Web Ontology Language
10/9/2009
Creative Commons - BY-NC
16
12. 10/8/2009
12
Protѐgѐ Overview
• Stanford Center for Biomedical Informatics Research,
– Stanford UniversityStanford University
– University of Manchester
• OWL Editor
• Plugins: Natural Language, Visualization, Rules Engine,
Database, …
• Very well documented,
• Long history with many academic supports
10/9/2009 Creative Commons - BY-NC 23
Protѐgѐ – Class View
10/9/2009 Creative Commons - BY-NC 24
14. 10/8/2009
14
Protѐgѐ ‐ Visualization
10/9/2009 Creative Commons - BY-NC 27
Ontology Development
• Define purpose and scopes
Eli it k l d• Elicit knowledge
• Collect and organize concepts
• Classify and add axioms
• Reasoning
10/9/2009 Creative Commons - BY-NC 28
15. 10/8/2009
15
OWL vs. UML class modeling
• OWL properties vs. UML associations & attributes
OWL ti h di ti– OWL properties have a direction
– OWL properties are binary relations
– OWL properties are “first‐class” citizens (global scope)
• OWL classes vs. UML classes
– OWL classes have no operations
OWL classes can have “sufficient” conditions– OWL classes can have sufficient conditions
• Primitive vs. defined classes
2910/9/2009 Creative Commons - BY-NC
Ontologies and Data Models
• Ontologies live in an open, distributed world; data
models in a closed worldmodels in a closed world
• Writing a model in OWL does not make it an
ontology
– The ontology should be shared
3010/9/2009 Creative Commons - BY-NC
17. 10/8/2009
17
Benefit Semantic Web Applications
• Less coding, more meaningful data structure
L b i l• Less business rules
• More across boundary information
• Embedded logic
10/9/2009 Creative Commons - BY-NC 33
Global Database
from: Tim Berners‐Lee, Weaving the Web, 1999
• "If HTML and the Web made all the online
documents look like one huge book RDF schemadocuments look like one huge book, RDF, schema,
and inference languages will make all the data in the
world look like one huge database"
10/9/2009 Creative Commons - BY-NC 34