SlideShare una empresa de Scribd logo
1 de 34
1Jarrar © 2014
Mustafa Jarrar
Sina Institute, University of Birzeit
mjarrar@birzeit.edu
www.jarrar.info
Lecture Notes on WordNet
University of Birzeit, Palestine
Fall Semester, 2014
WordNet
EuroWordNet, and Global WordNet
2Jarrar © 2014
Watch this lecture and download the slides from
http://jarrar-courses.blogspot.com/2011/11/artificial-intelligence-fall-2011.html
3Jarrar © 2014
Reading
Everything in these slides + everything I say
[MBC93] George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross,
and Katherine Miller: Introduction to WordNet: An On-line Lexical
Database. International Journal of Lexicography, Vol. 3, Nr. 4. Pages
235-244. (1990) http://wordnetcode.princeton.edu/5papers.pdf
[GGO02] Aldo Gangemi , Nicola Guarino , Alessandro Oltramari , Ro Oltramari ,
Stefano Borgo: Cleaning-up WordNet's Top-Level. In Proc. of the 1st
International WordNetConference (2002)
http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=C9962DFEDD7
93F3F839426B774BC9BAF?doi=10.1.1.11.4064&rep=rep1&type=pdf
4Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
Lecture Keywords:
،‫مكنز‬ ،‫المفردات‬ ‫شبكة‬‫انطولوجيا‬،‫للغة‬،‫المعنى‬ ،‫الداللة‬ ،‫الداللة‬ ‫علم‬،‫المفهوم‬
،‫اللغات‬ ‫تعدد‬‫عالقات‬ ،‫المعاني‬ ‫تصنيف‬ ،‫التضاد‬ ،‫المعاني‬ ‫تعدد‬ ،‫اللغوي‬ ‫الترادف‬
‫جزء‬-‫كل‬
WordNet, Global WordNet, Thesaurus, Linguistic Ontology, Lexical Semantics, Semantics,
Meaning, Synset, Concept, Synonymy, Polysemy, Hyponymy, Meronymy, Antonymy,
5Jarrar © 2014
What is WordNet?
• In 1985 a group of psychologists and linguists at Princeton University
started to develop a “mental lexicon”.
• You may also call it:“electronic dictionary”, “Mental dictionary”, English,
“semantic Network”, hyperdimensional thesaurus, etc.
• Includes most frequent words (nouns, adjectives, adverbs, verbs).
• Organized by meaning: words in close proximity are semantically similar.
• Can be used by humans and machines.
• Human users and computers can browse WordNet and find words that
are meaningfully related to their queries.
• Available online, for downloading! http://wordnet.princeton.edu
6Jarrar © 2014
WordNet: Synonymy
WordNet gives information about two fundamental, universal
properties of human language: polysemy and synonymy.
• English words are grouped (roughly) into sets of synonyms.
• Each set of synonyms is called a Synset; and given a unique
SynsetID to identify it.
• Each synset expresses a distinct meaning/concept.
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
08283156
06501650
07955878
03410635
03018908
04615793
{work table}
A table designed…
7Jarrar © 2014
Exercise
List the different meanings of the words:
Table, Array, Matrix, Bureau
8Jarrar © 2014
WordNet: Polysemy
• Each word form-meaning pair is unique.
• A word that appears in n synsets is n-fold polysemous.
• For example: “Table” here is two-fold polysemous
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
9Jarrar © 2014
WordNet: Glosses
A short gloss is provided for each sysnet.
Glosses are examples of contexts for many word-sense pairs, telling us
how words with specific senses are being used in context.
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
10Jarrar © 2014
WordNet: Statistics
155 287 word forms, groups into
117 659 synsets
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
WordForms Synsets
noun 117,798 82,115
verb 11,529 13,767
adjective 21,479 18,156
adverb 4,481 3,621
Total 155,287 117,659
11Jarrar © 2014
WordNet Semantic Relations
Synsets are interconnected with semantic relations, forming a large
semantic network (graph).
Such Relations are:
• Hyponymy, also called “Is a” relation, or sub/superordinate.
• Meronymy, also called “part of” relation
{Container}
Any object that can
be used ..
{Drawer}
A boxlike container
in a..
{shelf}
A support that
consists…
{Support}
Any device that
bears..
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
12Jarrar © 2014
WordNet Relations: Hyponymy
• A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English
speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular
Array is a kind of Array, Array is a kind of Arrangement,…
• Hyponymy is transitive and asymmetrical. So as Hyponymy generates a
hierarchical semantic structure, a hyponym inherits all the features of the more
generic concept and adds at least one feature that distinguishes it from its
superordinate.
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
13Jarrar © 2014
WordNet Relations: Hyponymy
• A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English
speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array
is a kind of Array, Array is a kind of Arrangement,…
• Hyponymy is transitive and asymmetrical. So as Hyponymy generates a
hierarchical semantic structure, a hyponym inherits all the features of the more
generic concept and adds at least one feature that distinguishes it from its
superordinate. [2]
The WordNet hierarchy
is about 16 levels
{act, action, activity} {natural object }
{animal, fauna} {natural phenomenon }
{artifact } {person, human being}
{attribute, property } {plant, flora}
{body, corpus} {possession}
{cognition, knowledge} {process}
{communication} {quantity, amount}
{event, happening} {relation }
{feeling, emotion} {shape}
{food} {state, condition}
{group, collection} {substance}
{location, place } {time}
{motive}
Top Level Nouns (25 unique beginners)
14Jarrar © 2014
WordNet Relations: Meronymy
• A synset {x, x′, . . .} is meronym of the synset {y, y′, . . .} if native English
speakers accept sentences like y has an x (as a part) or An x is a part of y.
E. g., Finger is part of Hand , Hand is part of Arm, Arm is part of Body.
• Meronymy is transitive (with qualification) and asymmetrical relations, and
forms a part hierarchy..
• Synsets may have multiple hypernyms
{Container}
Any object that can
be used ..
{Drawer}
A boxlike container
in a..
{shelf}
A support that
consists…
{Support}
Any device that
bears..
{Periodic Table}
a tabular arrangement
of the chemical elem…
{Matrix}
A rectangular array
of quantities …
{Arrangement}
An orderly grouping
(of things or…
{Bureau, Dresser,
Chest of Drawers,}
Furniture with drawers for
keeping clothes
{Table, Tabular Array}
A set of data arranged in rows
and columns
{Categorization,
Classification}
A group of people or things
arranged…
{Array}
An orderly arrangement
{Calendar}
A tabular array
of the days..
{Contents,
TableOfContents}
A list of divisions…
{Furniture, Piece of furniture ,
Article of furniture}
Furnishings that make a room….
{Table}
A piece of furniture
having a smooth …
{Desk}
A piece of furniture with
a writing surface…
{Booth}
A table (in a restaurant or
bar) surrounded by two…
{River}
A large natural
stream of ...
{Stream}
A natural body of
running water…
{Nile}
The world's
longest..
{work table}
A table designed…
15Jarrar © 2014
Exercise
Find the hyponyms and meronyms of this synset
{car, auto, automobile, machine, motorcar}
16Jarrar © 2014
WordNet Relations: Another Example
{car, auto, automobile, machine, motorcar}
{conveyance,transport}
{vehicle}
{motor vehicle, automotive vehicle}
{cruiser, squad car, patrol car,
police car, prowl car}
{cab, taxi, hack, taxicab}
{bumper}
{car door}
{car window}
{car mirror} {armrest}
{doorlock}
{hinge,
flexible joint}
hyper(o)nym
hyponym
meronyms
Hyponymy and meronymy relations are:
• transitive
• directed
[1]
17Jarrar © 2014
{Old}
Of long duration
WordNet Relations: Antonymy
• The antonym of a word x is sometimes not-x, but not always. For example, rich and poor
are antonyms, but to say that someone is not rich does not imply that they must be poor; many people
consider themselves neither rich nor poor.
• Antonymy, which seems to be a simple symmetric relation, is actually quite
complex, yet speakers of English have little difficulty recognizing antonyms when
they see them. For example, the meanings {rise, ascend } and {fall, descend} may be conceptual
opposites, but they are not antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most
people hesitate and look thoughtful when asked if rise and descend, or ascend and fall, are antonyms
• Antonymy is a lexical relation between word forms, not a semantic relation between
word meanings. Or, some call it semantic relations between words [MPC93].
{Fall, Come Down, Go
Down, Descend}
Move downward and lower, but not
necessarily all the way
{Set, Go down, Go Under}
(astronomy) disappear beyond the horizon{Ascend, Come
up, Rise, Uprise}
(astronomy) come up, of
celestial bodies
{Ascend, Go up}
Travel up
{Rise, Uprise, Come up,
Go up, Move up, Lift}
Move upward
{Ascend, Move up, Rise}
Move to a better position in life …
{Hot}
Used of physical
heat; having..
{Cold}
Having a low or
inadequate..
{New}
Unaffected by use
or exposure
{New}
Not of long
duration; having..
{Worn}
Affected by wear;
damaged by …
{Young, Immature}
in an early period of life…
{Old}
having lived
for a relatively
18Jarrar © 2014
WordWeb
http://wordweb.info/free/
A nice and intuitive
interface for WordNet
19Jarrar © 2014
Other WordNet Relations
• Although the main interest of WordNet was on specifying semantic
relations but other lexical/morphological relations between word forms
were added.
• For example: stems, singular-plural, verb tenses, etc.
20Jarrar © 2014
Why do we need WordNet?
• Word sense disambiguation,
• Information retrieval,
• Automatic text classification,
• Automatic text summarization,
• Machine translation
• ….etc.
21Jarrar © 2014
Is WordNet a Thesaurus?
Yes:
• it groups together meaningfully related words
No:
• WN labels the relations
• The relations are limited
• Related words are linked to specific concepts (disambiguated);
thesaurus is a “bag of words”
• Many words linked in WordNet do not co-occur in the same
thesaurus entry
• WordNet allows one to measure and quantify the semantic
similarity or distance among words and concepts
[Fellbaum]
22Jarrar © 2014
Is WordNet an Ontology?
Meaning (called Ontological Precision):
WordNet: based on what native speakers agree roughly
Ontology: based on Scientific and philosophical findings.
Classification:
WordNet: based on what native speakers agree roughly (Student IsA person)
Ontology: based on strict formal methodologies (student IsA role)
Formal Specification:
WordNet: logically vague
Ontology: strictly formal
 I like to use WordNet as a linguistic ontology, though it needs lots of cleaning!
 Linguistic ontologies are difficult to build but they are immune to changes
23Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
24Jarrar © 2014
EURO WordNet
• The development of a multilingual database with WordNets for several
European languages.
• Funded by the European Commission, DG XIII, LE2-4003 and LE4-8328
• March 1996 - September 1999 (2.5 Million EURO)
http://www.hum.uva.nl/~ewn
http://www.illc.uva.nl/EuroWordNet/finalresults-ewn.html
• Languages covered:
EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian
EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian.
• Size of vocabulary:
EuroWordNet-1: 30,000 concepts - 50,000 word meanings.
EuroWordNet-2: 15,000 concepts- 25,000 word meaning.
• Type of vocabulary:
the most frequent words of the languages
all concepts needed to relate more specific concepts.
[1]
25Jarrar © 2014
EURO WordNet Model
I = Language Independent link
II = Link from Language Specific
to Inter lingual Index
III = Language Dependent Link
III
Lexical Items Table
cavalcare
andare
muoversi
III
guidare
ILI-record
{drive}
Inter-Lingual-Index
Ontology
2OrderEntity
Location Dynamic
Domains
Traffic
Air Road` III
Lexical Items Table
bewegen
gaan
rijden berijden
III
Lexical Items Table
driveride
move
go
III
III
Lexical Items Table
cabalgar
jinetear
III
conducir
mover
transitar
III
II
IIII
II
II
[1]
26Jarrar © 2014
The Multilingual Design
• Inter-Lingual-Index: unstructured fund of concepts to provide an
efficient mapping across the languages;
• Index-records are mainly based on WordNet synsets and consist of
synonyms, glosses and source references;
• Various types of complex equivalence relations are distinguished;
• Equivalence relations from synsets to index records: not on a word-to-
word basis;
• Indirect matching of synsets linked to the same index items;
[1]
27Jarrar © 2014
EURO WordNet Model
• WordNets are unique language-specific structures:
 same organizational principles: synset structure and same set of
semantic relations.
 different lexicalizations
 differences in synonymy and homonymy:
"decoration" in English versus "versiersel/versiering" in Dutch
"bank" in English (money/river) versus "bank" in Dutch
(money/furniture)
•BUT also different relations for similar synsets
[1]
28Jarrar © 2014
Some Downsides of the EuroWordNet Model
• Construction is not done uniformly
• Coverage differs
• Not all wordnets can communicate with one another, i.e. linked
to different versions of English wordnet
• Proprietary rights restrict free access and usage
• A lot of semantics is duplicated
• Complex and obscure equivalence relations due to linguistic
differences between English and other languages
[1]
29Jarrar © 2014
WordNet and Global WordNet
• Part 1: The English WordNet
• Part 2: Euro WordNet
• Part 3: Global WordNet
30Jarrar © 2014
From EuroWordNet to Global WordNet
EuroWordNet ended in 1999
Global Wordnet Association was founded in 2000 to maintain the
framework: http://www.globalwordnet.org
Currently, wordnets exist for more than 50 languages, including:
Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic,
Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit,
Tamil, Thai, Turkish, Zulu...
Many languages are genetically and typologically unrelated
http://www.globalwordnet.org
31Jarrar © 2014
From EuroWordNet to Global WordNet
• EuroWordNet ended in 1999
• Global Wordnet Association was founded in 2000 to maintain the
framework: http://www.globalwordnet.org
• Currently, wordnets exist for more than 50 languages, including:
Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic,
Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit,
Tamil, Thai, Turkish, Zulu...
• Many languages are genetically and typologically unrelated
 The Arabic WordNet extension was not successful, will be explained
later.
[1]
32Jarrar © 2014
Global WordNet Model
Construct separate wordnets for each language
Contributors from each language encode the same core set of concepts
plus culture/language-specific ones
Synsets (concepts) are mapped cross linguistically via an ontology
instead of just the English Wordnet
[1]
33Jarrar © 2014
Discussion
What would be a good database schema to store WordNet? Global
WordNEt?
What is the difference between Synset and Concept?
How precise the Hyponymy? And what is the difference between to
Hyponymy and subclass/subset?
34Jarrar © 2014
References
[1] Piek Vossen: Lecture Notes on The Global Wordnet Grid: anchoring languages to universal
meaning
http://www.authorstream.com/Presentation/Stentore-40555-WN-EWN-GWA-Koszalin-Global-Wordnet-Grid-
anchoring-languages-universal-meaning-kosz-Entertainment-ppt-powerpoint/
[2] Lyons, John. Semantics. Vol. 1. Cambridge: Cambridge UP, 1977. Print.

Más contenido relacionado

Más de Mustafa Jarrar

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisMustafa Jarrar
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal OntologyMustafa Jarrar
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course OutlineMustafa Jarrar
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology Mustafa Jarrar
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesMustafa Jarrar
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORMMustafa Jarrar
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineMustafa Jarrar
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesMustafa Jarrar
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalMustafa Jarrar
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsMustafa Jarrar
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingMustafa Jarrar
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsMustafa Jarrar
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Mustafa Jarrar
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql ProjectMustafa Jarrar
 
Jarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringJarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringMustafa Jarrar
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesMustafa Jarrar
 
Jarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyJarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyMustafa Jarrar
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search Mustafa Jarrar
 

Más de Mustafa Jarrar (20)

Clustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment AnalysisClustering Arabic Tweets for Sentiment Analysis
Clustering Arabic Tweets for Sentiment Analysis
 
Classifying Processes and Basic Formal Ontology
Classifying Processes  and Basic Formal OntologyClassifying Processes  and Basic Formal Ontology
Classifying Processes and Basic Formal Ontology
 
Discrete Mathematics Course Outline
Discrete Mathematics Course OutlineDiscrete Mathematics Course Outline
Discrete Mathematics Course Outline
 
Customer Complaint Ontology
Customer Complaint Ontology Customer Complaint Ontology
Customer Complaint Ontology
 
Subset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion RulesSubset, Equality, and Exclusion Rules
Subset, Equality, and Exclusion Rules
 
Schema Modularization in ORM
Schema Modularization in ORMSchema Modularization in ORM
Schema Modularization in ORM
 
On Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in PalestineOn Computer Science Trends and Priorities in Palestine
On Computer Science Trends and Priorities in Palestine
 
Lessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online CoursesLessons from Class Recording & Publishing of Eight Online Courses
Lessons from Class Recording & Publishing of Eight Online Courses
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-final
 
Jarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 CallsJarrar: Future Internet in Horizon 2020 Calls
Jarrar: Future Internet in Horizon 2020 Calls
 
Habash: Arabic Natural Language Processing
Habash: Arabic Natural Language ProcessingHabash: Arabic Natural Language Processing
Habash: Arabic Natural Language Processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 ProposalsRiestra: How to Design and engineer Competitive Horizon 2020 Proposals
Riestra: How to Design and engineer Competitive Horizon 2020 Proposals
 
Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020Bouquet: SIERA Workshop on The Pillars of Horizon2020
Bouquet: SIERA Workshop on The Pillars of Horizon2020
 
Jarrar: Sparql Project
Jarrar: Sparql ProjectJarrar: Sparql Project
Jarrar: Sparql Project
 
Jarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology EngineeringJarrar: Logical Foundation of Ontology Engineering
Jarrar: Logical Foundation of Ontology Engineering
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing Ontologies
 
Jarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean MethodologyJarrar: Ontology Modeling using OntoClean Methodology
Jarrar: Ontology Modeling using OntoClean Methodology
 
Jarrar: Games
Jarrar: GamesJarrar: Games
Jarrar: Games
 
Jarrar: Informed Search
Jarrar: Informed Search  Jarrar: Informed Search
Jarrar: Informed Search
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Último (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Jarrar: WordNet And Global WordNets

  • 1. 1Jarrar © 2014 Mustafa Jarrar Sina Institute, University of Birzeit mjarrar@birzeit.edu www.jarrar.info Lecture Notes on WordNet University of Birzeit, Palestine Fall Semester, 2014 WordNet EuroWordNet, and Global WordNet
  • 2. 2Jarrar © 2014 Watch this lecture and download the slides from http://jarrar-courses.blogspot.com/2011/11/artificial-intelligence-fall-2011.html
  • 3. 3Jarrar © 2014 Reading Everything in these slides + everything I say [MBC93] George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller: Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography, Vol. 3, Nr. 4. Pages 235-244. (1990) http://wordnetcode.princeton.edu/5papers.pdf [GGO02] Aldo Gangemi , Nicola Guarino , Alessandro Oltramari , Ro Oltramari , Stefano Borgo: Cleaning-up WordNet's Top-Level. In Proc. of the 1st International WordNetConference (2002) http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=C9962DFEDD7 93F3F839426B774BC9BAF?doi=10.1.1.11.4064&rep=rep1&type=pdf
  • 4. 4Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet Lecture Keywords: ،‫مكنز‬ ،‫المفردات‬ ‫شبكة‬‫انطولوجيا‬،‫للغة‬،‫المعنى‬ ،‫الداللة‬ ،‫الداللة‬ ‫علم‬،‫المفهوم‬ ،‫اللغات‬ ‫تعدد‬‫عالقات‬ ،‫المعاني‬ ‫تصنيف‬ ،‫التضاد‬ ،‫المعاني‬ ‫تعدد‬ ،‫اللغوي‬ ‫الترادف‬ ‫جزء‬-‫كل‬ WordNet, Global WordNet, Thesaurus, Linguistic Ontology, Lexical Semantics, Semantics, Meaning, Synset, Concept, Synonymy, Polysemy, Hyponymy, Meronymy, Antonymy,
  • 5. 5Jarrar © 2014 What is WordNet? • In 1985 a group of psychologists and linguists at Princeton University started to develop a “mental lexicon”. • You may also call it:“electronic dictionary”, “Mental dictionary”, English, “semantic Network”, hyperdimensional thesaurus, etc. • Includes most frequent words (nouns, adjectives, adverbs, verbs). • Organized by meaning: words in close proximity are semantically similar. • Can be used by humans and machines. • Human users and computers can browse WordNet and find words that are meaningfully related to their queries. • Available online, for downloading! http://wordnet.princeton.edu
  • 6. 6Jarrar © 2014 WordNet: Synonymy WordNet gives information about two fundamental, universal properties of human language: polysemy and synonymy. • English words are grouped (roughly) into sets of synonyms. • Each set of synonyms is called a Synset; and given a unique SynsetID to identify it. • Each synset expresses a distinct meaning/concept. {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. 08283156 06501650 07955878 03410635 03018908 04615793 {work table} A table designed…
  • 7. 7Jarrar © 2014 Exercise List the different meanings of the words: Table, Array, Matrix, Bureau
  • 8. 8Jarrar © 2014 WordNet: Polysemy • Each word form-meaning pair is unique. • A word that appears in n synsets is n-fold polysemous. • For example: “Table” here is two-fold polysemous {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 9. 9Jarrar © 2014 WordNet: Glosses A short gloss is provided for each sysnet. Glosses are examples of contexts for many word-sense pairs, telling us how words with specific senses are being used in context. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 10. 10Jarrar © 2014 WordNet: Statistics 155 287 word forms, groups into 117 659 synsets {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed… WordForms Synsets noun 117,798 82,115 verb 11,529 13,767 adjective 21,479 18,156 adverb 4,481 3,621 Total 155,287 117,659
  • 11. 11Jarrar © 2014 WordNet Semantic Relations Synsets are interconnected with semantic relations, forming a large semantic network (graph). Such Relations are: • Hyponymy, also called “Is a” relation, or sub/superordinate. • Meronymy, also called “part of” relation {Container} Any object that can be used .. {Drawer} A boxlike container in a.. {shelf} A support that consists… {Support} Any device that bears.. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 12. 12Jarrar © 2014 WordNet Relations: Hyponymy • A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array is a kind of Array, Array is a kind of Arrangement,… • Hyponymy is transitive and asymmetrical. So as Hyponymy generates a hierarchical semantic structure, a hyponym inherits all the features of the more generic concept and adds at least one feature that distinguishes it from its superordinate. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 13. 13Jarrar © 2014 WordNet Relations: Hyponymy • A synset {x, x′, . . .} is hyponym of the synset {y, y′, . . .} if native English speakers accept sentences like x is a (kind of) y. E. g., Table/Tabular Array is a kind of Array, Array is a kind of Arrangement,… • Hyponymy is transitive and asymmetrical. So as Hyponymy generates a hierarchical semantic structure, a hyponym inherits all the features of the more generic concept and adds at least one feature that distinguishes it from its superordinate. [2] The WordNet hierarchy is about 16 levels {act, action, activity} {natural object } {animal, fauna} {natural phenomenon } {artifact } {person, human being} {attribute, property } {plant, flora} {body, corpus} {possession} {cognition, knowledge} {process} {communication} {quantity, amount} {event, happening} {relation } {feeling, emotion} {shape} {food} {state, condition} {group, collection} {substance} {location, place } {time} {motive} Top Level Nouns (25 unique beginners)
  • 14. 14Jarrar © 2014 WordNet Relations: Meronymy • A synset {x, x′, . . .} is meronym of the synset {y, y′, . . .} if native English speakers accept sentences like y has an x (as a part) or An x is a part of y. E. g., Finger is part of Hand , Hand is part of Arm, Arm is part of Body. • Meronymy is transitive (with qualification) and asymmetrical relations, and forms a part hierarchy.. • Synsets may have multiple hypernyms {Container} Any object that can be used .. {Drawer} A boxlike container in a.. {shelf} A support that consists… {Support} Any device that bears.. {Periodic Table} a tabular arrangement of the chemical elem… {Matrix} A rectangular array of quantities … {Arrangement} An orderly grouping (of things or… {Bureau, Dresser, Chest of Drawers,} Furniture with drawers for keeping clothes {Table, Tabular Array} A set of data arranged in rows and columns {Categorization, Classification} A group of people or things arranged… {Array} An orderly arrangement {Calendar} A tabular array of the days.. {Contents, TableOfContents} A list of divisions… {Furniture, Piece of furniture , Article of furniture} Furnishings that make a room…. {Table} A piece of furniture having a smooth … {Desk} A piece of furniture with a writing surface… {Booth} A table (in a restaurant or bar) surrounded by two… {River} A large natural stream of ... {Stream} A natural body of running water… {Nile} The world's longest.. {work table} A table designed…
  • 15. 15Jarrar © 2014 Exercise Find the hyponyms and meronyms of this synset {car, auto, automobile, machine, motorcar}
  • 16. 16Jarrar © 2014 WordNet Relations: Another Example {car, auto, automobile, machine, motorcar} {conveyance,transport} {vehicle} {motor vehicle, automotive vehicle} {cruiser, squad car, patrol car, police car, prowl car} {cab, taxi, hack, taxicab} {bumper} {car door} {car window} {car mirror} {armrest} {doorlock} {hinge, flexible joint} hyper(o)nym hyponym meronyms Hyponymy and meronymy relations are: • transitive • directed [1]
  • 17. 17Jarrar © 2014 {Old} Of long duration WordNet Relations: Antonymy • The antonym of a word x is sometimes not-x, but not always. For example, rich and poor are antonyms, but to say that someone is not rich does not imply that they must be poor; many people consider themselves neither rich nor poor. • Antonymy, which seems to be a simple symmetric relation, is actually quite complex, yet speakers of English have little difficulty recognizing antonyms when they see them. For example, the meanings {rise, ascend } and {fall, descend} may be conceptual opposites, but they are not antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most people hesitate and look thoughtful when asked if rise and descend, or ascend and fall, are antonyms • Antonymy is a lexical relation between word forms, not a semantic relation between word meanings. Or, some call it semantic relations between words [MPC93]. {Fall, Come Down, Go Down, Descend} Move downward and lower, but not necessarily all the way {Set, Go down, Go Under} (astronomy) disappear beyond the horizon{Ascend, Come up, Rise, Uprise} (astronomy) come up, of celestial bodies {Ascend, Go up} Travel up {Rise, Uprise, Come up, Go up, Move up, Lift} Move upward {Ascend, Move up, Rise} Move to a better position in life … {Hot} Used of physical heat; having.. {Cold} Having a low or inadequate.. {New} Unaffected by use or exposure {New} Not of long duration; having.. {Worn} Affected by wear; damaged by … {Young, Immature} in an early period of life… {Old} having lived for a relatively
  • 18. 18Jarrar © 2014 WordWeb http://wordweb.info/free/ A nice and intuitive interface for WordNet
  • 19. 19Jarrar © 2014 Other WordNet Relations • Although the main interest of WordNet was on specifying semantic relations but other lexical/morphological relations between word forms were added. • For example: stems, singular-plural, verb tenses, etc.
  • 20. 20Jarrar © 2014 Why do we need WordNet? • Word sense disambiguation, • Information retrieval, • Automatic text classification, • Automatic text summarization, • Machine translation • ….etc.
  • 21. 21Jarrar © 2014 Is WordNet a Thesaurus? Yes: • it groups together meaningfully related words No: • WN labels the relations • The relations are limited • Related words are linked to specific concepts (disambiguated); thesaurus is a “bag of words” • Many words linked in WordNet do not co-occur in the same thesaurus entry • WordNet allows one to measure and quantify the semantic similarity or distance among words and concepts [Fellbaum]
  • 22. 22Jarrar © 2014 Is WordNet an Ontology? Meaning (called Ontological Precision): WordNet: based on what native speakers agree roughly Ontology: based on Scientific and philosophical findings. Classification: WordNet: based on what native speakers agree roughly (Student IsA person) Ontology: based on strict formal methodologies (student IsA role) Formal Specification: WordNet: logically vague Ontology: strictly formal  I like to use WordNet as a linguistic ontology, though it needs lots of cleaning!  Linguistic ontologies are difficult to build but they are immune to changes
  • 23. 23Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet
  • 24. 24Jarrar © 2014 EURO WordNet • The development of a multilingual database with WordNets for several European languages. • Funded by the European Commission, DG XIII, LE2-4003 and LE4-8328 • March 1996 - September 1999 (2.5 Million EURO) http://www.hum.uva.nl/~ewn http://www.illc.uva.nl/EuroWordNet/finalresults-ewn.html • Languages covered: EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian. • Size of vocabulary: EuroWordNet-1: 30,000 concepts - 50,000 word meanings. EuroWordNet-2: 15,000 concepts- 25,000 word meaning. • Type of vocabulary: the most frequent words of the languages all concepts needed to relate more specific concepts. [1]
  • 25. 25Jarrar © 2014 EURO WordNet Model I = Language Independent link II = Link from Language Specific to Inter lingual Index III = Language Dependent Link III Lexical Items Table cavalcare andare muoversi III guidare ILI-record {drive} Inter-Lingual-Index Ontology 2OrderEntity Location Dynamic Domains Traffic Air Road` III Lexical Items Table bewegen gaan rijden berijden III Lexical Items Table driveride move go III III Lexical Items Table cabalgar jinetear III conducir mover transitar III II IIII II II [1]
  • 26. 26Jarrar © 2014 The Multilingual Design • Inter-Lingual-Index: unstructured fund of concepts to provide an efficient mapping across the languages; • Index-records are mainly based on WordNet synsets and consist of synonyms, glosses and source references; • Various types of complex equivalence relations are distinguished; • Equivalence relations from synsets to index records: not on a word-to- word basis; • Indirect matching of synsets linked to the same index items; [1]
  • 27. 27Jarrar © 2014 EURO WordNet Model • WordNets are unique language-specific structures:  same organizational principles: synset structure and same set of semantic relations.  different lexicalizations  differences in synonymy and homonymy: "decoration" in English versus "versiersel/versiering" in Dutch "bank" in English (money/river) versus "bank" in Dutch (money/furniture) •BUT also different relations for similar synsets [1]
  • 28. 28Jarrar © 2014 Some Downsides of the EuroWordNet Model • Construction is not done uniformly • Coverage differs • Not all wordnets can communicate with one another, i.e. linked to different versions of English wordnet • Proprietary rights restrict free access and usage • A lot of semantics is duplicated • Complex and obscure equivalence relations due to linguistic differences between English and other languages [1]
  • 29. 29Jarrar © 2014 WordNet and Global WordNet • Part 1: The English WordNet • Part 2: Euro WordNet • Part 3: Global WordNet
  • 30. 30Jarrar © 2014 From EuroWordNet to Global WordNet EuroWordNet ended in 1999 Global Wordnet Association was founded in 2000 to maintain the framework: http://www.globalwordnet.org Currently, wordnets exist for more than 50 languages, including: Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic, Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish, Zulu... Many languages are genetically and typologically unrelated http://www.globalwordnet.org
  • 31. 31Jarrar © 2014 From EuroWordNet to Global WordNet • EuroWordNet ended in 1999 • Global Wordnet Association was founded in 2000 to maintain the framework: http://www.globalwordnet.org • Currently, wordnets exist for more than 50 languages, including: Arabic, Bantu, Basque, Chinese, Bulgarian, Estonian, Hebrew, Icelandic, Japanese, Kannada, Korean, Latvian, Nepali, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish, Zulu... • Many languages are genetically and typologically unrelated  The Arabic WordNet extension was not successful, will be explained later. [1]
  • 32. 32Jarrar © 2014 Global WordNet Model Construct separate wordnets for each language Contributors from each language encode the same core set of concepts plus culture/language-specific ones Synsets (concepts) are mapped cross linguistically via an ontology instead of just the English Wordnet [1]
  • 33. 33Jarrar © 2014 Discussion What would be a good database schema to store WordNet? Global WordNEt? What is the difference between Synset and Concept? How precise the Hyponymy? And what is the difference between to Hyponymy and subclass/subset?
  • 34. 34Jarrar © 2014 References [1] Piek Vossen: Lecture Notes on The Global Wordnet Grid: anchoring languages to universal meaning http://www.authorstream.com/Presentation/Stentore-40555-WN-EWN-GWA-Koszalin-Global-Wordnet-Grid- anchoring-languages-universal-meaning-kosz-Entertainment-ppt-powerpoint/ [2] Lyons, John. Semantics. Vol. 1. Cambridge: Cambridge UP, 1977. Print.