Semantic annotation is fundamental to deal with large-scale
lexical information, mapping the information to an enumerable set of
categories over which rules and algorithms can be applied, and foundational
ontology classes can be used as a formal set of categories for
such tasks. A previous alignment between WordNet noun synsets and
DOLCE provided a starting point for ontology-based annotation, but in
NLP tasks verbs are also of substantial importance. This work presents
an extension to the WordNet-DOLCE noun mapping, aligning verbs according
to their links to nouns denoting perdurants, transferring to the
verb the DOLCE class assigned to the noun that best represents that
verb’s occurrence. To evaluate the usefulness of this resource, we implemented
a foundational ontology-based semantic annotation framework,
that assigns a high-level foundational category to each word or phrase
in a text, and compared it to a similar annotation tool, obtaining an
increase of 9.05% in accuracy.
2024 Q2 Orange County (CA) Tableau User Group Meeting
Word Tagging with Foundational Ontology Classes
1. NLP & Semantic Computing Group
N L P
Word Tagging with Foundational
Ontology Classes: Extending the
WordNet-DOLCE Mapping to Verbs
Vivian S. Silva
André Freitas
Siegfried Handschuh
2. NLP & Semantic Computing Group
Introduction
• NLP applications such as Question Answering
and Text Entailment require complex inferences
involving large commonsense knowledge bases
Need to map the words to an enumerable set of
categories, reducing the reasoning search space
• Rules and algorithms can be applied to these categories
3. NLP & Semantic Computing Group
Introduction
Linguistic resources, such as WordNet, can
serve as a “bridge” between natural language
text and higher level semantic representations
Foundational ontologies, which are sets of high
level formal categories, can provide a suitable
semantic representation
Map WordNet to a
foundational ontology
to enable FO based
word tagging
4. NLP & Semantic Computing Group
Why Foundational Ontologies?
John is
Mary’s
son
Representation Reasoning
Mary
gave
birth
Data
performs (Mary, give birth)
son (John, Mary)
son (x,y) mother (y,x)
mother (Mary, John)
Foundational ontologies are intended to represent the
world in the way people perceive it, classifying entities
into categories that are familiar to people’s common sense
can represent data
in a formal way
can reason over
data through rules
and restrictions
5. NLP & Semantic Computing Group
Practical Application
Assumption Mary is a mother
Hypothesis Mary gave birth
Text Entailment Task
Support definition
(e.g. from WN)
“a mother is a woman who has given birth”
Foundational
Ontology
Mapping
Rule
Applying the rule
Mary mother give birth
agent role action
(agent plays role) and (role performs action) -> (agent performs action)
(Mary plays mother) and (mother performs give birth) ->
(Mary performs give birth)
Foundational
classes
Commonsense
concepts
6. NLP & Semantic Computing Group
DOLCE-WordNet Alignment
• Sweetening WordNet with DOLCE (Gangemi et
al., 2003)
DOLCE: oriented towards language and cognition
813 noun synsets mapped to 50 DOLCE classes
No verb synsets mapped
• Proposal:
Expand the nouns alignment to the whole
taxonomy
Map also the verb synsets to DOLCE, using their
links to already mapped noun synsets
7. NLP & Semantic Computing Group
Verb Alignment Methodology
Update and
Expansion of
Nouns Alignment
Mappings update from version 1.6 to 3.0: 809 synsets;
Alignment expanded trough the taxonomy using the hypernym
links: 80,897 mapped synsets – 98.5% noun database
Top Level Verbs
Selection
Verbs classification performed over the top level synsets: 560
synsets
Direct Links
Derivationally related form lexical link retrieved for each of the 560
synsets; results manually filtered to identify the noun that best
represents the verb occurrence.
Examples: run - running; appear - apparition; leak - leakage
Indirect Links
When no direct links were found, indirect paths were searched,
using the antonym and verb group links.
Example: ignore [antonym of] know – knowingness
Manual Assignment
For verbs with no explicit direct or indirect link to a noun, implicit
relationships given by the words in their gloss were then identified
Example: overarch (“form an arch over”) - arch (“form an arch or
curve”)
8. NLP & Semantic Computing Group
Verb Alignment Methodology
• Alignment examples:
breathe_1 breathing_1 process
derivationally related form
degrade_1 aggrade_1 event
antonym
change_2
inherited hypernym
change_1
derivationally related form
take orders_2 eventordinance_3
verbs nouns DOLCE
classes
(a)
(b)
(c)
(a) Direct link
(b) Indirect link
(c) Manual assignment
Using as much information as it’s available in the synset’s gloss, in
order to make the classification as less subjective as possible!
9. NLP & Semantic Computing Group
Alignment Results
DOLCE class Top Synsets Full Taxonomy
event 412 12,037
cognitive-event 63 854
state 62 597
process 15 259
cognitive-state 8 20
Total 560 13,767
direct links
36.25%
indirect
links
16.25%
implicit
relationships
47.50%
explicit
links
52.50%
Alignments expanded
trough the taxonomy
using the hypernym
links: verb database
100% mapped
10. NLP & Semantic Computing Group
Evaluation
• Objective: evaluate the usefulness of the resulting
alignments in a semantic annotation task (not the
alignment quality!)
• Datasets: SemCor 3.0 and eXtended WordNet (XWN)
Sense number for each word/phrase used to retrieve the
synset ID, and then the DOLCE class associated to the
synset
Labeled datasets used as gold standard
• FO tagging approach:
Lookup in the WN-DOLCE mappings table
First sense WSD
11. NLP & Semantic Computing Group
Evaluation Results
• Baseline:
Random: chooses a random label among the ones available for a
word/phrase
SuperSense Tagger (Ciaramita & Altun, 2006): most similar tool,
assigns a super sense (high level WN synsets) to each word/phrase
XWN SemCor
Precision Recall F1-Score Precision Recall F1-Score
Random 71.82 72.04 71.93 61.52 62.52 62.02
FO Tagging 89.68 89.74 89.71 86.10 86.36 86.23
SuperSense Tagging - - - 76.65 77.71 77.18
9.05%
Accuracy of the chosen approach for FO tagging at selecting the
most suitable label from the standard mappings set
12. NLP & Semantic Computing Group
Known Issues
• WordNet hypernym links not always effectively
represent subsumption relationships
FO tagging deals with very high level categories
Related concepts tend to converge to the same
category even when not following a strict
subsumption relationship
• Tagging restricted to the words present in
WordNet
Future work: use the labeled datasets to train a
machine learning tagger
13. NLP & Semantic Computing Group
Conclusions
• Using a previous WN-DOLCE alignment for noun
synsets, we extended the mapping to the verb synsets
Using lexical links and gloss’ words to track back the
noun that best represent a verb occurrence
Assigning to the verb the same DOLCE class associated
to its noun counterpart
• Resulting alignment used in the implementation of
the FO Tagging semantic annotation framework
Compared to SST, it showed an increase of 9.05% in
accuracy, besides introducing a more homogeneous and
conceptually well-grounded set of categories
Even using a simple WSD technique, it is possible to
annotate text with high accuracy
14. NLP & Semantic Computing Group
N L P
Word Tagging with Foundational
Ontology Classes: Extending the
WordNet-DOLCE Mapping to Verbs
Thanks!