IAC 2024 - IA Fast Track to Search Focused AI Solutions
Word Sense Disambiguation and Intelligent Information Access
1. Word Sense Disambiguation
and
Intelligent Information Access
Pierpaolo Basile
basilepp@di.uniba.it
Department of Computer Science
University of Bari “A. Moro” (ITALY)
29 May 2009
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 1 / 55
2. Outline
1 Introduction
Word Sense Disambiguation
Intelligent Information Access
2 WSD Strategies
JIGSAW
JIGSAWz
HYDE : a hybrid strategy for WSD
COMBY : a combined strategy for WSD
3 WSD at Work
Information Filtering: ITR - ITem Recommender
Information Retrieval: Semantic Search
4 Conclusions and Future Work
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 2 / 55
3. Introduction Word Sense Disambiguation
Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the problem of selecting a
sense for a word from a set of predefined possibilities
sense inventory usually comes from a dictionary or thesaurus
polysemous word: having more than one possible meaning, e.g.
bank1 :
1 sloping land (especially the slope beside a body of water);
2 a financial institution that accepts deposits and channels the money
into lending activities;
3 a long ridge or pile;
4 an arrangement of similar objects in a row or in tiers;
knowledge intensive methods, supervised learning, and (sometimes)
bootstrapping approaches
1
First four meanings in WordNet 3.0
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 3 / 55
4. Introduction Word Sense Disambiguation
Brief History
1949: noted as problem for Machine Translation
1950s - 1960s: semantic networks, AI approaches
1970s - 1980s: rule based systems, rely on hand crafted knowledge
sources
1990s: WordNet, corpus based approaches, sense tagged text
2000s: Hybrid Systems, minimizing or eliminating use of sense tagged
text, taking advantage of the Web, domain WSD
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 4 / 55
5. Introduction Intelligent Information Access
Intelligent Information Access
Problems
Explosion of irrelevant, unclear, inaccurate information
Users overloaded with a large amount of information impossible to
absorb
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 5 / 55
6. Introduction Intelligent Information Access
Intelligent Information Access
Problems
Explosion of irrelevant, unclear, inaccurate information
Users overloaded with a large amount of information impossible to
absorb
Consequences
Searching is time consuming
Need for intelligent solutions able to support users in finding
documents
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 5 / 55
7. Introduction Intelligent Information Access
Intelligent Information Access
Problems
Explosion of irrelevant, unclear, inaccurate information
Users overloaded with a large amount of information impossible to
absorb
Consequences
Searching is time consuming
Need for intelligent solutions able to support users in finding
documents
Solution
Intelligent Information Access: user-centric and semantically rich
approach to access information
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 5 / 55
8. Introduction Intelligent Information Access
WSD in Information Access
Machine Translation
Translate “plant” from English to Italian
Is it a “pianta” or a “impianto/stabilimento”?
Information Retrieval
Find all Web Pages about “bat”
The sport equipment or the nocturnal mammal ?
Question Answering
What is George Millers position on gun control?
The psychologist or US congressman?
Knowledge Acquisition
Add to KB: Herb Bergson is the mayor of Duluth, Minnesota or
Georgia?
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 6 / 55
9. Introduction Intelligent Information Access
WSD and Intelligent Information Access
Natural Language Processing can enhance Intelligent Information
Access
keywords not appropriate for representing content, due to polysemy,
synonymy, multi-word concepts
WSD provides semantics: concepts identification in documents
Humans are able to comprehend the meaning of a text
Natural Language Processing and WSD convert human linguistic
abilities into more formal representations that are easier for computer
programs to understand
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 7 / 55
10. WSD Strategies JIGSAW
JIGSAW
JIGSAW
Knowledge-based WSD algorithm
Exploits WordNet senses
Three different strategies for: nouns, verbs and adjectives/adverbs
Main motivation: the effectiveness of a WSD algorithm is strongly
influenced by the PoS-tag
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 8 / 55
11. WSD Strategies JIGSAW
JIGSAW
JIGSAW
Knowledge-based WSD algorithm
Exploits WordNet senses
Three different strategies for: nouns, verbs and adjectives/adverbs
Main motivation: the effectiveness of a WSD algorithm is strongly
influenced by the PoS-tag
WordNet [Mil95]
Lexical reference database designed by Princeton University
English nouns, verbs, adverbs and adjectives organized into SYNonym
SETs (SYNSET)
Semantic relations among synsets
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 8 / 55
14. WSD Strategies JIGSAW
JIGSAW algorithm
The algorithm
Input
d = (w1 , w2 , . . . , wh ) document
Output
X = (s1 , s2 , . . . , sk ) k ≤h
each si obtained by disambiguating wi based on the context of each
word
some words not recognized by WordNet
groups of words recognized as a single concept
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 11 / 55
15. WSD Strategies JIGSAW
JIGSAWnouns
The idea
Based on Resnik [Res95] algorithm for disambiguating noun groups
Given a set of nouns N = {n1 , n2 , . . . , nn } from document d
each ni has an associated sense inventory Si = {si1 , si2 , . . . , sik } of
possible senses
Goal: assigning each wi with the most appropriate sense sih ∈ Si ,
maximizing the similarity of ni with the other nouns in N
The strategy
Computing Semantic Similarity exploiting “noun hierarchy”
Give more credit to senses that are hyponym of the Most Specific
Subsumer (MSS)
Combine MSS information with Semantic Similarity
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 12 / 55
21. WSD Strategies JIGSAW
JIGSAWnouns
Final synset score
Linear combination between semantic similarity (with MSS
information) and synset rank in WordNet:
ϕ(sik ) = α ∗ sim(sik , N) + β ∗ R(k) (α + β = 1) (1)
R(k) takes into account the synset rank in WordNet:
k
R(k) = 1 − 0.8 ∗ (2)
n−1
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 14 / 55
22. WSD Strategies JIGSAW
JIGSAWnouns
Differences between JIGSAWnouns and Resnik
Leacock-Chodorow measure to compute similarity (instead of
Information Content)
Gaussian factor G, which takes into account the distance between
words in the text
Factor R, which takes into account the synset frequency score in
WordNet
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 15 / 55
23. WSD Strategies JIGSAW
JIGSAWverbs
The idea
Try to establish a relation between verbs and nouns (distinct IS-A
hierarchies in WordNet)
Verb wi disambiguated using:
nouns in the context C of wi
nouns into the description (gloss + WordNet usage examples) of each
candidate synset for wi
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 16 / 55
24. WSD Strategies JIGSAW
JIGSAWverbs
The idea
Try to establish a relation between verbs and nouns (distinct IS-A
hierarchies in WordNet)
Verb wi disambiguated using:
nouns in the context C of wi
nouns into the description (gloss + WordNet usage examples) of each
candidate synset for wi
The strategy
For each candidate synset sik of wi
computes nouns(i, k): the set of nouns in the description for sik
for each wj in C and each synset sik computes the highest similarity
maxjk
maxjk is the highest similarity value for wj wrt the nouns related to the
k-th sense for wi (using Leacock-Chodorow measure)
using G and R factors (JIGSAWnouns ) to weight semantic similarity
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 16 / 55
25. WSD Strategies JIGSAW
JIGSAWverbs : The algorithm
I play basketball and soccer. wi = play C = {basketball, soccer }
1 (70) play - (participate in games or sport; “We played hockey all
afternoon”; “play cards”; “Pele played for the Brazilian teams in
many important matches”)
2 (29) play - (play on an instrument; “The band played all night long”)
3 ...
Build nouns set for each sik :
1 nouns(play,1): game, sport, hockey, afternoon, card, team, match
2 nouns(play,2): instrument, band, night
3 ...
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 17 / 55
26. WSD Strategies JIGSAW
JIGSAWverbs : The algorithm
wi = play
C = {basketball, soccer }
nouns(play,1): game, sport, hockey, afternoon, card, team, match
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 18 / 55
27. WSD Strategies JIGSAW
JIGSAWverbs : The algorithm
Finally, an overall similarity score, ϕ(i, k), among sik and the whole
context C is computed:
wj ∈C Gauss(position(wi ), position(wj )) · maxjk
ϕ(i, k) = R(k) · (3)
h Gauss(position(wi ), position(wh ))
The synset assigned to wi is the one with the highest ϕ value
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 19 / 55
28. WSD Strategies JIGSAW
JIGSAWothers
Based on the WSD algorithm proposed by Banerjee and Pedersen
[BP02, BP03] (inspired to Lesk [Les86])
Idea: computes the overlap between the glosses of each candidate
sense (including related synsets) for the target word to the glosses of
all words in its context
assigns the synset with the highest overlap score
if ties occur, the most common synset in WordNet is chosen
Given the sentence: “I bought a bottle of aged wine”
the context is C = {bottle, wine}
the first two synsets for aged are:
1 (advanced in years; ”aged members of the society”; ”elderly residents
could remember the construction of the first skyscraper”; ”senior
citizen”);
2 (of wines, fruit, cheeses; having reached a desired or final condition;
”mature well-aged cheeses”)
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 20 / 55
29. WSD Strategies JIGSAWz
JIGSAWz : ZIPF distribution
Zipf’s law: the frequency of an event is inversely proportional to its
rank in the frequency table
similar to words distribution: the most frequent word occurs
approximately twice the second most frequent word, which occurs
twice the fourth most frequent word, ...
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 21 / 55
30. WSD Strategies JIGSAWz
JIGSAWz
Modify R factor using ZIPF distribution:
1/k s
f (k; N; s) = N
(4)
s
n=1 1/n
where:
N is the number of word meanings
k is the word meaning rank. We adopt the WordNet synset rank
s is the value of the exponent characterizing the distribution
Compute the frequency of the word meaning in SemCor
Approximate s using the Pearson’s chi-square χ2 test method
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 22 / 55
31. WSD Strategies JIGSAWz
NLP tools for the evaluation
WSD requires pre-processing steps: tokenization, stemming,
PoS-tagging and lemmatization
META (MultilanguagE Text Analyzer) [BdG+ 08] implements several
NLP tasks and provides tools for semantic indexing of documents:
Text normalization and tokenization
Stemming (SNOWBALL library)
Lemmatization
English: WordNet Morphological Analyzer
Italian: Morph-it! and Lemmagen tool (Ripple Down Rule learning)
POS-tagging based on ACOPOST T3 (HMM - Hidden Markov Model)
Entity recognition based on SVM classifier (YAMCHA)
WSD: English/Italian
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 23 / 55
32. WSD Strategies JIGSAWz
JIGSAW Evaluation
SensEval-3 All-Words Task
disambiguation of all words contained into English texts
sense inventory: WordNet 1.7.1
2.041 words
inter-annotators agreement rate was approximately 72,5%
EVALITA WSD All-Words Task
disambiguation of all words contained into Italian texts
sense inventory: ItalWordNet
about 5,000 words
no information about inter-annotators agreement rate
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 24 / 55
33. WSD Strategies JIGSAWz
JIGSAW Evaluation: Results
JIGSAW at SensEval-3 All-Words Task
system P R A(%) F
1st sense 0.624 0.651 100 0.651
BestUnsupervised 0.583 0.582 100 0.582
JIGSAW 0.525 0.525 100 0.525
JIGSAWz 0.606 0.606 100 0.606
JIGSAW at EVALITA WSD All-Words Task [BS07]
system P R A(%) F
1st sense 0.648 0.614 94.7 0.631
Random 0.483 0.458 94.7 0.470
JIGSAW 0.598 0.567 94.7 0.582
JIGSAWz 0.639 0.606 94.7 0.622
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 25 / 55
34. WSD Strategies HYDE : a hybrid strategy for WSD
Supervised Learning for WSD
Supervised Learning for WSD
Exploits machine learning techniques to induce models of word usage from
large text collections
annotated corpora are tagged manually using semantic classes chosen
from a sense inventory
each sense-tagged occurrence of a particular word is transformed into
a feature vector, which is then used in an automatic learning process
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 26 / 55
35. WSD Strategies HYDE : a hybrid strategy for WSD
Problems and Motivation
Knowledge-based methods
outperformed by supervised methods
high coverage: applicable to all words in unrestricted text
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 27 / 55
36. WSD Strategies HYDE : a hybrid strategy for WSD
Problems and Motivation
Knowledge-based methods
outperformed by supervised methods
high coverage: applicable to all words in unrestricted text
Supervised methods
high precision
low coverage: applicable only to those words for which annotated
corpora are available
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 27 / 55
37. WSD Strategies HYDE : a hybrid strategy for WSD
Problems and Motivation
Knowledge-based methods
outperformed by supervised methods
high coverage: applicable to all words in unrestricted text
Supervised methods
high precision
low coverage: applicable only to those words for which annotated
corpora are available
Solution
HYDE : combination of Knowledge-based (JIGSAW ) methods and
Supervised Learning can improve WSD effectiveness [BdLS08]
Knowledge-based methods improve coverage
Supervised Learning strategies improve precision
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 27 / 55
38. WSD Strategies HYDE : a hybrid strategy for WSD
Supervised Learning
Exploited features
nouns: the first noun, verb or adjective before the target noun, within
a (left) window of at most three words to the left and its PoS-tag
verbs: the first word before and the first word after the target verb
and their PoS-tag
adjectives: six nouns (before and after the target adjective)
adverbs: the same as adjectives but adjectives rather than nouns are
used
Training corpus: MultiSemCor
1 Italian translations of the SemCor texts
2 automatically aligning Italian and English texts
3 automatically transferring the word sense annotations from English
(WordNet) to the aligned Italian (MultiWordNet) words
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 28 / 55
39. WSD Strategies HYDE : a hybrid strategy for WSD
Supervised Learning
K-NN algorithm for WSD
Learning: build a vector for each annotated word
Classification:
build a vector vf for each word in the text
compute similarity between vf and the training vectors
rank the training vectors in decreasing order according to the similarity
value
choose the most frequent sense in the first K vectors
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 29 / 55
40. WSD Strategies HYDE : a hybrid strategy for WSD
HYDE Evaluation
Dataset: EVALITA WSD All-Words Task Dataset
Two strategies:
Integrating JIGSAW into a supervised learning method
supervised method is applied to words for which training examples are
provided
JIGSAW is applied to words not covered by the first step
Integrating supervised learning into JIGSAW
JIGSAW is applied to assign a sense to the words which can be
disambiguated with a high level of confidence
remaining words are disambiguated by the supervised method
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 30 / 55
41. WSD Strategies HYDE : a hybrid strategy for WSD
HYDE Evaluation: Baselines
Baselines for EVALITA WSD All-Words Task Dataset
Setting P R F A (%)
1st sense 0.648 0.614 0.631 94.7
Random 0.484 0.484 0.484 100.0
JIGSAW 0.639 0.606 0.622 94.7
K-NN 0.797 0.336 0.473 42.2
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 31 / 55
42. WSD Strategies HYDE : a hybrid strategy for WSD
HYDE : Evaluation results
1st sense (0.631), Random (0.470), JIGSAW (0.622), K-NN (0.484)
Integrating JIGSAW into a supervised learning method
Setting P R F A (%)
K-NN + JIGSAW 0.624 0.591 0.607 94.7
K-NN + JIGSAW (ϕ ≥ 0.80) 0.693 0.337 0.453 48.6
K-NN + JIGSAW (ϕ ≥ 0.60) 0.680 0.410 0.512 60.3
K-NN + JIGSAW (ϕ ≥ 0.40) 0.652 0.452 0.534 69.3
K-NN + JIGSAW (ϕ ≥ 0.20) 0.652 0.452 0.534 69.3
Integrating supervised learning into JIGSAW
Setting P R F A (%)
JIGSAW (ϕ ≥ 0.80) + K-NN 0.715 0.392 0.556 55.6
JIGSAW (ϕ ≥ 0.60) + K-NN 0.688 0.440 0.537 64.0
JIGSAW (ϕ ≥ 0.40) + K-NN 0.651 0.484 0.555 74.4
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 32 / 55
43. WSD Strategies COMBY : a combined strategy for WSD
COMBY : a combined strategy for WSD
COMBY WSD framework: combines the output data of several WSD
algorithms
run a set of WSD algorithms on a sense-annotated corpus (TRC )
obtain a set of output data O = {o1 , o2 , .., oN } where each oi is the
output provided by the i − th algorithm
each output oi contains for each word instance wj in TRC a list of pairs
(< synset1 , score1 >, ..., < synsetk , scorek >, ..., < synsetl , scorel >)
combination step: run WSD algorithms on a not sense-annotated
corpus (TSC ):
run the WSD algorithms on a different dataset (TSC )
obtain a set of output data
combination of outputs: voting strategies and supervised methods
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 33 / 55
44. WSD Strategies COMBY : a combined strategy for WSD
Combination strategies
Voting strategies
1 simple voting: the sense that has the majority of votes is chosen
2 simple voting using the information about the synset score: the vote
for each synset is the sum of all scores in each WSD system
3 simple voting using different weights for each system according to the
WSD performance in TRC
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 34 / 55
45. WSD Strategies COMBY : a combined strategy for WSD
Combination strategies
Voting strategies
1 simple voting: the sense that has the majority of votes is chosen
2 simple voting using the information about the synset score: the vote
for each synset is the sum of all scores in each WSD system
3 simple voting using different weights for each system according to the
WSD performance in TRC
Supervised methods
1 several classification algorithms using the WEKA package
2 Support Vector Machine adopting the open-source software LIBSVM
3 using unsupervised predictions into a supervised system
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 34 / 55
46. WSD Strategies COMBY : a combined strategy for WSD
COMBY evaluation
Dataset
TRC training: SemCor 1.7.1
TRS testing: SensEval-3 All-Words Task
1s t sense baseline: (F=0,651)
Involved WSD systems
JIGSAW : a knowledge-based WSD algorithm that exploits WordNet
as knowledge-base.
AitorKB: graph-based method for performing knowledge-based WSD
[AS08]
TS: exploits Topic Signatures to disambiguate nouns [AdL04]
RIC : automatically builds examples from the Web using a new
approach based on the “monosemous relative” method [MAW06]
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 35 / 55
47. WSD Strategies COMBY : a combined strategy for WSD
COMBY evaluation: voting strategies
Performance of each systems
System P R F A(%)
JIGSAW 0.554 0.554 0.554 100.0
TS 0.458 0.215 0.292 46.9
RIC 0.397 0.396 0.396 99.8
AitorKB 0.600 0.600 0.600 100.0
Voting strategies
Strategy P R F A(%)
Simple 0.587 0.587 0.587 100.0
Z-Score 0.575 0.575 0.575 100.0
Rank 0.615 0.615 0.615 100.0
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 36 / 55
48. WSD Strategies COMBY : a combined strategy for WSD
COMBY evaluation: supervised combination
Combination using WEKA
Classifier P R F A(%)
Naive Bayes 0.653 0.653 0.653 100.0
Decision Trees 0.649 0.649 0.649 100.00
Ada Boost 0.647 0.647 0.647 100.0
K-NN 0.643 0.643 0.643 100.0
SMO (SVM) 0.653 0.653 0.653 100.0
Combination using LIBSVM and a supervised system (Knn.ehu [AdL07])
System P R F A(%)
LIBSVM 0.654 0.654 0.654 100.0
Knn.ehu 0.667 0.667 0.667 100.0
Knn.ehu+predictions 0.671 0.671 0.671 100.0
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 37 / 55
49. WSD at Work
WSD at Work
Exploit WSD techniques in real application scenarios
Information Filtering
Content-based recommending system
User profiles compared against item descriptions to provide
recommendations
Problems: keywords not appropriate for representing content, due to
polysemy, synonymy, multi-word concepts
Information Retrieval
Selection of documents, from a fixed collection, which satisfy a user’s
one-off information need (query)
Problems: polysemy and synonymy
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 38 / 55
50. WSD at Work Information Filtering: ITR - ITem Recommender
Information Filtering: ITR - ITem Recommender
ITR - ITem Recommender [SDLB07]: framework for Intelligent User
Profiling based on:
Word Sense Disambiguation for detecting relevant concepts
representing user interests
Naive Bayes text categorization algorithm for learning user profiles
from disambiguated documents
Concept-based user profiles:
Bag-of-Synset: a synset vector corresponds to a document, instead of a
word vector
synsets provided by JIGSAW
recognition of n-grams
synonyms represented by the same synsets
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 39 / 55
51. WSD at Work Information Filtering: ITR - ITem Recommender
ITR evaluation
EachMovie Dataset: Project conducted by Compaq Research Centre
(1996-1997)
Dataset of user-movie ratings
About 2.8 millions ratings
72,916 users
1,628 items (movies) divided in 10 categories (Genre)
Discrete rating on a 6-point scale
Movie content crawled from the Internet Movie Database (IMDb)
10 movie categories/genres
933 randomly selected users
100 users for each category, only for Category 2 Animation, 33 users
selected
Each user rated between 30 and 100 movies
Goal: compare performance of keyword-based profiles vs.
synset-based profiles
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 40 / 55
53. WSD at Work Information Retrieval: Semantic Search
Information Retrieval Evaluation
Two kinds of evaluation
SemEval-2007 Task 1: indexing of a documents collection for Cross
Language IR [BDG+ 07]
application-driven task
fixed cross-language information retrieval system
participants disambiguate text by assigning WordNet synsets (29,681
documents)
CLEF 2008: Ad-Hoc Robust WSD task: classical IR benchmark using
Cross Language dataset [BCS08]
166,726 documents
160 topics in English and Spanish
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 42 / 55
54. WSD at Work Information Retrieval: Semantic Search
SemEval-2007 Task 1 results
SemEval-2007 task 1 results
system IR documents CLIR
no expansion 0.3599 0.1446
full expansion 0.1610 0.2676
1st sense 0.2862 0.2637
ORGANIZERS 0.2886 0.2664
JIGSAW 0.3030 0.1373
PART-B 0.3036 0.1734
Performance of each system
system precision recall attempted
ORGANIZERS 0.591 0.566 95.76%
JIGSAW 0.484 0.338 69.98%
PART-B 0.334 0.186 55.68%
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 43 / 55
55. WSD at Work Information Retrieval: Semantic Search
CLEF 2008: system setup
N-Levels model [BCG+ 08]: each document has N levels of
representations
Each level has:
local feature weighting
local similarity function
Global ranking function: merges the results of different levels
N-Levels for CLEF 2008:
2 levels: stemming (TF/IDF) and synset (SF/IDF)
Global ranking function: Z-Score normalization and CombSUM
aggregation strategy
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 44 / 55
56. WSD at Work Information Retrieval: Semantic Search
CLEF 2008: results
N-levels results on CLEF 2008
Run MONO CROSS N-Levels WSD MAP
MONO1TDnus2f X - - - 0.168
MONO11nus2f X - - - 0.192
MONO12nus2f X - - - 0.145
MONO13nus2f X - - - 0.154
MONO14nus2f X - - - 0.068
MONOwsd1nus2f X - - X 0.180
MONOwsd11nus2f X - - X 0.186
MONOwsd12nus2f X - X X 0.220
MONOwsd13nus2f X - X X 0.227
CROSS1TDnus2f X X - - 0.025
CROSS1nus2f X X - - 0.015
CROSSwsd1nus2f X X - X 0.071
CROSSwsd11nus2f X X X X 0.060
CROSSwsd12nus2f X X X X 0.072
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 45 / 55
57. Conclusions and Future Work
Conclusions
The problem of word ambiguity into the context of intelligent
information access is exploited
Several WSD methods are proposed and evaluated:
JIGSAW knowledge-based algorithm
HYDE combination of knowledge-based and supervised approaches
COMBY combination of unsupervised methods
Evaluation: Senseval-3 All Words Task and EVALITA All Words Task
Languages different from English: knowledge-based and a hybrid
strategy for Italian WSD are proposed
Evaluation in real application scenarios: Information Filtering and
Information Retrieval
WSD can enhance real applications in the domain of Intelligent
Information Access
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 46 / 55
58. Conclusions and Future Work
Future Work
Include information about a specific domain into the WSD process
More investigation on the interaction between IR and WSD is needed
document expansion
query disambiguation/expansion
word polysemy
Other semantic features could be exploited: Named Entity and Entity
Relation
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 47 / 55
59. Conclusions and Future Work
That’s all folks!
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 48 / 55
60. Conclusions and Future Work
For Further Reading I
E. Agirre and O.L. de Lacalle.
Publicly available topic signatures for all WordNet nominal senses.
In Proceedings of the 4th International Conference on Languages
Resources and Evaluations (LREC 2004), 2004.
E. Agirre and O.L. de Lacalle.
UBC-ALM: Combining k-NN with SVD for WSD.
pages 342–345, 2007.
Eneko Agirre and Aitor Soroa.
Using the Multilingual Central Repository for Graph-Based Word
Sense Disambiguation.
In European Language Resources Association (ELRA), editor,
Proceedings of the Sixth International Language Resources and
Evaluation (LREC’08), Marrakech, Morocco, may 2008.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 49 / 55
61. Conclusions and Future Work
For Further Reading II
Pierpaolo Basile, Annalina Caputo, Anna Lisa Gentile, Marco
Degemmis, Pasquale Lops, and Giovanni Semeraro.
Enhancing Semantic Search using N-Levels Document Representation.
In Stephan Bloehdorn, Marko Grobelnik, Peter Mika, and Duc Thanh
Tran, editors, SemSearch, volume 334 of CEUR Workshop
Proceedings, pages 29–43. CEUR-WS.org, 2008.
P. Basile, A. Caputo, and G. Semeraro.
Uniba-Sense at Clef 2008: SEmantic N-levels Search Engine.
In F. Borri, A. Nardi, and C. Peters, editors, Results of
Cross-Language Evaluation Forum 2008 (CLEF 2008), page 9, 2008.
ISSN: 1818-8044.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 50 / 55
62. Conclusions and Future Work
For Further Reading III
P. Basile, M. Degemmis, A.L. Gentile, P. Lops, and G. Semeraro.
UNIBA: JIGSAW Algorithm for Word Sense Disambiguation.
In Proceedings of the 4th ACL 2007 International Worshop on
Semantic Evaluations (SemEval-2007), pages 398–401. Association for
Computational Linguistics (ACL), 2007.
P. Basile, M. de Gemmis, A.L. Gentile, L. Iaquinta, P. Lops, and
G. Semeraro.
META - MultilanguagE Text Analyzer.
In Proceedings of the Language and Speech Technnology Conference -
LangTech 2008, Rome, Italy, February 28-29, pages 137–140, 2008.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 51 / 55
63. Conclusions and Future Work
For Further Reading IV
Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, and Giovanni
Semeraro.
Combining Knowledge-based Methods and Supervised Learning for
Effective Italian Word Sense Disambiguation.
In Rodolfo Delmonte and Johan Bos, editors, Symposium on
Semantics in Systems for Text Processing, STEP 2008, Venice, Italy,
September 22-24, 2008, Proceedings, volume 1 of Research in
Computational Semantics, pages 5–16. College Publications, 2008.
S. Banerjee and T. Pedersen.
An Adapted Lesk Algorithm for Word Sense Disambiguation Using
WordNet.
In CICLing ’02: Proceedings of the Third International Conference on
Computational Linguistics and Intelligent Text Processing, pages
136–145, London, UK, 2002. Springer-Verlag.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 52 / 55
64. Conclusions and Future Work
For Further Reading V
S. Banerjee and T. Pedersen.
Extended gloss overlaps as measure of semantic relatedness.
In Proceedings of 18th International Joint Conference on Artificial
Intelligence (IJCAI), pages 805–810, Acapulco Mexico, 2003.
Pierpaolo Basile and Giovanni Semeraro.
JIGSAW: An algorithm for word sense disambiguation.
Intelligenza Artificiale, 4(2):53–54, 2007.
M. Lesk.
Automatic sense disambiguation using machine readable dictionaries:
how to tell a pine cone from an ice cream cone.
In Proceedings of ACM SIGDOC Conference, pages 24–26, 1986.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 53 / 55
65. Conclusions and Future Work
For Further Reading VI
D. Martinez, E. Agirre, and X. Wang.
Word relatives in context for word sense disambiguation.
In Proc. of the 2006 Australasian Language Technology Workshop,
pages 42–50, 2006.
G. A. Miller.
WordNet: a lexical database for English.
Commun. ACM, 38(11):39–41, 1995.
P. Resnik.
Disambiguating noun groupings with respect to WordNet senses.
In Proceedings of the Third Workshop on Very Large Corpora, pages
54–68. Association for Computational Linguistics, 1995.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 54 / 55
66. Conclusions and Future Work
For Further Reading VII
G. Semeraro, M. Degemmis, P. Lops, and P. Basile.
Combining Learning and Word Sense Disambiguation for Intelligent
User Profiling.
In Proceedings of the Twentieth International Joint Conference on
Artificial Intelligence IJCAI-07, pages 2856–2861, 2007.
M. Kaufmann, San Francisco, California. ISBN: 978-I-57735-298-3.
Pierpaolo Basile (basilepp@di.uniba.it) WSD and IIA 29/05/09 55 / 55