SlideShare una empresa de Scribd logo
1 de 43
Introduction to Distributional
Semantics
André Freitas
Insight Centre for Data Analytics
Insight Workshop on Distributional Semantics
Galway, 2014
Based on the Great ESSLLI Tutorial from Evert & Lenci
Outline
 Contemporary Semantics
 Distributional Semantics
 Compositional-Distributional Semantics
 Take-away message
Contemporary
Semantics
Shift in the Semantics Landscape
Corroboration
PraxisScientific / FormalPhilosophical
Semantics as a
complex phenomena
Semantics for a Complex World
• Most semantic models have dealt with particular types of
constructions, and have been carried out under very simplifying
assumptions, in true lab conditions.
• If these idealizations are removed it is not clear at all that modern
semantics can give a full account of all but the simplest
models/statements.
Sahlgren, 2013
Formal World Real World
Baroni et al., 2012
What is Distributional
Semantics?
Meaning
 Word meaning is usually represented in terms of some formal,
symbolic structure, either external or internal to the word
 External structure
- Associations between different concepts
 Internal structure
- Feature (property, attribute) lists
 The semantic properties of a word are derived from the formal
structure of its representation
- e.g. Inference algorithm, etc.
Semantics = Meaning representation model (data) +
inference model
Formal Representation of Meaning
 Modelling fine-grained lexical inferences
Formal Representation of Meaning
(Problems)
 Different meanings
- bat (animal), bat (artefact)
 Meaning variation in context
- clever politician, clever tycoon
 Meaning evolution
 Ambiguity, vagueness, inconsistency
Word meaning acquisition
Lack of flexibility
Scalability
Distributional Hypothesis
“Words occurring in similar (linguistic) contexts tend
to be semantically similar”
 He filled the wampimuk with the substance, passed it
around and we all drunk some
 We found a little, hairy wampimuk sleeping behind the
tree
Weak and Strong DH (Lenci, 2008)
 Weak DH:
- Word meaning is reflected in linguistic distributions
- By inspecting a sufficiently large number of distributional
contexts we may have a useful surrogate representation of
meaning.
 Strong DH:
- A cognitive hypothesis about the form and origin of semantic
representations
Contextual Representation
 Abstract structure that accumulates encounters with the words
in various (linguistic) contexts.
 For our purposes …
- Context is equated with linguistic context
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
contexts = nouns and verbs in the same
sentence
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
bark
dog
park
leash
contexts = nouns and verbs in the same
sentence
bark : 2
park : 1
leash : 1
owner : 1
Distributional Semantic Models (DSMs)
distributional matrix = targets x contexts
contexts
targets
Vector Space Model (VSM)
Semantic Similarity & Relatedness
θ
car
dog
cat
bark
run
leash
Semantic Similarity & Relatedness
 Semantic similarity - two words sharing a high number of
salient
- features (attributes)
- synonymy (car/automobile)
- hyperonymy (car/vehicle)
- co-hyponymy (car/van/truck)
 Semantic relatedness (Budanitsky & Hirst 2006) - two words
semantically associated without being necessarily similar
- function (car/drive)
- meronymy (car/tyre)
- location (car/road)
- attribute (car/fast)
Distributional Semantic Models (DSMs)
 Computational models that build contextual semantic representations
from corpus data
 Semantic context is represented by a vector
 Vectors are obtained through the statistical analysis of the linguistic
contexts of a word
 Salience of contexts (cf. context weighting scheme)
 Semantic similarity/relatedness as the core operation over the model
DSMs as Commonsense Reasoning
Commonsense is here
θ
car
dog
cat
bark
run
leash
DSMs as Commonsense Reasoning
DSMs as Commonsense Reasoning
θ
car
dog
cat
bark
run
leash
...
vs.
Semantic best-effort
Demonstration (EasyESA)
http://treo.deri.ie/easyesa/
Applications
 Applications
- Semantic search
- Question answering
- Approximate semantic inference
- Word sense disambiguation
- Paraphrase detection
- Text entailment
- Semantic anomaly detection
...
Alternative Names for DSMs
 Corpus-based semantics
 Statistical semantics
 Geometrical models of meaning
 Vector semantics
 Word (semantic) space models
Definition of DSMs
Building a DSM
 Pre-process a corpus (target, context)
 Count the target-context co-occurrences
 Weight the contexts (optional)
 Build the distributional matrix
 Reduce the matrix dimensions (optional)
 Parameters
- Corpus
- Context type
- Weighting scheme
- Similarity measure
- Number of dimensions
 A parameter configuration determines the DSM: (LSA, ESA, …)
Parameters
 Corpus pre-processing
- Stemming/lemmatization
- POS tagging
- Syntactic Dependencies
 Context
- Document
- Paragraph
- Passage
- Word windows
- Words
- Linguistic features
- Lingustic patterns
- Verbs : contexts nouns
- Verbs : contexts adverbs
- etc.
- Size
- Shape
Context
Engineering
Effect of Parameters
Context Weighting
 Smoothing frequency differences: From raw counts to log-
frequency.
 Association measures (Evert 2005): are used to give more
weight to contexts that are more significantly associated with a
target word
Context Weighting
Measures
Kiela & Clark, 2014
Similarity Measures
Kiela & Clark, 2014
What is the best parameter configuration?
 The best parameter configuration depends on the task.
 Systematic exploration of the parameters
DSM Instances
 Latent Semantic Analysis (Landauer & Dumais 1996)
 Hyperspace Analogue to Language (Lund & Burgess 1996)
 Infomap NLP (Widdows 2004)
 Random Indexing (Karlgren & Salhgren 2001)
 Dependency Vectors (Pad´o & Lapata 2007)
 Explicit Semanitc Analysis (Gabrilovich & Markovitch, 2008)
 Distributional Memory (Baroni & Lenci 2009)
Compositional
Semantics
Paraphrase Detection
I find it rather odd that people are already trying to tie the
Commission's hands in relation to the proposal for a
directive, while at the same calling on it to present a Green
Paper on the current situation with regard to optional and
supplementary health insurance schemes.
I find it a little strange to now obliging the Commission to
a motion for a resolution and to ask him at the same time
to draw up a Green Paper on the current state of voluntary
insurance and supplementary sickness insurance.
=?
Compositional Semantics
 Can we extend DS to account for the meaning of phrases
and sentences?
 Compositionality: The meaning of a complex expression
is a function of the meaning of its constituent parts.
Compositional Semantics
Words in which the meaning is
directly determined by their
distributional behaviour (e.g.,
nouns).
Words that act as functions
transforming the distributional
profile of other words (e.g., verbs,
adjectives, …).
Compositional Semantics
Mixture Function
Compositional Semantics
 Take the syntactic structure to constitute the backbone
guiding the assembly of the semantic representations of
phrases.
(CHASE × cats) × dogs.
3rd order tensor vector
vector
(CHASE × cats)
Baroni et al., 2012
Formal Model
 Distributional Semantics & Category Theory
Take-away message
 Low acquisition effort
 Simple way to build a commonsense KB
 Semantic approximation as a built-in construct
 Semantic best-effort
 Simple to use
 DSMs are evolving fast (compositional and formal grounding)
 Distributional semantics brings a promising approach for
building semantic models that work in the real world
Great Introductory References
 Evert & Lenci ESSLLI Tutorial on Distributional
Semantics, 2009. (many slides were taken or adapted
from this great tutorial).
 Turney & Pantel, From Frequency to Meaning:Vector
Space Models of Semantics, 2010.
 Baroni et al., Frege in Space: A Program for
Compositional Distributional Semantics, 2012.
 Kiela & Clark: A Systematic Study of Semantic Vector
Space Model Parameters, 2014.

Más contenido relacionado

La actualidad más candente

Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semantics
Rushdi Shams
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parameters
Velnar
 

La actualidad más candente (20)

Lexical semantics
Lexical semanticsLexical semantics
Lexical semantics
 
European linguistics in the 20th century
European linguistics in the 20th centuryEuropean linguistics in the 20th century
European linguistics in the 20th century
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Above The Clause
Above The ClauseAbove The Clause
Above The Clause
 
Anaphora resolution
Anaphora resolutionAnaphora resolution
Anaphora resolution
 
Syntactic analysis in NLP
Syntactic analysis in NLPSyntactic analysis in NLP
Syntactic analysis in NLP
 
Treebank annotation
Treebank annotationTreebank annotation
Treebank annotation
 
Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semantics
 
Semantic interpretation
Semantic interpretationSemantic interpretation
Semantic interpretation
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Traditional Grammar
Traditional GrammarTraditional Grammar
Traditional Grammar
 
Lecture 2: Computational Semantics
Lecture 2: Computational SemanticsLecture 2: Computational Semantics
Lecture 2: Computational Semantics
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Linguistics semantics-syntax-presentation1
Linguistics semantics-syntax-presentation1Linguistics semantics-syntax-presentation1
Linguistics semantics-syntax-presentation1
 
Semantics
SemanticsSemantics
Semantics
 
Morphological Analysis
Morphological AnalysisMorphological Analysis
Morphological Analysis
 
The Different Theories of Semantics
The Different Theories of Semantics The Different Theories of Semantics
The Different Theories of Semantics
 
Case theory
Case theoryCase theory
Case theory
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
Principles of parameters
Principles of parametersPrinciples of parameters
Principles of parameters
 

Destacado

Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
Cloudera, Inc.
 

Destacado (12)

A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
 
Knowledge graph
Knowledge graphKnowledge graph
Knowledge graph
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyOn the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
 
Theories of meaning
Theories of meaningTheories of meaning
Theories of meaning
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Similar a Introduction to Distributional Semantics

An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
Andre Freitas
 

Similar a Introduction to Distributional Semantics (20)

Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
Aaai 2006 Pedersen
Aaai 2006 PedersenAaai 2006 Pedersen
Aaai 2006 Pedersen
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Ijcai 2007 Pedersen
Ijcai 2007 PedersenIjcai 2007 Pedersen
Ijcai 2007 Pedersen
 
NLP
NLPNLP
NLP
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
The Semantic Quilt
The Semantic QuiltThe Semantic Quilt
The Semantic Quilt
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Measuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and ConceptsMeasuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and Concepts
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddings
 
nlp (1).pptx
nlp (1).pptxnlp (1).pptx
nlp (1).pptx
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Nlp (1)
Nlp (1)Nlp (1)
Nlp (1)
 
Mapping Landscape of Patterns - Vol.2
Mapping Landscape of Patterns - Vol.2Mapping Landscape of Patterns - Vol.2
Mapping Landscape of Patterns - Vol.2
 

Más de Andre Freitas

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
Andre Freitas
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Andre Freitas
 

Más de Andre Freitas (20)

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering Systems
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category Descriptors
 

Último

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 

Último (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Introduction to Distributional Semantics

  • 1. Introduction to Distributional Semantics André Freitas Insight Centre for Data Analytics Insight Workshop on Distributional Semantics Galway, 2014 Based on the Great ESSLLI Tutorial from Evert & Lenci
  • 2. Outline  Contemporary Semantics  Distributional Semantics  Compositional-Distributional Semantics  Take-away message
  • 4. Shift in the Semantics Landscape Corroboration PraxisScientific / FormalPhilosophical Semantics as a complex phenomena
  • 5. Semantics for a Complex World • Most semantic models have dealt with particular types of constructions, and have been carried out under very simplifying assumptions, in true lab conditions. • If these idealizations are removed it is not clear at all that modern semantics can give a full account of all but the simplest models/statements. Sahlgren, 2013 Formal World Real World Baroni et al., 2012
  • 7. Meaning  Word meaning is usually represented in terms of some formal, symbolic structure, either external or internal to the word  External structure - Associations between different concepts  Internal structure - Feature (property, attribute) lists  The semantic properties of a word are derived from the formal structure of its representation - e.g. Inference algorithm, etc. Semantics = Meaning representation model (data) + inference model
  • 8. Formal Representation of Meaning  Modelling fine-grained lexical inferences
  • 9. Formal Representation of Meaning (Problems)  Different meanings - bat (animal), bat (artefact)  Meaning variation in context - clever politician, clever tycoon  Meaning evolution  Ambiguity, vagueness, inconsistency Word meaning acquisition Lack of flexibility Scalability
  • 10. Distributional Hypothesis “Words occurring in similar (linguistic) contexts tend to be semantically similar”  He filled the wampimuk with the substance, passed it around and we all drunk some  We found a little, hairy wampimuk sleeping behind the tree
  • 11. Weak and Strong DH (Lenci, 2008)  Weak DH: - Word meaning is reflected in linguistic distributions - By inspecting a sufficiently large number of distributional contexts we may have a useful surrogate representation of meaning.  Strong DH: - A cognitive hypothesis about the form and origin of semantic representations
  • 12. Contextual Representation  Abstract structure that accumulates encounters with the words in various (linguistic) contexts.  For our purposes … - Context is equated with linguistic context
  • 13. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.”
  • 14. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.” contexts = nouns and verbs in the same sentence
  • 15. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.” bark dog park leash contexts = nouns and verbs in the same sentence bark : 2 park : 1 leash : 1 owner : 1
  • 16. Distributional Semantic Models (DSMs) distributional matrix = targets x contexts contexts targets Vector Space Model (VSM)
  • 17. Semantic Similarity & Relatedness θ car dog cat bark run leash
  • 18. Semantic Similarity & Relatedness  Semantic similarity - two words sharing a high number of salient - features (attributes) - synonymy (car/automobile) - hyperonymy (car/vehicle) - co-hyponymy (car/van/truck)  Semantic relatedness (Budanitsky & Hirst 2006) - two words semantically associated without being necessarily similar - function (car/drive) - meronymy (car/tyre) - location (car/road) - attribute (car/fast)
  • 19. Distributional Semantic Models (DSMs)  Computational models that build contextual semantic representations from corpus data  Semantic context is represented by a vector  Vectors are obtained through the statistical analysis of the linguistic contexts of a word  Salience of contexts (cf. context weighting scheme)  Semantic similarity/relatedness as the core operation over the model
  • 20. DSMs as Commonsense Reasoning Commonsense is here θ car dog cat bark run leash
  • 21. DSMs as Commonsense Reasoning
  • 22. DSMs as Commonsense Reasoning θ car dog cat bark run leash ... vs. Semantic best-effort
  • 24. Applications  Applications - Semantic search - Question answering - Approximate semantic inference - Word sense disambiguation - Paraphrase detection - Text entailment - Semantic anomaly detection ...
  • 25. Alternative Names for DSMs  Corpus-based semantics  Statistical semantics  Geometrical models of meaning  Vector semantics  Word (semantic) space models
  • 27. Building a DSM  Pre-process a corpus (target, context)  Count the target-context co-occurrences  Weight the contexts (optional)  Build the distributional matrix  Reduce the matrix dimensions (optional)  Parameters - Corpus - Context type - Weighting scheme - Similarity measure - Number of dimensions  A parameter configuration determines the DSM: (LSA, ESA, …)
  • 28. Parameters  Corpus pre-processing - Stemming/lemmatization - POS tagging - Syntactic Dependencies  Context - Document - Paragraph - Passage - Word windows - Words - Linguistic features - Lingustic patterns - Verbs : contexts nouns - Verbs : contexts adverbs - etc. - Size - Shape Context Engineering
  • 30. Context Weighting  Smoothing frequency differences: From raw counts to log- frequency.  Association measures (Evert 2005): are used to give more weight to contexts that are more significantly associated with a target word
  • 33. What is the best parameter configuration?  The best parameter configuration depends on the task.  Systematic exploration of the parameters
  • 34. DSM Instances  Latent Semantic Analysis (Landauer & Dumais 1996)  Hyperspace Analogue to Language (Lund & Burgess 1996)  Infomap NLP (Widdows 2004)  Random Indexing (Karlgren & Salhgren 2001)  Dependency Vectors (Pad´o & Lapata 2007)  Explicit Semanitc Analysis (Gabrilovich & Markovitch, 2008)  Distributional Memory (Baroni & Lenci 2009)
  • 36. Paraphrase Detection I find it rather odd that people are already trying to tie the Commission's hands in relation to the proposal for a directive, while at the same calling on it to present a Green Paper on the current situation with regard to optional and supplementary health insurance schemes. I find it a little strange to now obliging the Commission to a motion for a resolution and to ask him at the same time to draw up a Green Paper on the current state of voluntary insurance and supplementary sickness insurance. =?
  • 37. Compositional Semantics  Can we extend DS to account for the meaning of phrases and sentences?  Compositionality: The meaning of a complex expression is a function of the meaning of its constituent parts.
  • 38. Compositional Semantics Words in which the meaning is directly determined by their distributional behaviour (e.g., nouns). Words that act as functions transforming the distributional profile of other words (e.g., verbs, adjectives, …).
  • 40. Compositional Semantics  Take the syntactic structure to constitute the backbone guiding the assembly of the semantic representations of phrases. (CHASE × cats) × dogs. 3rd order tensor vector vector (CHASE × cats) Baroni et al., 2012
  • 41. Formal Model  Distributional Semantics & Category Theory
  • 42. Take-away message  Low acquisition effort  Simple way to build a commonsense KB  Semantic approximation as a built-in construct  Semantic best-effort  Simple to use  DSMs are evolving fast (compositional and formal grounding)  Distributional semantics brings a promising approach for building semantic models that work in the real world
  • 43. Great Introductory References  Evert & Lenci ESSLLI Tutorial on Distributional Semantics, 2009. (many slides were taken or adapted from this great tutorial).  Turney & Pantel, From Frequency to Meaning:Vector Space Models of Semantics, 2010.  Baroni et al., Frege in Space: A Program for Compositional Distributional Semantics, 2012.  Kiela & Clark: A Systematic Study of Semantic Vector Space Model Parameters, 2014.