[2019] Class-based N-gram Models of Natural Language

•

1 recomendación•378 vistas

This document discusses class-based n-gram models of natural language. It outlines that n-gram models predict the probability of a word based on the previous n-1 words, and that increasing n improves accuracy but decreases reliability due to more parameters needing estimation. It also discusses using word classes in n-gram models by grouping words with similar contexts together, and finding "sticky pairs" of words that occur near each other more than expected based on their individual frequencies.

Tecnología

Class-Based n-gram Models of
Natural Language
Victoria Lawlor
1/23/2019
Brown, P. F., Desouza, P. V., Mercer, R. L.,
Pietra, V. J. D., & Lai, J. C. (1992).

Problem
“In a number of natural language processing
tasks, we face the problem of recovering a string
of English words after it has been garbled by
passage through a noisy channel.”

Outline
1. Language Models
2. Word Classes
3. Sticky Pairs and Semantic Classes

Language Models
Evaluating language models: perplexity
• Perplexity is a measurement of how well a probability
distribution or probability model predicts a sample
• A low perplexity indicates the probability distribution is
good at predicting the sample

Language Models
N-gram models—predict the probability of a word based
on the words before
Predict wn given the previous n – 1 words:
P(wn|w1w2...wn – 1)

Language Models
Estimate N-gram probabilities:
Number of times the
given n – 1 words
precedes wn.
Sum of all the n-
grams that start with
the given n – 1 words.

Language Models
Increasing n
• Increases the accuracy of the model
• Decreases reliability
• Increases the number of parameters to estimate
Solution: Interpolation

Word Classes
Friday
Monday
Wednesday
weekends
man
boy
woman
dude
guy
girl
feet
miles
pounds
inches
meters
acres Thursday

Word Classes
Monday Tuesday Wednesday Thursday Friday
Let’s meet on______
Are you free to hang out ______
Tomorrow is ______
I’ll be there ______
You have until ______
Assigning words to classes:

Word Classes
Assigning words to classes—mutual information:

Word Classes
Assigning words to classes (greedy):
• Put each word into its own class
• Select desired number of classes, C
• Merge the pair of classes for which the loss in average
mutual information is least, until the number of classes = C
• Iterate through each word and move it to a different class if
it will lead to higher mutual information

Sticky Pairs
Mutual information of w1 and w2:
If w2 follows w1 more often than we expect, then
the mutual information is positive, and the pair is
“sticky”.

Sticky Pairs
Word Pair Mutual Information
Humpy Dumpty 22.5
Klux Klan 22.2
Avant garde 22.1
Taj Mahal 21.8
Mumbo jumbo 21.4
Helter skelter 21.4

Semantic Stickiness
We can seek pairs of words that simply occur near one another more than we would expect.

Semantic Stickiness
We can seek pairs of words that simply occur near one another more than we would expect.
Prnear(w1w2) > Pr(w1) Pr(w2)
(much)

Semantic Stickiness
large
size
smaller
small
larger
attorney
counsel trial
court judge
morning
noon
evening
night
nights
midnight
bed
school
classroom
teaching
grade
math
letter
addressed
enclosed
letters
correspondence
performance
performed
perform
performs
performing

Más contenido relacionado

La actualidad más candente

NlpHyderabad Scalability Meetup

natural language processing help at myassignmenthelp.netwww.myassignmenthelp.net

Introduction to Natural Language ProcessingPranav Gupta

Natural Language Processing Adarsh Saxena

Natural lanaguage processinggulshan kumar

Natural language processingAbash shah

Natural Language ProcessingYasir Khan

Natural Language ProcessingJaganadh Gopinadhan

Natural language processing (NLP) ASWINKP11

NLP.pptxRahul Borate

Natural language processingNational Institute of Technology Durgapur

Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev

Introduction to natural language processing, history and originShubhankar Mohan

Natural Language ProcessingVeenaSKumar2

Introduction to natural language processing (NLP)Alia Hamwi

Natural Language Processing (NLP)Yuriy Guts

Word embedding ShivaniChoudhary74

Language modelsMaryam Khordad

What is word2vec?Traian Rebedea

NLTKGirish Khanzode

La actualidad más candente (20)

Nlp

natural language processing help at myassignmenthelp.net

Introduction to Natural Language Processing

Natural Language Processing

Natural lanaguage processing

Natural language processing

Natural Language Processing

Natural language processing (NLP)

NLP.pptx

Natural language processing

Introduction to Transformers for NLP - Olga Petrova

Introduction to natural language processing, history and origin

Natural Language Processing

Introduction to natural language processing (NLP)

Natural Language Processing (NLP)

Word embedding

Language models

What is word2vec?

NLTK

Similar a [2019] Class-based N-gram Models of Natural Language

Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters

Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Daniele Di Mitri

Morphology: Start Your Engines! (NCRA 2014)Kenneth McKee

Interview presentationJoseph Gubbins

Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters

Morphology: Start Your EnginesElizabeth Swaggerty

(Deep) Neural Networks在 NLP 和 Text Mining 总结君廖

A Panorama of Natural Language ProcessingTed Xiao

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan

Semantic_properties-BlackboxNLPPia Sommerauer

Assessing Vocabulary with Dr. John ReadLanguage Acquisition Resource Center

NonKhanhHoa Tran

Semeval Deep Learning In Semantic SimilarityEnterprise Search Warsaw Meetup

Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax

Word vectorsAdwait Bhave

Similar a [2019] Class-based N-gram Models of Natural Language (15)

Deep Learning for Natural Language Processing: Word Embeddings

Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...

Morphology: Start Your Engines! (NCRA 2014)

Interview presentation

Visual-Semantic Embeddings: some thoughts on Language

Morphology: Start Your Engines

(Deep) Neural Networks在 NLP 和 Text Mining 总结

A Panorama of Natural Language Processing

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn

Semantic_properties-BlackboxNLP

Assessing Vocabulary with Dr. John Read

Non

Semeval Deep Learning In Semantic Similarity

Designing, Visualizing and Understanding Deep Neural Networks

Word vectors

Más de Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi

Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi

Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi

Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi

The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi

Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi

Abstract Meaning RepresentationJinho Choi

Semantic Role LabelingJinho Choi

CKY ParsingJinho Choi

CS329 - WordNet SimilaritiesJinho Choi

CS329 - Lexical RelationsJinho Choi

Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi

Attention is All You Need for AMR ParsingJinho Choi

Graph-to-Text Generation and its Applications to DialogueJinho Choi

Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi

Topological SortJinho Choi

Tries - PutJinho Choi

Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi

Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi

How to make Emora talk about Sports IntelligentlyJinho Choi

Más de Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...

Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...

Competence-Level Prediction and Resume & Job Description Matching Using Conte...

Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...

The Myth of Higher-Order Inference in Coreference Resolution

Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...

Abstract Meaning Representation

Semantic Role Labeling

CKY Parsing

CS329 - WordNet Similarities

CS329 - Lexical Relations

Automatic Knowledge Base Expansion for Dialogue Management

Attention is All You Need for AMR Parsing

Graph-to-Text Generation and its Applications to Dialogue

Real-time Coreference Resolution for Dialogue Understanding

Topological Sort

Tries - Put

Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease

Building Widely-Interpretable Semantic Networks for Dialogue Contexts

How to make Emora talk about Sports Intelligently

Último

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Gen AI in Business - Global Trends Report 2024.pdfAddepto

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

AI as an Interface for Commercial BuildingsMemoori

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

CloudStudio User manual (basic edition):comworks

Search Engine Optimization SEO PDF for 2024.pdfRankYa

[2019] Class-based N-gram Models of Natural Language

1. Class-Based n-gram Models of Natural Language Victoria Lawlor 1/23/2019 Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992).

2. Problem “In a number of natural language processing tasks, we face the problem of recovering a string of English words after it has been garbled by passage through a noisy channel.”

3. Outline 1. Language Models 2. Word Classes 3. Sticky Pairs and Semantic Classes

4. Language Models Evaluating language models: perplexity • Perplexity is a measurement of how well a probability distribution or probability model predicts a sample • A low perplexity indicates the probability distribution is good at predicting the sample

5. Language Models N-gram models—predict the probability of a word based on the words before Predict wn given the previous n – 1 words: P(wn|w1w2...wn – 1)

6. Language Models Estimate N-gram probabilities: Number of times the given n – 1 words precedes wn. Sum of all the n- grams that start with the given n – 1 words.

7. Language Models Increasing n • Increases the accuracy of the model • Decreases reliability • Increases the number of parameters to estimate Solution: Interpolation

8. Word Classes Friday Monday Wednesday weekends man boy woman dude guy girl feet miles pounds inches meters acres Thursday

9. Word Classes N-gram class model:

10. Word Classes N-gram class model:

11. Word Classes Monday Tuesday Wednesday Thursday Friday Let’s meet on______ Are you free to hang out ______ Tomorrow is ______ I’ll be there ______ You have until ______ Assigning words to classes:

12. Word Classes Assigning words to classes—mutual information:

13. Word Classes Assigning words to classes (greedy): • Put each word into its own class • Select desired number of classes, C • Merge the pair of classes for which the loss in average mutual information is least, until the number of classes = C • Iterate through each word and move it to a different class if it will lead to higher mutual information

14. Word Classes

15. Sticky Pairs Mutual information of w1 and w2: If w2 follows w1 more often than we expect, then the mutual information is positive, and the pair is “sticky”.

16. Sticky Pairs Word Pair Mutual Information Humpy Dumpty 22.5 Klux Klan 22.2 Avant garde 22.1 Taj Mahal 21.8 Mumbo jumbo 21.4 Helter skelter 21.4

17. Semantic Stickiness We can seek pairs of words that simply occur near one another more than we would expect.

18. Semantic Stickiness We can seek pairs of words that simply occur near one another more than we would expect. Prnear(w1w2) > Pr(w1) Pr(w2) (much)

19. Semantic Stickiness large size smaller small larger attorney counsel trial court judge morning noon evening night nights midnight bed school classroom teaching grade math letter addressed enclosed letters correspondence performance performed perform performs performing

20. Thanks!

[2019] Class-based N-gram Models of Natural Language

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a [2019] Class-based N-gram Models of Natural Language

Similar a [2019] Class-based N-gram Models of Natural Language (15)

Más de Jinho Choi

Más de Jinho Choi (20)

Último

Último (20)

[2019] Class-based N-gram Models of Natural Language