SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
Introduction to Natural Language Processing
Lecture 1. Introduction
Ekaterina Chernyak, Dmitry Ilvovsky
echernyak@hse.ru, dilvovsky@hse.ru
National Research University
Higher School of Economics (Moscow)
July 11, 2015
Brief history of NLP
January 7, 1954 – Georgetown experiment. Russian to English
machine translation;
1957 – Noam Chomsky introduced “universal grammar”;
since 1961 – Brown Corpus;
the late 1960’s – ELIZA, a simulation of a psychotherapist;
1975 – Vector Space Model by Salton;
up to the early 1980’s – rule based approaches;
after the early 1980’s – machine learning, corpus linguistics;
1998 – Language Model by Ponte and Croft;
since 1999 – topic modeling (LSI, pLSI, LDA, etc);
1999 – “Foundations of Statistical Natural Language Processing” by
Manning and Shuetze;
2009 – “Natural Language Processing with Python” by Bird, Klein,
and Loper.
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 2 / 6
Major tasks of NLP
Machine Translation
Text classification
Sentiment analysis
Spam filtering
Classification by topic or by genre
Text clustering
Named entity recognition
Question answering
Automatic summarization
Natural language generation
Speech recognition
Spell checking
User study design and evaluation
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 3 / 6
NLP techniques
The level of characters:
Word segmentation
Sentence breaking
The level of words – morphology:
Part of speech (POS) tagging
Word sense disambiguation
The level of sentences – syntax:
Parsing
The level of senses – semantics:
Coreference resolution
Discourse analysis
Semantic role labeling
Synonymy detection
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 4 / 6
Main problems
Ambiguity
Lexical ambiguity:
Time flies like an arrow; fruit flies like a banana.
Syntactic ambiguity
Police help dog bite victim.
Wanted: a nurse for a baby about twenty years old.
Neologism: unfriend, retweet, instagram
Different spelling: NY, New York City, New-York
Non-standard language: HIIII, how are u? miss u SOOOO much:((((
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 5 / 6
About this course
We will cover the following topics:
Tokenization
POS tagging
Key word and phrase extraction
Parsing
Synonyms detection
Language sources
Topic modeling
Text visualisation
We will try to use Python and R for various tasks.
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 6 / 6
Further reading
Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 7 / 6

Más contenido relacionado

La actualidad más candente

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Yasir Khan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 

La actualidad más candente (20)

Nlp
NlpNlp
Nlp
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Natural language-processing
Natural language-processingNatural language-processing
Natural language-processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Nltk
NltkNltk
Nltk
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
NLP
NLPNLP
NLP
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing glossary for Coders
Natural Language Processing glossary for CodersNatural Language Processing glossary for Coders
Natural Language Processing glossary for Coders
 

Destacado

зачем нужен чистый корпус
зачем нужен чистый корпусзачем нужен чистый корпус
зачем нужен чистый корпус
Ekaterina Chernyak
 
редактор параллельных разметок
редактор параллельных разметокредактор параллельных разметок
редактор параллельных разметок
Ekaterina Chernyak
 

Destacado (15)

Intro to NLP (RU)
Intro to NLP (RU)Intro to NLP (RU)
Intro to NLP (RU)
 
Backgammon
BackgammonBackgammon
Backgammon
 
Suhoroslov
SuhoroslovSuhoroslov
Suhoroslov
 
зачем нужен чистый корпус
зачем нужен чистый корпусзачем нужен чистый корпус
зачем нужен чистый корпус
 
Koptsov
KoptsovKoptsov
Koptsov
 
Yacovlev
YacovlevYacovlev
Yacovlev
 
Gusakov
GusakovGusakov
Gusakov
 
Hse.projects 17.01.2015
Hse.projects 17.01.2015Hse.projects 17.01.2015
Hse.projects 17.01.2015
 
Ignat vita artur
Ignat vita arturIgnat vita artur
Ignat vita artur
 
L3 v2
L3 v2L3 v2
L3 v2
 
Koptsov.web.introduction
Koptsov.web.introductionKoptsov.web.introduction
Koptsov.web.introduction
 
редактор параллельных разметок
редактор параллельных разметокредактор параллельных разметок
редактор параллельных разметок
 
Gayazov
GayazovGayazov
Gayazov
 
Beyond best practice: Crafting purposefully distinct experiences
Beyond best practice: Crafting purposefully distinct experiencesBeyond best practice: Crafting purposefully distinct experiences
Beyond best practice: Crafting purposefully distinct experiences
 
Chernyak_defense
Chernyak_defenseChernyak_defense
Chernyak_defense
 

Similar a L1

Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]
Dr. Cupid Lucid
 
Theories%20of%20 Language%20 Acquisition[1]
Theories%20of%20 Language%20 Acquisition[1]Theories%20of%20 Language%20 Acquisition[1]
Theories%20of%20 Language%20 Acquisition[1]
Dr. Cupid Lucid
 
Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]
Dr. Cupid Lucid
 
The Input Learner Learners Forward Throughout...
The Input Learner Learners Forward Throughout...The Input Learner Learners Forward Throughout...
The Input Learner Learners Forward Throughout...
Tiffany Sandoval
 
679 discussion leader final
679 discussion leader final679 discussion leader final
679 discussion leader final
Crizdee
 
Cs599 Fall2005 Lecture 01
Cs599 Fall2005 Lecture 01Cs599 Fall2005 Lecture 01
Cs599 Fall2005 Lecture 01
Dr. Cupid Lucid
 
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
HimanshuSharma914532
 
U fhp 2012-13-l3-sla-handout 1-2 - inconnu(e)
U fhp  2012-13-l3-sla-handout 1-2 - inconnu(e)U fhp  2012-13-l3-sla-handout 1-2 - inconnu(e)
U fhp 2012-13-l3-sla-handout 1-2 - inconnu(e)
blessedkkr
 
. . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx. . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx
adkinspaige22
 
. . . all human languages do share the same structure. More e.docx
 . . . all human languages do share the same structure. More e.docx . . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx
ShiraPrater50
 

Similar a L1 (20)

Towards a lingua universalis
Towards a lingua universalisTowards a lingua universalis
Towards a lingua universalis
 
Warnikchow - SAIT - 0529
Warnikchow - SAIT - 0529Warnikchow - SAIT - 0529
Warnikchow - SAIT - 0529
 
Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]
 
Theories of language acquisition
Theories of language acquisitionTheories of language acquisition
Theories of language acquisition
 
Theories%20of%20 Language%20 Acquisition[1]
Theories%20of%20 Language%20 Acquisition[1]Theories%20of%20 Language%20 Acquisition[1]
Theories%20of%20 Language%20 Acquisition[1]
 
Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]Theories Of Language Acquisition[1]
Theories Of Language Acquisition[1]
 
SLA and usage based theory
SLA and usage based theorySLA and usage based theory
SLA and usage based theory
 
The Input Learner Learners Forward Throughout...
The Input Learner Learners Forward Throughout...The Input Learner Learners Forward Throughout...
The Input Learner Learners Forward Throughout...
 
A Working Paper On Second Language Acquisition Research Some Notes On Theory...
A Working Paper On Second Language Acquisition Research  Some Notes On Theory...A Working Paper On Second Language Acquisition Research  Some Notes On Theory...
A Working Paper On Second Language Acquisition Research Some Notes On Theory...
 
679 discussion leader final
679 discussion leader final679 discussion leader final
679 discussion leader final
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudy
 
Cs599 Fall2005 Lecture 01
Cs599 Fall2005 Lecture 01Cs599 Fall2005 Lecture 01
Cs599 Fall2005 Lecture 01
 
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
chomskystheories-of-language-acquisition1-1225480010904742-8-130125064334-php...
 
What English Do University Students Really Need
What English Do University Students Really NeedWhat English Do University Students Really Need
What English Do University Students Really Need
 
AI_08_NLP.pptx
AI_08_NLP.pptxAI_08_NLP.pptx
AI_08_NLP.pptx
 
U fhp 2012-13-l3-sla-handout 1-2 - inconnu(e)
U fhp  2012-13-l3-sla-handout 1-2 - inconnu(e)U fhp  2012-13-l3-sla-handout 1-2 - inconnu(e)
U fhp 2012-13-l3-sla-handout 1-2 - inconnu(e)
 
What is Universal Grammar Theory and its Criticism
What is Universal Grammar Theory and its Criticism What is Universal Grammar Theory and its Criticism
What is Universal Grammar Theory and its Criticism
 
. . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx. . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx
 
. . . all human languages do share the same structure. More e.docx
 . . . all human languages do share the same structure. More e.docx . . . all human languages do share the same structure. More e.docx
. . . all human languages do share the same structure. More e.docx
 
Ren Hulin, X. N. (2014). A study of Chomsky's Universal Grammar in Second Lan...
Ren Hulin, X. N. (2014). A study of Chomsky's Universal Grammar in Second Lan...Ren Hulin, X. N. (2014). A study of Chomsky's Universal Grammar in Second Lan...
Ren Hulin, X. N. (2014). A study of Chomsky's Universal Grammar in Second Lan...
 

L1

  • 1. Introduction to Natural Language Processing Lecture 1. Introduction Ekaterina Chernyak, Dmitry Ilvovsky echernyak@hse.ru, dilvovsky@hse.ru National Research University Higher School of Economics (Moscow) July 11, 2015
  • 2. Brief history of NLP January 7, 1954 – Georgetown experiment. Russian to English machine translation; 1957 – Noam Chomsky introduced “universal grammar”; since 1961 – Brown Corpus; the late 1960’s – ELIZA, a simulation of a psychotherapist; 1975 – Vector Space Model by Salton; up to the early 1980’s – rule based approaches; after the early 1980’s – machine learning, corpus linguistics; 1998 – Language Model by Ponte and Croft; since 1999 – topic modeling (LSI, pLSI, LDA, etc); 1999 – “Foundations of Statistical Natural Language Processing” by Manning and Shuetze; 2009 – “Natural Language Processing with Python” by Bird, Klein, and Loper. Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 2 / 6
  • 3. Major tasks of NLP Machine Translation Text classification Sentiment analysis Spam filtering Classification by topic or by genre Text clustering Named entity recognition Question answering Automatic summarization Natural language generation Speech recognition Spell checking User study design and evaluation Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 3 / 6
  • 4. NLP techniques The level of characters: Word segmentation Sentence breaking The level of words – morphology: Part of speech (POS) tagging Word sense disambiguation The level of sentences – syntax: Parsing The level of senses – semantics: Coreference resolution Discourse analysis Semantic role labeling Synonymy detection Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 4 / 6
  • 5. Main problems Ambiguity Lexical ambiguity: Time flies like an arrow; fruit flies like a banana. Syntactic ambiguity Police help dog bite victim. Wanted: a nurse for a baby about twenty years old. Neologism: unfriend, retweet, instagram Different spelling: NY, New York City, New-York Non-standard language: HIIII, how are u? miss u SOOOO much:(((( Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 5 / 6
  • 6. About this course We will cover the following topics: Tokenization POS tagging Key word and phrase extraction Parsing Synonyms detection Language sources Topic modeling Text visualisation We will try to use Python and R for various tasks. Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 6 / 6
  • 7. Further reading Chernyak, Ilvovsky (HSE) Intro to NLP July 11, 2015 7 / 6