SlideShare una empresa de Scribd logo
1 de 81
Descargar para leer sin conexión
Drop me a mail: Drop me a mail: rushdecoder@yahoo.comrushdecoder@yahoo.com
Visit me at: Visit me at: http://rushdishams.googlepages.comhttp://rushdishams.googlepages.com
1Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh
O OO O
V
• Peter mentioned the book I sent to Marry
• We will give medicines to pregnant women
and children
• I saw the boy with the telescope
• The painter put on another coat
• We like flying planes
• The judge threw the book at him
• Visiting relatives can be tiresome
• Da Vinci liked to paint his models nude.
• He wrote the note yesterday
• You mean you carried the information by a
bus?
• Connecting wires are tiring in DLD lab
• Squad helps dog bite victim
Why use computers in translation?
• Too much translation for humans
• Technical materials too boring for humans
• Greater consistency required
• Need results more quickly
• Not everything needs to be top quality
• Reduce costs
• any one of these may justify machine translation or computer aids
Components of a LanguageComponents of a Language
• There are three components of a language‐There are three components of a language
1. Lexicon
C i i2. Categorization
3. Grammar Rules
LexiconLexicon
stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
the | a | an |the | a | an | ……
to | in | on | near |to | in | on | near | ……
and | or | but |and | or | but | ……
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
CategorizationCategorization
NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
| | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
ArticleArticle >> the | a | an |the | a | an | ……
PrepositionPreposition >> to | in | on | near |to | in | on | near | ……
ConjunctionConjunction >> and | or | but |and | or | but | ……
DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Grammar StructureGrammar Structure
• In this lecture and the one following it  In this lecture and the one following it, 
attending it carefully does not mean you know 
all of English languageg g g
• Because, that will take you to read NLP as one 
subject for 4 years! ☺subject for 4 years! ☺
• We will learn how to define the basic grammar 
structure for NLP systemsstructure for NLP systems
• We will also learn what things you need to 
keep in your head while devising such systemskeep in your head while devising such systems
Syntactic TreeSyntactic Tree
• Human recognizes the organization of words g g
according to their POS in a sentence with trees.
• Are you denying?
• Well you can. Because, you didn’t learn it this way in 
your childhood.
• No one did!• No one did!
• But it has been proved that our brain draws a tree 
like structure when we first develop our skills on p
language
• That research is beyond this lecture
Syntactic TreeSyntactic Tree
• So  if you really do that unintentionally  So, if you really do that unintentionally, 
then why not learn it on pen and paper so 
that you can understand how you will teach that you can understand how you will teach 
machines to learn languages?
• The tree structure human contemplates is • The tree structure human contemplates is 
called syntactic tree
Parsing a Syntactic TreeParsing a Syntactic Tree
• Parsing is the process of using grammar Parsing is the process of using grammar 
rules to determine whether a sentence is 
legal  and to obtain its syntactical structurelegal, and to obtain its syntactical structure
• ‘The large cat eats the small rat’
ParsingParsing
The large cat eats the small rat
ParsingParsing
Article adjective noun VerbArticle adjective noun
Article adjective noun
Verb
The large cat eats the small rat
ParsingParsing
Article adjective noun noun phraseVerbArticle adjective noun noun phrase
Article adjective noun
Verb
The large cat eats the small rat
ParsingParsing
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
Parsing
t
Parsing
Noun phrase verb phrase
sentence
Article adjective noun Verb noun phrase
Noun phrase verb phrase
Article adjective noun Verb noun phrase
Article adjective noun
The large cat eats the small rat
Syntactic Tree
• The point where lines begin or end is called node
• Each node has labels like S  PP or chasedEach node has labels like S, PP or chased
• If 2 nodes are connected by a line, the upper node is immediate 
dominator of the lower node. D is the immediate dominator of the
• Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator 
of  D, N, the, dog
Syntactic TreeSyntactic Tree
• Two nodes are sisters if they are immediately dominated by 
same node  D and N are sisterssame node. D and N are sisters.
• The immediate dominator of them is called their mother. NP is 
the mother of D and N. Similarly, D and N are daughters of NP
• The immediate dominators of them are called their parents
Syntactic TreeSyntactic Tree
• Constituents are the terminal nodes that 
  ll d i d b     i l   i l are all dominated by a single non‐terminal 
node. Chased a cat into the garden are 
constituents as they are dominated by VPconstituents as they are dominated by VP
Label BracketingLabel Bracketing
I  i       f  i   h   i    i   h  • It is a process of representing the syntactic tree in another way.
Do yourself: Label Bracket the treeDo yourself: Label Bracket the tree
R b      h     i   h  Remember, you may have to practise the 
reverse‐ constructing a syntactic tree 
f  l b l b k i  ☺from label bracketing ☺
Constituents and CategoriesConstituents and Categories
• Tree structure provides two information‐Tree structure provides two information
1. It divides the sentence into constituents
(in English  these are called phrases)(in English, these are called phrases)
2. It puts them into categories (NP, VP, etc)
Constituents and CategoriesConstituents and Categories
• How do we know what would be the right way to group g y g p
words into right category?
• How do we know into the garden is a category, but a cat 
i t  i   t?into is not?
• Any words that can be moved as group are probably
constituents‐ the meaning of the dog chased a cat into g g
the garden and into the garden, the dog chased a cat. 
• Which one did you move? Into the garden‐ right?
• And the meaning did not change
• That’s probably our constituent
Constituents and CategoriesConstituents and Categories
• Any string of words that can be deleted is Any string of words that can be deleted is 
probably a constituent
• If you omit into the garden from the sentence, y g ,
nothing is changed grammatically.
• Usually, meaning of unit of words makes sense. y g
Into the garden is much more meaningful than a 
cat into
Constituents and CategoriesConstituents and Categories
• However, we are only talking about syntactic However, we are only talking about syntactic 
structure, not the semantic one.
• The dog, the cat and the garden‐ their grammar g, g g
structure is saying they are all noun phrases. 
• It means, they can be used interchangeably‐ no y g y
linguist can deny that
• Then what about‐ “The garden chased the cat 
into the dog”? ☺☺
• We will not focus on semantics, said you before!
AmbiguityAmbiguity
• There are 2 types of ambiguity‐yp g y
1. Lexical Ambiguity: Sentence contains an 
idiom/word/term that has more than one meaning.
Glasses means both drinking glasses and spectacles
2. Structural Ambiguity: Sentence has more than 
one syntactic treeone syntactic tree
I saw the boy with the telescope‐
Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or
Did you see the boy who was having a telescope?
Structural AmbiguityStructural Ambiguity
Difficulties with Natural Language:
Anaphora
•• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es
already introduced in the textalready introduced in the text
After Mary proposed to John, After Mary proposed to John, theythey found a found a 
preacher and got married.preacher and got married.
For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii
Mary saw a ring through the window and asked Mary saw a ring through the window and asked 
John for John for itit
Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
Difficulties with Natural Language:
Indexicality
•• Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance 
situation (place, time, S/H, etc.)situation (place, time, S/H, etc.)
I am over I am over herehere
Why did you do Why did you do thatthat??
Difficulties with Natural Language:
Metonymy
•• Using one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for another
I'   dI'   d Sh kSh kI've readI've read ShakespeareShakespeare
ChryslerChrysler announced record profitsannounced record profits
The The ham sandwichham sandwich on Table 4 wants another on Table 4 wants another 
beerbeer
Difficulties with Natural Language:
Metaphor
•• “Non“Non‐‐literal" usage of words and phrases  literal" usage of words and phrases  NonNon literal  usage of words and phrases, literal  usage of words and phrases, 
often systematic.often systematic.
I've tried killing the process but it won't die. I've tried killing the process but it won't die. 
I    k  i   liI    k  i   liIts parent keeps it alive.Its parent keeps it alive.
Semantics in NLSemantics in NL
• I can't untie that knot with one hand.ca t u t e t at ot t o e a d.
– The sentence is about the abilities of whoever spoke 
or wrote it. (Call this person the speaker.)
– It's also about a knot, maybe one that the speaker is 
pointing at
– The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain 
ability. (This is the contribution of the word `can't'.)
– Untying is a way of making something not tied.
– The sentence doesn't mean that the knot has one 
hand; it has to do with how many hands are used to 
do the untyingdo the untying.
Problems in Semantics in NLProblems in Semantics in NL
• If you do not understand certain you do ot u de sta d ce ta
characteristics of linguistics, you will not be 
able to understand the semantics.
• If you do understand them, you need to feel 
them
• If you do feel them, you need to see the context
• If you see the context, you are dealt with both 
ti   d  ti  i  NLsemantics and pragmatics in NL
• ☺
SynonymySynonymy
• Synonyms are different words (or sometimesSynonyms are different words (or sometimes 
phrases) with identical or very similar 
meaningsmeanings.
• Words that are synonyms are said to 
be synonymous and the state of being abe synonymous, and the state of being a 
synonym is called synonymy
SynonymySynonymy
• student and pupil (noun)student and pupil (noun)
• buy and purchase (verb)
i k d ill ( dj i )• sick and ill (adjective)
• quickly and speedily (adverb)
• on and upon (preposition)
SynonymySynonymy
• Note that synonyms are defined with respectNote that synonyms are defined with respect 
to certain senses of words 
• pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is 
not synonymous with student. 
Si il l h i d h h• Similarly,he expired means the same as he 
died, yet my passport has expired cannot be 
l d b h di dreplaced by my passport has died. 
AntonymyAntonymy
• Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly 
opposite meanings. For example:
• short and tall• short and tall
• dead and alive
• increase and decrease
HomonymyHomonymy
• a homonym is one of a group of words thata homonym is one of a group of words that 
share the same spelling and the same 
pronunciation but have different meanings, 
usually as a result of the two words having 
different origins. 
• The state of being a homonym is 
called homonymy. 
• bark (the sound of a dog) and bark (the skin of 
a tree).
HeteronymyHeteronymy
• heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are 
words with identical spellings (or characters) 
but different pronunciations and meaningsbut different pronunciations and meanings.
Monolingual ambiguity
• morphological ambiguity:
– German -en: noun plural, dative plural, weak noun non-nominative, adjective
masculine non-nominative, etc.
• compound nouns:
– coincide -> coin+cide, cooperate -> cooper+ate
• category ambiguity:
– round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a
voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc.
• homographs and polysemes:
– branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine)
– ball: The ball rolled down the hill, The ball lasted until midnight
Bilingual lexical ambiguity
• English wall: German Mauer (outside) or Wand (inside)
• English river: French fleuve (major) or rivière (general term)
• English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey)
• English blue: Russian goluboi (pale blue) or sinii (dark blue)
• French louer: English hire or rent
• German leihen: English borrow or lend
• English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru
(ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace)
• resolvable by:
– rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.)
– collocations (specifying particular adjacent words)
– frequencies (most probable adjacent or dependent words)
Structural ambiguity
• Flying planes can be dangerous
• The man saw the girl with a telescope
• John mentioned the book I sent to Mary
• I told everyone concerned about the strike
– everyone concerned/involved/relevant, or: everyone disturbed/worried
• He noticed her shaking hands
– either which were shaking from cold, or which were shaking other hands
• They complained to the guide that they could not hear
– that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could
not hear him’)
• The mathematics students sat their examinations
• The mathematics students study today is very complex
– difficulty of identifying noun compound vs. relative clause
• Gas pump prices rose last time oil stocks fell
– each word potentially noun or verb
ReferenceReference
• Richard ThomsonRichard Thomson
http://www.eecs.umich.edu/~rthomaso/docum
ents/general/what is semantics htmlents/general/what‐is‐semantics.html
ReferenceReference
• NLP for Prolog Programmers by Michael A  NLP for Prolog Programmers by Michael A. 
Covington
Chapter 4Chapter 4
Rushdi Shams, Dept of CSE, KUET, 
Bangladesh
1
ReferenceReference
• Wikipedia  Wikipedia, 
http://en.wikipedia.org/wiki/Thematic_relatio
nsns
Rushdi Shams, Dept of CSE, KUET, 
Bangladesh
1

Más contenido relacionado

Destacado

Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating SystemsLecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating SystemsRushdi Shams
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structureRushdi Shams
 
Propositional logic
Propositional logicPropositional logic
Propositional logicRushdi Shams
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semanticsRushdi Shams
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translationRushdi Shams
 

Destacado (7)

Weka
WekaWeka
Weka
 
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating SystemsLecture 7, 8, 9 and 10  Inter Process Communication (IPC) in Operating Systems
Lecture 7, 8, 9 and 10 Inter Process Communication (IPC) in Operating Systems
 
Knowledge structure
Knowledge structureKnowledge structure
Knowledge structure
 
Propositional logic
Propositional logicPropositional logic
Propositional logic
 
Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Syntax and semantics
Syntax and semanticsSyntax and semantics
Syntax and semantics
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 

Más de Rushdi Shams

Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchRushdi Shams
 
Common evaluation measures in NLP and IR
Common evaluation measures in NLP and IRCommon evaluation measures in NLP and IR
Common evaluation measures in NLP and IRRushdi Shams
 
Machine learning with nlp 101
Machine learning with nlp 101Machine learning with nlp 101
Machine learning with nlp 101Rushdi Shams
 
L5 understanding hacking
L5  understanding hackingL5  understanding hacking
L5 understanding hackingRushdi Shams
 
L2 Intrusion Detection System (IDS)
L2  Intrusion Detection System (IDS)L2  Intrusion Detection System (IDS)
L2 Intrusion Detection System (IDS)Rushdi Shams
 
L2 l3 l4 software process models
L2 l3 l4  software process modelsL2 l3 l4  software process models
L2 l3 l4 software process modelsRushdi Shams
 
L1 overview of software engineering
L1  overview of software engineeringL1  overview of software engineering
L1 overview of software engineeringRushdi Shams
 
Lecture 14,15 and 16 file systems
Lecture 14,15 and 16  file systemsLecture 14,15 and 16  file systems
Lecture 14,15 and 16 file systemsRushdi Shams
 
Lecture 11,12 and 13 deadlocks
Lecture 11,12 and 13  deadlocksLecture 11,12 and 13  deadlocks
Lecture 11,12 and 13 deadlocksRushdi Shams
 
Lecture 1 and 2 processes
Lecture 1 and 2  processesLecture 1 and 2  processes
Lecture 1 and 2 processesRushdi Shams
 
Lecture 3 and 4 threads
Lecture 3 and 4  threadsLecture 3 and 4  threads
Lecture 3 and 4 threadsRushdi Shams
 
Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Rushdi Shams
 
My slide relational algebra
My slide  relational algebraMy slide  relational algebra
My slide relational algebraRushdi Shams
 

Más de Rushdi Shams (18)

Research Methodology and Tips on Better Research
Research Methodology and Tips on Better ResearchResearch Methodology and Tips on Better Research
Research Methodology and Tips on Better Research
 
Common evaluation measures in NLP and IR
Common evaluation measures in NLP and IRCommon evaluation measures in NLP and IR
Common evaluation measures in NLP and IR
 
Machine learning with nlp 101
Machine learning with nlp 101Machine learning with nlp 101
Machine learning with nlp 101
 
First order logic
First order logicFirst order logic
First order logic
 
Belief function
Belief functionBelief function
Belief function
 
L5 understanding hacking
L5  understanding hackingL5  understanding hacking
L5 understanding hacking
 
L4 vpn
L4  vpnL4  vpn
L4 vpn
 
L3 defense
L3  defenseL3  defense
L3 defense
 
L2 Intrusion Detection System (IDS)
L2  Intrusion Detection System (IDS)L2  Intrusion Detection System (IDS)
L2 Intrusion Detection System (IDS)
 
L1 phishing
L1  phishingL1  phishing
L1 phishing
 
L2 l3 l4 software process models
L2 l3 l4  software process modelsL2 l3 l4  software process models
L2 l3 l4 software process models
 
L1 overview of software engineering
L1  overview of software engineeringL1  overview of software engineering
L1 overview of software engineering
 
Lecture 14,15 and 16 file systems
Lecture 14,15 and 16  file systemsLecture 14,15 and 16  file systems
Lecture 14,15 and 16 file systems
 
Lecture 11,12 and 13 deadlocks
Lecture 11,12 and 13  deadlocksLecture 11,12 and 13  deadlocks
Lecture 11,12 and 13 deadlocks
 
Lecture 1 and 2 processes
Lecture 1 and 2  processesLecture 1 and 2  processes
Lecture 1 and 2 processes
 
Lecture 3 and 4 threads
Lecture 3 and 4  threadsLecture 3 and 4  threads
Lecture 3 and 4 threads
 
Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)Distributed Database Management Systems (Distributed DBMS)
Distributed Database Management Systems (Distributed DBMS)
 
My slide relational algebra
My slide  relational algebraMy slide  relational algebra
My slide relational algebra
 

Último

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Último (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

L1 l2 l3 introduction to machine translation

  • 2.
  • 3.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. • Peter mentioned the book I sent to Marry
  • 28. • We will give medicines to pregnant women and children
  • 29. • I saw the boy with the telescope
  • 34. • Da Vinci liked to paint his models nude.
  • 35. • He wrote the note yesterday
  • 36. • You mean you carried the information by a bus?
  • 37. • Connecting wires are tiring in DLD lab
  • 38. • Squad helps dog bite victim
  • 39. Why use computers in translation? • Too much translation for humans • Technical materials too boring for humans • Greater consistency required • Need results more quickly • Not everything needs to be top quality • Reduce costs • any one of these may justify machine translation or computer aids
  • 40. Components of a LanguageComponents of a Language • There are three components of a language‐There are three components of a language 1. Lexicon C i i2. Categorization 3. Grammar Rules
  • 41. LexiconLexicon stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… the | a | an |the | a | an | …… to | in | on | near |to | in | on | near | …… and | or | but |and | or | but | …… 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  • 42. CategorizationCategorization NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… ArticleArticle >> the | a | an |the | a | an | …… PrepositionPreposition >> to | in | on | near |to | in | on | near | …… ConjunctionConjunction >> and | or | but |and | or | but | …… DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  • 43. Grammar StructureGrammar Structure • In this lecture and the one following it  In this lecture and the one following it,  attending it carefully does not mean you know  all of English languageg g g • Because, that will take you to read NLP as one  subject for 4 years! ☺subject for 4 years! ☺ • We will learn how to define the basic grammar  structure for NLP systemsstructure for NLP systems • We will also learn what things you need to  keep in your head while devising such systemskeep in your head while devising such systems
  • 44. Syntactic TreeSyntactic Tree • Human recognizes the organization of words g g according to their POS in a sentence with trees. • Are you denying? • Well you can. Because, you didn’t learn it this way in  your childhood. • No one did!• No one did! • But it has been proved that our brain draws a tree  like structure when we first develop our skills on p language • That research is beyond this lecture
  • 45. Syntactic TreeSyntactic Tree • So  if you really do that unintentionally  So, if you really do that unintentionally,  then why not learn it on pen and paper so  that you can understand how you will teach that you can understand how you will teach  machines to learn languages? • The tree structure human contemplates is • The tree structure human contemplates is  called syntactic tree
  • 46. Parsing a Syntactic TreeParsing a Syntactic Tree • Parsing is the process of using grammar Parsing is the process of using grammar  rules to determine whether a sentence is  legal  and to obtain its syntactical structurelegal, and to obtain its syntactical structure • ‘The large cat eats the small rat’
  • 47. ParsingParsing The large cat eats the small rat
  • 48. ParsingParsing Article adjective noun VerbArticle adjective noun Article adjective noun Verb The large cat eats the small rat
  • 49. ParsingParsing Article adjective noun noun phraseVerbArticle adjective noun noun phrase Article adjective noun Verb The large cat eats the small rat
  • 50. ParsingParsing Noun phrase verb phrase Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  • 51. Parsing t Parsing Noun phrase verb phrase sentence Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  • 52. Syntactic Tree • The point where lines begin or end is called node • Each node has labels like S  PP or chasedEach node has labels like S, PP or chased • If 2 nodes are connected by a line, the upper node is immediate  dominator of the lower node. D is the immediate dominator of the • Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator  of  D, N, the, dog
  • 53. Syntactic TreeSyntactic Tree • Two nodes are sisters if they are immediately dominated by  same node  D and N are sisterssame node. D and N are sisters. • The immediate dominator of them is called their mother. NP is  the mother of D and N. Similarly, D and N are daughters of NP • The immediate dominators of them are called their parents
  • 54. Syntactic TreeSyntactic Tree • Constituents are the terminal nodes that    ll d i d b     i l   i l are all dominated by a single non‐terminal  node. Chased a cat into the garden are  constituents as they are dominated by VPconstituents as they are dominated by VP
  • 55. Label BracketingLabel Bracketing I  i       f  i   h   i    i   h  • It is a process of representing the syntactic tree in another way.
  • 56. Do yourself: Label Bracket the treeDo yourself: Label Bracket the tree
  • 57. R b      h     i   h  Remember, you may have to practise the  reverse‐ constructing a syntactic tree  f  l b l b k i  ☺from label bracketing ☺
  • 58. Constituents and CategoriesConstituents and Categories • Tree structure provides two information‐Tree structure provides two information 1. It divides the sentence into constituents (in English  these are called phrases)(in English, these are called phrases) 2. It puts them into categories (NP, VP, etc)
  • 59. Constituents and CategoriesConstituents and Categories • How do we know what would be the right way to group g y g p words into right category? • How do we know into the garden is a category, but a cat  i t  i   t?into is not? • Any words that can be moved as group are probably constituents‐ the meaning of the dog chased a cat into g g the garden and into the garden, the dog chased a cat.  • Which one did you move? Into the garden‐ right? • And the meaning did not change • That’s probably our constituent
  • 60. Constituents and CategoriesConstituents and Categories • Any string of words that can be deleted is Any string of words that can be deleted is  probably a constituent • If you omit into the garden from the sentence, y g , nothing is changed grammatically. • Usually, meaning of unit of words makes sense. y g Into the garden is much more meaningful than a  cat into
  • 61. Constituents and CategoriesConstituents and Categories • However, we are only talking about syntactic However, we are only talking about syntactic  structure, not the semantic one. • The dog, the cat and the garden‐ their grammar g, g g structure is saying they are all noun phrases.  • It means, they can be used interchangeably‐ no y g y linguist can deny that • Then what about‐ “The garden chased the cat  into the dog”? ☺☺ • We will not focus on semantics, said you before!
  • 62. AmbiguityAmbiguity • There are 2 types of ambiguity‐yp g y 1. Lexical Ambiguity: Sentence contains an  idiom/word/term that has more than one meaning. Glasses means both drinking glasses and spectacles 2. Structural Ambiguity: Sentence has more than  one syntactic treeone syntactic tree I saw the boy with the telescope‐ Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or Did you see the boy who was having a telescope?
  • 64. Difficulties with Natural Language: Anaphora •• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es already introduced in the textalready introduced in the text After Mary proposed to John, After Mary proposed to John, theythey found a found a  preacher and got married.preacher and got married. For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii Mary saw a ring through the window and asked Mary saw a ring through the window and asked  John for John for itit Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
  • 66. Difficulties with Natural Language: Metonymy •• Using one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for another I'   dI'   d Sh kSh kI've readI've read ShakespeareShakespeare ChryslerChrysler announced record profitsannounced record profits The The ham sandwichham sandwich on Table 4 wants another on Table 4 wants another  beerbeer
  • 67. Difficulties with Natural Language: Metaphor •• “Non“Non‐‐literal" usage of words and phrases  literal" usage of words and phrases  NonNon literal  usage of words and phrases, literal  usage of words and phrases,  often systematic.often systematic. I've tried killing the process but it won't die. I've tried killing the process but it won't die.  I    k  i   liI    k  i   liIts parent keeps it alive.Its parent keeps it alive.
  • 68. Semantics in NLSemantics in NL • I can't untie that knot with one hand.ca t u t e t at ot t o e a d. – The sentence is about the abilities of whoever spoke  or wrote it. (Call this person the speaker.) – It's also about a knot, maybe one that the speaker is  pointing at – The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain  ability. (This is the contribution of the word `can't'.) – Untying is a way of making something not tied. – The sentence doesn't mean that the knot has one  hand; it has to do with how many hands are used to  do the untyingdo the untying.
  • 69. Problems in Semantics in NLProblems in Semantics in NL • If you do not understand certain you do ot u de sta d ce ta characteristics of linguistics, you will not be  able to understand the semantics. • If you do understand them, you need to feel  them • If you do feel them, you need to see the context • If you see the context, you are dealt with both  ti   d  ti  i  NLsemantics and pragmatics in NL • ☺
  • 70. SynonymySynonymy • Synonyms are different words (or sometimesSynonyms are different words (or sometimes  phrases) with identical or very similar  meaningsmeanings. • Words that are synonyms are said to  be synonymous and the state of being abe synonymous, and the state of being a  synonym is called synonymy
  • 71. SynonymySynonymy • student and pupil (noun)student and pupil (noun) • buy and purchase (verb) i k d ill ( dj i )• sick and ill (adjective) • quickly and speedily (adverb) • on and upon (preposition)
  • 72. SynonymySynonymy • Note that synonyms are defined with respectNote that synonyms are defined with respect  to certain senses of words  • pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is  not synonymous with student.  Si il l h i d h h• Similarly,he expired means the same as he  died, yet my passport has expired cannot be  l d b h di dreplaced by my passport has died. 
  • 73. AntonymyAntonymy • Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly  opposite meanings. For example: • short and tall• short and tall • dead and alive • increase and decrease
  • 74. HomonymyHomonymy • a homonym is one of a group of words thata homonym is one of a group of words that  share the same spelling and the same  pronunciation but have different meanings,  usually as a result of the two words having  different origins.  • The state of being a homonym is  called homonymy.  • bark (the sound of a dog) and bark (the skin of  a tree).
  • 75. HeteronymyHeteronymy • heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are  words with identical spellings (or characters)  but different pronunciations and meaningsbut different pronunciations and meanings.
  • 76. Monolingual ambiguity • morphological ambiguity: – German -en: noun plural, dative plural, weak noun non-nominative, adjective masculine non-nominative, etc. • compound nouns: – coincide -> coin+cide, cooperate -> cooper+ate • category ambiguity: – round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc. • homographs and polysemes: – branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine) – ball: The ball rolled down the hill, The ball lasted until midnight
  • 77. Bilingual lexical ambiguity • English wall: German Mauer (outside) or Wand (inside) • English river: French fleuve (major) or rivière (general term) • English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey) • English blue: Russian goluboi (pale blue) or sinii (dark blue) • French louer: English hire or rent • German leihen: English borrow or lend • English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru (ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace) • resolvable by: – rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.) – collocations (specifying particular adjacent words) – frequencies (most probable adjacent or dependent words)
  • 78. Structural ambiguity • Flying planes can be dangerous • The man saw the girl with a telescope • John mentioned the book I sent to Mary • I told everyone concerned about the strike – everyone concerned/involved/relevant, or: everyone disturbed/worried • He noticed her shaking hands – either which were shaking from cold, or which were shaking other hands • They complained to the guide that they could not hear – that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could not hear him’) • The mathematics students sat their examinations • The mathematics students study today is very complex – difficulty of identifying noun compound vs. relative clause • Gas pump prices rose last time oil stocks fell – each word potentially noun or verb