1. Rushdi Shams, Dept of CSE, KUET, Bangladesh 1
Syntax and SemanticsSyntax and Semantics
Artificial IntelligenceArtificial Intelligence
Version 1.oVersion 1.o
Drop me a mail:Drop me a mail: rushdecoder@yahoo.comrushdecoder@yahoo.com
Find me on web:Find me on web: http://rushdishams.googlepages.comhttp://rushdishams.googlepages.com
Gather at:Gather at: http://groups.google.com/group/csebatchesofrushdihttp://groups.google.com/group/csebatchesofrushdi
3. Rushdi Shams, Dept of CSE, KUET, Bangladesh 3
Introduction
Natural Language means any language we speak
Natural Language Processing (NLP) means processing the
language in concern so that it can be represented in
machine readable format
Languages consist of grammar
If grammar is not understood by you, how can you develop
NLP systems that can read (parse) sentences, recognize
every word, their Part of Speech (POS)?
How can you represent knowledge from a sentence with a
knowledge representation software if you yourself do not
understand the way language is formed?
4. Rushdi Shams, Dept of CSE, KUET, Bangladesh 4
Components of a Language
There are three components of a language-
1. Lexicon
2. Categorization
3. Grammar Rules
5. Rushdi Shams, Dept of CSE, KUET, Bangladesh 5
Lexicon
stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
the | a | an |the | a | an | ……
to | in | on | near |to | in | on | near | ……
and | or | but |and | or | but | ……
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
6. Rushdi Shams, Dept of CSE, KUET, Bangladesh 6
Categorization
NounNoun >>stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | ....
VerbVerb >>is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | ……
AdjectiveAdjective >>right | left | east | south | back | smelly |right | left | east | south | back | smelly | ……
AdverbAdverb >>here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | ……
PronounPronoun >>me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL ……
NameName >>John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……
ArticleArticle >>the | a | an |the | a | an | ……
PrepositionPreposition >> to | in | on | near |to | in | on | near | ……
ConjunctionConjunction >>and | or | but |and | or | but | ……
DigitDigit >>0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
7. Rushdi Shams, Dept of CSE, KUET, Bangladesh 7
Grammar Structure
In this lecture and the one following it, attending it
carefully does not mean you know all of English
language
Because, that will take you to read NLP as one subject
for 4 years!
We will learn how to define the basic grammar
structure for NLP systems
We will also learn what things you need to keep in
your head while devising such systems
8. Rushdi Shams, Dept of CSE, KUET, Bangladesh 8
Syntactic Tree
Human recognizes the organization of words
according to their POS in a sentence with trees.
Are you denying?
Well you can. Because, you didn’t learn it this way in
your childhood.
No one did!
But it has been proved that our brain draws a tree like
structure when we first develop our skills on language
That research is beyond this lecture
9. Rushdi Shams, Dept of CSE, KUET, Bangladesh 9
Syntactic Tree
So, if you really do that unintentionally, then why not
learn it on pen and paper so that you can understand
how you will teach machines to learn languages?
The tree structure human contemplates is called
syntactic tree
10. Rushdi Shams, Dept of CSE, KUET, Bangladesh 10
Parsing a Syntactic Tree
Parsing is the process of using grammar rules to
determine whether a sentence is legal, and to obtain
its syntactical structure
‘The large cat eats the small rat’
11. Rushdi Shams, Dept of CSE, KUET, Bangladesh 11
The large cat eats the small rat
Parsing
12. Rushdi Shams, Dept of CSE, KUET, Bangladesh 12
The large cat
Article adjective noun
Article adjective noun
eats the small rat
Parsing
Verb
13. Rushdi Shams, Dept of CSE, KUET, Bangladesh 13
The large cat
Article adjective noun noun phrase
Article adjective noun
eats the small rat
Parsing
Verb
14. Rushdi Shams, Dept of CSE, KUET, Bangladesh 14
The large cat
Article adjective noun Verb noun phrase
Article adjective noun
Noun phrase
eats the small rat
Parsing
15. Rushdi Shams, Dept of CSE, KUET, Bangladesh 15
The large cat
Article adjective noun Verb noun phrase
Article adjective noun
Noun phrase verb phrase
eats the small rat
Parsing
16. Rushdi Shams, Dept of CSE, KUET, Bangladesh 16
The large cat
Article adjective noun Verb noun phrase
Article adjective noun
Noun phrase verb phrase
sentence
eats the small rat
Parsing
17. Rushdi Shams, Dept of CSE, KUET, Bangladesh 17
Syntactic Tree
The point where lines begin or end is called node
Each node has labels like S, PP or chased
If 2 nodes are connected by a line, the upper node is immediate
dominator of the lower node. D is the immediate dominator of the
Upper nodes in a branch are called dominators. NP is the dominator
of D, N, the, dog
18. Rushdi Shams, Dept of CSE, KUET, Bangladesh 18
Syntactic Tree
Two nodes are sisters if they are immediately dominated by
same node. D and N are sisters.
The immediate dominator of them is called their mother. NP is
the mother of D and N. Similarly, D and N are daughters of NP
The dominators of them are called their parents
19. Rushdi Shams, Dept of CSE, KUET, Bangladesh 19
Syntactic Tree
Constituents are the terminal nodes that are all
dominated by a single non-terminal node. Chased a
cat into the garden are constituents as they are
dominated by VP
20. Rushdi Shams, Dept of CSE, KUET, Bangladesh 20
Label Bracketing
It is a process of representing
the syntactic tree in another
way.
21. Rushdi Shams, Dept of CSE, KUET, Bangladesh 21
Do yourself: Label Bracket the tree
22. Rushdi Shams, Dept of CSE, KUET, Bangladesh 22
Remember, you may have to practise the reverse-
constructing a syntactic tree from label
bracketing
23. Rushdi Shams, Dept of CSE, KUET, Bangladesh 23
Constituents and Categories
Tree structure provides two information-
1. It divides the sentence into constituents (in
English, these are called phrases)
2. It puts them into categories (NP, VP, etc)
24. Rushdi Shams, Dept of CSE, KUET, Bangladesh 24
Constituents and Categories
How do we know what would be the right way to group
words into right category?
How do we know into the garden is a category, but a cat
into is not?
Any words that can be moved as group are probably
constituents- the meaning of the dog chased a cat into
the garden and into the garden, the dog chased a cat.
Which one did you move? Into the garden- right?
And the meaning did not change
That’s probably our constituent
25. Rushdi Shams, Dept of CSE, KUET, Bangladesh 25
Constituents and Categories
Any string of words that can be deleted is
probably a constituent
If you omit into the garden from the sentence,
nothing is changed grammatically.
Usually, meaning of unit of words makes sense.
Into the garden is much more meaningful than a
cat into
26. Rushdi Shams, Dept of CSE, KUET, Bangladesh 26
Constituents and Categories
However, we are only talking about syntactic
structure, not the semantic one.
The dog, the cat and the garden- their grammar
structure is saying they are all noun phrases.
It means, they can be used interchangeably- no
linguist can deny that
Then what about- “The garden chased the cat
into the dog”?
We will not focus on semantics, said you before!
27. Rushdi Shams, Dept of CSE, KUET, Bangladesh 27
Ambiguity
There are 2 types of ambiguity-
1. Lexical Ambiguity: Sentence contains an
idiom/word/term that has more than one meaning.
Glasses means both drinking glasses and spectacles
2. Structural Ambiguity: Sentence has more than one
syntactic tree
I saw the boy with the telescope-
Did you see the boy with a telescope? Or
Did you see the boy who was having a telescope?
29. Rushdi Shams, Dept of CSE, KUET, Bangladesh 29
Ambiguity
Which of the following examples have lexical
ambiguity and which of them carry structural
ambiguity; justify-
1. The painter put on another coat
2. We like flying planes
3. The judge threw the book at him
4. Visiting relatives can be tiresome
30. Rushdi Shams, Dept of CSE, KUET, Bangladesh 30
Ambiguity
Da Vinci liked to paint his models nude.
He wrote the note yesterday
You mean you carried the information by a bus?
Connecting wires are tiring in DLD lab
Squad helps dog bite victim
31. Rushdi Shams, Dept of CSE, KUET, Bangladesh 31
Now, We will take a look at How
to Construct Grammar
32. Rushdi Shams, Dept of CSE, KUET, Bangladesh 32
Noun Phrase
We will start with a Noun Phrase (NP)
At a first glance, we can say, an NP can be as of-
NP -> D (Adj) N (PP)
the dog -> D N
the gray cat -> D Adj N
the dog in the garden -> D N PP
the young boy with the telescope -> D Adj N PP
p.s. ( ) means optional presence
33. Rushdi Shams, Dept of CSE, KUET, Bangladesh 33
Determiner
Determiners can be NULL as in-
Birds fly
So, for the rule
NP -> D (Adj) N (PP)
we can say that D - > NULL or
D-> A, An, The
p.s. for simplicity, we are not taking the two as
determiner here, in spite of its eligibility of being so
34. Rushdi Shams, Dept of CSE, KUET, Bangladesh 34
Noun Phrase (continued)
NPs can be simply pronouns
NP -> Pronoun
NPs can be simply nouns as well
NP -> Names
And Pronouns and Names are then-
Pronoun -> he, she, we, you, they, I
Names -> Mehedi, Shams, Rushdi
35. Rushdi Shams, Dept of CSE, KUET, Bangladesh 35
Adjectives
It is permitted in English to put one million adjectives in front
of nouns.
Therefore, our rule for noun phrase
NP -> D (Adj) N (PP)
will face problems for-
36. Rushdi Shams, Dept of CSE, KUET, Bangladesh 36
Adjectives
No problem! We can change the rule as follows-
That will help us with a recursion like-
37. Rushdi Shams, Dept of CSE, KUET, Bangladesh 37
Adjective Phrase
An adjective is not always a single word
It can be a phrase as in-
So, in that case, we need to change the rule slightly-
38. Rushdi Shams, Dept of CSE, KUET, Bangladesh 38
Sentences within NPs
In NPs such as the fact that birds fly, there is a
sentence after the complementizer- that
So, we need to change the rule as-
And we also need rules as follows to satisfy such NPs-
40. Rushdi Shams, Dept of CSE, KUET, Bangladesh 40
Parsing
Now, with these rules inside your lexicon, we will see,
how machine parses top down and bottom up for the
sentence-
The fact that birds fly surprised him
41. Rushdi Shams, Dept of CSE, KUET, Bangladesh 41
Difficulties with Natural Language:
Anaphora
Using pronouns to refer back to entities alreadyUsing pronouns to refer back to entities already
introduced in the textintroduced in the text
After Mary proposed to John,After Mary proposed to John, theythey found a preacherfound a preacher
and got married.and got married.
For the honeymoon,For the honeymoon, theythey went to Hawaiiwent to Hawaii
Mary saw a ring through the window and asked JohnMary saw a ring through the window and asked John
forfor itit
Mary threw a rock at the window and brokeMary threw a rock at the window and broke itit
42. Rushdi Shams, Dept of CSE, KUET, Bangladesh 42
Difficulties with Natural Language:
Indexicality
Indexical sentences refer to utterance situationIndexical sentences refer to utterance situation
(place, time, etc.)(place, time, etc.)
I am overI am over herehere
Why did you doWhy did you do thatthat??
43. Rushdi Shams, Dept of CSE, KUET, Bangladesh 43
Difficulties with Natural Language:
Metonymy
Using one noun phrase to stand for anotherUsing one noun phrase to stand for another
I've readI've read ShakespeareShakespeare
ChryslerChrysler announced record profitsannounced record profits
TheThe ham sandwichham sandwich on Table 4 wants another beeron Table 4 wants another beer
44. Rushdi Shams, Dept of CSE, KUET, Bangladesh 44
Difficulties with Natural Language:
Metaphor
““Non-literal" usage of words and phrases, oftenNon-literal" usage of words and phrases, often
systematic.systematic.
I've tried killing the process but it won't die. Its parentI've tried killing the process but it won't die. Its parent
keeps it alive.keeps it alive.
45. Rushdi Shams, Dept of CSE, KUET, Bangladesh 45
Reference
NLP for Prolog Programmers by Michael A.
Covington
Chapter 4
46. Rushdi Shams, Dept of CSE, KUET, Bangladesh 46
Part IIPart II
SemanticsSemantics
47. Rushdi Shams, Dept of CSE, KUET, Bangladesh 47
Structure of Language
Words
………
………
………
………
………
………
Sentences
(Phrases)
..................
..................
..................
..................
..................
..................
Discourse
..................
..................
..................
..................
..................
..................
..................
Syntax
Pragmatics
semantics
48. Rushdi Shams, Dept of CSE, KUET, Bangladesh 48
Semantics in NL
I can't untie that knot with one hand.
The sentence is about the abilities of whoever spoke
or wrote it. (Call this person the speaker.)
It's also about a knot, maybe one that the speaker is
pointing at
The sentence denies that the speaker has a certain
ability. (This is the contribution of the word `can't'.)
Untying is a way of making something not tied.
The sentence doesn't mean that the knot has one
hand; it has to do with how many hands are used to do
the untying.
49. Rushdi Shams, Dept of CSE, KUET, Bangladesh 49
Problems in Semantics in NL
If you do not understand certain characteristics of
linguistics, you will not be able to understand the
semantics.
If you do understand them, you need to feel them
If you do feel them, you need to see the context
If you see the context, you are dealt with both
semantics and pragmatics in NL
50. Rushdi Shams, Dept of CSE, KUET, Bangladesh 50
Synonymy
Synonyms are different words (or sometimes
phrases) with identical or very similar meanings.
Words that are synonyms are said to
be synonymous, and the state of being a synonym
is called synonymy
51. Rushdi Shams, Dept of CSE, KUET, Bangladesh 51
Synonymy
student and pupil (noun)
buy and purchase (verb)
sick and ill (adjective)
quickly and speedily (adverb)
on and upon (preposition)
52. Rushdi Shams, Dept of CSE, KUET, Bangladesh 52
Synonymy
Note that synonyms are defined with respect to
certain senses of words
pupil as the "aperture in the iris of the eye" is not
synonymous with student.
Similarly,he expired means the same as he died,
yet my passport has expired cannot be replaced
by my passport has died.
53. Rushdi Shams, Dept of CSE, KUET, Bangladesh 53
Antonymy
Antonyms are words with opposite or nearly
opposite meanings. For example:
short and tall
dead and alive
increase and decrease
54. Rushdi Shams, Dept of CSE, KUET, Bangladesh 54
Homonymy
a homonym is one of a group of words that
share the same spelling and
the same pronunciation but
have different meanings, usually as a result of the two
words having different origins.
The state of being a homonym is called homonymy.
bark (the sound of a dog) and bark (the skin of a
tree).
55. Rushdi Shams, Dept of CSE, KUET, Bangladesh 55
Heteronymy
heteronyms (also known as heterophones) are
words with
identical spellings (or characters)
but different pronunciations and meanings.
56. Rushdi Shams, Dept of CSE, KUET, Bangladesh 56
Semantic Features
Nouns can be classified according to the values of a
set of semantic features.
For example, the word boy can be paraphrased as
young male human, man as not-young male human.
The difference in meaning between boy and man
seems reside in the value of age.
57. Rushdi Shams, Dept of CSE, KUET, Bangladesh 57
Semantic Features
Thus we could take AGE to be a primitive, here taking
the values young and not-young.
Boy is defined by
AGE = young (among other features), man by AGE =
not-young.
58. Rushdi Shams, Dept of CSE, KUET, Bangladesh 58
Semantic Features
Some more general semantic features which are have been used for nouns
include:
1. ABSTRACT Nouns with the feature +ABSTRACT are abstract or non-concrete (e.g.
sincerity), those with the feature –ABSTRACT are concrete (e.g. Jane or water).
2. COMMON All ‘proper nouns’ (names of people, things, etc.) are –COMMON.
Thus Jane (referring to a specific person) is –COMMON, as is London. Other
nouns are +COMMON, e.g. dog, mankind or sincerity.
3. COUNT Nouns which can be made plural are +COUNT, e.g. dog. Nouns which are
–COUNT include water (in its usual sense), mankind or sincerity.
4. ANIMATE Nouns with the feature +ANIMATE are alive, those with –ANIMATE
are not.
5. HUMAN +HUMAN implies human, –HUMAN implies not human.
6. MALE +MALE implies male, –MALE implies not male (i.e. either female or
neither).
7. FEMALE +FEMALE implies female, -FEMALE implies not female (i.e. either male
or neither).
59. Rushdi Shams, Dept of CSE, KUET, Bangladesh 59
Semantic Features
Nouns can then be classified using sets of
these primitive features. For example:
Jane [–ABSTRACT,–COMMON,–COUNT,+ANIMATE,+HUMAN,–MALE,
+FEMALE]
boy [–ABSTRACT,+COMMON,+COUNT,+ANIMATE,+HUMAN,+MALE,-
FEMALE]
idea [+ABSTRACT,+COMMON,+COUNT,–ANIMATE,-HUMAN,-MALE,-
FEMALE]
sincerity [+ABSTRACT,+COMMON,–COUNT,–ANIMATE,-HUMAN,-MALE,-
FEMALE]
61. Rushdi Shams, Dept of CSE, KUET, Bangladesh 61
Acknowledgement
Dr. Adel Elsayed
Research Leader, M3C Lab, University of Bolton, UK
Weiqiang Wei
Former PhD Student, University of Bolton, UK
Images have been stolen from Mrs. Web’s son
Master Google.