This lecture lays the theoretic foundations for declarative syntax formalisms and syntax-based language processors, which we will discuss later in the course. We introduce the notions of formal languages, formal grammars, and syntax trees, starting from Chomsky's work on formal grammars as generative devices.
We start with a formal model of languages and investigate formal grammars and their derivation relations as finite models of infinite productivity. We further discuss several classes of formal grammars and their corresponding classes of formal languages. In a second step, we introduce the word problem, analyse its decidability and complexity for different classes of formal languages, and discuss consequences of this analysis on language processing. We conclude the lecture with a discussion about parse tree construction, abstract syntax trees, and ambiguities.
15. Formal Grammars 13
vocabulary Σ
finite, nonempty set of elements (words, letters)
alphabet
string over Σ
finite sequence of elements chosen from Σ
word, sentence, utterance
16. Formal Grammars 13
vocabulary Σ
finite, nonempty set of elements (words, letters)
alphabet
string over Σ
finite sequence of elements chosen from Σ
word, sentence, utterance
formal language λ
set of strings over a vocabulary Σ
λ ⊆ Σ*
20. Formal Grammars 17
G = (N, Σ, P, S)
Num → Digit Num
Num → Digit
Digit → “0”
Digit → “1”
Digit → “2”
Digit → “3”
Digit → “4”
Digit → “5”
Digit → “6”
Digit → “7”
Digit → “8”
Digit → “9”
decimal numbers
morphology
Σ: finite set of terminal symbols
21. Formal Grammars 18
G = (N, Σ, P, S)
Num → Digit Num
Num → Digit
Digit → “0”
Digit → “1”
Digit → “2”
Digit → “3”
Digit → “4”
Digit → “5”
Digit → “6”
Digit → “7”
Digit → “8”
Digit → “9”
decimal numbers
morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
22. Formal Grammars 19
G = (N, Σ, P, S)
Num → Digit Num
Num → Digit
Digit → “0”
Digit → “1”
Digit → “2”
Digit → “3”
Digit → “4”
Digit → “5”
Digit → “6”
Digit → “7”
Digit → “8”
Digit → “9”
decimal numbers
morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
23. Formal Grammars 20
G = (N, Σ, P, S)
Num → Digit Num
Num → Digit
Digit → “0”
Digit → “1”
Digit → “2”
Digit → “3”
Digit → “4”
Digit → “5”
Digit → “6”
Digit → “7”
Digit → “8”
Digit → “9”
decimal numbers
morphology
Σ: finite set of terminal symbols
N: finite set of non-terminal symbols
S∈N: start symbol
P⊆N×(N∪Σ)*
: set of production rules
24. Formal Grammars 21
Num
Digit Num
4 Num
4 Digit Num
4 3 Num
4 3 Digit Num
4 3 0 Num
4 3 0 Digit
4 3 0 3
Num → Digit Num
Digit → “4”
Num → Digit Num
Digit → “3”
Num → Digit Num
Digit → “0”
Num → Digit
Digit → “3”
decimal numbers
production
25. Formal Grammars 21
Num
Digit Num
4 Num
4 Digit Num
4 3 Num
4 3 Digit Num
4 3 0 Num
4 3 0 Digit
4 3 0 3
Num → Digit Num
Digit → “4”
Num → Digit Num
Digit → “3”
Num → Digit Num
Digit → “0”
Num → Digit
Digit → “3”
decimal numbers
production
leftmost derivation
26. Formal Grammars 22
Num
Digit Num
Digit Digit Num
Digit Digit Digit Num
Digit Digit Digit Digit
Digit Digit Digit 3
Digit Digit 0 3
Digit 3 0 3
4 3 0 3
Num → Digit Num
Num → Digit Num
Num → Digit Num
Num → Digit
Digit → “3”
Digit → “0”
Digit → “3”
Digit → “4”
decimal numbers
production
27. Formal Grammars 22
Num
Digit Num
Digit Digit Num
Digit Digit Digit Num
Digit Digit Digit Digit
Digit Digit Digit 3
Digit Digit 0 3
Digit 3 0 3
4 3 0 3
Num → Digit Num
Num → Digit Num
Num → Digit Num
Num → Digit
Digit → “3”
Digit → “0”
Digit → “3”
Digit → “4”
decimal numbers
production
rightmost derivation
34. Formal Grammars 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
start symbol S∈N
35. Formal Grammars 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
start symbol S∈N
nonterminal symbol
36. Formal Grammars 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
start symbol S∈N
context
37. Formal Grammars 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
start symbol S∈N
replacement
38. Formal Grammars 29
formal grammar G = (N, Σ, P, S)
nonterminal symbols N
terminal symbols Σ
production rules P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
start symbol S∈N
39. Formal Grammars 30
type-0, unrestricted: P ⊆ (N∪Σ)*
N (N∪Σ)*
× (N∪Σ)*
type-1, context-sensitive: (a A c, a b c)
type-2, context-free: P ⊆ N × (N∪Σ)*
type-3, regular: (A, x) or (A, xB)
41. Formal Grammars 32
formal grammar G
derivation relation G
formal language L(G) ⊆ Σ*
L(G) = {w∈Σ*
| S G
*
w}
42. Formal Grammars 33
formal grammar G = (N, Σ, P, S)
derivation relation G ⊆ (N∪Σ)*
× (N∪Σ)*
w G w’
∃(p, q)∈P: ∃u,v∈(N∪Σ)*
:
w=u p v ∧ w’=u q v
formal language L(G) ⊆ Σ*
L(G) = {w∈Σ*
| S G
*
w}
68. Formal Grammars 46
syntax trees
different trees for same sentence
derivations
different leftmost derivations for same sentence
different rightmost derivations for same sentence
NOT just different derivations for same sentence
context-free grammars
ambiguity
69. Formal Grammars 47
parse trees
parent node: nonterminal symbol
child nodes: terminal symbols
abstract syntax trees (ASTs)
abstract over terminal symbols
convey information at parent nodes
abstract over injective production rules
syntax trees
parse trees & abstract syntax trees
73. Formal Grammars 50
attribution
slide title author license
1 The Pine, Saint Tropez Paul Signac public domain
2, 3, 35, 41 PICOL icons Melih Bilgil CC BY 3.0
9 Writing Caitlin Regan CC BY 2.0
10 Latin Grammar Anthony Nelzin
13, 15, 29-34, 38 Noam Chomsky Fellowsisters CC BY-NC-SA 2.0
Notas del editor
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
expressions: binary expressions, method calls\n\nobject-orientation: classes, single inheritance, methods, no overloading\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
property of language\n\nsay what no-one said before\n\nholds for software languages as well\n
scientific challenge\n
\n
linguistic theory\n\nNoam Chomsky\n
linguistic theory\n\nNoam Chomsky\n
linguistic theory\n\nNoam Chomsky\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
restrictions on production rules => grammar classes\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
computer science: lexical syntax\n\ncan write this as a regular grammar\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
generative device\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
other derivations possible\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
X* for zero or more X\n\nempty word\n\nbut no construct for separated repetition\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
All actual, detailed work on assignments must be individual work. \nYou are encouraged to discuss assignments, programming languages used to solve the assignments, their libraries, and general solution techniques in these languages with each other. \nBut if you do so, then you must acknowledge the people with whom you discussed at the top of your submission. \nYou should not look for assignment solutions elsewhere; but if material is taken from elsewhere, then you must acknowledge it. \n\nYou are not permitted to provide or receive any kind of solutions of assignments. \nThis includes partial, incomplete, or erroneous solutions. \nYou are also not permitted to provide or receive programming help from people other than the teaching assistants or the instructor of the course.\n\n
any kind of solution: an algorithm, code, intermediate solutions\nplagiarism: copying from others, from the web, mimic another ones solution \n\nAny violation of these rules will be reported as a suspected case of fraud to the Board of Examiners and handled according to the EEMCS Faculty's fraud procedure. \nIf the case is proven, a penalty will be imposed: a minimum of exclusion from the course for the duration of one academic year up to a maximum of a one-year exclusion form all courses at TU Delft. \nFor details on the procedure, see Section 2.1.26 in the faculty's Study Guide for MSc Programmes.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
round-up on every lecture\n\nwhat to take with you\n\ncheck yourself, pre- and post-paration\n