3. The parser finds the derivation
of a particular input
derivation
Parser
input E => E + E
E -> E + E
=> E + E * E
10 + 2 * 5 |E*E
=> 10 + E*E
| INT
=> 10 + 2 * E
=> 10 + 2 * 5
4. derivation tree
derivation
E
E => E + E E + E
=> E + E * E
=> 10 + E*E 10
E * E
=> 10 + 2 * E
=> 10 + 2 * 5 2 5
6. A Substitution Rule
Equivalent
grammar
A→a
A→a
A → aaA
A → aaA
A → abBc Substitute B
A → ababbAc
B → abbA
A → abbc
B→b
7. In general:
A → xBz
B → y1 | y2 | | yn
Substitute
B
equivalent
A → xy1z | xy2 z | | xyn z
grammar
8. Useless Productions
S → aSb
S →λ
S→A
A → aA Useless Production
Some derivations never terminate...
S ⇒ A ⇒ aA ⇒ aaA ⇒ ⇒ aa aA ⇒
9. Another grammar:
S→A
A → aA
A→λ
B → bA Useless Production
Not reachable from S
10. In general:
If S ⇒ ⇒ xAy ⇒ ⇒ w
w∈ L(G )
Then variable A is useful
Otherwise, variable A is useless
A is useful if it occurs in some sentential
form and a string of terminals can be
derived from it.
11. A production A → x is useful
if all its variables are useful
12. Removing Useless
Productions
Example Grammar:
S → aS | A | C
A→a
B → aa
C → aCb
13. Recognizing useless symbols
A variable may be useless because
there is no way of getting a terminal string from it.
( variable can not derive a string of terminals)
Another reason for a variable B to be useless may
be that
there is no x and y such that
S ⇒* x B y ( Variable does not occur in a sentential
form)
14. First: find all variables that can produce
strings with only terminals
S → aS | A | C Round 1: { A, B}
A→a
B → aa
C → aCb Round 2: { A, B, S }
15. Keep only the variables
{ A, B, S }
that produce terminal symbols
S → aS | A | C
A→a S → aS | A
B → aa A→a
C → aCb B → aa
16. New grammar G’ = (V’,T’,P’,S)
V’ = {S,A,B} G
T’={ a} S→ aS | A | C
P’: S → aS | A
A→a
A→a B → aa
B → aa C → aCb
17. algorithm
G= (V,T,P,S): given context free grammar
G’ = (V’,T’,P’ S) such that V’ contains only variables A
for which
A ⇒* w ∈ T* is possible
Step 1: set V’ to empty
Step 2:repeat the following step until no more
variables are added to V’.
For every A ∈ V for which P has a production of the
form
A x1 x2…xn with all xi in V’ ∪ T
Add A to V’
18. Step 3: Take P’ as all the productions in P
whose symbols are all in (V’ ∪ T)
If a terminal is not present in any production
of P’ remove it from T to get T’.
19. Example
S→AB
A→b V’ = Φ
B→a V’ = {A,B,E}
B→D V’={A,B,E,S}
E→a P’:
S → AB; B →a; E →a
A→ b
20. Recognize variables that cannot be
reached from the start symbol
Dependency graph:
It has vertices labeled with variables
Add an edge between C and D if and only if there is
production of the form
C→x Dy
if a variable is not reachable from starting state
Removing it and the affected productions and
terminals will not change the grammar
21. Second: Find all variables
reachable from S
Dependency Graph
S → aS | A
A→a S A B
B → aa not
reachable
22. Keep only the variables
reachable from S
Final Grammar
S → aS | A
S → aS | A
A→a
A→a
B → aa
23. Example
V’ = Φ
V’ = {A,B,E} A
V’={A,B,E,S} E
S
P’:
B
S → AB; B →a; E →a
A→ b Only A and B are
reachable,so E
can be removed
24. So the new equivalent grammar
•S→AB V’={A,B,S}, T’={a,b}
•A → b P’:
•B → a S → AB; B →a;
•B → D
A→b
•E → a
This grammar is now having
only useful symbols
25. Process of removing useless
symbols
1 Find an equivalent grammar removing
symbols which can not produce string of
terminals. i.e symbols (A) s.t. A⇒* w
2 From the grammar so obtained get a new
grammar having only those symbols (A)each
of which is present in some sentential form.
S ⇒* x A y
28. Final Grammar
S → aMb
S → aMb
Substitute S → ab
M → aMb M →λ
M → aMb
M →λ
M → ab
Both the grammars generates the same language
{anbn : n >= 1}
29. Theorem: Let G be any CFG with λ not in L(G).Then
there exists an equivalent grammar G’ having no λ
productions.
First find the set Vn of all nullable variables of G as:
1 for all productions A→λ put A in Vn
2 repeat following until no further variables are
added to Vn
For all productions
B → A1A2…An
Where all Ai are in Vn , put B into Vn
as B is also nullable
30. Once the set Vn has been found we are ready to
construct production set P’ as
Look all productions of P which are of the form
A→ x1 x2… xm m >= 1
Where each xi is in V ∪T
For each such production of P we put into P’
those productions as well as all those generated
by replacing nullable variable with λ in all
possible combinations.
If all xi are nullable then A → λ will not be
added in P’
31. Example
S→AB
A → aAA | λ
B → bBB | λ
Here A, B are nullable(found in the first step)
S is also nullable (in the second step)
So Vn ={ S,A,B}.so new set of rules are
S → AB | A| B Note: the grammar obtained is not
A → aAA | aA| a equivalent to given one as:
B → bBB |bB | b λ Is in given language but not in the
language generated by the new grammar
reason being condition of the theorem
not satisfied
32. Example
S → ABaC
A → BC
B → b| λ
C→D|λ
D→ d
Nullable variables are A,B,C
33. S → ABaC is replaced by
S → ABaC| BaC|AaC|
Aba|aC|Aa|Ba|a
A → BC is replaced by
A → BC | B | C
B → b| λ is replaced by
B →b
C → D | λ is replaced by
C→D
D→ d is copied as such D→ d
34. so the new equivalent grammar without any
nullable symbol is
S → ABaC| BaC|AaC|Aba|
aC|Aa|Ba|a
A → BC | B | C
B →b
C→D
D→ d
42. First find for each A all variables B such that
A ⇒* B this can be done by drawing a
dependency graph with an edge (C,D)
whenever the grammar has a unit production
C →D
So A ⇒* B whenever there is a walk from A to
B in the graph.
43. The new grammar G’ is obtained as:
• include all non unit productions of P
• If A ⇒* B we add to P’
A → y1 | y2| … y n
Where B → y1| y2| …yn is the set of all
rules in P’ with B on the left
44. Example
S → Aa | B
B → A| bb
A → a | bc | B S A B
S ⇒* A ; S ⇒* B, B ⇒* A and A ⇒*B
In first step we add
•S → Aa B → bb
•A → a | bc
45. In the second step we add
S → a | bc | bb
A → bb
B → a | bc S → a | bc | bb|Aa
So the new grammar A → a | bb |bc
B → a | bb | bc
51. Introduce variables for terminals: Ta , Tb , Tc
S → ABTa
S → ABa A → TaTaTb
A → aab B → ATc
B → Ac Ta → a
Tb → b
Tc → c
52. Introduce intermediate variable: V1
S → AV1
S → ABTa
V1 → BTa
A → TaTaTb
A → TaTaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c
53. Introduce intermediate variable: V2
S → AV1
S → AV1
V1 → BTa
V1 → BTa
A → TaV2
A → TaTaTb
V2 → TaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c
54. Final grammar in Chomsky Normal Form:
S → AV1
Initial grammar
V1 → BTa
S → ABa A → TaV2
A → aab V2 → TaTb
B → Ac B → ATc
Ta → a
Tb → b
Tc → c
55. In general:
From any context-free grammar
not in Chomsky Normal Form provided
empty string is not a member of the
language
we can obtain:
An equivalent grammar
in Chomsky Normal Form
57. For every symbol a:
Add production Ta → a
In productions: replace a with Ta
New variable: Ta
58. Replace any production A → C1C2 Cn
with A → C1V1
V1 → C2V2
Vn− 2 → Cn −1Cn
New intermediate variables: V1, V2 , ,Vn−2
59. Theorem:
For any context-free grammar
there is an equivalent grammar
in Chomsky Normal Form
provide Empty string is not a
Member
If empty string is in the language L
First Find an equivalent chomsky’s
normal form for L-{λ} ,Then add a new starting
var S’ and add one new prod
S’ S |λ
60. Observations
• Chomsky normal forms are good
for parsing and proving theorems
• It is very easy to find the Chomsky normal
form of any context-free grammar
70. aabbb
W11
a a b b b
aa ab bb bb
aab abb bbb
W14 aabb abbb
71. S → AB
A → BB
a a b b b
A→a A A B B B
B → AB
aa ab bb bb
B→b
aab abb bbb
aabb abbb
aabbb
72. S → AB
A → BB
a a b b b
A→a A A B B B
B → AB
aa ab bb bb
B→b S,B A A
aab abb bbb
aabb abbb
aabbb
73. S → AB a a b b b
A → BB A A B B B
A→a aa ab bb bb
B → AB S,B A A
aab abb bbb
B→b S,B A S,B
aabb abbb
A S,B
V15
aabbb
S,B
74. Since S is a member of V15
The given string is a member of L(G)
75. S → AB | BC
A → BA |a V12={S,A}
B → CC | b V23={B}
C → AB |a V34={S,C}
String baaba V45={S,A}
V11= {B}=V44
V22= V33=V55={A,C} W12= b a
B A,C
BA, BC
A S
76. W34 = a b
A,C B
AB CB
V34 = S,C
W13= baa
b aa
B B
ba a
A ,S A,C
77. V13 empty set
V24= {B}
V35={B}
V14=empty set
V25= {S,A,C}
V15={S,A,C} so string is a member of L(G)
W15 ba aba