SlideShare una empresa de Scribd logo
1 de 78
Compiler Components & Generators
Lexical Analysis

Guido Wachsmuth




       Delft
                                    Course IN4303
       University of
       Technology             Compiler Construction
       Challenge the future
Recap: Garbage Collection
lessons learned

How can we collect unreachable records on the heap?

  •   reference counts
  •   mark reachable records, sweep unreachable records
  •   copy reachable records

How can we reduce heap space needed for garbage collection?

  •   pointer-reversal
  •   breadth-first search
  •   hybrid algorithms




                                                          Lexical Analysis   2
Overview
today’s lecture




                  Lexical Analysis   3
Overview
today’s lecture

lexical analysis




                   Lexical Analysis   3
Overview
today’s lecture

lexical analysis

regular languages

   •   regular grammars
   •   regular expressions
   •   finite state automata




                               Lexical Analysis   3
Overview
today’s lecture

lexical analysis

regular languages

   •   regular grammars
   •   regular expressions
   •   finite state automata

equivalence of formalisms

   •   constructive approach




                               Lexical Analysis   3
Overview
today’s lecture

lexical analysis

regular languages

   •   regular grammars
   •   regular expressions
   •   finite state automata

equivalence of formalisms

   •   constructive approach

tool generation


                               Lexical Analysis   3
I
regular grammars




                   Lexical Analysis   4
Recap: A Theory of Language
formal languages




                              Lexical Analysis   5
Recap: A Theory of Language
formal languages

vocabulary Σ
   finite, nonempty set of elements (words, letters)
   alphabet




                                                       Lexical Analysis   5
Recap: A Theory of Language
formal languages

vocabulary Σ
   finite, nonempty set of elements (words, letters)
   alphabet

string over Σ
   finite sequence of elements chosen from Σ
   word, sentence, utterance




                                                       Lexical Analysis   5
Recap: A Theory of Language
formal languages

vocabulary Σ
   finite, nonempty set of elements (words, letters)
   alphabet

string over Σ
   finite sequence of elements chosen from Σ
   word, sentence, utterance

formal language λ
   set of strings over a vocabulary Σ
   λ ⊆ Σ*


                                                       Lexical Analysis   5
Recap: A Theory of Language
formal grammars




                              Lexical Analysis   6
Recap: A Theory of Language
formal grammars

formal grammar G = (N, Σ, P, S)
   nonterminal symbols N
   terminal symbols Σ
   production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
   start symbol S∈N




                                                   Lexical Analysis   6
Recap: A Theory of Language
formal grammars

formal grammar G = (N, Σ, P, S)
   nonterminal symbols N      nonterminal symbol
   terminal symbols Σ
   production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
   start symbol S∈N




                                                   Lexical Analysis   6
Recap: A Theory of Language
formal grammars

formal grammar G = (N, Σ, P, S)
   nonterminal symbols N
   terminal symbols Σ
   production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
   start symbol S∈N
                        context




                                                   Lexical Analysis   6
Recap: A Theory of Language
formal grammars

formal grammar G = (N, Σ, P, S)
   nonterminal symbols N
   terminal symbols Σ
   production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
   start symbol S∈N
                                replacement




                                                   Lexical Analysis   6
Recap: A Theory of Language
formal grammars

formal grammar G = (N, Σ, P, S)
   nonterminal symbols N
   terminal symbols Σ
   production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)*
   start symbol S∈N

grammar classes
   type-0, unrestricted
   type-1, context-sensitive: (a A c, a b c)
   type-2, context-free: P ⊆ N × (N∪Σ)*
   type-3, regular: (A, x) or (A, xB)

                                                   Lexical Analysis   6
Decimal Numbers
right regular grammar

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis   7
Identifiers
right regular grammar

  Id → “a” R            Id → “a”
     …                     …
  Id → “z” R            Id → “z”

  R → “a” R             R → “a”
     …                     …
  R → “z” R             R → “z”
  R → “0” R             R → “0”
     …                     …
  R → “9” R             R → “9”




                                   Lexical Analysis   8
Recap: A Theory of Language
formal languages




                              Lexical Analysis   9
Recap: A Theory of Language
formal languages

formal grammar G = (N, Σ, P, S)




                                  Lexical Analysis   9
Recap: A Theory of Language
formal languages

formal grammar G = (N, Σ, P, S)

derivation relation     G   ⊆ (N∪Σ)* × (N∪Σ)*

    w   G   w’

            ∃(p, q)∈P: ∃u,v∈(N∪Σ)*:
            w=u p v ∧ w’=u q v




                                                Lexical Analysis   9
Recap: A Theory of Language
formal languages

formal grammar G = (N, Σ, P, S)

derivation relation     G    ⊆ (N∪Σ)* × (N∪Σ)*

    w   G   w’

            ∃(p, q)∈P: ∃u,v∈(N∪Σ)*:
            w=u p v ∧ w’=u q v

formal language L(G) ⊆ Σ*
    L(G) = {w∈Σ* | S     G
                             *
                                 w}




                                                 Lexical Analysis   9
Recap: A Theory of Language
formal languages

formal grammar G = (N, Σ, P, S)

derivation relation     G    ⊆ (N∪Σ)* × (N∪Σ)*

    w   G   w’

            ∃(p, q)∈P: ∃u,v∈(N∪Σ)*:
            w=u p v ∧ w’=u q v

formal language L(G) ⊆ Σ*
    L(G) = {w∈Σ* | S     G
                             *
                                 w}


classes of formal languages

                                                 Lexical Analysis   9
II
regular expressions




                      Lexical Analysis 10
Recap: Regular Expressions
overview

basics

  •      symbol from an alphabet
  •      ε

combinators

  •      alternation: E1 | E2

  •      concatenation: E1 E2
  •      repetition: E*
  •      optional: E? = E | ε
  •      one or more: E+ = E E*



                                   Lexical Analysis 11
Decimal Numbers & Identifiers
regular expressions




         Num: (0|1|2|3|4|5|6|7|8|9)+




          Id: (a|…|z)(a|…|z|0|…|9)*




                                       Lexical Analysis 12
Regular Expressions
formal languages

basics

  •      L(a) = {“a”}
  •      L(ε) = {“”}

combinators

  •      L(E1 | E2) = L(E1) ∪ L(E2)

  •      L(E1 E2) = L(E1) · L(E2)
  •      L(E*) = L(E)*




                                      Lexical Analysis 13
III
finite automata




                  Lexical Analysis 14
Finite Automata
formal definition




                    Lexical Analysis 15
Finite Automata
formal definition

finite automaton M = (Q, Σ, T, q0, F)
   states Q
   input symbols Σ
   transition function T
   start state q0∈Q

   final states F⊆Q




                                        Lexical Analysis 15
Finite Automata
formal definition

finite automaton M = (Q, Σ, T, q0, F)
    states Q
    input symbols Σ
    transition function T
    start state q0∈Q

    final states F⊆Q

transition function
    nondeterministic FA T : Q × Σ → P(Q)
    NFA with ε-moves T : Q × (Σ ∪ {ε}) → P(Q)
    deterministic FA T : Q × Σ → Q

                                                Lexical Analysis 15
Decimal Numbers & Identifiers
finite automata



               0-9
           1         2   0-9




               a-z       a-z
           1         2
                         0-9




                                Lexical Analysis 16
Nondeterministic Finite Automata
formal languages




                             Lexical Analysis 17
Nondeterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)




                                        Lexical Analysis 17
Nondeterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)

transition function T : Q × Σ → P(Q)
   T({q1,..., qn}, x) := T(q1, x) ∪ … ∪ T(qn, x)

   T*({q1,..., qn}, ε) := {q1,..., qn}

   T*({q1,..., qn}, xw) := T*(T({q1,..., qn}, x), w)




                                                       Lexical Analysis 17
Nondeterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)

transition function T : Q × Σ → P(Q)
   T({q1,..., qn}, x) := T(q1, x) ∪ … ∪ T(qn, x)

   T*({q1,..., qn}, ε) := {q1,..., qn}

   T*({q1,..., qn}, xw) := T*(T({q1,..., qn}, x), w)

formal language L(M) ⊆ Σ*

   L(M) = {w∈Σ* | T*({q0}, w) ∩ F ≠ ∅}



                                                       Lexical Analysis 17
Deterministic Finite Automata
formal languages




                                Lexical Analysis 18
Deterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)




                                        Lexical Analysis 18
Deterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)

transition function T : Q × Σ → P(Q)
   T*(q, ε) := q

   T*(q, xw) := T*(T(q, x), w)




                                        Lexical Analysis 18
Deterministic Finite Automata
formal languages

finite automaton M = (Q, Σ, T, q0, F)

transition function T : Q × Σ → P(Q)
   T*(q, ε) := q

   T*(q, xw) := T*(T(q, x), w)

formal language L(M) ⊆ Σ*

   L(M) = {w∈Σ* | T*(q0, w) ∈ F}




                                        Lexical Analysis 18
coffee break




               Lexical Analysis 19
IV
equivalence




              Lexical Analysis 20
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 21
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 21
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 21
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 21
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 21
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 22
Regular Languages
formalisms



       left regular   right regular     regular
        grammars        grammars      expressions




       NFAs with
                         NFAs            DFAs
        ε-moves




                                             Lexical Analysis 22
NFA construction
right regular grammar




                        Lexical Analysis 23
NFA construction
right regular grammar

formal grammar G = (N, Σ, P, S)




                                  Lexical Analysis 23
NFA construction
right regular grammar

formal grammar G = (N, Σ, P, S)

finite automaton M = (N ∪ {f}, Σ, T, S, F)




                                             Lexical Analysis 23
NFA construction
right regular grammar

formal grammar G = (N, Σ, P, S)

finite automaton M = (N ∪ {f}, Σ, T, S, F)

transition function T
    (X, aY)∈P : (X, a, Y)∈T
    (X, a)∈P : (X, a, f)∈T




                                             Lexical Analysis 23
NFA construction
right regular grammar

formal grammar G = (N, Σ, P, S)

finite automaton M = (N ∪ {f}, Σ, T, S, F)

transition function T
    (X, aY)∈P : (X, a, Y)∈T
    (X, a)∈P : (X, a, f)∈T

final states F
    (S, ε)∈P : F = {S, f}
    else: F = {f}



                                             Lexical Analysis 23
NFA construction
example

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis 24
NFA construction
example

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
                                              N
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”         f
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis 24
NFA construction
example

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
                                              N
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”         f
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis 24
NFA construction
example

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
                                              N          0-9
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”         f
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis 24
NFA construction
example

  Num   →   “0”   Num   Num   →   “0”
  Num   →   “1”   Num   Num   →   “1”
  Num   →   “2”   Num   Num   →   “2”
                                              N          0-9
  Num   →   “3”   Num   Num   →   “3”
  Num   →   “4”   Num   Num   →   “4”
  Num   →   “5”   Num   Num   →   “5”          0-9
  Num   →   “6”   Num   Num   →   “6”
  Num   →   “7”   Num   Num   →   “7”         f
  Num   →   “8”   Num   Num   →   “8”
  Num   →   “9”   Num   Num   →   “9”




                                        Lexical Analysis 24
NFA construction
regular expressions

              a
 x



                      x
(x)            ε          ε




                      x
                  ε
x*


                  ε

                              Lexical Analysis 25
NFA construction
 regular expressions

              ε        x       ε


x|y

                       y
              ε                ε




                  x                y
        ε                  ε                         ε


xy

                                       Lexical Analysis 26
NFA construction
ε elimination

additional final states

   •   states with ε-moves into final states
   •   become final states themselves

additional transitions

   •   ε-move from source to target state
   •   transitions from target state
   •   add these transitions to the source state




                                                   Lexical Analysis 27
Powerset construction
eliminating nondeterminism

nondeterministic finite automaton M = (Q, Σ, T, q0, F)

deterministic finite automaton M’ = (P(Q), Σ, T’, {q0}, F’)

transition function T’

   •   T’({q1,..., qn}, x) = T({q1,..., qn}, x) = T(q1, x) ∪ … ∪ T(qn, x)

final states F’ = {S⎮S⊆Q, S∩F ≠ ∅}

   •   all states that include a final state of the original NFA




                                                                   Lexical Analysis 28
Powerset construction
example



          f                      f
 3             4         23                        24


                                 a-e,g-z             a-z
     i                       i     0-9               0-9


         a-z       a-z                                       a-z
 1             2         1                         2
                   0-9           a-h                         0-9
                                 j-z




                                       Lexical Analysis 29
V
summary




          Lexical Analysis 30
Summary
lessons learned




                  Lexical Analysis 31
Summary
lessons learned

What are the formalisms to describe regular languages?
  •   regular grammars
  •   regular expressions
  •   finite state automata




                                                   Lexical Analysis 31
Summary
lessons learned

What are the formalisms to describe regular languages?
  •   regular grammars
  •   regular expressions
  •   finite state automata
Why are these formalisms equivalent?
  •   constructive proofs




                                                   Lexical Analysis 31
Summary
lessons learned

What are the formalisms to describe regular languages?
  •   regular grammars
  •   regular expressions
  •   finite state automata
Why are these formalisms equivalent?
  •   constructive proofs
How can we generate compiler tools from that?
  •   implement DFAs
  •   generate transition tables




                                                   Lexical Analysis 31
Literature
learn more




             Lexical Analysis 32
Literature
learn more

formal languages
   Noam Chomsky: Three models for the description of language. 1956
   J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata
   Theory, Languages, and Computation. 2006




                                                           Lexical Analysis 32
Literature
learn more

formal languages
    Noam Chomsky: Three models for the description of language. 1956
    J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata
    Theory, Languages, and Computation. 2006

lexical analysis
    Andrew W. Appel, Jens Palsberg: Modern Compiler Implementation in
    Java, 2nd edition. 2002




                                                            Lexical Analysis 32
Outlook
coming next

compiler components and their generators

  •   Lecture 12: Syntactical Analysis
  •   Lecture 13: SDF inside
  •   Lecture 14: Static Analysis

Lab Nov 26

  •   test cases
  •   finish Lab 2, do reference resolution next
  •   finish Lab 3, do hover help next
  •   content completion



                                                   Lexical Analysis 33
questions




            Lexical Analysis 34
credits




          Lexical Analysis 35
Pictures
copyrights

Slide 1:
   Book Scanner by Ben Woosley, some rights reserved

Slides 5, 6, 9:
   Noam Chomsky by Fellowsisters, some rights reserved

Slide 7, 8, 12, 16:
   Tiger by Bernard Landgraf, some rights reserved

Slide 19:
   Coffee Primo by Dominica Williamson, some rights reserved

Slide 33:
   Pine Creek by Nicholas, some rights reserved

Slide 34:
   Questions by Oberazzi, some rights reserved

Slide 35:
   Too Much Credit by Andres Rueda, some rights reserved




                                                               Lexical Analysis 36

Más contenido relacionado

La actualidad más candente

Declare Your Language (at DLS)
Declare Your Language (at DLS)Declare Your Language (at DLS)
Declare Your Language (at DLS)
Eelco Visser
 

La actualidad más candente (20)

Term Rewriting
Term RewritingTerm Rewriting
Term Rewriting
 
Dynamic Semantics
Dynamic SemanticsDynamic Semantics
Dynamic Semantics
 
LL Parsing
LL ParsingLL Parsing
LL Parsing
 
Syntax Definition
Syntax DefinitionSyntax Definition
Syntax Definition
 
Declarative Semantics Definition - Term Rewriting
Declarative Semantics Definition - Term RewritingDeclarative Semantics Definition - Term Rewriting
Declarative Semantics Definition - Term Rewriting
 
Type analysis
Type analysisType analysis
Type analysis
 
Pure and Declarative Syntax Definition: Paradise Lost and Regained
Pure and Declarative Syntax Definition: Paradise Lost and RegainedPure and Declarative Syntax Definition: Paradise Lost and Regained
Pure and Declarative Syntax Definition: Paradise Lost and Regained
 
Syntax Definition
Syntax DefinitionSyntax Definition
Syntax Definition
 
Declare Your Language: Syntactic (Editor) Services
Declare Your Language: Syntactic (Editor) ServicesDeclare Your Language: Syntactic (Editor) Services
Declare Your Language: Syntactic (Editor) Services
 
Declare Your Language: Syntax Definition
Declare Your Language: Syntax DefinitionDeclare Your Language: Syntax Definition
Declare Your Language: Syntax Definition
 
Dynamic Semantics Specification and Interpreter Generation
Dynamic Semantics Specification and Interpreter GenerationDynamic Semantics Specification and Interpreter Generation
Dynamic Semantics Specification and Interpreter Generation
 
Ch04
Ch04Ch04
Ch04
 
Declare Your Language (at DLS)
Declare Your Language (at DLS)Declare Your Language (at DLS)
Declare Your Language (at DLS)
 
Declare Your Language: Type Checking
Declare Your Language: Type CheckingDeclare Your Language: Type Checking
Declare Your Language: Type Checking
 
Parsing in Compiler Design
Parsing in Compiler DesignParsing in Compiler Design
Parsing in Compiler Design
 
Declare Your Language: Name Resolution
Declare Your Language: Name ResolutionDeclare Your Language: Name Resolution
Declare Your Language: Name Resolution
 
Declare Your Language: Transformation by Strategic Term Rewriting
Declare Your Language: Transformation by Strategic Term RewritingDeclare Your Language: Transformation by Strategic Term Rewriting
Declare Your Language: Transformation by Strategic Term Rewriting
 
Ch4a
Ch4aCh4a
Ch4a
 
Ch03
Ch03Ch03
Ch03
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
 

Destacado

Survey around Semantics for Programming Languages, and Machine Proof using Coq
Survey around Semantics for Programming Languages, and Machine Proof using CoqSurvey around Semantics for Programming Languages, and Machine Proof using Coq
Survey around Semantics for Programming Languages, and Machine Proof using Coq
bellbind
 

Destacado (15)

Operating Systems & Applications
Operating Systems & ApplicationsOperating Systems & Applications
Operating Systems & Applications
 
Introduction to Compiler Development
Introduction to Compiler DevelopmentIntroduction to Compiler Development
Introduction to Compiler Development
 
A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...
 
A basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyA basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and why
 
Research data management
Research data managementResearch data management
Research data management
 
Research Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & ResponsibilitiesResearch Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & Responsibilities
 
A basic course on Reseach data management, part 2: protecting and organizing ...
A basic course on Reseach data management, part 2: protecting and organizing ...A basic course on Reseach data management, part 2: protecting and organizing ...
A basic course on Reseach data management, part 2: protecting and organizing ...
 
Software languages
Software languagesSoftware languages
Software languages
 
Language
LanguageLanguage
Language
 
A basic course on Research data management, part 3: sharing your data
A basic course on Research data management, part 3: sharing your dataA basic course on Research data management, part 3: sharing your data
A basic course on Research data management, part 3: sharing your data
 
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
 
Ch6 architectural design
Ch6 architectural designCh6 architectural design
Ch6 architectural design
 
Survey around Semantics for Programming Languages, and Machine Proof using Coq
Survey around Semantics for Programming Languages, and Machine Proof using CoqSurvey around Semantics for Programming Languages, and Machine Proof using Coq
Survey around Semantics for Programming Languages, and Machine Proof using Coq
 
Csci360 08-subprograms
Csci360 08-subprogramsCsci360 08-subprograms
Csci360 08-subprograms
 

Similar a Compiler Components and their Generators - Lexical Analysis

140106 isaim-okayama
140106 isaim-okayama140106 isaim-okayama
140106 isaim-okayama
gumitaro2012
 
Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language Semantics
Dimitrios Kartsaklis
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
CSR2011
 

Similar a Compiler Components and their Generators - Lexical Analysis (20)

Compiler components and their generators traditional parsing algorithms
Compiler components and their generators  traditional parsing algorithmsCompiler components and their generators  traditional parsing algorithms
Compiler components and their generators traditional parsing algorithms
 
Алексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВАлексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВ
 
LPS talk notes
LPS talk notesLPS talk notes
LPS talk notes
 
Introduction of predicate logics
Introduction of predicate  logicsIntroduction of predicate  logics
Introduction of predicate logics
 
L03 ai - knowledge representation using logic
L03 ai - knowledge representation using logicL03 ai - knowledge representation using logic
L03 ai - knowledge representation using logic
 
Pre-Cal 40S June 3, 2009
Pre-Cal 40S June 3, 2009Pre-Cal 40S June 3, 2009
Pre-Cal 40S June 3, 2009
 
Ch02
Ch02Ch02
Ch02
 
Open course(programming languages) 20150211
Open course(programming languages) 20150211Open course(programming languages) 20150211
Open course(programming languages) 20150211
 
Open course(programming languages) 20150211
Open course(programming languages) 20150211Open course(programming languages) 20150211
Open course(programming languages) 20150211
 
140106 isaim-okayama
140106 isaim-okayama140106 isaim-okayama
140106 isaim-okayama
 
Logicrevo15
Logicrevo15Logicrevo15
Logicrevo15
 
A new technique for proving non regularity based on the measure of a language
A new technique for proving non regularity based on the measure of a languageA new technique for proving non regularity based on the measure of a language
A new technique for proving non regularity based on the measure of a language
 
Applied Math 40S June 2 AM, 2008
Applied Math 40S June 2 AM, 2008Applied Math 40S June 2 AM, 2008
Applied Math 40S June 2 AM, 2008
 
ToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdf
 
Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language Semantics
 
"That scripting language called Prolog"
"That scripting language called Prolog""That scripting language called Prolog"
"That scripting language called Prolog"
 
Context free langauges
Context free langaugesContext free langauges
Context free langauges
 
L_2_apl.pptx
L_2_apl.pptxL_2_apl.pptx
L_2_apl.pptx
 
Scope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languagesScope Graphs: A fresh look at name binding in programming languages
Scope Graphs: A fresh look at name binding in programming languages
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
 

Más de Guido Wachsmuth

Domain-Specific Type Systems
Domain-Specific Type SystemsDomain-Specific Type Systems
Domain-Specific Type Systems
Guido Wachsmuth
 

Más de Guido Wachsmuth (9)

Domain-Specific Type Systems
Domain-Specific Type SystemsDomain-Specific Type Systems
Domain-Specific Type Systems
 
Declarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty PrintingDeclarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty Printing
 
Compiler Components and their Generators - LR Parsing
Compiler Components and their Generators - LR ParsingCompiler Components and their Generators - LR Parsing
Compiler Components and their Generators - LR Parsing
 
Compiling Imperative and Object-Oriented Languages - Garbage Collection
Compiling Imperative and Object-Oriented Languages - Garbage CollectionCompiling Imperative and Object-Oriented Languages - Garbage Collection
Compiling Imperative and Object-Oriented Languages - Garbage Collection
 
Compiling Imperative and Object-Oriented Languages - Register Allocation
Compiling Imperative and Object-Oriented Languages - Register AllocationCompiling Imperative and Object-Oriented Languages - Register Allocation
Compiling Imperative and Object-Oriented Languages - Register Allocation
 
Compiling Imperative and Object-Oriented Languages - Dataflow Analysis
Compiling Imperative and Object-Oriented Languages - Dataflow AnalysisCompiling Imperative and Object-Oriented Languages - Dataflow Analysis
Compiling Imperative and Object-Oriented Languages - Dataflow Analysis
 
Compiling Imperative and Object-Oriented Languages - Activation Records
Compiling Imperative and Object-Oriented Languages - Activation RecordsCompiling Imperative and Object-Oriented Languages - Activation Records
Compiling Imperative and Object-Oriented Languages - Activation Records
 
Declarative Semantics Definition - Code Generation
Declarative Semantics Definition - Code Generation Declarative Semantics Definition - Code Generation
Declarative Semantics Definition - Code Generation
 
Software Languages
Software LanguagesSoftware Languages
Software Languages
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

Compiler Components and their Generators - Lexical Analysis

  • 1. Compiler Components & Generators Lexical Analysis Guido Wachsmuth Delft Course IN4303 University of Technology Compiler Construction Challenge the future
  • 2. Recap: Garbage Collection lessons learned How can we collect unreachable records on the heap? • reference counts • mark reachable records, sweep unreachable records • copy reachable records How can we reduce heap space needed for garbage collection? • pointer-reversal • breadth-first search • hybrid algorithms Lexical Analysis 2
  • 3. Overview today’s lecture Lexical Analysis 3
  • 5. Overview today’s lecture lexical analysis regular languages • regular grammars • regular expressions • finite state automata Lexical Analysis 3
  • 6. Overview today’s lecture lexical analysis regular languages • regular grammars • regular expressions • finite state automata equivalence of formalisms • constructive approach Lexical Analysis 3
  • 7. Overview today’s lecture lexical analysis regular languages • regular grammars • regular expressions • finite state automata equivalence of formalisms • constructive approach tool generation Lexical Analysis 3
  • 8. I regular grammars Lexical Analysis 4
  • 9. Recap: A Theory of Language formal languages Lexical Analysis 5
  • 10. Recap: A Theory of Language formal languages vocabulary Σ finite, nonempty set of elements (words, letters) alphabet Lexical Analysis 5
  • 11. Recap: A Theory of Language formal languages vocabulary Σ finite, nonempty set of elements (words, letters) alphabet string over Σ finite sequence of elements chosen from Σ word, sentence, utterance Lexical Analysis 5
  • 12. Recap: A Theory of Language formal languages vocabulary Σ finite, nonempty set of elements (words, letters) alphabet string over Σ finite sequence of elements chosen from Σ word, sentence, utterance formal language λ set of strings over a vocabulary Σ λ ⊆ Σ* Lexical Analysis 5
  • 13. Recap: A Theory of Language formal grammars Lexical Analysis 6
  • 14. Recap: A Theory of Language formal grammars formal grammar G = (N, Σ, P, S) nonterminal symbols N terminal symbols Σ production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)* start symbol S∈N Lexical Analysis 6
  • 15. Recap: A Theory of Language formal grammars formal grammar G = (N, Σ, P, S) nonterminal symbols N nonterminal symbol terminal symbols Σ production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)* start symbol S∈N Lexical Analysis 6
  • 16. Recap: A Theory of Language formal grammars formal grammar G = (N, Σ, P, S) nonterminal symbols N terminal symbols Σ production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)* start symbol S∈N context Lexical Analysis 6
  • 17. Recap: A Theory of Language formal grammars formal grammar G = (N, Σ, P, S) nonterminal symbols N terminal symbols Σ production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)* start symbol S∈N replacement Lexical Analysis 6
  • 18. Recap: A Theory of Language formal grammars formal grammar G = (N, Σ, P, S) nonterminal symbols N terminal symbols Σ production rules P ⊆ (N∪Σ)* N (N∪Σ)* × (N∪Σ)* start symbol S∈N grammar classes type-0, unrestricted type-1, context-sensitive: (a A c, a b c) type-2, context-free: P ⊆ N × (N∪Σ)* type-3, regular: (A, x) or (A, xB) Lexical Analysis 6
  • 19. Decimal Numbers right regular grammar Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” Num → “6” Num Num → “6” Num → “7” Num Num → “7” Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 7
  • 20. Identifiers right regular grammar Id → “a” R Id → “a” … … Id → “z” R Id → “z” R → “a” R R → “a” … … R → “z” R R → “z” R → “0” R R → “0” … … R → “9” R R → “9” Lexical Analysis 8
  • 21. Recap: A Theory of Language formal languages Lexical Analysis 9
  • 22. Recap: A Theory of Language formal languages formal grammar G = (N, Σ, P, S) Lexical Analysis 9
  • 23. Recap: A Theory of Language formal languages formal grammar G = (N, Σ, P, S) derivation relation G ⊆ (N∪Σ)* × (N∪Σ)* w G w’ ∃(p, q)∈P: ∃u,v∈(N∪Σ)*: w=u p v ∧ w’=u q v Lexical Analysis 9
  • 24. Recap: A Theory of Language formal languages formal grammar G = (N, Σ, P, S) derivation relation G ⊆ (N∪Σ)* × (N∪Σ)* w G w’ ∃(p, q)∈P: ∃u,v∈(N∪Σ)*: w=u p v ∧ w’=u q v formal language L(G) ⊆ Σ* L(G) = {w∈Σ* | S G * w} Lexical Analysis 9
  • 25. Recap: A Theory of Language formal languages formal grammar G = (N, Σ, P, S) derivation relation G ⊆ (N∪Σ)* × (N∪Σ)* w G w’ ∃(p, q)∈P: ∃u,v∈(N∪Σ)*: w=u p v ∧ w’=u q v formal language L(G) ⊆ Σ* L(G) = {w∈Σ* | S G * w} classes of formal languages Lexical Analysis 9
  • 26. II regular expressions Lexical Analysis 10
  • 27. Recap: Regular Expressions overview basics • symbol from an alphabet • ε combinators • alternation: E1 | E2 • concatenation: E1 E2 • repetition: E* • optional: E? = E | ε • one or more: E+ = E E* Lexical Analysis 11
  • 28. Decimal Numbers & Identifiers regular expressions Num: (0|1|2|3|4|5|6|7|8|9)+ Id: (a|…|z)(a|…|z|0|…|9)* Lexical Analysis 12
  • 29. Regular Expressions formal languages basics • L(a) = {“a”} • L(ε) = {“”} combinators • L(E1 | E2) = L(E1) ∪ L(E2) • L(E1 E2) = L(E1) · L(E2) • L(E*) = L(E)* Lexical Analysis 13
  • 30. III finite automata Lexical Analysis 14
  • 31. Finite Automata formal definition Lexical Analysis 15
  • 32. Finite Automata formal definition finite automaton M = (Q, Σ, T, q0, F) states Q input symbols Σ transition function T start state q0∈Q final states F⊆Q Lexical Analysis 15
  • 33. Finite Automata formal definition finite automaton M = (Q, Σ, T, q0, F) states Q input symbols Σ transition function T start state q0∈Q final states F⊆Q transition function nondeterministic FA T : Q × Σ → P(Q) NFA with ε-moves T : Q × (Σ ∪ {ε}) → P(Q) deterministic FA T : Q × Σ → Q Lexical Analysis 15
  • 34. Decimal Numbers & Identifiers finite automata 0-9 1 2 0-9 a-z a-z 1 2 0-9 Lexical Analysis 16
  • 35. Nondeterministic Finite Automata formal languages Lexical Analysis 17
  • 36. Nondeterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) Lexical Analysis 17
  • 37. Nondeterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) transition function T : Q × Σ → P(Q) T({q1,..., qn}, x) := T(q1, x) ∪ … ∪ T(qn, x) T*({q1,..., qn}, ε) := {q1,..., qn} T*({q1,..., qn}, xw) := T*(T({q1,..., qn}, x), w) Lexical Analysis 17
  • 38. Nondeterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) transition function T : Q × Σ → P(Q) T({q1,..., qn}, x) := T(q1, x) ∪ … ∪ T(qn, x) T*({q1,..., qn}, ε) := {q1,..., qn} T*({q1,..., qn}, xw) := T*(T({q1,..., qn}, x), w) formal language L(M) ⊆ Σ* L(M) = {w∈Σ* | T*({q0}, w) ∩ F ≠ ∅} Lexical Analysis 17
  • 39. Deterministic Finite Automata formal languages Lexical Analysis 18
  • 40. Deterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) Lexical Analysis 18
  • 41. Deterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) transition function T : Q × Σ → P(Q) T*(q, ε) := q T*(q, xw) := T*(T(q, x), w) Lexical Analysis 18
  • 42. Deterministic Finite Automata formal languages finite automaton M = (Q, Σ, T, q0, F) transition function T : Q × Σ → P(Q) T*(q, ε) := q T*(q, xw) := T*(T(q, x), w) formal language L(M) ⊆ Σ* L(M) = {w∈Σ* | T*(q0, w) ∈ F} Lexical Analysis 18
  • 43. coffee break Lexical Analysis 19
  • 44. IV equivalence Lexical Analysis 20
  • 45. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 21
  • 46. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 21
  • 47. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 21
  • 48. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 21
  • 49. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 21
  • 50. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 22
  • 51. Regular Languages formalisms left regular right regular regular grammars grammars expressions NFAs with NFAs DFAs ε-moves Lexical Analysis 22
  • 52. NFA construction right regular grammar Lexical Analysis 23
  • 53. NFA construction right regular grammar formal grammar G = (N, Σ, P, S) Lexical Analysis 23
  • 54. NFA construction right regular grammar formal grammar G = (N, Σ, P, S) finite automaton M = (N ∪ {f}, Σ, T, S, F) Lexical Analysis 23
  • 55. NFA construction right regular grammar formal grammar G = (N, Σ, P, S) finite automaton M = (N ∪ {f}, Σ, T, S, F) transition function T (X, aY)∈P : (X, a, Y)∈T (X, a)∈P : (X, a, f)∈T Lexical Analysis 23
  • 56. NFA construction right regular grammar formal grammar G = (N, Σ, P, S) finite automaton M = (N ∪ {f}, Σ, T, S, F) transition function T (X, aY)∈P : (X, a, Y)∈T (X, a)∈P : (X, a, f)∈T final states F (S, ε)∈P : F = {S, f} else: F = {f} Lexical Analysis 23
  • 57. NFA construction example Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” Num → “6” Num Num → “6” Num → “7” Num Num → “7” Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 24
  • 58. NFA construction example Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” N Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” Num → “6” Num Num → “6” Num → “7” Num Num → “7” f Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 24
  • 59. NFA construction example Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” N Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” Num → “6” Num Num → “6” Num → “7” Num Num → “7” f Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 24
  • 60. NFA construction example Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” N 0-9 Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” Num → “6” Num Num → “6” Num → “7” Num Num → “7” f Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 24
  • 61. NFA construction example Num → “0” Num Num → “0” Num → “1” Num Num → “1” Num → “2” Num Num → “2” N 0-9 Num → “3” Num Num → “3” Num → “4” Num Num → “4” Num → “5” Num Num → “5” 0-9 Num → “6” Num Num → “6” Num → “7” Num Num → “7” f Num → “8” Num Num → “8” Num → “9” Num Num → “9” Lexical Analysis 24
  • 62. NFA construction regular expressions a x x (x) ε ε x ε x* ε Lexical Analysis 25
  • 63. NFA construction regular expressions ε x ε x|y y ε ε x y ε ε ε xy Lexical Analysis 26
  • 64. NFA construction ε elimination additional final states • states with ε-moves into final states • become final states themselves additional transitions • ε-move from source to target state • transitions from target state • add these transitions to the source state Lexical Analysis 27
  • 65. Powerset construction eliminating nondeterminism nondeterministic finite automaton M = (Q, Σ, T, q0, F) deterministic finite automaton M’ = (P(Q), Σ, T’, {q0}, F’) transition function T’ • T’({q1,..., qn}, x) = T({q1,..., qn}, x) = T(q1, x) ∪ … ∪ T(qn, x) final states F’ = {S⎮S⊆Q, S∩F ≠ ∅} • all states that include a final state of the original NFA Lexical Analysis 28
  • 66. Powerset construction example f f 3 4 23 24 a-e,g-z a-z i i 0-9 0-9 a-z a-z a-z 1 2 1 2 0-9 a-h 0-9 j-z Lexical Analysis 29
  • 67. V summary Lexical Analysis 30
  • 68. Summary lessons learned Lexical Analysis 31
  • 69. Summary lessons learned What are the formalisms to describe regular languages? • regular grammars • regular expressions • finite state automata Lexical Analysis 31
  • 70. Summary lessons learned What are the formalisms to describe regular languages? • regular grammars • regular expressions • finite state automata Why are these formalisms equivalent? • constructive proofs Lexical Analysis 31
  • 71. Summary lessons learned What are the formalisms to describe regular languages? • regular grammars • regular expressions • finite state automata Why are these formalisms equivalent? • constructive proofs How can we generate compiler tools from that? • implement DFAs • generate transition tables Lexical Analysis 31
  • 72. Literature learn more Lexical Analysis 32
  • 73. Literature learn more formal languages Noam Chomsky: Three models for the description of language. 1956 J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata Theory, Languages, and Computation. 2006 Lexical Analysis 32
  • 74. Literature learn more formal languages Noam Chomsky: Three models for the description of language. 1956 J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata Theory, Languages, and Computation. 2006 lexical analysis Andrew W. Appel, Jens Palsberg: Modern Compiler Implementation in Java, 2nd edition. 2002 Lexical Analysis 32
  • 75. Outlook coming next compiler components and their generators • Lecture 12: Syntactical Analysis • Lecture 13: SDF inside • Lecture 14: Static Analysis Lab Nov 26 • test cases • finish Lab 2, do reference resolution next • finish Lab 3, do hover help next • content completion Lexical Analysis 33
  • 76. questions Lexical Analysis 34
  • 77. credits Lexical Analysis 35
  • 78. Pictures copyrights Slide 1: Book Scanner by Ben Woosley, some rights reserved Slides 5, 6, 9: Noam Chomsky by Fellowsisters, some rights reserved Slide 7, 8, 12, 16: Tiger by Bernard Landgraf, some rights reserved Slide 19: Coffee Primo by Dominica Williamson, some rights reserved Slide 33: Pine Creek by Nicholas, some rights reserved Slide 34: Questions by Oberazzi, some rights reserved Slide 35: Too Much Credit by Andres Rueda, some rights reserved Lexical Analysis 36

Notas del editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. linguistic theory\n\nNoam Chomsky\n
  9. linguistic theory\n\nNoam Chomsky\n
  10. linguistic theory\n\nNoam Chomsky\n
  11. restrictions on production rules => grammar classes\n
  12. restrictions on production rules => grammar classes\n
  13. restrictions on production rules => grammar classes\n
  14. restrictions on production rules => grammar classes\n
  15. restrictions on production rules => grammar classes\n
  16. restrictions on production rules => grammar classes\n
  17. restrictions on production rules => grammar classes\n
  18. restrictions on production rules => grammar classes\n
  19. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  20. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  28. \n
  29. \n
  30. restrictions on production rules => grammar classes\n
  31. restrictions on production rules => grammar classes\n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. restrictions on production rules => grammar classes\n
  52. restrictions on production rules => grammar classes\n
  53. restrictions on production rules => grammar classes\n
  54. restrictions on production rules => grammar classes\n
  55. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  56. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  57. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  58. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  59. computer science: lexical syntax\n\ncan write this as a regular grammar\n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n