SlideShare una empresa de Scribd logo
1 de 17
1
SPECIFICATION OF
TOKENS
2
Strings and Languages
• Regular Expressions are an important notation for specifying patterns.
• Alphabet – any finite set of symbols
e.g. ASCII, binary alphabet, UNICODE, EBCDIC,LATIN-1
• String – A finite sequence of symbols drawn from an alphabet
– Banana (ASCII Alphabet)
– Length of a string => |s|
– Empty String => ε
• Other terms relating to strings: prefix; suffix; substring; proper prefix,
suffix, or substring (non-empty, not entire string); subsequence
• Language – A set of strings over a fixed alphabet
3
Languages
• A language, L, is simply any set of strings over a
fixed alphabet.
Alphabet Languages
{0,1} {0,10,100,1000,100000…}
{0,1,00,11,000,111,…}
{a,b,c} {abc,aabbcc,aaabbbccc,…}
{A, … ,Z} {FOR,WHILE,GOTO,…}
{A,…,Z,a,…,z,0,…9, { All legal PASCAL progs}
+,-,…,<,>,…}
Special Languages:  - EMPTY LANGUAGE
 - contains  string only
4
String operations
• Given String: banana
• Prefix : ban, banana
• Suffix : ana, banana
• Substring : nan, ban, ana, banana
• Subsequence: bnan, nn
• Proper Prefix and Suffix
5
String Operations
• Concatenation
– xy; s = s = s;  - identity for concatenation
– s0 =  if i > 0 si = si-1s
6
Operations on Languages
OPERATION DEFINITION
union of L and M
written L  M
concatenation of L
and M written LM
Kleene closure of L
written L*
positive closure of L
written L+
L  M = {s | s is in L or s is in M}
LM = {st | s is in L and t is in M}
L+=


0
i
i
L
L* denotes “zero or more concatenations of “ L
L*=


1
i
i
L
L+ denotes “one or more concatenations of “ L
Exponentiation Lo={ε}, L1=L,L2=LL
7
Operations on Languages
• LUD is the set of letters and digits
• LD is the set of strings consisting of a
letter followed by a digit
• L4 is the set of all four strings
• L* is the set of strings including ε
• D+ is the set of strings of one or more
digits.
8
Say What?
L = {A, B, C, D } D = {1, 2, 3}
• L  D
{A, B, C, D, 1, 2, 3 }
• LD
{A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 }
• L2
{ AA, AB, AC, AD, BA, BB, BC, BD, CA, … DD}
• L*
{ All possible strings of L plus  }
• L+
L* - 
• L (L  D )
Valid :{ A1,AA2,B345,CD45} Invlaid:{321,4A2}
• L (L  D )*
Valid:{ A,A1,A23,D3,DA5..} Invalid:{31}
9
Regular Expressions
• A Regular Expression is a Set of Rules /
Techniques for Constructing Sequences of
Symbols (Strings) from an Alphabet.
• Let  Be an Alphabet, r a Regular Expression
Then L(r) is the Language That is characterized
by the Rules of r
10
Regular Expressions
• Defined over an alphabet Σ
• ε represents {ε}, the set containing the empty string
• If a is a symbol in Σ, then a is a regular expression
denoting {a}, the set containing the string a
• If r and s are regular expressions denoting the
languages L(r) and L(s), then:
– (r)|(s) is a regular expression denoting L(r)U L(s)
– (r)(s) is a regular expression denoting L(r)L(s)
– (r)* is a regular expression denoting (L(r))*
– (r) is a regular expression denoting L(r)
• Precedence: * (left associative), then concatenation (left
associative), then | (left associative)
11
Regular Expressions
Alphabet = {a, b}
1. a|b denotes {a, b}
2. (a|b)(a|b) denotes {ab, aa, ba, bb}
3. a* denotes {, a, aa, …}
4. (a|b)* - Strings of a’s and b’s including the 
5. a|a*b – a followed by zero/more a’s followed by b
12
Algebraic Properties of Regular
Expressions
AXIOM DESCRIPTION
r | s = s | r
r | (s | t) = (r | s) | t
(r s) t = r (s t)
r = r
r = r
r* = ( r |  )*
r ( s | t ) = r s | r t
( s | t ) r = s r | t r
r** = r*
| is commutative
| is associative
concatenation is associative
concatenation distributes over |
relation between * and 
 Is the identity element for concatenation
* is idempotent
13
Regular Definitions
• Names maybe given to regular expressions; these
names can be used like symbols
• Let  is an alphabet of basic symbols. The regular
definition is a sequence of definitions of the form
d1 r1
d2 r2
. . .
dn rn
Where, each di is a distinct name, and each ri is a
regular expression over the symbols in   {d1, d2,
…, di-1 }
14
Regular Definitions
• Example 1:
– letter  A|B|…|Z|a|b|…|z
– digit  0|1|…|9
– id  letter (letter | digit)*
• Example 2
– digit  0 | 1 | 2 | … | 9
– digits  digit digit*
– optional_fraction  . digits | 
– optional_exponent  ( E ( + | -| ) digits) | 
– num  digits optional_fraction optional_exponent
15
Regular Definitions
• Shorthand
– One or more instances: r+ denotes rr*
– Zero or one Instance: r? denotes r|ε
– Character classes: [a-z] denotes
[a|b|…|z]
16
Example
• digit  0 | 1 | 2 | … | 9
• digits  digit+
• optional_fraction  (. digits ) ?
• optional_exponent  ( E ( + | -) ? digits) ?
• num  digits optional_fraction optional_exponent
17
Limitations of Regular
Expression
• Some languages cannot be described by any regular
expression
• Cannot describe balanced or nested constructs
– Example, all valid strings of balanced parentheses
– This can be done with CFG
• Cannot describe repeated strings
– Example: {wcw|w is a string of a’s and b’s}
– This can be done with CFG
• Can be used to denote only a fixed or unspecified
number of repetitions.

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Chapter1 Formal Language and Automata Theory
Chapter1 Formal Language and Automata TheoryChapter1 Formal Language and Automata Theory
Chapter1 Formal Language and Automata Theory
 
Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal language
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
Chomsky classification of Language
Chomsky classification of LanguageChomsky classification of Language
Chomsky classification of Language
 
language , grammar and automata
language , grammar and automatalanguage , grammar and automata
language , grammar and automata
 
Finite Automata
Finite AutomataFinite Automata
Finite Automata
 
Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)
 
Implementation of lexical analyser
Implementation of lexical analyserImplementation of lexical analyser
Implementation of lexical analyser
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Operator precedence
Operator precedenceOperator precedence
Operator precedence
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
 
Compiler design syntax analysis
Compiler design syntax analysisCompiler design syntax analysis
Compiler design syntax analysis
 
Context free grammar
Context free grammar Context free grammar
Context free grammar
 
closure properties of regular language.pptx
closure properties of regular language.pptxclosure properties of regular language.pptx
closure properties of regular language.pptx
 
Chomsky Hierarchy.ppt
Chomsky Hierarchy.pptChomsky Hierarchy.ppt
Chomsky Hierarchy.ppt
 
1.10. pumping lemma for regular sets
1.10. pumping lemma for regular sets1.10. pumping lemma for regular sets
1.10. pumping lemma for regular sets
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4
 
Relationship Among Token, Lexeme & Pattern
Relationship Among Token, Lexeme & PatternRelationship Among Token, Lexeme & Pattern
Relationship Among Token, Lexeme & Pattern
 
NFA & DFA
NFA & DFANFA & DFA
NFA & DFA
 
Regular expression with DFA
Regular expression with DFARegular expression with DFA
Regular expression with DFA
 

Similar a 2_2Specification of Tokens.ppt

Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)
bolovv
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
Jagjit Wilku
 

Similar a 2_2Specification of Tokens.ppt (20)

Chapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdfChapter2CDpdf__2021_11_26_09_19_08.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdf
 
7645347.ppt
7645347.ppt7645347.ppt
7645347.ppt
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 
Computer Software: Compiler Construction Lecture 05.ppt
Computer Software: Compiler Construction Lecture 05.pptComputer Software: Compiler Construction Lecture 05.ppt
Computer Software: Compiler Construction Lecture 05.ppt
 
Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal language
 
Syntax Analyzer.pdf
Syntax Analyzer.pdfSyntax Analyzer.pdf
Syntax Analyzer.pdf
 
Lecture3 lexical analysis
Lecture3 lexical analysisLecture3 lexical analysis
Lecture3 lexical analysis
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)
 
Module 1 TOC.pptx
Module 1 TOC.pptxModule 1 TOC.pptx
Module 1 TOC.pptx
 
Regular expressions h1
Regular expressions h1Regular expressions h1
Regular expressions h1
 
compiler Design course material chapter 2
compiler Design course material chapter 2compiler Design course material chapter 2
compiler Design course material chapter 2
 
Ch3.ppt
Ch3.pptCh3.ppt
Ch3.ppt
 
Ch3.ppt
Ch3.pptCh3.ppt
Ch3.ppt
 
Unit-1-part-1.pptx
Unit-1-part-1.pptxUnit-1-part-1.pptx
Unit-1-part-1.pptx
 
13000120020_A.pptx
13000120020_A.pptx13000120020_A.pptx
13000120020_A.pptx
 
L_2_apl.pptx
L_2_apl.pptxL_2_apl.pptx
L_2_apl.pptx
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 
1 introduction
1 introduction1 introduction
1 introduction
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 

Más de Ratnakar Mikkili (8)

AI-State Space Representation.pptx
AI-State Space Representation.pptxAI-State Space Representation.pptx
AI-State Space Representation.pptx
 
AI-State Space Representation.pptx
AI-State Space Representation.pptxAI-State Space Representation.pptx
AI-State Space Representation.pptx
 
Artificial Intelligence_Searching.pptx
Artificial Intelligence_Searching.pptxArtificial Intelligence_Searching.pptx
Artificial Intelligence_Searching.pptx
 
Artificial Intelligence_Environment.pptx
Artificial Intelligence_Environment.pptxArtificial Intelligence_Environment.pptx
Artificial Intelligence_Environment.pptx
 
2_4 Finite Automata.ppt
2_4 Finite Automata.ppt2_4 Finite Automata.ppt
2_4 Finite Automata.ppt
 
Push down automata
Push down automataPush down automata
Push down automata
 
Context free grammar
Context free grammarContext free grammar
Context free grammar
 
Introduction TO Finite Automata
Introduction TO Finite AutomataIntroduction TO Finite Automata
Introduction TO Finite Automata
 

Último

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 

Último (20)

COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 

2_2Specification of Tokens.ppt

  • 2. 2 Strings and Languages • Regular Expressions are an important notation for specifying patterns. • Alphabet – any finite set of symbols e.g. ASCII, binary alphabet, UNICODE, EBCDIC,LATIN-1 • String – A finite sequence of symbols drawn from an alphabet – Banana (ASCII Alphabet) – Length of a string => |s| – Empty String => ε • Other terms relating to strings: prefix; suffix; substring; proper prefix, suffix, or substring (non-empty, not entire string); subsequence • Language – A set of strings over a fixed alphabet
  • 3. 3 Languages • A language, L, is simply any set of strings over a fixed alphabet. Alphabet Languages {0,1} {0,10,100,1000,100000…} {0,1,00,11,000,111,…} {a,b,c} {abc,aabbcc,aaabbbccc,…} {A, … ,Z} {FOR,WHILE,GOTO,…} {A,…,Z,a,…,z,0,…9, { All legal PASCAL progs} +,-,…,<,>,…} Special Languages:  - EMPTY LANGUAGE  - contains  string only
  • 4. 4 String operations • Given String: banana • Prefix : ban, banana • Suffix : ana, banana • Substring : nan, ban, ana, banana • Subsequence: bnan, nn • Proper Prefix and Suffix
  • 5. 5 String Operations • Concatenation – xy; s = s = s;  - identity for concatenation – s0 =  if i > 0 si = si-1s
  • 6. 6 Operations on Languages OPERATION DEFINITION union of L and M written L  M concatenation of L and M written LM Kleene closure of L written L* positive closure of L written L+ L  M = {s | s is in L or s is in M} LM = {st | s is in L and t is in M} L+=   0 i i L L* denotes “zero or more concatenations of “ L L*=   1 i i L L+ denotes “one or more concatenations of “ L Exponentiation Lo={ε}, L1=L,L2=LL
  • 7. 7 Operations on Languages • LUD is the set of letters and digits • LD is the set of strings consisting of a letter followed by a digit • L4 is the set of all four strings • L* is the set of strings including ε • D+ is the set of strings of one or more digits.
  • 8. 8 Say What? L = {A, B, C, D } D = {1, 2, 3} • L  D {A, B, C, D, 1, 2, 3 } • LD {A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 } • L2 { AA, AB, AC, AD, BA, BB, BC, BD, CA, … DD} • L* { All possible strings of L plus  } • L+ L* -  • L (L  D ) Valid :{ A1,AA2,B345,CD45} Invlaid:{321,4A2} • L (L  D )* Valid:{ A,A1,A23,D3,DA5..} Invalid:{31}
  • 9. 9 Regular Expressions • A Regular Expression is a Set of Rules / Techniques for Constructing Sequences of Symbols (Strings) from an Alphabet. • Let  Be an Alphabet, r a Regular Expression Then L(r) is the Language That is characterized by the Rules of r
  • 10. 10 Regular Expressions • Defined over an alphabet Σ • ε represents {ε}, the set containing the empty string • If a is a symbol in Σ, then a is a regular expression denoting {a}, the set containing the string a • If r and s are regular expressions denoting the languages L(r) and L(s), then: – (r)|(s) is a regular expression denoting L(r)U L(s) – (r)(s) is a regular expression denoting L(r)L(s) – (r)* is a regular expression denoting (L(r))* – (r) is a regular expression denoting L(r) • Precedence: * (left associative), then concatenation (left associative), then | (left associative)
  • 11. 11 Regular Expressions Alphabet = {a, b} 1. a|b denotes {a, b} 2. (a|b)(a|b) denotes {ab, aa, ba, bb} 3. a* denotes {, a, aa, …} 4. (a|b)* - Strings of a’s and b’s including the  5. a|a*b – a followed by zero/more a’s followed by b
  • 12. 12 Algebraic Properties of Regular Expressions AXIOM DESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r = r r = r r* = ( r |  )* r ( s | t ) = r s | r t ( s | t ) r = s r | t r r** = r* | is commutative | is associative concatenation is associative concatenation distributes over | relation between * and   Is the identity element for concatenation * is idempotent
  • 13. 13 Regular Definitions • Names maybe given to regular expressions; these names can be used like symbols • Let  is an alphabet of basic symbols. The regular definition is a sequence of definitions of the form d1 r1 d2 r2 . . . dn rn Where, each di is a distinct name, and each ri is a regular expression over the symbols in   {d1, d2, …, di-1 }
  • 14. 14 Regular Definitions • Example 1: – letter  A|B|…|Z|a|b|…|z – digit  0|1|…|9 – id  letter (letter | digit)* • Example 2 – digit  0 | 1 | 2 | … | 9 – digits  digit digit* – optional_fraction  . digits |  – optional_exponent  ( E ( + | -| ) digits) |  – num  digits optional_fraction optional_exponent
  • 15. 15 Regular Definitions • Shorthand – One or more instances: r+ denotes rr* – Zero or one Instance: r? denotes r|ε – Character classes: [a-z] denotes [a|b|…|z]
  • 16. 16 Example • digit  0 | 1 | 2 | … | 9 • digits  digit+ • optional_fraction  (. digits ) ? • optional_exponent  ( E ( + | -) ? digits) ? • num  digits optional_fraction optional_exponent
  • 17. 17 Limitations of Regular Expression • Some languages cannot be described by any regular expression • Cannot describe balanced or nested constructs – Example, all valid strings of balanced parentheses – This can be done with CFG • Cannot describe repeated strings – Example: {wcw|w is a string of a’s and b’s} – This can be done with CFG • Can be used to denote only a fixed or unspecified number of repetitions.