SlideShare a Scribd company logo
1 of 43
Download to read offline
Language processing patterns
Prof. Dr. Ralf Lämmel
Universität Koblenz-Landau
Software Languages Team
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
An EBNF for the 101companies System
company :
'company' STRING '{' department* '}' EOF;
department :
'department' STRING '{'
('manager' employee)
('employee' employee)*
department*
'}';
employee :
STRING '{'
'address' STRING
'salary' FLOAT
'}';
STRING : '"' (~'"')* '"';
FLOAT : ('0'..'9')+ ('.' ('0'..'9')+)?;
Nonterminal
Terminal
Grouping
Repetition
Context-free syntax
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Another
EBNF for the 101companies System
COMPANY : 'company';
DEPARTMENT : 'department';
EMPLOYEE : 'employee';
MANAGER : 'manager';
ADDRESS : 'address';
SALARY : 'salary';
OPEN : '{';
CLOSE : '}';
STRING : '"' (~'"')* '"';
FLOAT : ('0'..'9')+ ('.' ('0'..'9')+)?;
WS : (' '|'r'? 'n'|'t')+;
Lexical (= token
level) syntax
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
What’s a language processor?
A program that performs language processing:
Acceptor
Parser
Analysis
Transformation
...
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Language processing patterns
manual
patterns
generative
patterns
1. The Chopper Pattern
2. The Lexer Pattern
3. The Copy/Replace Pattern
4. The Acceptor Pattern
5. The Parser Pattern
6. The Lexer Generation Pattern
7. The Acceptor Generation Pattern
8. The Parser Generation Pattern
9. The Text-to-object Pattern
10.The Object-to-text Pattern
11.The Text-to-tree Pattern
12.The Tree-walk Pattern
13.The Parser Generation2 Pattern
The Chopper Pattern
Approximates
the Lexer Pattern
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
The Chopper Pattern
Intent:
Analyze text at the lexical level.
Operational steps (run time):
1. Chop input into “pieces”.
2. Classify each piece.
3. Process classified pieces in a stream.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Chopping input into pieces
with java.util.Scanner
scanner = new Scanner(new File(...));
Default delimiter is whitespace.
The object is iterable.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Tokens =
classifiers of pieces of input
public enum Token {
	 COMPANY,
	 DEPARTMENT,
	 MANAGER,
	 EMPLOYEE,
	 NAME,
	 ADDRESS,
	 SALARY,
	 OPEN,
	 CLOSE,
	 STRING,
	 FLOAT,
}
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
	 public static Token classify(String s) {
	 	 if (keywords.containsKey(s))
	 	 	 return keywords.get(s);
	 	 else if (s.matches(""[^"]*""))
	 	 	 return STRING;
	 	 else if (s.matches("d+(.d*)?"))
	 	 	 return FLOAT;
	 	 else
	 	 	 throw new RecognitionException(...);
	 }
Classify chopped pieces into
keywords, floats, etc.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Process token stream to
compute salary total
	 public static double total(String s) throws ... {
	 	 double total = 0;
	 	 Recognizer recognizer = new Recognizer(s);
	 	 Token current = null;
	 	 Token previous = null;
	 	 while (recognizer.hasNext()) {
	 	 	 current = recognizer.next();
	 	 	 if (current==FLOAT && previous==SALARY)
	 	 	 	 total += Double.parseDouble(recognizer.getLexeme());
	 	 	 previous = current;
	 	 }
	 	 return total;
	 }
The test for previous to be equal
SALARY is not mandatory here.
When could it be needed?
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Demo
http://101companies.org/wiki/
Contribution:javaScanner
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Declare an enum type for tokens.
Set up instance of java.util.Scanner.
Iterate over pieces (strings) returned by scanner.
Classify pieces as tokens.
Use regular expression matching.
Implement operations by iteration over pieces.
For example:
Total: aggregates floats
Cut: copy tokens, modify floats
Summary of implementation aspects
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
A problem with the Chopper Pattern
Input:
company “FooBar Inc.” { ...
Pieces:
‘company’, ‘“FooBar’, ‘Inc.”’, ‘{‘, ...
There is no general rule
for chopping the input into pieces.
The Lexer Pattern
Fixes
the Chopper Pattern
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
The Lexer Pattern
Intent:
Analyze text at the lexical level.
Operational steps (run time):
1. Recognize token/lexeme pairs in input.
2. Process token/lexeme pairs in a stream.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Terminology
Token: classification of lexical unit.
Lexeme: the string that makes up the unit.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Lookahead-based decisions
	 	 if (Character.isDigit(lookahead)) {
	 	 // Recognize float
	 	 	 ...
	 	 	 token = FLOAT;
	 	 	 return;
	 	 }
	 	 if (lookahead=='"') {
	 	 	 // Recognize string
	 	 	 ...
	 	 	 token = STRING;
	 	 	 return;	 	 	
	 	 }
...
Inspect lookahead
and decide on what
token to recognize.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Recognize floats
	 	 if (Character.isDigit(lookahead)) {
	 	 	 do {
	 	 	 	 read();
	 	 	 } while (Character.isDigit(lookahead));
	 	 	 if (lookahead=='.') {
	 	 	 	 read();
	 	 	 	 while (Character.isDigit(lookahead))
	 	 	 	 	 read();
	 	 	 }
	 	 	 token = FLOAT;
	 	 	 return;
	 	 }
"d+(.d*)?"
The code essentially
implements this regexp:
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Demo
http://101companies.org/wiki/
Contribution:javaLexer
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Declare an enum type for tokens.
Read characters one by one.
Use lookahead for decision making.
Consume all characters for lexeme.
Build token/lexeme pairs.
Implement operations by iteration over pairs.
Summary of implementation aspects
Other approaches
use automata
theory (DFAs).
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
A problem with the Lexer Pattern
(for the concrete approach discussed)
How do we get (back) the conciseness of
regular expressions?
	 	 if (Character.isDigit(lookahead)) {
	 	 	 do {
	 	 	 	 read();
	 	 	 } while (Character.isDigit(lookahead));
	 	 	 if (lookahead=='.') {
	 	 	 	 read();
	 	 	 	 while (Character.isDigit(lookahead))
	 	 	 	 	 read();
	 	 	 }
	 	 	 token = FLOAT;
	 	 	 return;
	 	 }
The Copy/Replace Pattern
Builds on top of
the Lexer Pattern
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
The Copy/Replace Pattern
Intent:
Transform text at the lexical level.
Operational steps (run time):
1. Recognize token/lexeme pairs in input.
2. Process token/lexeme pairs in a stream.
1. Copy some lexemes.
2. Replace others.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Precise copy for comparison
	 public Copy(String in, String out) throws ... {
	 	 Recognizer recognizer = new Recognizer(in);
	 	 Writer writer = new OutputStreamWriter(
new FileOutputStream(out));
	 	 String lexeme = null;
	 	 Token current = null;
	 	 while (recognizer.hasNext()) {
	 	 	 current = recognizer.next();
	 	 	 lexeme = recognizer.getLexeme();
	 	 	 writer.write(lexeme);
	 	 }
	 	 writer.close();
	 }
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Copy/replace
for cutting salaries in half
...
	 	 	 lexeme = recognizer.getLexeme();
	 	 	 // Cut salary in half
	 	 	 if (current == FLOAT && previous == SALARY)
	 	 	 	 lexeme = Double.toString(
	 	 	 	 	 	 (Double.parseDouble(
recognizer.getLexeme()) / 2.0d));
	 	 	 writer.write(lexeme);
...
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Demo
http://101companies.org/wiki/
Contribution:javaLexer
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Build on top of the Lexer Pattern.
Processor writes to an output stream.
Processor may maintain history such as “previous”.
Summary of implementation aspects
The Acceptor Pattern
Builds on top of
the Lexer Pattern
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
EBNF for 101companies System
company :
'company' STRING '{' department* '}' EOF;
department :
'department' STRING '{'
('manager' employee)
('employee' employee)*
department*
'}';
employee :
STRING '{'
'address' STRING
'salary' FLOAT
'}';
Wanted: an
acceptor
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
The Acceptor Pattern
Intent:
Verify syntactical correctness of input.
Operational steps (run time):
Recognize lexemes/tokens based on the Lexer Pattern.
Match terminals.
Invoke procedures for nonterminals.
Commit to alternatives based on lookahead.
Verify elements of sequences one after another.
Communicate acceptance failure as exception.
We assume a recursive
descent parser as acceptor.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
	 void department() {
	 	 match(DEPARTMENT);
	 	 match(STRING);
	 	 match(OPEN);
	 	 match(MANAGER);
	 	 employee();
	 	 while (test(EMPLOYEE)) {
	 	 	 match(EMPLOYEE);
	 	 	 employee();
	 	 }
	 	 while (test(DEPARTMENT))
	 	 	 department();
	 	 match(CLOSE);
	 }
department :
'department' STRING '{'
('manager' employee)
('employee' employee)*
dept*
'}';
Grammar
production
Corresponding
procedure
Recursive descent parsing
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
	 void department() {
	 	 match(DEPARTMENT);
	 	 match(STRING);
	 	 match(OPEN);
	 	 match(MANAGER);
	 	 employee();
	 	 while (test(EMPLOYEE)
|| test(DEPARTMENT)) {
	 	 if (test(EMPLOYEE)) {
	 	 	 match(EMPLOYEE);
	 	 	 employee();
	 	 } else
	 	 	 department();
	 }
	 	 match(CLOSE);
	 }
department :
'department' STRING '{'
('manager' employee)
( ('employee' employee)
| dept
)*
'}';
Use of
alternatives
Recursive descent parsing
A revised
production
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Demo
See class Acceptor.java
http://101companies.org/wiki/
Contribution:javaParser
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Rules for recursive-
descent parsers
Each nonterminal becomes a (void) procedure.
RHS symbols become statements.
Terminals become “match” statements.
Nonterminals become procedure calls.
Symbol sequences become statement sequences.
Star/plus repetitions become while loops with lookahead.
Alternatives are selected based on lookahead.
The Parser Pattern
Builds on top of
the Acceptor Pattern
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
The Parser Pattern
Intent:
Make accessible syntactical structure.
Operational steps (run time):
Accept input based on the Acceptor Pattern.
Invoke semantic actions along acceptance.
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)

 void department() {

 
 match(DEPARTMENT);

 
 match(STRING);

 
 match(OPEN);

 
 match(MANAGER);

 
 employee();

 
 while (test(EMPLOYEE)) {

 
 
 match(EMPLOYEE);

 
 
 employee();

 
 }

 
 while (test(DEPARTMENT))

 
 
 department();

 
 match(CLOSE);

 }
For comparison: no
semantic actions
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)

 void department() {

 
 match(DEPARTMENT);

 
 String name = match(STRING);

 
 match(OPEN);

 
 openDept(name);

 
 match(MANAGER);

 
 employee(true);

 
 while (test(EMPLOYEE)) {

 
 
 match(EMPLOYEE);

 
 
 employee(false);

 
 }

 
 while (test(DEPARTMENT))

 
 
 department();

 
 match(CLOSE);

 
 closeDept(name);


 }
Handle events for
open and close
department
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
All handlers for companies
protected void openCompany(String name) { }
protected void closeCompany(String name) { }
protected void openDept(String name) { }
protected void closeDept(String name) { }
protected void handleEmployee(
boolean isManager,
String name,
String address,
Double salary) { }
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
A parser that totals
public class Total extends Parser {

 private double total = 0;


 public double getTotal() {

 
 return total;

 }

 protected void handleEmployee(
boolean isFinal,
String name,
String address,
Double salary) {

 
 total += salary;

 }
}
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Demo
See class Parser.java
http://101companies.org/wiki/
Contribution:javaParser
(C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable)
Summary
Language processing is a programming domain.
Grammars may define two levels of syntax:
token/lexeme level (lexical level)
tree-like structure level (context-free level)
Both levels are implementable in parsers:
Recursive descent for parsing
Use generative tools (as discussed soon)

More Related Content

What's hot

Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingCompiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingEelco Visser
 
CS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingCS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingEelco Visser
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis MethodsIntroducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis MethodsKamiya Toshihiro
 
Introduction of flex
Introduction of flexIntroduction of flex
Introduction of flexvip_du
 
CS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesCS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesEelco Visser
 
The use of the code analysis library OpenC++: modifications, improvements, er...
The use of the code analysis library OpenC++: modifications, improvements, er...The use of the code analysis library OpenC++: modifications, improvements, er...
The use of the code analysis library OpenC++: modifications, improvements, er...PVS-Studio
 
Rust and Eclipse
Rust and EclipseRust and Eclipse
Rust and EclipseMax Bureck
 
A recommender system for generalizing and refining code templates
A recommender system for generalizing and refining code templatesA recommender system for generalizing and refining code templates
A recommender system for generalizing and refining code templatesCoen De Roover
 
T02 a firstcprogram
T02 a firstcprogramT02 a firstcprogram
T02 a firstcprogramprincepavan
 
Chapter Eight(1)
Chapter Eight(1)Chapter Eight(1)
Chapter Eight(1)bolovv
 
ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingMateus S. H. Cruz
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongMario Fusco
 
Functional Programming You Already Know
Functional Programming You Already KnowFunctional Programming You Already Know
Functional Programming You Already KnowKevlin Henney
 

What's hot (16)

Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingCompiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
 
Learning c++
Learning c++Learning c++
Learning c++
 
CS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | ParsingCS4200 2019 | Lecture 3 | Parsing
CS4200 2019 | Lecture 3 | Parsing
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis MethodsIntroducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods
Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods
 
Introduction of flex
Introduction of flexIntroduction of flex
Introduction of flex
 
C Basics
C BasicsC Basics
C Basics
 
CS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic ServicesCS4200 2019 | Lecture 4 | Syntactic Services
CS4200 2019 | Lecture 4 | Syntactic Services
 
The use of the code analysis library OpenC++: modifications, improvements, er...
The use of the code analysis library OpenC++: modifications, improvements, er...The use of the code analysis library OpenC++: modifications, improvements, er...
The use of the code analysis library OpenC++: modifications, improvements, er...
 
Rust and Eclipse
Rust and EclipseRust and Eclipse
Rust and Eclipse
 
A recommender system for generalizing and refining code templates
A recommender system for generalizing and refining code templatesA recommender system for generalizing and refining code templates
A recommender system for generalizing and refining code templates
 
T02 a firstcprogram
T02 a firstcprogramT02 a firstcprogram
T02 a firstcprogram
 
Chapter Eight(1)
Chapter Eight(1)Chapter Eight(1)
Chapter Eight(1)
 
ENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query ProcessingENKI: Access Control for Encrypted Query Processing
ENKI: Access Control for Encrypted Query Processing
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are Wrong
 
Functional Programming You Already Know
Functional Programming You Already KnowFunctional Programming You Already Know
Functional Programming You Already Know
 

Similar to Language processing patterns

Database programming including O/R mapping (as part of the the PTT lecture)
Database programming including O/R mapping (as part of the the PTT lecture)Database programming including O/R mapping (as part of the the PTT lecture)
Database programming including O/R mapping (as part of the the PTT lecture)Ralf Laemmel
 
Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Ralf Laemmel
 
Functional OO programming (as part of the the PTT lecture)
Functional OO programming (as part of the the PTT lecture)Functional OO programming (as part of the the PTT lecture)
Functional OO programming (as part of the the PTT lecture)Ralf Laemmel
 
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesChoose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesCHOOSE
 
The Expression Problem (as part of the the PTT lecture)
The Expression Problem (as part of the the PTT lecture)The Expression Problem (as part of the the PTT lecture)
The Expression Problem (as part of the the PTT lecture)Ralf Laemmel
 
Formalizing homogeneous language embeddings
Formalizing homogeneous language embeddingsFormalizing homogeneous language embeddings
Formalizing homogeneous language embeddingsClarkTony
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory buildingClarkTony
 
An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processingRalf Laemmel
 
From Java to Parellel Clojure - Clojure South 2019
From Java to Parellel Clojure - Clojure South 2019From Java to Parellel Clojure - Clojure South 2019
From Java to Parellel Clojure - Clojure South 2019Leonardo Borges
 
Xtend api and_dsl_design_patterns_eclipse_confrance2016
Xtend api and_dsl_design_patterns_eclipse_confrance2016Xtend api and_dsl_design_patterns_eclipse_confrance2016
Xtend api and_dsl_design_patterns_eclipse_confrance2016Max Bureck
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)Pavlo Baron
 
Concepts of JetBrains MPS
Concepts of JetBrains MPSConcepts of JetBrains MPS
Concepts of JetBrains MPSVaclav Pech
 
Multithreaded programming (as part of the the PTT lecture)
Multithreaded programming (as part of the the PTT lecture)Multithreaded programming (as part of the the PTT lecture)
Multithreaded programming (as part of the the PTT lecture)Ralf Laemmel
 
Introduction to Spark with Scala
Introduction to Spark with ScalaIntroduction to Spark with Scala
Introduction to Spark with ScalaHimanshu Gupta
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in RAshraf Uddin
 
6 compiler lab - Flex
6 compiler lab - Flex6 compiler lab - Flex
6 compiler lab - FlexMashaelQ
 
Survey of Program Transformation Technologies
Survey of Program Transformation TechnologiesSurvey of Program Transformation Technologies
Survey of Program Transformation TechnologiesChunhua Liao
 
Object Oriented Programming in Matlab
Object Oriented Programming in Matlab Object Oriented Programming in Matlab
Object Oriented Programming in Matlab AlbanLevy
 

Similar to Language processing patterns (20)

Database programming including O/R mapping (as part of the the PTT lecture)
Database programming including O/R mapping (as part of the the PTT lecture)Database programming including O/R mapping (as part of the the PTT lecture)
Database programming including O/R mapping (as part of the the PTT lecture)
 
XML data binding
XML data bindingXML data binding
XML data binding
 
Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)Metaprograms and metadata (as part of the the PTT lecture)
Metaprograms and metadata (as part of the the PTT lecture)
 
Functional OO programming (as part of the the PTT lecture)
Functional OO programming (as part of the the PTT lecture)Functional OO programming (as part of the the PTT lecture)
Functional OO programming (as part of the the PTT lecture)
 
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of TonguesChoose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
Choose'10: Ralf Laemmel - Dealing Confortably with the Confusion of Tongues
 
The Expression Problem (as part of the the PTT lecture)
The Expression Problem (as part of the the PTT lecture)The Expression Problem (as part of the the PTT lecture)
The Expression Problem (as part of the the PTT lecture)
 
Formalizing homogeneous language embeddings
Formalizing homogeneous language embeddingsFormalizing homogeneous language embeddings
Formalizing homogeneous language embeddings
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory building
 
An introduction on language processing
An introduction on language processingAn introduction on language processing
An introduction on language processing
 
From Java to Parellel Clojure - Clojure South 2019
From Java to Parellel Clojure - Clojure South 2019From Java to Parellel Clojure - Clojure South 2019
From Java to Parellel Clojure - Clojure South 2019
 
Xtend api and_dsl_design_patterns_eclipse_confrance2016
Xtend api and_dsl_design_patterns_eclipse_confrance2016Xtend api and_dsl_design_patterns_eclipse_confrance2016
Xtend api and_dsl_design_patterns_eclipse_confrance2016
 
What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)What can be done with Java, but should better be done with Erlang (@pavlobaron)
What can be done with Java, but should better be done with Erlang (@pavlobaron)
 
Concepts of JetBrains MPS
Concepts of JetBrains MPSConcepts of JetBrains MPS
Concepts of JetBrains MPS
 
Multithreaded programming (as part of the the PTT lecture)
Multithreaded programming (as part of the the PTT lecture)Multithreaded programming (as part of the the PTT lecture)
Multithreaded programming (as part of the the PTT lecture)
 
Introduction to Spark with Scala
Introduction to Spark with ScalaIntroduction to Spark with Scala
Introduction to Spark with Scala
 
Erlang session1
Erlang session1Erlang session1
Erlang session1
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
6 compiler lab - Flex
6 compiler lab - Flex6 compiler lab - Flex
6 compiler lab - Flex
 
Survey of Program Transformation Technologies
Survey of Program Transformation TechnologiesSurvey of Program Transformation Technologies
Survey of Program Transformation Technologies
 
Object Oriented Programming in Matlab
Object Oriented Programming in Matlab Object Oriented Programming in Matlab
Object Oriented Programming in Matlab
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Language processing patterns

  • 1. Language processing patterns Prof. Dr. Ralf Lämmel Universität Koblenz-Landau Software Languages Team
  • 2. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) An EBNF for the 101companies System company : 'company' STRING '{' department* '}' EOF; department : 'department' STRING '{' ('manager' employee) ('employee' employee)* department* '}'; employee : STRING '{' 'address' STRING 'salary' FLOAT '}'; STRING : '"' (~'"')* '"'; FLOAT : ('0'..'9')+ ('.' ('0'..'9')+)?; Nonterminal Terminal Grouping Repetition Context-free syntax
  • 3. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Another EBNF for the 101companies System COMPANY : 'company'; DEPARTMENT : 'department'; EMPLOYEE : 'employee'; MANAGER : 'manager'; ADDRESS : 'address'; SALARY : 'salary'; OPEN : '{'; CLOSE : '}'; STRING : '"' (~'"')* '"'; FLOAT : ('0'..'9')+ ('.' ('0'..'9')+)?; WS : (' '|'r'? 'n'|'t')+; Lexical (= token level) syntax
  • 4. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) What’s a language processor? A program that performs language processing: Acceptor Parser Analysis Transformation ...
  • 5. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Language processing patterns manual patterns generative patterns 1. The Chopper Pattern 2. The Lexer Pattern 3. The Copy/Replace Pattern 4. The Acceptor Pattern 5. The Parser Pattern 6. The Lexer Generation Pattern 7. The Acceptor Generation Pattern 8. The Parser Generation Pattern 9. The Text-to-object Pattern 10.The Object-to-text Pattern 11.The Text-to-tree Pattern 12.The Tree-walk Pattern 13.The Parser Generation2 Pattern
  • 7. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) The Chopper Pattern Intent: Analyze text at the lexical level. Operational steps (run time): 1. Chop input into “pieces”. 2. Classify each piece. 3. Process classified pieces in a stream.
  • 8. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Chopping input into pieces with java.util.Scanner scanner = new Scanner(new File(...)); Default delimiter is whitespace. The object is iterable.
  • 9. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Tokens = classifiers of pieces of input public enum Token { COMPANY, DEPARTMENT, MANAGER, EMPLOYEE, NAME, ADDRESS, SALARY, OPEN, CLOSE, STRING, FLOAT, }
  • 10. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) public static Token classify(String s) { if (keywords.containsKey(s)) return keywords.get(s); else if (s.matches(""[^"]*"")) return STRING; else if (s.matches("d+(.d*)?")) return FLOAT; else throw new RecognitionException(...); } Classify chopped pieces into keywords, floats, etc.
  • 11. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Process token stream to compute salary total public static double total(String s) throws ... { double total = 0; Recognizer recognizer = new Recognizer(s); Token current = null; Token previous = null; while (recognizer.hasNext()) { current = recognizer.next(); if (current==FLOAT && previous==SALARY) total += Double.parseDouble(recognizer.getLexeme()); previous = current; } return total; } The test for previous to be equal SALARY is not mandatory here. When could it be needed?
  • 12. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Demo http://101companies.org/wiki/ Contribution:javaScanner
  • 13. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Declare an enum type for tokens. Set up instance of java.util.Scanner. Iterate over pieces (strings) returned by scanner. Classify pieces as tokens. Use regular expression matching. Implement operations by iteration over pieces. For example: Total: aggregates floats Cut: copy tokens, modify floats Summary of implementation aspects
  • 14. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) A problem with the Chopper Pattern Input: company “FooBar Inc.” { ... Pieces: ‘company’, ‘“FooBar’, ‘Inc.”’, ‘{‘, ... There is no general rule for chopping the input into pieces.
  • 15. The Lexer Pattern Fixes the Chopper Pattern
  • 16. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) The Lexer Pattern Intent: Analyze text at the lexical level. Operational steps (run time): 1. Recognize token/lexeme pairs in input. 2. Process token/lexeme pairs in a stream.
  • 17. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Terminology Token: classification of lexical unit. Lexeme: the string that makes up the unit.
  • 18. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Lookahead-based decisions if (Character.isDigit(lookahead)) { // Recognize float ... token = FLOAT; return; } if (lookahead=='"') { // Recognize string ... token = STRING; return; } ... Inspect lookahead and decide on what token to recognize.
  • 19. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Recognize floats if (Character.isDigit(lookahead)) { do { read(); } while (Character.isDigit(lookahead)); if (lookahead=='.') { read(); while (Character.isDigit(lookahead)) read(); } token = FLOAT; return; } "d+(.d*)?" The code essentially implements this regexp:
  • 20. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Demo http://101companies.org/wiki/ Contribution:javaLexer
  • 21. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Declare an enum type for tokens. Read characters one by one. Use lookahead for decision making. Consume all characters for lexeme. Build token/lexeme pairs. Implement operations by iteration over pairs. Summary of implementation aspects Other approaches use automata theory (DFAs).
  • 22. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) A problem with the Lexer Pattern (for the concrete approach discussed) How do we get (back) the conciseness of regular expressions? if (Character.isDigit(lookahead)) { do { read(); } while (Character.isDigit(lookahead)); if (lookahead=='.') { read(); while (Character.isDigit(lookahead)) read(); } token = FLOAT; return; }
  • 23. The Copy/Replace Pattern Builds on top of the Lexer Pattern
  • 24. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) The Copy/Replace Pattern Intent: Transform text at the lexical level. Operational steps (run time): 1. Recognize token/lexeme pairs in input. 2. Process token/lexeme pairs in a stream. 1. Copy some lexemes. 2. Replace others.
  • 25. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Precise copy for comparison public Copy(String in, String out) throws ... { Recognizer recognizer = new Recognizer(in); Writer writer = new OutputStreamWriter( new FileOutputStream(out)); String lexeme = null; Token current = null; while (recognizer.hasNext()) { current = recognizer.next(); lexeme = recognizer.getLexeme(); writer.write(lexeme); } writer.close(); }
  • 26. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Copy/replace for cutting salaries in half ... lexeme = recognizer.getLexeme(); // Cut salary in half if (current == FLOAT && previous == SALARY) lexeme = Double.toString( (Double.parseDouble( recognizer.getLexeme()) / 2.0d)); writer.write(lexeme); ...
  • 27. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Demo http://101companies.org/wiki/ Contribution:javaLexer
  • 28. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Build on top of the Lexer Pattern. Processor writes to an output stream. Processor may maintain history such as “previous”. Summary of implementation aspects
  • 29. The Acceptor Pattern Builds on top of the Lexer Pattern
  • 30. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) EBNF for 101companies System company : 'company' STRING '{' department* '}' EOF; department : 'department' STRING '{' ('manager' employee) ('employee' employee)* department* '}'; employee : STRING '{' 'address' STRING 'salary' FLOAT '}'; Wanted: an acceptor
  • 31. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) The Acceptor Pattern Intent: Verify syntactical correctness of input. Operational steps (run time): Recognize lexemes/tokens based on the Lexer Pattern. Match terminals. Invoke procedures for nonterminals. Commit to alternatives based on lookahead. Verify elements of sequences one after another. Communicate acceptance failure as exception. We assume a recursive descent parser as acceptor.
  • 32. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) void department() { match(DEPARTMENT); match(STRING); match(OPEN); match(MANAGER); employee(); while (test(EMPLOYEE)) { match(EMPLOYEE); employee(); } while (test(DEPARTMENT)) department(); match(CLOSE); } department : 'department' STRING '{' ('manager' employee) ('employee' employee)* dept* '}'; Grammar production Corresponding procedure Recursive descent parsing
  • 33. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) void department() { match(DEPARTMENT); match(STRING); match(OPEN); match(MANAGER); employee(); while (test(EMPLOYEE) || test(DEPARTMENT)) { if (test(EMPLOYEE)) { match(EMPLOYEE); employee(); } else department(); } match(CLOSE); } department : 'department' STRING '{' ('manager' employee) ( ('employee' employee) | dept )* '}'; Use of alternatives Recursive descent parsing A revised production
  • 34. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Demo See class Acceptor.java http://101companies.org/wiki/ Contribution:javaParser
  • 35. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Rules for recursive- descent parsers Each nonterminal becomes a (void) procedure. RHS symbols become statements. Terminals become “match” statements. Nonterminals become procedure calls. Symbol sequences become statement sequences. Star/plus repetitions become while loops with lookahead. Alternatives are selected based on lookahead.
  • 36. The Parser Pattern Builds on top of the Acceptor Pattern
  • 37. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) The Parser Pattern Intent: Make accessible syntactical structure. Operational steps (run time): Accept input based on the Acceptor Pattern. Invoke semantic actions along acceptance.
  • 38. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) void department() { match(DEPARTMENT); match(STRING); match(OPEN); match(MANAGER); employee(); while (test(EMPLOYEE)) { match(EMPLOYEE); employee(); } while (test(DEPARTMENT)) department(); match(CLOSE); } For comparison: no semantic actions
  • 39. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) void department() { match(DEPARTMENT); String name = match(STRING); match(OPEN); openDept(name); match(MANAGER); employee(true); while (test(EMPLOYEE)) { match(EMPLOYEE); employee(false); } while (test(DEPARTMENT)) department(); match(CLOSE); closeDept(name); } Handle events for open and close department
  • 40. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) All handlers for companies protected void openCompany(String name) { } protected void closeCompany(String name) { } protected void openDept(String name) { } protected void closeDept(String name) { } protected void handleEmployee( boolean isManager, String name, String address, Double salary) { }
  • 41. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) A parser that totals public class Total extends Parser { private double total = 0; public double getTotal() { return total; } protected void handleEmployee( boolean isFinal, String name, String address, Double salary) { total += salary; } }
  • 42. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Demo See class Parser.java http://101companies.org/wiki/ Contribution:javaParser
  • 43. (C) 2010-2013 Prof. Dr. Ralf Lämmel, Universität Koblenz-Landau (where applicable) Summary Language processing is a programming domain. Grammars may define two levels of syntax: token/lexeme level (lexical level) tree-like structure level (context-free level) Both levels are implementable in parsers: Recursive descent for parsing Use generative tools (as discussed soon)