SlideShare una empresa de Scribd logo
1 de 39
(Introduction)
satatpu@yahoo.com
1
Basic Requirements
 Regularity-Be on time
 Attendance
 Assignments-Zero tolerance for plagiarism
 Quizzes-No retake
 No requests at the end of the semester
 DON’T hesitate to speak if you have any
questions/comments/suggestions!
2
Compilers Construction (First Edition)
By: A.A. Puntambekar (Chapter No. 1 to 8)
Reference: Compiler Construction Principles & practices
by Kenneth C. Louden
3
 Increases understanding of language
semantics
 To understand performance issues for
programming languages
 Teaches good language design
 New devices may need device-specific
languages
 New business fields may need domain-specific
languages
4
 Introduce and Describe the Fundamentals
of Compilation Techniques
 Assist you in Writing your Own Compiler
(or a part of it)
 Overview & Basic Concepts
 Lexical Analysis
 Syntax Analysis / Syntax-Directed
Translation
 Semantic Analysis
 Intermediate Code Generation.
5
 Programming languages: Notations for
describing computations to people and to
machines
 e.g. Java, C#, C++, Perl, Prolog etc.
 A program written in any language must be
translated into a form that is understood by the
computer
 This form is typically known as the Machine
Language (ML), Machine Code, or Object Code
 Consists of streams of 0’s and 1’s
 The program that carries out the translation
activity is called the compiler.
6
 The language to be translated: Source language
 Input code is called the Source code
 The language produced: Target language
 Output code is called Target code
 A compiler is a program that translates source
code written in one language into the target
code of another language.
7
Java Machine Language
 Executables: Files that actually do something by
carrying out a set of instructions (as compared
to files that only contain data)
 E.g., .exe files in Windows
 If you look at their data, it won’t make sense
8
Hex Dump of a
boot loader executable
The primary reason for
compiling source code
is to create an
executable ML program
 Once the executable is there, it can be called by
the user to process the given inputs to a
program and produce the desired outputs
 Initially, there is a compile phase which is
followed by the run/execute phase
 This approach is used in many languages,
e.g., Java, C++, C# etc.
 Interpretation is concept parallel to compilation
9
Target
Executable
Program Inputs Desired Outputs
10
 Interpreter: A program that doesn’t produce the
target executable. It can do three things:
1. In a line-by-line fashion, it directly executes
the source code by using the given inputs,
and producing the desired outputs
2. May translate source code into some
intermediate language and then execute that
immediately, e.g., Perl, Python, and Matlab
3. May also execute previously stored pre-
compiled code, made by a compiler that is
part of the interpreter system, e.g., Java and
Pascal
 Some systems, e.g., SmallTalk, may combine 2
and 3.
11
12
Interpreter
Source Code
Desired Outputs
Program Inputs
Compiler
Source Code
Program Inputs
Bytecodes: Intermediate Program Virtual
Machine
(Interpreter)
Desired Outputs
Java combines compilation and interpretation
Portability: Bytecodes compiled on one machine can be interpreted on
another one
13
Preprocessor
Source Code
Library Files
Re-locatable ML Files
Compiler
Assembler
Linker/Loader
Preprocessed Source Code
Target Assembly Program
Re-locatable ML
Target ML
Combines source code stored
in different files
Converts source code into
Assembly Language (which is
easier to generate as an
intermediate representation
and also, easier to debug)
Assembler produces ML
Linker: resolves external references
Loader: Puts all object files into
memory for execution
Combines different ML parts
(of the same program) that
have been compiled
separately
1. Correct code
2. Output runs fast
3. Compiler runs fast
4. Compile time proportional to program size
5. Good diagnostics for syntax errors
6. Works well with the debugger
7. Consistent, predictable optimization
14
15
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Character Stream
Machine-Independent
Code Optimizer
Machine Code
Generator
Machine-Dependent
Code Optimizer
Token Stream
Syntax Tree
Semantically-correct
Syntax Tree
Intermediate Representation
Intermediate Representation
Target Machine Code
Optimized Target
Machine Code
SYMBOL TABLE
 Phases involved:
 Lexical Analysis
 Syntax Analysis
 Semantic Analysis
 Determines the operations implied by the
source program which are recorded in a tree
structure called the Syntax Tree
 Breaks up the source code into constituent
pieces, while storing information in the symbol
table
 Imposes a grammatical structure on these
pieces.
16
 Phases involved:
 Intermediate Code Generation
 Intermediate Code Optimization
 Machine Code Generator
 Machine Dependent Code Optimization
 Constructs the target code from the syntax tree,
and from the information in the symbol table
 Optimization is another important activity
 Both the intermediate and machine codes are
optimized in order to achieve an efficient
translation, with the least possible use of
computing resources. 17
 Front End phases:
 Lexical Analysis
 Syntax Analysis
 Semantic Analysis
 Intermediate Code Generation
 Intermediate Code Optimization
 Front end is machine independent
 Back End phases:
 Machine Code generation
 Machine Code optimization
 Back end is machine dependent.
18
 Read the character stream of the source code and
break it up into meaningful sequences called
lexemes (also known as “aka” Scanning)
 Each lexeme is represented as a token
 <token_name, attribute value>
 Single, atomic unit of the language
 e.g., force = mass * 60
 Lexeme force token <id,1>
 Lexeme = token <=>
 Lexeme mass token <id,2>
 Lexeme * token <*>
 Lexeme 60 token <60>
 Output: <id,1>=<id,2>*60
19
1 force ..
2 mass ..
SYMBOL TABLE
 Syntax analysis: Parsing the token stream to
identify the grammatical structure of the stream
(a.k.a parsing)
 Typically builds a parse tree, which replaces the
linear sequence of tokens with a tree structure
 The tree is built according to the rules of
a formal grammar which define the language's
syntax
 It is analyzed, augmented, and transformed
by later phases of compilation.
20
<id,1>
=
*
<id,2> 60
Syntax tree for
<id,1>=<id,2>*60
 Semantic analysis: Adding semantic information
to the parse tree
 Performs semantic checks, e.g., type
checking (checking for type errors), object
binding (associating variable and function
references with their definitions), rejecting
incorrect programs or issuing warnings
 Suppose force and mass are floats and
acceleration is integer: type casting required
21
<id,1>
=
*
<id,2> inttofloat
Semantically correct
syntax tree for
<id,1>=<id,2>*60
60
 Intermediate Code Generation: After verifying
the semantics of the source code, we convert it
into a machine-like intermediate representation
i.e., like a program for a machine
 Typically, an assembly language-like form is
used, because it is easy to generate and debug
 Three address code: 3 operands/instruction
 Suppose t1 and t2 are registers
 t1=inttofloat(60)
 t2=id2*t1
 id1=t2.
22
<id,1>
=
*
<id,2> inttofloat
60
 Machine-Independent Optimizer: Optimize the
intermediate code, so that a better target
program can be generated
 Faster, smaller code that consumes less
computation power
 e.g., X = Y * 0 is the same as X = 0
 Example:
 t1=inttofloat(60)
 t2=id2*t1 t1=id2*60.0
 id1=t2 id1=t1
23
Eliminating the one time
use of register t2, and
converting 60 to a float
 Machine Code Generator: Converts the optimized
intermediate code into the target code (typically the
Machine Code)
 This is done through the Assembly Language
 In case of Machine Code, registers (memory locations) are
selected for each variable
 Then, intermediate instructions are translated into
sequences of machine code instructions that perform the
same task
 Example:
 id1=id2*60.0 LDF R2,id2
MULF R1, R2, #60
STF id1, R1
24
Transfer id2 into R2, multiply and
assign to R1
 Machine-Dependent Optimizer: In this phase,
after the Machine Code has been generated, it is
optimized further (if needed)
 This completes the compilation process
 Important: Optimization is an optional activity in
compilation
 One or both of the optimization phases might
be missing.
25
26
27
28
29
Modern optimizers are usually built as a set of passes
 Symbol Table: A data structure for storing the
names of the variables, and their associated
attributes
 Attributes: Storage allocated, type, scope
 Attributes (functions): number and type of
arguments, method of argument passing (by
reference / by value)
 It should be designed so that the compiler can:
 Find the record for each variable quickly
 Store/retrieve data for a record quickly.
30
 One or more phases can be grouped into a pass
that reads an input file and generates an output
file, e.g., front-end phases can be grouped into
one pass
 It is also possible to skip some phase(s), e.g.,
all optimization activity can be avoided
 Typically, compilers produce intermediate code
from a source language that can be translated
into different target languages
 A front end can be interfaced with different
back-ends and vice versa that is several front-
ends (source code in different languages) can
be interfaced with the same back-end.
31
 Compiler Construction Toolkits: Integrated
frameworks for the implementation of various
compilation phases
 Scanner Generators / Scanners:
Automatically produce lexical analyzers from
a description of the language tokens
 Parser Generators / Parsers: Automatically
produce syntax analyzers from a grammatical
description of the programming language
 Syntax-directed Translation Engines: Produce
collection of routines for walking a parse tree
and generating intermediate code.
32
 ANTLR: Another Tool for Language Recognition
 Provides scanning, parsing and tree-walking
functionalities, along with many others
 http://www.antlr.org/
 JACCIE: Java-based Compiler Compiler with
Interactive Environment
 Educational tool for visualizing modern
compiling techniques
 Automatically generates demonstration
compilers from suitable language descriptions
 http://www2.cs.unibw.de/Tools/Syntax/englis
h/index.html
33
 SableCC: A compiler construction toolkit
 http://sablecc.org/
 JavaCC: Very popular parser generator
 Takes as input a language, converts it to a
Java file, and then determines matches to this
language
 Builds syntax trees according to a given
language using Java Tree Builder (JTB)
 https://javacc.dev.java.net/
 JLEX: A Java-based lexical analyzer
 http://www.cs.princeton.edu/~appel/modern/j
ava/JLex/
34
 IRONY: A development kit for implementing
languages on .NET platform
 Uses the flexibility and power of C# language
and .NET Framework to implement a
completely new and streamlined technology of
compiler construction
 Provides a new programming approach to
encode some grammar
 http://www.codeproject.com/KB/recipes/Irony
.aspx
 http://www.codeplex.com/irony
35
 PHC: open source compiler for PHP with support
for plugins
 Not as flexible as libraries for Java/C#
 http://www.phpcompiler.org/
 While browsing, you will come across “YACC”
 YACC: Yet Another Compiler Compiler
 A parser generator for UNIX that outputs in C
 The default parser generator for UNIX for a
long time
 Now replaced by Berkeley YACC, GNU Bison,
MKS Yacc etc.
 Many C-based CC tools require YACC-like
inputs.
36
 First generation language: Machine Language
 Very low-level operations coded in 0’s and
1’s
 Slow, tedious and error prone
 Programs were hard to understand and
modify
 Second generation language: Assembly
Language
 First step towards people-friendly language
 Instructions are mnemonic representations
of machine instructions
 Macros (shorthands) are available for
frequently executed pieces of code.
37
 Third generation languages: More user-friendly
than the second-generation ones
 FORTRAN (scientific computation)
 COBOL (business data processing)
 LISP (symbolic computation)
 C, C++, C#, Java
 Fourth generation languages: designed for
specific applications
 SQL (DB), Postscript (text formatting)
 Fifth generation languages: logic and
constraints-based languages
 Prolog, OPS5.
38
 Object-oriented languages (SmallTalk, C++,
C#, Java etc.)
 Scripting Languages (Perl, Awk, JavaScript etc.)
 Impact of Language on Compilers
 Compilers should (and have) adapt(ed) to the
evolution of languages
 Promote the use of high-level languages by
reducing the execution overhead
 Impact of Computer Architecture on Compilers
 Architecture has evolved immensely, and
compilers are now being used to evaluate a
new architecture, before it can be marketed.
39

Más contenido relacionado

La actualidad más candente

Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture NotesFellowBuddy.com
 
Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Eelco Visser
 
Type checking in compiler design
Type checking in compiler designType checking in compiler design
Type checking in compiler designSudip Singh
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Aman Sharma
 
Operating systems system structures
Operating systems   system structuresOperating systems   system structures
Operating systems system structuresMukesh Chinta
 
Implementation of lexical analyser
Implementation of lexical analyserImplementation of lexical analyser
Implementation of lexical analyserArchana Gopinath
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler DesignKuppusamy P
 
System Programing Unit 1
System Programing Unit 1System Programing Unit 1
System Programing Unit 1Manoj Patil
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction toolsAkhil Kaushik
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI Hanif Durad
 
A simple approach of lexical analyzers
A simple approach of lexical analyzersA simple approach of lexical analyzers
A simple approach of lexical analyzersArchana Gopinath
 
Computer Organization and Assembly Language
Computer Organization and Assembly LanguageComputer Organization and Assembly Language
Computer Organization and Assembly Languagefasihuddin90
 
Compiler Design
Compiler DesignCompiler Design
Compiler DesignMir Majid
 
Expression and Operartor In C Programming
Expression and Operartor In C Programming Expression and Operartor In C Programming
Expression and Operartor In C Programming Kamal Acharya
 

La actualidad más candente (20)

Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture Notes
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 
Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?Compiler Construction | Lecture 1 | What is a compiler?
Compiler Construction | Lecture 1 | What is a compiler?
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
Type checking in compiler design
Type checking in compiler designType checking in compiler design
Type checking in compiler design
 
Lexical Analysis - Compiler design
Lexical Analysis - Compiler design Lexical Analysis - Compiler design
Lexical Analysis - Compiler design
 
Compilers
CompilersCompilers
Compilers
 
Operating systems system structures
Operating systems   system structuresOperating systems   system structures
Operating systems system structures
 
Implementation of lexical analyser
Implementation of lexical analyserImplementation of lexical analyser
Implementation of lexical analyser
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler Design
 
Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
Compiler design
Compiler designCompiler design
Compiler design
 
System Programing Unit 1
System Programing Unit 1System Programing Unit 1
System Programing Unit 1
 
Compiler construction tools
Compiler construction toolsCompiler construction tools
Compiler construction tools
 
Introduction to MPI
Introduction to MPI Introduction to MPI
Introduction to MPI
 
A simple approach of lexical analyzers
A simple approach of lexical analyzersA simple approach of lexical analyzers
A simple approach of lexical analyzers
 
Computer Organization and Assembly Language
Computer Organization and Assembly LanguageComputer Organization and Assembly Language
Computer Organization and Assembly Language
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Expression and Operartor In C Programming
Expression and Operartor In C Programming Expression and Operartor In C Programming
Expression and Operartor In C Programming
 
Run time administration
Run time administrationRun time administration
Run time administration
 

Similar a Compiler Construction introduction

Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfDrIsikoIsaac
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdfAkarTaher
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfDrIsikoIsaac
 
Language translators
Language translatorsLanguage translators
Language translatorsAditya Sharat
 
SSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfSSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfJacobDragonette
 
Chapter One
Chapter OneChapter One
Chapter Onebolovv
 
Ch 1.pptx
Ch 1.pptxCh 1.pptx
Ch 1.pptxwoldu2
 
Dineshmaterial1 091225091539-phpapp02
Dineshmaterial1 091225091539-phpapp02Dineshmaterial1 091225091539-phpapp02
Dineshmaterial1 091225091539-phpapp02Tirumala Rao
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialgadisaAdamu
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compileradilmehmood93
 
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docxvenkatapranaykumarGa
 

Similar a Compiler Construction introduction (20)

Chapter#01 cc
Chapter#01 ccChapter#01 cc
Chapter#01 cc
 
Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
 
Chapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdfChapter1pdf__2021_11_23_10_53_20.pdf
Chapter1pdf__2021_11_23_10_53_20.pdf
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdf
 
Compiler Design Material
Compiler Design MaterialCompiler Design Material
Compiler Design Material
 
unit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdfunit1pdf__2021_12_14_12_37_34.pdf
unit1pdf__2021_12_14_12_37_34.pdf
 
Language translators
Language translatorsLanguage translators
Language translators
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
SSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfSSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdf
 
Chapter One
Chapter OneChapter One
Chapter One
 
Ch 1.pptx
Ch 1.pptxCh 1.pptx
Ch 1.pptx
 
Assignment1
Assignment1Assignment1
Assignment1
 
Cpcs302 1
Cpcs302  1Cpcs302  1
Cpcs302 1
 
Dineshmaterial1 091225091539-phpapp02
Dineshmaterial1 091225091539-phpapp02Dineshmaterial1 091225091539-phpapp02
Dineshmaterial1 091225091539-phpapp02
 
Cd unit i
Cd unit iCd unit i
Cd unit i
 
Compiler
Compiler Compiler
Compiler
 
Chapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course MaterialChapter-1.pptx compiler Design Course Material
Chapter-1.pptx compiler Design Course Material
 
what is compiler and five phases of compiler
what is compiler and five phases of compilerwhat is compiler and five phases of compiler
what is compiler and five phases of compiler
 
Compiler presentaion
Compiler presentaionCompiler presentaion
Compiler presentaion
 
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
2-Design Issues, Patterns, Lexemes, Tokens-28-04-2023.docx
 

Último

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 

Último (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 

Compiler Construction introduction

  • 2. Basic Requirements  Regularity-Be on time  Attendance  Assignments-Zero tolerance for plagiarism  Quizzes-No retake  No requests at the end of the semester  DON’T hesitate to speak if you have any questions/comments/suggestions! 2
  • 3. Compilers Construction (First Edition) By: A.A. Puntambekar (Chapter No. 1 to 8) Reference: Compiler Construction Principles & practices by Kenneth C. Louden 3
  • 4.  Increases understanding of language semantics  To understand performance issues for programming languages  Teaches good language design  New devices may need device-specific languages  New business fields may need domain-specific languages 4
  • 5.  Introduce and Describe the Fundamentals of Compilation Techniques  Assist you in Writing your Own Compiler (or a part of it)  Overview & Basic Concepts  Lexical Analysis  Syntax Analysis / Syntax-Directed Translation  Semantic Analysis  Intermediate Code Generation. 5
  • 6.  Programming languages: Notations for describing computations to people and to machines  e.g. Java, C#, C++, Perl, Prolog etc.  A program written in any language must be translated into a form that is understood by the computer  This form is typically known as the Machine Language (ML), Machine Code, or Object Code  Consists of streams of 0’s and 1’s  The program that carries out the translation activity is called the compiler. 6
  • 7.  The language to be translated: Source language  Input code is called the Source code  The language produced: Target language  Output code is called Target code  A compiler is a program that translates source code written in one language into the target code of another language. 7 Java Machine Language
  • 8.  Executables: Files that actually do something by carrying out a set of instructions (as compared to files that only contain data)  E.g., .exe files in Windows  If you look at their data, it won’t make sense 8 Hex Dump of a boot loader executable The primary reason for compiling source code is to create an executable ML program
  • 9.  Once the executable is there, it can be called by the user to process the given inputs to a program and produce the desired outputs  Initially, there is a compile phase which is followed by the run/execute phase  This approach is used in many languages, e.g., Java, C++, C# etc.  Interpretation is concept parallel to compilation 9 Target Executable Program Inputs Desired Outputs
  • 10. 10
  • 11.  Interpreter: A program that doesn’t produce the target executable. It can do three things: 1. In a line-by-line fashion, it directly executes the source code by using the given inputs, and producing the desired outputs 2. May translate source code into some intermediate language and then execute that immediately, e.g., Perl, Python, and Matlab 3. May also execute previously stored pre- compiled code, made by a compiler that is part of the interpreter system, e.g., Java and Pascal  Some systems, e.g., SmallTalk, may combine 2 and 3. 11
  • 12. 12 Interpreter Source Code Desired Outputs Program Inputs Compiler Source Code Program Inputs Bytecodes: Intermediate Program Virtual Machine (Interpreter) Desired Outputs Java combines compilation and interpretation Portability: Bytecodes compiled on one machine can be interpreted on another one
  • 13. 13 Preprocessor Source Code Library Files Re-locatable ML Files Compiler Assembler Linker/Loader Preprocessed Source Code Target Assembly Program Re-locatable ML Target ML Combines source code stored in different files Converts source code into Assembly Language (which is easier to generate as an intermediate representation and also, easier to debug) Assembler produces ML Linker: resolves external references Loader: Puts all object files into memory for execution Combines different ML parts (of the same program) that have been compiled separately
  • 14. 1. Correct code 2. Output runs fast 3. Compiler runs fast 4. Compile time proportional to program size 5. Good diagnostics for syntax errors 6. Works well with the debugger 7. Consistent, predictable optimization 14
  • 15. 15 Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Character Stream Machine-Independent Code Optimizer Machine Code Generator Machine-Dependent Code Optimizer Token Stream Syntax Tree Semantically-correct Syntax Tree Intermediate Representation Intermediate Representation Target Machine Code Optimized Target Machine Code SYMBOL TABLE
  • 16.  Phases involved:  Lexical Analysis  Syntax Analysis  Semantic Analysis  Determines the operations implied by the source program which are recorded in a tree structure called the Syntax Tree  Breaks up the source code into constituent pieces, while storing information in the symbol table  Imposes a grammatical structure on these pieces. 16
  • 17.  Phases involved:  Intermediate Code Generation  Intermediate Code Optimization  Machine Code Generator  Machine Dependent Code Optimization  Constructs the target code from the syntax tree, and from the information in the symbol table  Optimization is another important activity  Both the intermediate and machine codes are optimized in order to achieve an efficient translation, with the least possible use of computing resources. 17
  • 18.  Front End phases:  Lexical Analysis  Syntax Analysis  Semantic Analysis  Intermediate Code Generation  Intermediate Code Optimization  Front end is machine independent  Back End phases:  Machine Code generation  Machine Code optimization  Back end is machine dependent. 18
  • 19.  Read the character stream of the source code and break it up into meaningful sequences called lexemes (also known as “aka” Scanning)  Each lexeme is represented as a token  <token_name, attribute value>  Single, atomic unit of the language  e.g., force = mass * 60  Lexeme force token <id,1>  Lexeme = token <=>  Lexeme mass token <id,2>  Lexeme * token <*>  Lexeme 60 token <60>  Output: <id,1>=<id,2>*60 19 1 force .. 2 mass .. SYMBOL TABLE
  • 20.  Syntax analysis: Parsing the token stream to identify the grammatical structure of the stream (a.k.a parsing)  Typically builds a parse tree, which replaces the linear sequence of tokens with a tree structure  The tree is built according to the rules of a formal grammar which define the language's syntax  It is analyzed, augmented, and transformed by later phases of compilation. 20 <id,1> = * <id,2> 60 Syntax tree for <id,1>=<id,2>*60
  • 21.  Semantic analysis: Adding semantic information to the parse tree  Performs semantic checks, e.g., type checking (checking for type errors), object binding (associating variable and function references with their definitions), rejecting incorrect programs or issuing warnings  Suppose force and mass are floats and acceleration is integer: type casting required 21 <id,1> = * <id,2> inttofloat Semantically correct syntax tree for <id,1>=<id,2>*60 60
  • 22.  Intermediate Code Generation: After verifying the semantics of the source code, we convert it into a machine-like intermediate representation i.e., like a program for a machine  Typically, an assembly language-like form is used, because it is easy to generate and debug  Three address code: 3 operands/instruction  Suppose t1 and t2 are registers  t1=inttofloat(60)  t2=id2*t1  id1=t2. 22 <id,1> = * <id,2> inttofloat 60
  • 23.  Machine-Independent Optimizer: Optimize the intermediate code, so that a better target program can be generated  Faster, smaller code that consumes less computation power  e.g., X = Y * 0 is the same as X = 0  Example:  t1=inttofloat(60)  t2=id2*t1 t1=id2*60.0  id1=t2 id1=t1 23 Eliminating the one time use of register t2, and converting 60 to a float
  • 24.  Machine Code Generator: Converts the optimized intermediate code into the target code (typically the Machine Code)  This is done through the Assembly Language  In case of Machine Code, registers (memory locations) are selected for each variable  Then, intermediate instructions are translated into sequences of machine code instructions that perform the same task  Example:  id1=id2*60.0 LDF R2,id2 MULF R1, R2, #60 STF id1, R1 24 Transfer id2 into R2, multiply and assign to R1
  • 25.  Machine-Dependent Optimizer: In this phase, after the Machine Code has been generated, it is optimized further (if needed)  This completes the compilation process  Important: Optimization is an optional activity in compilation  One or both of the optimization phases might be missing. 25
  • 26. 26
  • 27. 27
  • 28. 28
  • 29. 29 Modern optimizers are usually built as a set of passes
  • 30.  Symbol Table: A data structure for storing the names of the variables, and their associated attributes  Attributes: Storage allocated, type, scope  Attributes (functions): number and type of arguments, method of argument passing (by reference / by value)  It should be designed so that the compiler can:  Find the record for each variable quickly  Store/retrieve data for a record quickly. 30
  • 31.  One or more phases can be grouped into a pass that reads an input file and generates an output file, e.g., front-end phases can be grouped into one pass  It is also possible to skip some phase(s), e.g., all optimization activity can be avoided  Typically, compilers produce intermediate code from a source language that can be translated into different target languages  A front end can be interfaced with different back-ends and vice versa that is several front- ends (source code in different languages) can be interfaced with the same back-end. 31
  • 32.  Compiler Construction Toolkits: Integrated frameworks for the implementation of various compilation phases  Scanner Generators / Scanners: Automatically produce lexical analyzers from a description of the language tokens  Parser Generators / Parsers: Automatically produce syntax analyzers from a grammatical description of the programming language  Syntax-directed Translation Engines: Produce collection of routines for walking a parse tree and generating intermediate code. 32
  • 33.  ANTLR: Another Tool for Language Recognition  Provides scanning, parsing and tree-walking functionalities, along with many others  http://www.antlr.org/  JACCIE: Java-based Compiler Compiler with Interactive Environment  Educational tool for visualizing modern compiling techniques  Automatically generates demonstration compilers from suitable language descriptions  http://www2.cs.unibw.de/Tools/Syntax/englis h/index.html 33
  • 34.  SableCC: A compiler construction toolkit  http://sablecc.org/  JavaCC: Very popular parser generator  Takes as input a language, converts it to a Java file, and then determines matches to this language  Builds syntax trees according to a given language using Java Tree Builder (JTB)  https://javacc.dev.java.net/  JLEX: A Java-based lexical analyzer  http://www.cs.princeton.edu/~appel/modern/j ava/JLex/ 34
  • 35.  IRONY: A development kit for implementing languages on .NET platform  Uses the flexibility and power of C# language and .NET Framework to implement a completely new and streamlined technology of compiler construction  Provides a new programming approach to encode some grammar  http://www.codeproject.com/KB/recipes/Irony .aspx  http://www.codeplex.com/irony 35
  • 36.  PHC: open source compiler for PHP with support for plugins  Not as flexible as libraries for Java/C#  http://www.phpcompiler.org/  While browsing, you will come across “YACC”  YACC: Yet Another Compiler Compiler  A parser generator for UNIX that outputs in C  The default parser generator for UNIX for a long time  Now replaced by Berkeley YACC, GNU Bison, MKS Yacc etc.  Many C-based CC tools require YACC-like inputs. 36
  • 37.  First generation language: Machine Language  Very low-level operations coded in 0’s and 1’s  Slow, tedious and error prone  Programs were hard to understand and modify  Second generation language: Assembly Language  First step towards people-friendly language  Instructions are mnemonic representations of machine instructions  Macros (shorthands) are available for frequently executed pieces of code. 37
  • 38.  Third generation languages: More user-friendly than the second-generation ones  FORTRAN (scientific computation)  COBOL (business data processing)  LISP (symbolic computation)  C, C++, C#, Java  Fourth generation languages: designed for specific applications  SQL (DB), Postscript (text formatting)  Fifth generation languages: logic and constraints-based languages  Prolog, OPS5. 38
  • 39.  Object-oriented languages (SmallTalk, C++, C#, Java etc.)  Scripting Languages (Perl, Awk, JavaScript etc.)  Impact of Language on Compilers  Compilers should (and have) adapt(ed) to the evolution of languages  Promote the use of high-level languages by reducing the execution overhead  Impact of Computer Architecture on Compilers  Architecture has evolved immensely, and compilers are now being used to evaluate a new architecture, before it can be marketed. 39

Notas del editor

  1. a