Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Compiler
1. Fortran66 Java Byte Code
Art Wilton
Kevin Chase
Alekhya Dulur
2. What is a Compiler?
A compiler is a program that translates a high level
programming language (called source code ) into
machine language (the target language).
Machine language is a sequence of 0’s and 1’s that
the machine (computer) understands and can
interpret into instructions.
3. Translator:
English German
“Where is Fred?” “Wo ist Freidrich?”
Programming Language Machine Code
0000000 05ea c000 8c07 8ec8 8ed8 8ec0
name = raw_input( 0000010 befc 002d 20ac 74c0 b409 bb0e
'What is your name?n') 0000020 f2eb c031 16cd 19cd f0ea 00ff
print 'Hi, %s.' % name 0000030 6365 2074 6f62 746f 6e69 2067
0000040 6620 6f6c 7070 2079 7369 6e20
0000050 676e 7265 7320 7075 6f70 7472
6. FORTRAN 66 – Brief History
Fortran is an acronym for Formula Translation.
Developed by a team led by John Backus of IBM in the
1950’s for scientific calculations.
Before Fortran, programmers wrote machine language
(0’s and 1’s).
7. C
DO 30 J = 1, 98
I = 100 - J
WRITE (6,100) I, I
WRITE (6,110)
I=I-1
IF (I .GE. 2) GO TO 20
WRITE (6,125) I
GO TO 30
20 WRITE (6,120) I
30 CONTINUE
I=1 An Example
WRITE (6,105) I, I
WRITE (6,110)
Of
WRITE (6,130) Fortran Code:
CALL EXIT
C 99 Bottles of
100 FORMAT (1H0,I2,30H BOTTLES OF BEER ON THE WALL,
1,I2,16H BOTTLES OF BEER) Beer
105 FORMAT (1H0,I2,29H BOTTLE OF BEER ON THE WALL,
1,I2,15H BOTTLE OF BEER)
110 FORMAT (33H TAKE ONE DOWN AND PASS IT AROUND)
120 FORMAT (1H ,I2,17H BOTTLES OF BEER.)
125 FORMAT (1H ,I2,16H BOTTLE OF BEER.)
130 FORMAT (20H NO BOTTLES OF BEER.)
END
8. The basic Structure of a Compiler:
Source Machine
Code IR
Code
FRONT END BACK END
9. Create a Symbol Table which holds information
about each identifier -which the AST can access (will
be explained later).
Identifier Type Value
x int 8
test String “Begin Test”
allows boolean false
10. Intermediate Representation: Abstract Syntax Tree
Represents the source code in a tree form without the
details of syntax (grammar specific to a programming
language).
Each node in the tree represents a type of structure.
(a+n ) * 1
11. The Abstract Syntax Tree is designed to hold
instructions for any type of language.
The initial goal of our project was to create a
compiler that could take any source code and
translate it into general instructions for any type of
language with the help of the Abstract Syntax Tree.
12. The FRONT END of a compiler:
Lexical Analysis
• Tokenizing
FRONT END
Semantic Analysis
• Parsing /Parse Tree
• Symbol Table
13. Tokenizing a Program:
X = absVal(-7);
LEXEME TOKEN TYPE
x variable
= Assignment
absVal function
-7 Integer
; Punctuation
14. Lexical Analysis (scanner): the process of reading a
program from left to right and grouping it into tokens.
Tokens are groups of characters that represent a
certain symbol in a program.
For example: The English language can be
represented as tokens, two of them being nouns and
adjectives.
“The tree is tall.”
Lexeme Token Type
tree Noun
tall Adjective
15. Semantic Analysis:
Does this have any meaning?
Detect errors in the source code language that would
not make it executable.
“Tree tall the is.”
x=3+
…but not all errors
x=3÷0
16. We focused on specifying our compiler to
translate Fortran 66 code into modern Java Byte Code.
17. Java Byte Code is not Java…
Java :
for (int i = 2; i < 1000; i++) {
for (int j = 2; j < i; j++) {
if (i % j == 0) continue outer;
}
System.out.println (i);
}