3. ERROR
• Program submitted to a compiler often have errors of
various kinds
So, good compiler should be able to detect as
many errors as possible in various ways and also recover
from them
(i.e) even in the presence of errors ,the compiler should
scan the program and try to compile all of it.(error
recovery).
4. • when the scanner or parser finds an error
and cannot proceed , the compiler must
modify the input
• so that the correct portions of the program
can be pieced together and successfully
proessed in the syntax analysis phase.
• Therefore , there should be a considerable
forethought from the beginning in designing
the compiler.
5. • various ways:
• After the detection of a error,
• * a simple compiler may stop all the
activities other than lexical and syntactic
analysis.
• * a more complex compiler may transform
the erroneous input into a similar legal
input on which the normal processing can
be resumed(repair).
6. • *an even more sophisticated compiler may
correct the erroneous input by guessing
what the user has intended.
• However, no compiler can do true
correction.Because,compiler wont know the
intent of the programmer due to errors.
Completely accurate error correction can be
done only by the programmer.
7. DIAGNOSTIC COMPILER
• Compilers giving due importance to all the 3
aspects is called “Diagnostic
compilers”, which follows complex analysis
task, which may require extra time or
memory space.
• Eg: WATFOR, IITFORT
8. CORRECTING COMPILER
• These compilers does the job of error recovery
not only from the compiler point of view but
also from the programmers point of view(ie)
generates code to be executed, which eases the
programmer. Eg:PL/C
• But at the same time, error recovery should not
lead to misleading or spurious error messages
elsewhere (error propagation).
•
9. • Indication of run time errors is another
neglected area in compiler design.
Because, code generated to monitor these
violations increases the target program
size, which leads to slow execution.
• So these checks are included as “debugging
options”(also includes intermediate display
of values, trace of procedure calls) at the
cost of additional amount.
10. • Good error reporting plays an important role in the construction of reliable
programs.
•
• PROPERTIES OF GOOD ERROR DIAGNOSIS:
•
• *The messages should pinpoint the errors in terms of original source program
rather than in some internal representation, which is unknown to the user.
• *The error messages should be tasteful and understandable by the user.
• Eg: Showing error as “missing right parenthesis in line 5” rather than as a
cryptic code “OH17”
• *The messages should be specific and should localize the problem
• Eg: Showing error as “Z not declared in procedure add” rather than “missing
declaration”.
• *The massages should not be redundant.
• Eg: If z is not declared, then it should be said once not every time z appears in
the program.
•
11. PROPERTIES OF GOOD
ERROR DIAGNOSIS
• *The messages should pinpoint the errors in
terms of original source program rather
than in some internal representation, which
is unknown to the user.
• *The error messages should be tasteful and
understandable by the user.
12. • Eg: Showing error as “missing right
parenthesis in line 5” rather than as a cryptic
code “OH17”
• *The messages should be specific and should
localize the problem
• Eg: Showing error as “Z not declared in
procedure add” rather than “missing
declaration”.
• *The massages should not be redundant.
• Eg: If z is not declared, then it should be said
once not every time z appears in the program.
13. SOURCES OF ERROR
• * ALGORITHMIC ERRORS:
The algorithm used to meet the design may be
inadequate or incorrect
*CODING ERRORS:
The programmer may introduce errors in
implementing the algorithms, either by
introducing logical errors or using the
programming language constructs improperly
14. • *The program may exceed a compiler or machine
limit not implied by the definition of the
programming language.
• Eg:
• An array may be declared with too many
dimensions to fit in the symbol table ,
• an array may be declared with too large to be
allocated at runtime.
• *COMPILER ERRORS:
• compiler can insert errors as it translates source
program into an object program.
15. • CRETERIA FOR THE CLASSIFICATION OF
ERRORS:
• *Compile time,
• *Link / Load time
• * Run time errors.
16. • CLASSIFICATION OF COMPILE TIME
ERRORS:
• *lexical errors
• * Syntactic errors
• * Semantic errors.
17. • The lexical and syntactic errors are found
during the execution of the program.
• Most of the run time errors are semantic in
nature.
• In compile-link-go systems, the compile and
link errors will be trapped seperately by the
compiler and the linkage editor / loader.
• In compile-and-go systems, the compile and
link errors will be trapped by the compiler
itself.
18. • In compile-and-go systems, the compile and
link errors will be trapped by the compiler
itself.
• Execution time errors are detected by the
run time environment, which includes
runtime control routine, the machine
hardware and the standard OS interfaces
through which status of the hardware can
be accessed or monitored as when required.
19. PLAN OF ERROR DETECTION IN
PORTION OF COMPILER
• It’s consists of routine to recover from lexical and syntactic
errors , a routine to detect semantic errors and a routine to
print the diagnostics
• The diagnostic routine communicates with the symbol
table to avoid printing redundant messages.
• The message printer must defer it’s diagnostics until after
complete line of source text has been read and listed.
9/3/2012 19
20. ERRORS SEEN BY EACH PHASE
Each phase of the compiler expects it’s input to flow in
certain specification
When the input does not, the phase has detected an
inconsistency or error which it should report to the user.
Moreover in order to continue processing it’s input, phase
has to recover from each error as being lexical phase
, syntactic phase or semantic phase errors depending on
which compiler phase detects them.
9/3/2012 20
21. • After detecting and reporting error , a phase can either
repair it or pass it along to the subsequent modules.
• If a phase attempts repair , it should take precautions that
the repair does not introduce a flurry of other errors
• If a phase transmits error the subsequent phase should be
able to deal with the erroneous inputs passed on.
9/3/2012 21
22. PLAN OF ERROR DETECTOR / CORRECTOR
Diagnostic
Message
Printer
Symbol
Table
Lexical Syntactic
corrector corrector
Source
Lexical Tokens Intermediate Semantic
code Parser
analyzer code checker
9/3/2012 22
23. LEXICAL AND SYNTAX ERRORS
• Two frequent sources of these errors are:
1.Spelling errors,
2.Missing operators and keywords
• These errors can happen due to genuine oversight or due
to typing mistake.
• They are common mistakes for even for professional
programmers.
24. SPELLING ERRORS- WHEN DO THEY
OCCUR???
If a program uses variables names which differs in only one
or two characters ,then there exits great scope for spelling
errors.
There is less chance for using automatic procedures for
detecting and correcting these errors.
Only the programmer can able to tackle the problem
because he only know about significance of variable names.
25. MAJORITY SPELLING ERRORS
1. One character is wrong,
2. One character is missing,
3. One character is extra,
4. Two adjacent characters are transposed.
• Testing for these four types of error will not enable us to catch
all the spelling mistakes but practical consideration limit
searches to these four only
• The implementation these four checks is quite expensive
because an associative search has to be performed over all
names in symbol table to locate resembling name
26. CORRECTION ALGORITHM
• The searching try to mask off one or more adjacent characters
from symbols and locate a matching symbol from symbol table.
• The located symbol can be used instead of erroneous symbol if
only one character was masked off
• If two characters were masked then transposition should be
checked .
• If associative search for an erroneous names matches more
than one symbol in the symbol table then In that case their
attributes are used to decide the final.
• If more than one symbol with matching attributes result then
correction is not safe and should not be attempted.
• This algorithm may fail if unusual usage of names results from
valid usage of language facilities.
• So it is necessary to inform the user whenever one correction
made.
27. MISSING OPERATORS AND KEYWORDS
• It can detected by their context.
• It is not perfect because certain context tends to hide the
absence of an operator .
• ex: G=H(A+B) typed instead of G=H*(A+B).
• In this case only if H exist in the symbol it shows error .
• Otherwise the couldn’t produce any error since H could
later be declared to be a function or an array.
28. DUPLICATE MASSAGE
• It is to find that many message appear owing to the same
error.
• Ex: If a is used as a simple variable and later goes to
declare and use it as array a[10,10] then all references to
the array will be flagged as erroneous use of variable
name.
• This can achieved by setting a flag in the symbol table
entry of a.
• This will enable to detect and indicate all possible illegal
use of identifier.
29. RECOVERING FROM SYNTAX ERROR
• The chief concern while recovering from the syntax error
is to attain a parser state from where the parser can safely
resume parsing the input string.
• Many parsers detects errors when it doesnot have legal
move from it’s correct configuration , which is determined
by it’s state , stack content and current input symbol.
• To recover from an error a parser should ideally locate it
, correct and resume parsing
30. TIME OF DETECTION – VALID PREFIX
PROPERTY
• LL1 AND LR1 parsers will announce errors as soon as the
prefix of the input has been seen for which there is no valid
continuation
• This is the earliest time at which a parser that reads it’s
input from left to right can announce an error
• Adv – reports errors as soon as possible
• Limits amount of erraneous output
9/3/2012 30
31. Panic mode recovery
• Parser discards input symbol until a synchronizing token
usually a statement delimiter or semicolon is found
• The parser then deletes stack entries until it finds an entry
that can continue parsing given the synchrnosing token on
input
• Ie.skip until we encounter a symbol which tells us what
should be the parser state inorder to recognize it
• Adv – simple to implemement
• Never go infinite loop
9/3/2012 31
32. on discovering an error
the parser discards input symbols
one at a time
until is found one of a designated set of synchronizing tokens
◦ delimiters ; or }
◦ have a clear and unambiguous role
◦ must be selected by the compiler designer
skips considerable amount of input
no checking for additional errors
simple
guaranteed not to go on an infinite loop
9/3/2012 32
33. • Three basic policy of recovering syntax error:
• 1.Deletion of a source symbol
• 2.Insertion of a synthetic symbol.
• 3.Replacement.
9/3/2012 33
34. • The motive behind all these actions is to
present a new string to the parser which
would lead to bypassing the error situation
and continue to parse.
• Here multiple recovery possibilities may
exists.
• We should choose the one which has
smallest number of changes – minimum
distance recovery
9/3/2012 34
35. RECOVERY IN TOP DOWN PARSING
• There are two methods.
• First is to try and successfully complete the predictions existing
in the stack at the error point.
Ex: Input string: Aα
• Last prediction was W:…ABν
• If no other rules exit with A on right hand side then recovery can
be effected by inserting B and deleting parts a until a ν is
recognized in source string.
• Another is unstack certain symbols from parser stack until we
have a TOS symbol which can produce one of the synchronizing
symbols.
• We can skip until we find a synchronising symbol in input stack
36. RECOVERY IN BOTTOM UP PARSING
• In bottom up parsing insertion of symbols is better than
deletion.
• Because it is easy to determine what symbol is to be
inserted . routines may be devised to carry out of the
specific recovery action.
• Replacing or deleting the next few source symbols also
done.
37. OPERATOR PRECEDENCE PARSING
Operator precedence parser uses set of production rules
and operator precedence table to parse an arithmetic
expression.
E→E+E
|E–E
|E*E
|E/E
|E^E
|(E)
|-E
| id
38. ERROR RECOVERY IN OPERATOR
PRECEDENCE PARSING
•There are two types of operator precedence parsing
errors.
character pair errors
reducibility errors.
•A character pair error occurs when there is no operator
precedence relation between pairs of symbols in the
grammar.
•A reducibility error occurs when you cannot reduce the
handle to the left hand side of some production.
39. CHARACTER PAIR ERROR RECOVERY
Fill each empty entry with a pointer to an error routine.
Example,
E1 – ‘missing operand’ – whole expression is missing
E2- ‘unbalanced right parenthesis’
E3- ‘missing operator’
E4- ‘missing right parenthesis’
40. REDUCIBILITY ERROR RECOVERY
• Decides the popped handle “looks like” which right hand
side. And tries to recover from that situation.
• Same like shift-reduce errors
41. HANDLING SHIFT-REDUCE ERRORS
• Generic shift-reduce strategy:
– If there is a handle on top of the stack, reduce
– Otherwise, shift
• But what if there is a choice?
– If it is legal to shift or reduce, there is a shift-reduce
conflict
– If it is legal to reduce by two different
productions, there is a reduce-reduce conflict
42. HANDLING SHIFT-REDUCE ERRORS
• Ambiguous grammars always cause conflicts
• But beware, so do many non-ambiguous grammars
To resolve this, we should modify the grammar.
43. SEMANTIC ERRORS
• Can be both local and global in scope.
• Types
– Immediate errors
• Can be detected while processing the erroneous
statement itself.
– Delayed errors
• Can’t be detected while processing the statement.
• But can be detected at a later stage when its effect is
felt.
44. EXAMPLES FOR SEMANTIC ERRORS
• Illegal Operator or Operand (immediate)
• Control Structure Violation (both)
• Missing Labels (delayed at the end of the
program)
• Duplicate Labels
45. THE ERROR PRINT ROUTINE
• Messages have to be displayed for all errors which are
detected , or detected and corrected in the source
program.
• The error print routine is the common agency that is
used by all individual compiler routines for this purpose.
• The text of the error is normally stored in the table local
to this routine
• Associated with each message is the numerical value
indicating it’s error severity
• This value is mainly used for purposes internal to the
compiler’s operation ( like if not to allow the program to
reach the execution stage or not)
46. 1 Warning and correction. Compilation continues and the compiled program
will execute
2 Warning only. Compilation continues and compiled program will execute
3 Fatal error. Compilation continues but the compiled program will not
execute
4 Compiler error . Compilation terminated
47. • For each individual error two items of
information need to be passed to this
routine
• The error number and the statement
number
• The structure and logic of the routine
depends largely on the decisions regarding
the place where the message is to be
printed.
48. Desirable place for printing error
messages
• The messages are best printed against the
erraneous statement itself
• Single pass compilers find it difficult to
indicate all errors against the offending
statement
• Multipass compilers can provide such error
condition
49. • Many Fortran compilers indicate errors on a line
by line basis as far as possible since syntax
analysis and output listing are both performed in
the same – normally first pass
• Some compilers group all error messages at end of
the program.
• This has the advantages that the problem of
duplicate messages for similar misuse of an
identifier can be satisfactorily solved.
50. • The compiler error table will be in the form
Error Erroneous Auxiliary
number(Message statement information
identifier)
Message text
51. Runtime errors
• The runtime errors are detected by
1. The run time control routine which is
interfaced with the generated code in
standard manner
2. The machine hardware
3 . Operating system interfaces to I/O
52. • The agency required to detect particular
type of error depends on nature of error
and in general varies from machine to
machine and compiler to compiler
53. Detection of runtime errors
• Arithmetic exceptions
Arises because of the violations of semantics of machine
computations.
Includes frequently occurring error conditions like
overflows, underflows, loss of precision etc..
Present day architecture detects most of the conditions at the
machine hardware level and indicate their presence
through interrupts or traps
9/3/2012 53
54. • Input output errors
Device error conditions and end of file conditions on input
file are usually detected by IOCS which sets appropriate
flag to indicate their occurance
The runtime control routine should make appropraite
provisions to obtain control when such conditions arise
Ex: fortran programmer stmt
read(5,100,err=110,end=120) A,B
Appropriate code take control to line 120 on eof and 110 on
error
9/3/2012 54
55. • Dimensions overruns
overall array bound check
Individual subscript bound check
Watfor compilers do these
9/3/2012 55
56. Programmer Recovery Options
• Difference b/w compile time and run time error is the
type of recovery possible and its implications for pgmr.
• Syntax errrors can be patched up in a standard manner
in order to extend the life of the program and to push it
to exeution.
•Same thing is for runtime errors but here the difference
is that the programmer can forsee the runtime errors
and correct it
• Standard recovery action may not suit for a
programmer.
•languages provide this options. PL/I.
•When ever an exception occurs the runtime control
routine has to decide what action to take.
•Maintains runtime exception table
57. • Ex.
• ON SUBSCRIPTRANGE I = 5;
• ON OVERFLOW I = 25;
……………………………
I = A(J,K-4)/X;
……………………………
Type of Exception Program Action System Action
Overflow I = 25 Make the standard assumption
regarding the resulting value. Return.
Subscript Range I=5 Cancel the Program.
58. • Compiler generates code for inserting and deleting entries from the
program action fields depending on the scope of the program-indicated
recovery actions.
Type of Exception Scope Recovery action stack pointer
Overflow -------- -------------
Subscript range -------- -------------
• Scope column indicate where the scope of the programmer indicated
recovery action ends.
59. Debugging aids and options
• Run time checks are so costly in terms of code space and execution time.
• This checks are debugging options.
Trace and Sub traces.
• Procedure calls printed out at user option to indicate the flow of control.
• The trace is written into special debugging files.
• Debugging file consist of output for the statements and variables.
Assignment Checks
• Assignments to a variable are monitored by the system.
• If the new value is assigned to a variable, which will be the output.
60. Intermediate and error Dumps
• Intermediate dumps can be produced during the execution time.
• It may also be produced at abnormal program execution.
Conversational debugging
• Facilities are provided through which the programmer can set the break
points in the program.
• When the program reach the break point, a conversation is initiated with
the programmer.
Ex, Assigining values and Displaying the values.