You are on page 1of 12

Syntax error handling

Errors can occur at many levels

lexical: unknown operator


syntactic: unbalanced parentheses
semantic: variable never declared
runtime: reference a NULL pointer

Goals of error-handling in a parser


To detect and report the presence of errors
To recover from an error and detect subsequent
errors
To not slow down the processing of correct
programs

Error recovery strategies


Panic mode recovery
On discovering an error, discard input symbols
one at a time until one of a designated set of
synchronizing token is found.

Phrase-level recovery
On discovering an error, perform a local fix to
allow the parser to continue.

Error recovery in predictive parsing


Recovery in a non-recursive predictive parser is
easier than in a recursive descent parser.
Panic mode recovery
If a terminal on stack, pop the terminal.
If a non-terminal on stack, shift the input until the
terminal can expand.

Phrase-level recovery
Carefully filling in the blank entries about what to
do.

Error recover in LR parsing


Canonical LR parsers never make extra reductions
when recognizing an error.
SLR and LALR may make extra reductions, but will
never shift an erroneous input symbol on the stack.
Panic mode recovery
Scan down stack until a state representing a major
program construct is found. Input symbols are discarded
until one is found that is in the follow of the nonterminal.
Trying to isolate the phrase containing the error.

Phrase level recovery


Implement an error recovery routine for each error entry
in the table.

Writing a parser with YACC (Yet Another


Compiler Compiler).
Generates LALR parsers
Work with lex. YACC calls yylex to get next token.
YACC and lex must agree on the values for each token.

Produce y.tab.c file by yacc yaccfile, which contains a


routine yyparse().
yyparse() returns 0 if the program is ok, non-zero otherwise
YACC file format:
declarations
%%
translation rules
%%
supporting C-routines

The declarations part specifies tokens, non-terminals


symbols, other C constructs.
To specify token AAA BBB
%token AAA BBB

To assign a token number to a token (needed when using lex), a


nonnegative integer followed immediately to the first appearance
of the token
%token EOFnumber 0
%token SEMInumber 101

Non-terminals do not need to be declared unless you want to


associated it with a type (will be discussed later).

Translations rules specify the grammar productions


exp : exp PLUSnumber exp
| exp MINUSnumber exp
| exp TIMESnumber exp
| exp DIVIDEnumber exp
| LPARENnumber exp RPARENnumber
| ICONSTnumber
;
exp : exp PLUSnumber exp
;
exp : exp MINUSnumber exp
;

Yacc environment
Yacc processes the specification file and produce a y.tab.c file.
An integer function yyparse() is produced by Yacc.
Calls yylex() to get tokens.
Return non-zero when an error is found.
Return 0 if the program is accepted.

Need main() and and yyerror() functions.


Example:
yyerror(str)
char *str;
{ printf("yyerror: %s at line %d\n", str, yyline);
}
main()
{
if (!yyparse()) {printf("accept\n");}
else printf("reject\n");
}

YACC builds a LALR parser for the grammar.


May have shift/reduce and reduce/reduce conflicts if there are
problems with the grammar.
Default conflict resolution:
shift/reduce --> shift
reduce/reduce --> first production in the state
should always avoid reduce/reduce conflicts
yacc -v *.y will generate a report in file y.output.
See example1.y
The programmer MUST resolve all conflicts (unless you really
know what you are doing).
modify the grammar. See example2.y
Use precedence and associativity of operators.

Use precedence and associativity of


operators.
Using keywords %left, %right, %nonassoc in
the declarations section.
All tokens on the same line are the same precedence
level and associativity.
The lines are listed in order of increasing
precedence.
%left PLUSnumber, MINUSnumber
%left TIMESnumber, DIVIDEnumber

See example3.y

Symbol attributes
Each symbol can be associated with some
attributes.
Data structure of the attributes can be specified in the union in
the declarations. (see example4.y).
%union {
int semantic_value;
}
%token <semantic_value> ICONSTnumber
%type <semantic_value> exp
%type <semantic_value> term
%type <semantic_value> item

119

Semantic actions associate with productions can be specified

Semantic actions
Semantic actions associate with productions can be
specified.
item : LPARENnumber exp RPARENnumber
{$$ = $2;}
| ICONSTnumber
{$$ = $1;}
;
$$ is the attribute associated with the left handside of the
production
$1 is the attribute associated with the first symbol in the
right handside, $2 for the second symbol,
An action can be in anyway in the production, it is also
counted as a symbol.

Checkout example5.y for examples with multiple


types associated with different symbol.

You might also like