Principles of Bottom-Up Parsing


9th lecture, June 14, 2019

Language Theory and Compilers

Martin J. Dürst


© 2005-19 Martin J. Dürst 青山学院大学


Today's Schedule


Summary of Last Lecture


Last Week's Homework 1



Last Week's Homework 2



Problems with Top-Down Parsing


Additional Requirements for Grammars

Depending on parsing method, additional requirements become necessary:

⇒ A grammar that just produces/recognizes 'words' is not enough


Limits of Top-Down Parsing

⇒ Bottom-up parsing


Top-Down and Bottom-Up Parsing

Parsing Direction
Top-Down Bottom-Up
Start top (start symbol) bottom (terminal symbols)
Parse tree(s) single tree, some branches incomplete multiple small trees
General method backtracking dynamic programming
(CYK algorithm)
Practical method recursive descent LR parsing
Grammar type (E)BNF (incl. repetitions) no explicit repetitions
Tools usually by hand; ANTLR,... yacc, bison,...
Main problem left recursion shift/... conflicts


bison Overview


Exercise: A Simple Pocket Calculator

Files to start: makefile, calc.y, calc.lex


Use of make


Example of bison Input Format

#include <stdio.h>
#define YYSTYPE double
int yylex (void);
void yyerror (char const *);
%token NUM PLUS

statement: exp { printf ("Result is %g\n", $1); }

exp: exp PLUS exp { $$ = $1 + $3; }
| NUM { $$ = $1; }

int main (void)
{ return yyparse (); }

void yyerror (char const *s)
{ fprintf (stderr, "%s\n", s); }


Structure of bison Input Format

bison 関係の宣言など
%{ C 言語の宣言など %}
書換規則 { 実行文 (C 言語) }
書換規則 { 実行文 (C 言語) }
書換規則 { 実行文 (C 言語) }
関数など (C 言語) 関数など (C 言語)


Details of bison Input Format

Mixture of bison-specific directives and C program fragments

There are three main parts, separated by %%:

  1. Preparation/settings:
    Settings for using bison
    C #include statements, definition/initialization of global variables,...
    C parts have to be surrounded by %{ ... %}
  2. Rewriting rules and program fragments (in { ... }) that get executed when a rule is matched
    (the first nonterminal symbol is the start symbol)
  3. Rest of C program (functions,...)

Newlines and indentation can be significant


Details of bison Rewriting Rules

exp: exp PLUS exp { $$ = $1 + $3; }
   | NUM { $$ = $1; }


Attribute(d) Grammar


bison Manual


Hints for Developping bison Programs



Deadline: June 20, 2019 (Thursday), 19:00

Where to submit: Box in front of room O-529 (building O, 5th floor)

Format: A4 double-sided printout of parser program. Stapled in upper left if more than one page, no cover page, no wrapping lines, legible font size, non-proportional font, portrait (not landscape), formatted (indents,...) for easy visibility, name (kanji and kana) and student number as a comment at the top right

Collaboration: The same rules as for Computer Practice I (計算機実習 I) apply

Complete calc.y so that it can be used as a calculator for the four basic arithmetic operations (+-*/), including prefix minus for negative numbers and parentheses for grouping. See example input, and test.check example output. Use the grammar to define priorities and associativities (do NOT use %left, %right,...); submit calc.y only.

Hint: If there are differences with newlines, make sure that all files use Unix line ending convention
(In Notepad2, choose File → Line Endings → Unix (LF))



tabulator (tab character)
attribute(d) grammar