Principles of top-down parsing

(下向き構文解析の実装)

8th lecture, June 8, 2018

Language Theory and Compilers

http://www.sw.it.aoyama.ac.jp/2018/Compiler/lecture8.html

Martin J. Dürst

AGU

© 2005-18 Martin J. Dürst 青山学院大学

Today's Schedule

 

Remainders from Last Lecture

 

Summary of Last Lecture

 

Last Week's Homework 1

(In the problems below, n, +, -, *, and / are terminal symbols, and any other letters are non-terminal symbols. n denotes an arbitrary number, and the other symbols denote the four basic arithmetic operations.)

For the three grammars below, construct all the possible parse trees for words of length 5. Find the grammar that allows all and only those parse trees that lead to correct results.

  1. E → n | E - E
  2. E → n | n - E
  3. E → n | E - n

都合により削除

 

Last Week's Homework 2

Same as in problem 1 for the four grammars below.

  1. E → n | E + E | E * E
  2. E → n | E + n | E * n
  3. E → T | T + T; T → n | n * n
  4. E → T | T * T; T → n | n + n

都合により削除

 

Last Week's Homework 3

(Bonus problem) Based on the knowledge obtained when solving problems 1 and 2, create a grammar that allows to correctly calculate expressions with the four arithmetic operations (without parentheses). Check this grammar with expressions of length 5.

都合により削除

 

About Ambiguous Grammars

 

General Top-Down Parsing

 

Main Points of Backtracking

Backtracking tries all possible pathways (similar to finding exit in a labyrinth without map)

Backtracking may be very slow, but this can be improved:

 

Recursive Descent Parsing

 

Recursive Descent Parsing: Simple Hand-Written Parser

Program files: scanner.h, scanner.c, parser1.c

How to complie: gcc scanner.c parser.c && ./a

 

Details of Recursive Descent Parsing: Lexical Analysis

(see scanner.c)

 

Details of Recursive Descent Parsing: Parsing

(see parser1.c)

 

Details of Recursive Descent Parsing: Non-Terminal Symbols

 

How to Deal with Left Recursion

Example of left recursion:

E → E '-' integer | integer

Wrong solution (change of associativity):

E → integer '-' E | integer

Correct solution:

E → integer EE

EE → '-' integer EE | ε

In (E)BNF:

E → integer {'-' integer}

 

Differences between Grammars and Regular Expressions

Grammar:

Regular Expression:

A simple regular expression corresponds to a single rewriting rule in an (BNF,...) grammar

 

Homework

Deadline: June 21, 2017 (Thursday), 19:00

Where to submit: Box in front of room O-529 (building O, 5th floor)

Format: A4 double-sided printout of parser program. Stapled in upper left if more than one page, no cover page, no wrapping lines, legible font size, non-proportional font, portrait (not landscape), formatted (indents,...) for easy visibility, name (kanji and kana) and student number as a comment at the top

Collaboration: The same rules as for Computer Practice I (計算機実習 I) apply

  1. Expand the top-down parser of parser1.c to correctly deal with the four basic arithmetic operations.
    (scanner.h/c do not change, so no need to submit them)
  2. (bonus problem) Add more operations to the top-down parser, and/or deal with parentheses.
    (If you solve this problem, also submit the scanner.h/c files, but only one parser.c file for both problems.)
  3. Bring your notebook computer to the next lecture. Check again that flex, bison, make, and gcc are installed.

 

Glossary

ambiguous grammar
曖昧な文法
recursive descent parsing
再帰的下向き構文解析
depth-first
深さ優先
lookahead
先読み
backtracking
バックトラック
labyrinth
迷路
right associative
右結合
invariant
不変条件
left recursion
左再帰