Context-free languages and parsing

(下向き構文解析の原理)

7rd lecture, May 20, 2016

Language Theory and Compilers

http://www.sw.it.aoyama.ac.jp/2016/Compiler/lecture7.html

Martin J. Dürst

AGU

© 2005-16 Martin J. Dürst 青山学院大学

Today's Schedule

 

Summary of Last Lecture

 

Compound Statement for C

compound_statement
        : '{' '}'
        | '{' statement_list '}'
        | '{' declaration_list '}'
        | '{' declaration_list statement_list '}'
        ;

declaration_list
        : declaration
        | declaration_list declaration
        ;

statement_list
        : statement
        | statement_list statement
        ;

 

Block for Java

Block: 
     { BlockStatements }

 BlockStatements: 
     { BlockStatement }

 BlockStatement:
     LocalVariableDeclarationStatement
     ClassOrInterfaceDeclaration
     [Identifier :] Statement

 

Different Ways to Express a Grammar

Simple Grammar

BNF

 

Formal Rewriting Rules and BNF

There are many different ways to write a grammar:

  1. Simplest: Only a list of rewriting rules
  2. Connect all the right hand sides with the same left hand side with |
    ⇒ No fundamental change from 1. (syntactic sugar)
  3. Add equivalent of ? in regular expressions (present/absent, often written with [...])
    ⇒ Can be written as two different rules
  4. Add equivalent of * in regular expressions (often written {...}
    ⇒ Can be rewritten (see next slide)

This kind of grammar is often called BNF (Backus-Naur Form), EBNF (Extended...), or ABNF (Augmented...),...

 

Rewriting BNF to Simple Grammar Rules

M → a {N} b

M → a b | a NList b
NList → N | NList N

 

How to Create a Grammar

  1. Write down a simple example word of the language
  2. Convert the example to tokens types (result of lexical analysis)
  3. Give names to the various phenomena in the example (e.g.: ...expression, ...statement, etc.)
  4. Create draft rewriting rules
  5. Repeat 1.-4. with more difficult examples, check, and fix

 

Example of Grammar Creation

 
 
 
 

Goal of Parsing

 

Result of Parsing: Parse Tree and Abstract Syntax Tree

Parse tree (concrete syntax tree):
Abstract syntax tree:

 

Examples of Parse Tree and Abstract Syntax Tree


 
 
 
 
 

Parser Implementation: Top-Down or Bottom-Up

Top-down parsing:
Build the parse tree from the top (root, start symbol)
Bottom-up parsing:
Build the parse tree from the bottom (terminal symbols)
During parsing, there may be several (small) parse trees

 

Difficulty of Parsing

Very General Parsing Method

(Cocke–Younger–Kasami (CYK) algorithm)

Homework

Deadline: May 26, 2016 (Thursday), 19:00

Where to submit: Box in front of room O-529 (building O, 5th floor)

Format: A4 single page (using both sides is okay; NO cover page, staple in top left corner if more than one page is necessary), easily readable handwriting (NO printouts), name (kanji and kana) and student number at the top right

In the problems below, n, +, -, *, and / are terminal symbols, and any other letters are non-terminal symbols. n denotes an arbitrary number, and the other symbols denote the four basic arithmetic operations.

  1. For the three grammars below, construct all the possible parse trees for words of length 5. Find the grammar that allows all and only those parse trees that lead to correct results.
    1. E → n | E - E
    2. E → n | n - E
    3. E → n | E - n
  2. Same as in problem 1. for the four grammars below.
    1. E → n | E + E | E * E
    2. E → n | E + n | E * n
    3. E → n | T + T; T → n | n * n
    4. E → T | T + T; T → n | n * n
  3. (bonus problem) Based on the knowledge obtained when solving problems 1. and 2., create a grammar that allows to correctly calculate expressions with the four arithmetic operations (without parentheses). Check this grammar with expressions of length 5.
  4. Bring your notebook computer to the next lecture

 

Glossary

parse tree (concrete syntax tree)
解析木
abstract syntax tree
構文木 (抽象構文木)
syntactic sugar
糖衣構文
top-down parsing
下向き構文解析
bottom-up parsing
上向き構文解析
pocket calculator
電卓
Chomsky normal form
Chomsky 標準形
four (basic) arithmethic operations
四則演算