From Lexical Analysis to Parsing

(字句解析から構文解析へ)

6rd lecture, May 19, 2017

Language Theory and Compilers

http://www.sw.it.aoyama.ac.jp/2017/Compiler/lecture6.html

Martin J. Dürst

AGU

© 2005-17 Martin J. Dürst 青山学院大学

Today's Schedule

 

Minitest

 

flex Homework: Lexical Analysis for C

Deadline: May 25, 2017 (Thursday), 19:00, box in front of room O-529 (building O, 5th floor)

 

Additional Hints for Homework

 

Compilation Stages

  1. Lexical analysis
  2. Parsing (syntax analysis)
  3. Semantic analysis
  4. Optimization (or 5)
  5. Code generation (or 4)

 

Table of Formal Language Types

grammar Type Lanugage type automaton
phrase structure grammar (psg) 0 phrase structure language Turing machine
context-sensitive grammar (csg) 1 context-sensitive language linear-bounded automaton
context-free grammar (cfg) 2 context-free language push-down automaton
regular grammar (rg) 3 regular language finite state automaton

 

Limitations of Regular Languages/Grammars and FSAs

Can the following languages be represented with a regular expression?

All these languages cannot be accepted by FSAs because they have limited (finite) memory.

 

Comparing Lexical Analysis and Parsing


Lexical Analysis Parsing
Targets of analysis literals, identifiers, keywords, operators,... expressions, statements, functions, declarations, definitions,...
Requirement speed descriptive power
Notation regular expression context-free grammar
device for (automatic) analysis finite state automaton push-down automaton

 

Regular Grammars and Context-Free Grammars

Regular grammar:

Context free grammar:

 

Example of Context-Free Grammar

S → aSa | bSb | c

Examples of generated words: c, aca, bcb, abaabcbaaba

Language being generated: A single c in the middle, surrounded by 0 or more a and b so that the resulting word is a palindrome

This language cannot be accepted by an automaton with finite memory (e.g. FSA)

We need to extend FSAs to create more powerful automata

We will add a push-down stack

 

Push-Down Stack and Push-Down Symbols

Example of Push-Down Automaton

プッシュダウンオートマトンの図

 

How a Push-Down Automaton Works

 

Deterministic and Nondeterministic Push-Down Automata

(there are other aspects of context-free languages/grammars that affect parsing speed)

 

Homework

(bring to next lecture, will be collected)

For a programming language that you know (e.g. C, Java, Ruby,...), search for a grammar on the Web, print it out, and carefully study it.

 

Glossary

unsigned
符号無し
nested
入れ子 (になっている)
palindrome
回文、左右対称な語
push-down symbol
プッシュダウン記号
bottom marker
ボトムマーカ
deterministic push-down automaton
決定性プッシュダウンオートマトン
nonteterministic push-down automaton
非決定性プッシュダウンオートマトン