From Lexical Analysis to Parsing


6rd lecture, May 25, 2018

Language Theory and Compilers

Martin J. Dürst


© 2005-18 Martin J. Dürst 青山学院大学

Today's Schedule




flex Homework: Lexical Analysis for C

Deadline: May 31, 2017 (Thursday), 19:00, box in front of room O-529 (building O, 5th floor)


Hints for flex Homework (Leftovers)


Additional Hints for Homework


Compilation Stages

  1. Lexical analysis
  2. Parsing (syntax analysis)
  3. Semantic analysis
  4. Optimization (or 5)
  5. Code generation (or 4)


Table of Formal Language Types

grammar Type Lanugage type automaton
phrase structure grammar (psg) 0 phrase structure language Turing machine
context-sensitive grammar (csg) 1 context-sensitive language linear-bounded automaton
context-free grammar (cfg) 2 context-free language push-down automaton
regular grammar (rg) 3 regular language finite state automaton


Limitations of Regular Languages/Grammars and FSAs

Can the following languages be represented with a regular expression?

All these languages cannot be accepted by FSAs because they have limited (finite) memory.


Comparing Lexical Analysis and Parsing

Lexical Analysis Parsing
Targets of analysis literals, identifiers, keywords, operators,... expressions, statements, functions, declarations, definitions,...
Requirement speed descriptive power
Notation regular expression context-free grammar
device for (automatic) analysis finite state automaton push-down automaton


Regular Grammars and Context-Free Grammars

Regular grammar:

Context free grammar:


Example of Context-Free Grammar

S → aSa | bSb | c

Examples of generated words: c, aca, bcb, abaabcbaaba

Language being generated: A single c in the middle, surrounded by 0 or more a and b so that the resulting word is a palindrome

This language cannot be accepted by an automaton with finite memory (e.g. FSA)

We need to extend FSAs to create more powerful automata

We will add a push-down stack


Push-Down Stack and Push-Down Symbols

Example of Push-Down Automaton



How a Push-Down Automaton Works


Deterministic and Nondeterministic Push-Down Automata

(there are other aspects of context-free languages/grammars that affect parsing speed)



(bring to next lecture, will be collected)

For a programming language that you know (e.g. C, Java, Ruby,...), search for a grammar on the Web, print it out, and carefully study it.



入れ子 (になっている)
push-down symbol
bottom marker
deterministic push-down automaton
nonteterministic push-down automaton