Course Overview
Overall Compiler Structure

(授業の概要; コンパイラ全体の仕組み)

Language Theory and Compilers
(言語理論とコンパイラ)

1st lecture, April 8, 2022 / on demand

https://www.sw.it.aoyama.ac.jp/2022/Compiler/lecture1.html

Martin J. Dürst (duerst@it.aoyama.ac.jp)

AGU

© 2005-22 Martin J. Dürst 青山学院大学

 

Today's Schedule

 

Self-Introduction

TA: YU JINSONG (于 津松、M1)

 

Today's Schedule

 

授業の位置付け

 

授業の目標

 

The Importance of Compilers

Blog by Steve Yegge

Summary:

 

ACM Turing Award

On March 30, 2021, ACM annouced that the 2020 Turing Award ("Nobel Prize for Computer Science")
was awarded to Alfred V. Aho and Jeffrey D. Ullman
for their work on programming language implementation and their highly influential books.

 

授業の進め方

 

成績評価方法

(目安)

 

他人との協力

宿題・レポートなどの場合:

これらのルールを守らない者に対し、提出物の一部や全部を0点にすることになる!

 

Course Schedule

Schedule
(https://www.sw.it.aoyama.ac.jp/2022/Compiler)

Books/References
(https://www.sw.it.aoyama.ac.jp/2022/Compiler/biblio.html)

(授業は言語理論とコンパイラ両方をカバーするが、参考書はそれぞれ片方に集中)

 

Course Contents


Theory Compilers Other applications
Front end language theory, automata (2, 3, 6, 12) lexical analysis (4, 5), parsing (7-10) regular expressions, text/data formats (4)
Back end
optimization, code generation (13, 14)

(numbers indicate numbers of lectures where topic is discussed)

 

Today's Schedule

 

Example of Difference
between Input and Output

Character itself
(internal representation)
HTML/XML Escaping
(external representation)
' '
" "
< &lt;
> &gt;
& &amp;

 

Which direction is more difficult?

Input: HTML escaping → characters
("AT&amp;T, 3&lt;5""AT&T, 3<5")

Output: Characters → HTML escaping
("AT&T, 3<5""AT&amp;T, 3&lt;5")

 

Difficulties for Input

 

Today's Schedule

 

The Function of a Compiler

Bridge between software and hardware

 

Example Compiler Input/Output

Input fragment:

sum += price * 25;

Output (assembly language):

LOAD   R1, price  ; load from price into R1 (register 1)
CONST R2, 25 ; put constant 25 into R2 (register 2)
MUL R1, R1, R2 ; put multiple of R1 and R2 into R1
LOAD R2, sum ; load from sum into R2
ADD R2, R1, R2 ; put the sum of R1 and R2 into R2
STORE sum, R2 ; store the contents of R2 into sum

 

Logical Structure of a Compiler

  1. [preprocessor]
  2. Lexical analysis
  3. Parsing (syntax analysis)
  4. Semantic analysis
  5. Optimization (or 5.)
  6. Code generation (or 4.)
  7. [assembler]
  8. [linker, loader]

 

Compiler Types and
Related Software

 

Today's Schedule

 

Example of Lexical Analysis

Fragment of input program:

sum += price * 25;

This is a sequence of characters:

s u m   + =   p r i c e   *   2 5 ; \n

Corresponding output (sequence of tokens):

id("sum"), plusequal, id("price"), asterisk, int(25), semicolon

 

Overview of Lexical Analysis

 

Lexical Analysis Details

 

Example of Parsing

Program fragment: sum += price * 25;

Input (sequence of tokens):

id("sum"), plusequal, id("price"), asterisk, int(25), semicolon

Corresponding output (syntax tree):

 

 

Details of Syntax Trees

More Examples

price = pretax / 100 * (108 - discount);

score = theory * 2 - errors / 2 + practice * 3;

if (a > 5)
    b = 15;

 

Today's Schedule

 

Example of an Automaton

Very simple automatic vending machine:

State transition diagram:

 
 
 

 

Today's Schedule

 

Example Language: Commands

Grammar of Commands

Command → Verb Noun '!'
Verb → eat
Verb → read
Verb → play | stay
Noun → bread
Noun → music | books | home

How to produce a command:
Start with Command, and use grammar rules to replace concepts with words

Language and Grammar

 

Homework Submission / 宿題提出

Deadline: April 14, 2022 (Thursday), 18:40

Where to submit: Box in front of room O-529 (building O, 5th floor)

Format: A4 single page (using both sides is okay; NO cover page), easily readable handwriting (NO printouts), name (kanji and kana) and student number at the top right

Problem: For the one-line C program fragment below, based on the examples given in this lecture, write down:

  1. the result of lexical analysis
  2. the result of parsing
  3. the output of the compiler (in assembly language; comments are not needed; use SUB for substraction, and DIV for division)
grade = english - absent * 5 + math / 3;

 

Schedule From Now On

April 14 (Thursday), 18:40: Homework deadline, box in front of O-529

April 15 (Friday), 11:00-12:30: Second lecture, face-to-face, E-202

 

Glossary

lexical analysis
字句解析
parsing, syntax analysis
構文解析
automaton
オートマトン
formal language
形式言語
grammar
文法
executive summary
役員 (時間がない人) のための要約
front end
フロントエンド
back end
バックエンド
optimization
最適化
code generation
コード生成
regular expression
正規表現
text format
文書形式
data format
データ形式
internal representation
内部表現
external representation
外部表現
(e.g. face) recognition
(顔) 認識
high-level program language
高級プログラム言語
source (file/program)
ソース (ファイル・プログラム)、原始プログラム
object code
目的プログラム
machine code
実行プログラム
assembly language
アセンブリ言語
register
レジスタ
preprocessor
プリプロセッサ
semantic analysis
意味解析
assembler
アセンブラ (アセンブリ言語を処理するソフト)
linker
リンカ
loader
ローダ
one pass compiler
ワンパス・コンパイラ
x-pass compiler
x-パス・コンパイラ
cross-compiler
クロスコンパイラ
dynamic/just-in-time (JIT) compiler
動的コンパイラ
preprocessor
プリプロセッサ
interpreter
インタプリタ、通訳系
token
トークン、記号、符
natural language
自然言語
attribute
属性
identifier (発音: アイデンティファイア)
識別子
syntax tree
構文木
operator
演算子
operand
被演算子
expression
subexpression
部分式
statement (of a program)
separators
区切り記号
automatic vending machine
自動販売機
state transition diagram
状態遷移図
structure
構造
syntax
構文
semantics
意味 (論)
command
命令 (文)
verb
動詞
noun
名詞