# Code Optimization

(最適化)

## Language Theory and Compilers

http://www.sw.it.aoyama.ac.jp/2018/Compiler/lecture14.html

### Martin J. Dürst

© 2005-18 Martin J. Dürst 青山学院大学

# Today's Schedule

• Remaining schedule
• Summary, homework, and leftovers from last time
• Code optimization
• Goals, requirements
• Methods
• Techniques
• 授業改善のための学生アンケート

# Remaining Schedule

• July 20: Optimization
• July 24 (Friday lectures on Tuesday): Executing environment: virtual machines, garbage collection,...
• July 27, 11:10-12:35: Term final exam

# Summary of Previous Lecture

• Code generation is machine dependent
• The resulting code can be represented in assembly language
• There are no machine instructions for conditionals, `if`/`for`/... statements
⇒ we have to use conditional `jump` instructions
• `jump` uses comparison with 0 as a condition
• In C, `jump` can be expressed with `goto`
• Restricted C is an intermediate representation between (full) C and assembly language

# Homework from Two Weeks Ago

For the Turing machine given by the following state transition table:

 Current state Current tape symbol New tape symbol Movement direction Next state →1 0 1 L 1 →1 1 0 R 2 →1 _ _ L 3* 2 0 0 R 2 2 1 1 R 2 2 _ _ L 3*
1. Draw the state transition diagram for this machine
2. Show in detail how this machine processes the input ..._1101000_...
3. Guess and explain what kind of calculation this machine does if the tape contains only a single contiguous sequence of '0'es and '1'es (surrounded by blanks) and where the leftmost non-blank symbol is a '1'

(this Turing machine always starts on the rightmost non-blank symbol)

# Homework 1: Code Generation for Logical And

C fragment:

```if (c<4 && f>=12)
a=d;```

Use hint: Conversion to nested `if` statement:

```if (c<4)
if (f>=12)
a = d;```

# Homework 2: Code Generation for `for` Statement

(bonus problem)

C fragment:

```for (i=0; i<20; i++)
x *= y;```

Preparation (use hint): Conversion to `while` statement:

```i = 0;
while (i<20) {
x *= y;
i++;
}```

# Goals of Optimization

• Make program execution faster
• Reduce code size
• Program functionality (do not change meaning)
• Speed of compilation
• Ease of debugging
• Properties of target CPU/machine

Compilers provide options to choose different levels of optimization

# Optimization Methods

• Peephole optimization: Local exchange of instructions,...
• Control flow analysis
• Split program into basic blocks:
• No jumps from outside
• No jumps to other locations
→ linear control flow inside block
• Create a graph where:
• Basic blocks are nodes
• Jumps from (the end of) one block to (the start of) another are directed edges
• Example: Control flow analysis for Homework 2
• Data flow analysis
Use control flow graph to analyse which variable/register assignments affect which variable/register usages

# Optimization Techniques: Faster and Smaller

Repeatedly try to use different techniques to continuously optimize the code further
(good example for Ruby implementation (video))

• Constant folding
`24 * 60 * 60``86400`
• Constant propagation
`a = 3; b = a * 4;``a = 3; b = 3 * 4;`
• Move common code out of loops
`while (a<500) { a += b*c*d; }`
`e = b*c*d; while (a<500) { a += e; }`
• Dead code elimination: Eliminate code that will never be executed
• Reuse of register values
```STORE a, R1 LOAD R1, a```

`STORE a, R1`

# Optimization Techniques: Faster but Larger

Make execution faster even if the amout of code may increase

• Function inlining:
```double square(double x) { return x*x; }
square(a);```
`a*a`
• Loop unrolling:
• Unrolling a loop with a small number of repetitions
`for (i=0; i<5; i++) a+=b;`

`a = a+b+b+b+b+b;`

• Partial unrolling (e.g. 20 repetitions are split into 4 repetitions of 5 calculations)
`for (i=0; i<20; i++) a+=b;`

```for (i=0; i<20; i+=5)     a = a+b+b+b+b+b;```

# Optimization Techniques: Machine Dependent

Highly dependent on machine type

• Exchange of instructions (different execution time for different instructions)
Example: `x*2``x+x` or `x<<1`
Example: `x*5``x<<2 + x`
• Instruction reordering (example: instructions such as `LOAD` may take longer, but run in parallel)

# Example of Instruction Reordering

Expression: `743 * a`

Before optimization:

```        CONST   R1, 743        LOAD    R2, a
; invisible waiting time        MUL     R3, R1, R2```

After optimization:

`        LOAD    R2, a        CONST   R1, 743    ; executed while LOAD still in progress        MUL     R3, R1, R2`

# Dynamic Compilation

(also: Just-in-time compilation)

• Check execution count of functions and blocks
• Check use of frequent parameter values (e.g. 0, 1)
• Recompile to take advantage of this knowledge

# Examples of Actual Optimization

source

without optimization (```gcc -O0 -S code.c```)

with optimization (```gcc -O1 -S code.c```)

more optimized (```gcc -O2 -S code.c```)

much more optimized (```gcc -O3 -S code.c```, with parallel processing instructions)

(assembly language for Intel PCs (CISC))

Optimization options for `gcc`: English, Japanese

# 授業改善のための学生アンケート

WEB アンケート

お願い: 自由記述に必ず良かった点、問題点を具体的に書きましょう

(悪い例: 発音が分かりにくい; 良い例: さ行が濁っているかどうか分かりにくい)

# Homework

(no need to submit)

Prepare for term final exam. In particular, have a look at some optimization problems.

# Glossary

peephole optimization
ピープホール最適化
control flow analysis

basic block

data flow analysis
データフロー解析
constant folding

constant propagation