(漸近的計算量と O 記法)

http://www.sw.it.aoyama.ac.jp/2018/DA/lecture3.html

© 2009-18 Martin J. Dürst 青山学院大学

- Summary/leftovers from last lecture, last week's homework
- Comparing execution times: From concrete to abstract
- Classification of Functions by Asymptotic Growth
- Big-
`O`notation

- There are many ways of describing algorithms: natural language text, diagrams, pseudocode, programs
- Each description has advantages and disadvantages
- Pseudocode is close to structured programming, but ignores unnecessary details
- In this course, we will use Ruby as "executable pseudocode"
- The main criterion to evaluate algorithms is time complexity as a function of the number of (input) data items
- Time complexity is the most important criterion when comparing algorithms

- Identify basic operations (arithmetic operations, assignments, comparisons,...)
- Count or calculate number of times each operation is executed
- If there is a choice, use the worst case

(e.g. for linear search, the 'not found' case) - For branches, count the worst branch
- For loops, include the loop logic and multiply by number of times the loop is executed
- For functions, include some steps for function overhead and multiply by number of times the function is called

Very concrete

- Measure actual execution time

- Count operation steps

- Estimate worst case number of steps

- Think about asymtotic behavior

Very abstract

- For the same
**input size**, some algorithms always take the same number of steps.

Example: Sum of an array of numbers - Other algorithm's execution time depends on the
**input values**.

Example: Linear search: Finding 'Aargau' is very fast, finding 'Zug' is much slower. - An algorithm that is sometimes fast, but often slow is not very good.
- It is best to consider the
**worst case**behavior.

Example for linear search: Search the whole dictionary without finding the target word.(We will see exceptions later in this course.)

- The execution time of an algorithm and the number of executed steps depend on the size of the input (the number of data items in the input)
- We can express this dependency as a function:

`f`(`n`) (`n`is the size of the input) - Rules for comparing functions:
- Concentrate on what happens when
`n`increases (gets really big)

→ Ignore special cases for small`n`

→ Ignore constant(-time) differences (example: initialization time) - Concentrate on the essence of the algorithm

→ Ignore hardware differences and implementation differences

→ Ignore constant factors

- Concentrate on what happens when

⇒ Independent of hardware, implementation details, step counting details

⇒ Simple expression of essential differences between algorithms

Fill in the following table

(use engineering notation (e.g. 1.5E+20) if the numbers get very big;

round liberally, the magnitude of the number is more important than the exact
value)

n |
1 | 10 | 100 | 1'000 | 10'000 | 100'000 |

5n |
5 | 50 | 500 | 5'000 | 50'000 | 500'000 |

n^{1.2} |
1 | 15.8 | 251.2 | 3'981 | 63'096 | 1'000'000 |

n^{2} |
1 | 100 | 10'000 | 1'000'000 | 100'000'000 | 1e+10 |

n log_{2}
n |
0 | 33.2 | 664.4 | 9'966 | 132'877 | 1'660'964 |

1.01^{n} |
1.01 | 1.1046 | 2.7 | 20'959 | 1.636e+43 | 1.372e+432 |

Which function of each pair (left/right column) grows larger if `n`
increases?

left | right | answer |
---|---|---|

100n |
n^{2} |
right (n ≥ 100) |

1.1^{n} |
n^{20} |
left (n ≥ 1541) |

5 log_{2} n |
10 log_{4} n |
same (log |

20^{n} |
n! |
right (n ≥ 52) |

100·2^{n} |
2.1^{n} |
right (n ≥ 95) |

- Start
`irb`

(Interactive Ruby) - Write a loop:
`(`

`start`..`end`).each { |n|`comparison`} - Example of
:`comparison`

`puts n, 1.1**n, n**20`

- Change the
and`start`

values until appropriate`end`

- If necessary, convert integers to floating point numbers for easier comparison
- Define the factulty function:
`def fac(n) n<2 ? 1 : n*fac(n-1) end`

Caution: Use only when you understand which function will eventually grow larger

Various growth classes with example functions:

- Linear growth:
`n`, 2`n`+15, 100`n`-40, 0.001`n`,... - Quadratic growth:
`n`^{2}, 500`n`^{2}+30`n`+3000,... - Cubic growth:
`n`^{3}, 5`n`^{3}+7`n`^{2}+80,... - Logarithmic growth: ln
`n`, log_{2}`n`, 5 log_{10}`n`^{2}+30,... - Exponential growth: 1.1
^{n}, 2^{n}, 2^{0.5n}+1000`n`^{15},... - ...

Big-O notation is a notation for expressing the order of growth of a function (e.g. time complexity of an algorithm).

`O`(`g`): Set of functions with lower or same order of
growth as function `g`

Example:

Set of functions that grow slower or as slow as `n`^{2}:
`O`(`n`^{2})

Usage examples:

3`n`^{1.5} ∈ `O`(`n`^{2}),
15`n`^{2} ∈ `O`(`n`^{2}),
2.7`n`^{3} ∉ `O`(`n`^{2})

∃`c`>0: ∃`n`_{0}≥0:
∀`n`≥`n`_{0}:
`f`(`n`)≤`c`·`g`(`n`) ⇔ `f`(`n`)∈`O`(`g`(`n`))

`g`(`n`) is an*asymptotic upper bound*of`f`(`n`)- In some references (books, ...):
`f`(`n`)∈`O`(`g`(`n`)) is written`f`(`n`)＝`O`(`g`(`n`))

- In this case, O(
`g`(`n`)) is always on the rigth side - However,
`f`(`n`)∈`O`(`g`(`n`)) is more precise and easier to understand

- Role of
`c`: Ignore constant-factor differences (e.g. one computer or programming language being double as fast as another) - Role of
`n`_{0}: Ignore initialization costs and behavior for small values of`n`

- The number of steps in linear search is:
`a``n`+`b`

⇒ Linear search has time complexity`O`(`n`)

(linear search is`O`(`n`), linear search has linear time complexity) - The number of steps in binary search is:
`c`log_{2}`n`+`d`

⇒ Binary search has time a complexity of`O`(log`n`) - Because
`O`(log`n`) ⊊`O`(`n`), binary search is faster

(from last lecture)

Possible questions:

- How many
*seconds*faster is binary search when compared to linear search? - How many
*times*faster is binary search when compared to linear search?

Problem: These questions do not have a single answer.

When we compare algorithms, we want a simple answer.

The simple and general answer is using big-O notation:

Linear search is `O`(`n`), binary search is `O`(log
`n`).

- Linear growth:

`n`∈`O`(`n`); 2`n`+15∈`O`(`n`); 100`n`-40∈`O`(`n`); 5 log_{10}`n`+30∈`O`(`n`), ...

`O`(1)⊂`O`(`n`);`O`(log`n`)⊂`O`(`n`);`O`(20`n`)=`O`(4`n`+ 13), ... - Quadratic growth:

`n`^{2}∈`O`(`n`^{2}); 500`n`^{2}+30`n`+3000∈`O`(`n`^{2}), ...

O(`n`)⊂`O`(`n`);^{2}`n`^{3}∉O(`n`), ...^{2}^{} - Cubic Growth:

`n`^{3}∈`O`(`n`^{3}); 5`n`^{3}+7`n`^{2}+80∈`O`(`n`^{3}), ... - Logarithmic growth:
ln

`n`∈`O`(log`n`); log_{2}`n`∈`O`(log`n`); 5 log_{10}`n`^{2}+30∈`O`(log`n`), ...

`O`(`g`(`n`)): Set of functions with lower or same order of growth as`g`(`n`)

`Ω`(`g`(`n`)): Set of functions with larger or same order of growth as`g`(`n`)`Θ`(`g`(`n`)): Set of functions with same order of growth as`g`(`n`)

Examples:

3`n`^{1.5} ∈ `O`(`n`^{2}),
15`n`^{2} ∈ `O`(`n`^{2}),
2.7`n`^{3} ∉ `O`(`n`^{2})

3`n`^{1.5} ∉
` Ω`(

3

∃`c`>0: ∃`n`_{0}≥0:
∀`n`≥`n`_{0}: `c`·`g`(`n`)≤`f`(`n`) ⇔
`f`(`n`)∈`Ω`(`g`(`n`))

∃`c`_{1}>0: ∃`c`_{2}>0:
∃`n`_{0}≥0:
∀`n`≥`n`_{0}:
`c`_{1}·`g`(`n`)≤`f`(`n`)≤`c`_{2}·`g`(`n`) ⇔
`f`(`n`)∈`Θ`(`g`(`n`))

`f`(`n`)∈`O`(`g`(`n`)) ∧
`f`(`n`)∈`Ω`(`g`(`n`)) ⇔ `f`(`n`)∈`Θ`(`g`(`n`))

`Θ`(`g`(`n`)) =
`O`(`g`(`n`)) ∩
`Ω`(`g`(`n`))

`O`: Maximum (worst-case) time complexity of algorithms`Ω`: Minimally needed time complexity to solve a problem`Θ`: Used when expressing the fact that a time complexity is not only possible, but actually reached

In general as well as in this course, mainly `O` will be used.

- Method 1: Use the definition

Find appropriatie values for`n`_{0}and`c`, and check the definition - Method 2: Use the limit of a function

lim_{n→∞}(`f`(`n`)/`g`(`n`)):- If the limit is 0:
`O`(`f`(`n`))⊊`O`(`g`(`n`)),`f`(`n`)∈`O`(`g`(`n`)) - If the limit is 0 <
`d`< ∞: O(`f`(`n`))=`O`(`g`(`n`)),`f`(`n`)∈`O`(`g`(`n`)) - If the limit is
∞:
`O`(`g`(`n`))⊊`O`(`f`(`n`)),`f`(`n`)∉`O`(`g`(`n`))

- If the limit is 0:
- Method 3: Simplification

- Big-
`O`notation should be as simple as possible - Examples (for all functions except constant functions, we assume
increasing):
- Constant functions:
`O`(1) - Linear functions:
`O`(`n`) - Quadratic functions:
`O`(`n`^{2}) - Cubic functions:
`O`(`n`^{3}) - Logarithmic functions:
`O`(log`n`)

- Constant functions:
- For polynomials, all terms except the term with the biggest exponent can be ignored
- For logarithms, the base is left out (irrelevant)

Concrete Example: 500`n`^{2}+30`n` ∈
`O`(`n`^{2})

Derivation for general case: `f`(`n`) =
`d``n`` ^{a}` +

Definition of `O`: `f` (`n`) ≤
`c``g`(`n`) [`n` >
`n`_{0}; `n`_{0}, `c` > 0]

`d``n`` ^{a}` +

`d` +
`e``n`` ^{b}`/

Some possible values for `c` and `n`_{0}:

`n`_{0}= 1,`c`≥`d`+`e``n`_{0}= 2,`c`≥`d`+2^{b-a}`e``n`_{0}= 10,`c`≥`d`+10^{b-a}e

Some possible values for concrete example
(500`n`^{2}+30`n`):

`n`_{0}= 1,`c`≥ 530 → 500`n`^{2}+30`n`≤ 530`n`^{2}[`n`≥1]`n`_{0}= 2,`c`≥ 515 → 500`n`^{2}+30`n`≤ 515`n`^{2}[`n`≥2]`n`_{0}= 10,`c`≥ 503 → 500`n`^{2}+30`n`≤ 503`n`^{2}[`n`≥10]

In general: `a` > `b` > 0 ⇒
`O`(`n`^{a} +
`n`^{b}) =
`O`(`n`^{a})

How do `O`(log_{2} `n`) and
`O`(log_{10} `n`) differ?

(Hint: log_{b} `a` = log_{c}
`a` / log_{c} `b` =
log_{c} `a` ·
log_{b} `c`)

log_{10} `n` = log_{2}
`n` · log_{10} 2 ≅ 0.301 · log_{2}
`n`

`O`(log_{10} `n`) = `O`(0.301... · log_{2} `n`) =
`O`(log_{2} `n`)

∀ `a`>1, `b`>1:
`O`(log_{a} `n`) = `O`(log_{b} `n`) =
`O`(log `n`)

- To compare the time complexity of algorithms:
- Ignore constant terms (initialization,...)
- Ignore constant factors (differences due to hardware or implementation)
- Count basic steps executed in the worst case
- Look at asymptotic growth when input size increases

- Asymptotic growth can be expressed with big-
`O`notation - The time complexity of algorithms can be expressed as O(log
`n`), O(`n`), O(`n`^{2}), O(2^{n}), ...

(no need to submit)

Review this lecture's material and the additional handout **every
day**!

On the Web, find algorithms with time complexity O(1), O(log `n`),
O(`n`), O(`n` log `n`), O(`n`^{2}),
O(`n`^{3}), O(2^{n}), O(`n`!), and
so on.

- big-O notation
- O 記法 (O そのものは漸近記号ともいう)
- asymptotic growth
- 漸近的な増加
- approximate
- 近似する
- essence
- 本質
- constant factor
- 一定の係数、定倍数
- eventually
- 最終的に
- linear growth
- 線形増加
- quadratic growth
- 二次増加
- cubic growth
- 三次増加
- logarithmic growth
- 対数増加
- exponential growth
- 指数増加
- Omega (Ω)
- オメガ (大文字)
- capital letter
- 大文字
- Theta (Θ)
- シータ (大文字)
- asymptotic upper bound
- 漸近的上界
- asymptotic lower bound
- 漸近的下界
- appropriate
- 適切
- limit
- 極限
- polynomial
- 多項式
- term
- (式の) 項
- logarithm
- 対数
- base
- (対数の) 底