Dynamic Programming

(動的計画法)

Data Structures and Algorithms

12th lecture, December 15, 2022

https://www.sw.it.aoyama.ac.jp/2022/DA/lecture12.html

Martin J. Dürst

Today's Schedule

Leftovers and summary of last lecture
Algorithm design strategies
Overview of dynamic programming
Example application: Order of evaluation of chain matrix multiplication
Dynamic programming in Ruby

Remaining Schedule

December 15 (today): 12th lecture (Dynamic Programming)
December 22: 13th lecture (Algorithm Design Strategies)
January 12: 14th lecture (NP-completenes, reducibility)
January 19: 15th lecture (approximation algorithms)
January 26, 09:30-10:55 (85 min): Term Final Exam

Leftovers of Last Lecture

(Boyer-Moore algorithm, string matching context,...)

Summary of Last Lecture

A simplistic implementation of string matching is O(nm) in the worst case
The Rabin-Karp algorithm is O(n), using a hash function that can be extended to 2D matching
The Knuth-Morris-Pratt algorithm is O(n), and views the text strictly in input order after a precomputation step
The Boyer-Moore algorithm is O(n/m) in most cases

Algorithm Design Strategies

Simple/simplistic algorithms
Divide and conquer
Dynamic programming

Overview of Dynamic Programming

Investigate and clarify structure of (optimal) solution
Recursively define (optimal) solution
Calculate (optimal) solutions bottom-up
Construct (optimal) solution from calculation results

Proposed by Richard Bellman in the 1950ies
Name now sounds arbitrary, but is firmly established

Simple Example of Dynamic Programming

Definition of the Fibonacci function f(n):
- n ∈ {0, 1}: f(n) = n
- n ≧ 2: f(n) = f(n-1) + f(n-2)
Implementation for this recursive definition is easy (see also Cfibonacci.rb):
```
def fib (n)
    n<2 ? n : fib(n-1) + fib(n-2)
end
```
If n grows, execution gets extremely slow
Reason for slow execution: The same calculation is repeated many times
(when evaluating f(n), f(1) is evaluated f(n) times)
Evaluation time can be shortened by
- Changing the order of evaluations (bottom-up), or
- Remembering intermediate results

Matrix Multiplication

Multiplying a matrix ₀M₁ (r₀ by r₁) and a matrix ₁M₂ (r₁ by r₂) results in a r₀ by r₂ matrix ₀M₂ (₀M₁· ₁M₂ ⇒ ₀M₁M₂)
This multiplication needs r₀r₁r₂ scalar multiplications and r₀r₁r₂ scalar additions,
so its time complexity is O(r₀r₁r₂)
Actual example: r₀=100, r₁=2, r₂=200
⇒ Number of multiplications: 100×2×200 = 40'000
Because the number of scalar multiplications and additions is the same, we will only consider multiplications

Matrix Multiplication Program Skeleton

for (i=0; i<r₀; i++)
    for (j=0; j<r₂; j++) {
        sum = 0;
        for (k=0; k<r₁; k++)
            sum += ₀M₁[i][k] * ₁M₂[k][j];
        ₀M₂[i][j] = sum;
    }

Chain Multiplication of Scalars

A series of multiplications (e.g. 163·4·25) is called chain multiplication
Multiplication of scalars is associative (i.e. (163·4)·25 = 163·(4·25))
Not all multiplication orders have the same speed.
For humans, 163·(4·25) = 163·100 = 16300 is faster than (163·4)·25 = 652·25
Conclusion: Choosing a good order of multiplication can speed up calculation

Chain Multiplication of Matrices

Matrices can also be multiplied in chains (e.g. ₀M₁ · ₁M₂ · ₂M₃)
Multiplication of matrices is associative (but not commutative!)
This means that there are multiple ways to calculate the overall product:
(₀M₁·₁M₂) · ₂M₃ (also written ₀M₂M₃) or
₀M₁ · (₁M₂·₂M₃) (also written ₀M₁M₃)
For r₀=100, r₁=2, r₂=200, r₃=3, the number of scalar multiplications is:

(₀M₁·₁M₂) · ₂M₃: 100×2×200 + 100×200×3 = 40'000+60'000 = 100'000
₀M₁ · (₁M₂·₂M₃): 2×200×3 + 100×2×3 = 1'200+600 = 1'800
Conclusion: Choosing a good order for matrix multiplications can save a lot of work

Number of Matrix Multiplications Orders

Multiplications	Orders
0	1
1	1
2	2
3	5
4	14
5	42
6	132
7	429
8	1430
9	4862

The number of orders for multiplying n matrices is small for small n, but grows exponentially
The number of orders is equal to the numbers in the middle of Pascal's triangle (1, 2, 6, 20, 70,...)
divided by increasing natural numbers (1, 2, 3, 4, 5,...)
These numbers are called Catalan numbers:
C_n = (2n)! / (n!(n+1)!) = Ω(4ⁿ/n^3/2)
Catalan numbers have many applications:
- Combinations of n pairs of properly nested parentheses (n=3: ()()(), (())(), ()(()), ((())), (()()))
- Number of shapes of binary trees of size n
- Number of triangulations of a (convex) polygon with n vertices

Optimal Order of Multiplications

Checking all orders is very slow (Ω(n4ⁿ/n^3/2) = Ω(4ⁿ/n^1/2))
Minimal evaluation cost (number of scalar multiplications):
- mincost(a, c): minimal cost for evaluating _aM_c
  - if a+1 ≧ c, mincost(a, c) = 0
  - if a+1 < c, mincost(a, c) = min^c-1_b=a+1 cost(a, b, c)
- split(a, c): optimal spliting point
  - split(a, c) = arg min_b cost(a, b, c)
- cost(a, b, c): cost for calculating _aM_bM_c
  - i.e. cost for splitting the evaluation of _aM_c at b
  - cost(a, b, c) = mincost(a, b)+mincost(b, c) + r_ar_br_c
Simple implementation in Ruby: MatrixSlow in Cmatrix.rb

Inverting Optimization Order and Storing Intermediate Results

The solution can be evaluated from split(0, n) top-down using recursion
The problem with top-down evaluation is that intermediate results (mincost(x, y)) are calculated repeatedly
Bottom-up calculation:
- Calculate the minimal costs and optimal splitting points for chains of length k, starting with k=2 and increasing to k=n
- Store intermediate results for reuse
Implementation in Ruby: MatrixPlan in Cmatrix.rb

Example Calculation

				₀`M`₁`M`₅: 274 ₀`M`₂`M`₅: 450 ₀`M`₃`M`₅: 470 ₀`M`₄`M`₅: 320
			₀`M`₁`M`₄: 260 ₀`M`₂`M`₄: 468 ₀`M`₃`M`₄: 400		₁`M`₂`M`₅: 366 ₁`M`₃`M`₅: 330 ₁`M`₄`M`₅: 250
		₀`M`₁`M`₃: 200 ₀`M`₂`M`₃: 288		₁`M`₂`M`₄: 360 ₁`M`₃`M`₄: 220		₂`M`₃`M`₅: 330 ₂`M`₄`M`₅: 390
	₀`M`₁`M`₂: 48		₁`M`₂`M`₃: 120		₂`M`₃`M`₄: 300		₃`M`₄`M`₅: 150
₀`M`₁: 0		₁`M`₂: 0		₂`M`₃: 0		₃`M`₄: 0		₄`M`₅: 0
`r`₀ = 4	`r`₁ = 2		`r`₂ = 6		`r`₃ = 10		`r`₄ = 5		`r`₅ = 3

Overall solution (optimal order of multiplications): ₀M₁·(((₁M₂·₂M₃)·₃M₄)·₄M₅)

Complexity of Optimizing Evaluation Order

The calculation of mincost(a, c) is O(c-a)
Evaluating all mincost(a, a+k) is O((n-k)·k)
Evaluation cost has the shape of a tetrahedron,
with one edge left-right at the bottom,
and another edge front-back at the top
Total time complexity: ∑ⁿ_k=1 O((n-k)·k) = O(n³)

The time complexity of dynamic programming depends on the structure of the problem

O(n³), O(n²), O(n), O(nm),... are frequent time complexities

(for this problem, there is a different, more difficult algorithm with O(n log n))

Overview of Dynamic Programming

Investigate and clarify structure of (optimal) solution
Recursively define (optimal) solution (e.g. MatrixSlow)
Calculate (optimal) solutions bottom-up (e.g. MatrixPlan)
Construct (optimal) solution from calculation results

Conditions for Using Dynamic Programming

Optimal substructure:
The global (optimal) solution can be constructed from the (optimal) solutions of subproblems
(common with divide and conquer)
Overlapping subproblems
(different from divide and conquer)

Memoization

The key in dynamic programming is to reuse intermediate results
Many functions can be changed so that they remember results
This is called memoization:
- Add a data structure that stores results
  (a dictionary with arguments as key and result as value)
- Check the dictionary
- If the result is stored, return it immediately
- If the result is not stored, calculate it, store it, and return it
Only possible for pure functions (functions with no side effects)

Memoization in Ruby

Use metaprogramming to modify a function so that:
- On first calculation, result is stored (e.g. in a Hash using function arguments as the key)
- Before each calculation, storage is checked, and stored result used if available
Metaprogramming changes the program while it runs
Simple application example: Cfibonacci.rb
(caution: for very big numbers, calculation times are also affected by the size of the numbers themselves)

Summary

Dynamic programming is an algorithm design strategy
Dynamic programming is suited for problems where the overall (optimal) solution can be obtained from solutions for subproblems, but the subproblems overlap
The time complexity of dynamic programming depends on the structure of the actual problem

Homework

Review this lecture (including the 'Example Calculation' and the programs)
Find three problems that can be solved using dynamic programming, and investigate the algorithms used
Prepare for final exam

Glossary

dynamic programming: 動的計画法
algorithm design strategies: アルゴリズムの設計方針
optimal solution: 最適解
scalar: スカラー
Catalan number: カタラン数
matrix chain multiplication: 連鎖行列積、行列の連鎖乗算
triangulations: (多角形の) 三角分割
(convex) polygon: (凸) 多角形
intermediate result: 途中結果
splitting point: 分割点
arg min (argument of the minimum): 最小値点
top-down: 下向き、トップダウン
bottom-up: 上向き、ボトムアップ
(regular) tetrahedron: (正)四面体
optimal substructure: 部分構造の最適性
overlapping subproblems: 部分問題の重複
memoization (verb: memoize): 履歴管理
pure function: 純粋関数
metaprogramming: メタプログラミング