(アルゴリズムの表現と評価)

http://www.sw.it.aoyama.ac.jp/2019/DA/lecture2.html

© 2009-19 Martin J. Dürst Aoyama Gakuin University

- Summary of last lecture・last week's homework
- Representation of algorithms
- The programming language Ruby
- Evaluation of algorithms

- Course overview
- The fascination of algorithms and data structures
- Data structures: A number of data items and their relationships.

Example: linked list - Algorithm: A
*clear set of instructions*for how to solve a*well-defined problem*in*finite time*.

Examples: linear search, binary search

- Companies traded on the Tokyo Stock Exchange (TSE), first section: 2'150 companies
- Morning trading session 09:00-11:30, afternoon trading session 12:30-15:00
- Daily trading: 5 hours = 300 minutes = 6'000 trades
- Saturdays/Sundays in 2018: 104; additional holidays: 20; ⇒ business days: 241
- Total number of data items: 241·6'000·2'150 = 3'108'900'000 trades ≊ 3,1 billion trades

Caution: Make sure you provide references for the above data

Actual system capacity (source: The Japan News, 2015/9/22, p. 6):

old system | new system | |

used | until 2015/9/18 | from 2015/9/24 |

Orders per day (max) | 137 million | 270 million |

Orders per second (max) | 30'000~40'000 | >50'000 |

Time to process orders | 0.001s | 0.0005s |

Design an efficient (=fast) algorithm for Ms. Noda's problem.

Hint: Can you use an algorithm that you already know?

One idea/outline:

- Order products by product price
- For each day with target amount
`t`, find the winning product combination as follows:- For each product price
`p`, calculate the remaining maximum amount`r`=`t`-`p` - Find the second product with maximum price
`s`≦`r`using (a variant of) binary search - Take the products so that
`p`+`s`is the maximum

- For each product price

Examine the algorithm representations on the additional handout, and think about each representation's advantages and disadvantages.

(no need to submit)

- Algorithms are abstract ideas
- But we need some way to represent them
- The main representation methods are:
- Text(ual description)
- Diagrams
- Pseudocode
- Programming languages

- Each method has advantages and disadvantages

Principle: Describe the algorithm in a natural language

Advantages: Possible to write and understand for non-experts

Disadvantages:

- Ambiguity of natural language
- Difficulty to be precise
- Structure is unclear
- Dependent on natural language choice (e.g. Japanese vs. English)

Examples: Flowchart, ...

Advantages: Visual expression

Disadvantages:

- Creation is time-consuming
- Problem of data exchange (many different formats)
- Gap to structured programming

(structured programming: replacing `goto`

(jump to an arbitrary
location in a program) with structured branches
(`if`

/`switch`

/...) and loops
(`for`

/`while`

/...) only)

- Notation that is close to actual program code
- Commonalities (not required):
- Simple notation (e.g.: no semicolon at end of line)
- No variable declarations
- Ignoring type details
- Use of special symbols (e.g. ←, ∨, ∧, ≤, ∈,...)
- Partial use of natural language (including comments)

- Important: You can create
**your own**pseudocode!

Advantages:

- Middle ground between (natural language) text and program
- Adjustable to author's preferences
- Adjustable to the characteristics of the algorithm
- Strong connection to structured programming
- Possible to concentrate on the essence of the algorithm

Disadvantages:

- Many different variations
- Obfuscation is possible
- Not executable

Advantages:

- Precision
- Executability

Disadvantages:

- Many different programming languages
- Dependent on execution environment
- More precise than necessary
- Different from essence of algorithm

- Created in Japan by Yukihiro Matsumoto (松本 行弘; nickname: Matz) starting in 1993
- Gaining popularity outside Japan since 2000
- Increased attention after publication of Ruby on Rails in 2004

- Completely object-oriented
- Easy-to-use scripting language
- Easy for beginners, satisfying for experts
- Contributions to Ruby implementation by Master/Bachelor students (internationalization,...)

Install Ruby on your notebook computer (and/or on your computer at home)

Main installation methods (choose one):

- Add Ruby to your Cygwin installation
(use cygwin setup.exe; detailled
instructions)

(maybe already installed for Computer Programming I, please check) - Download and install Ruby 2.5.1-2 (or any other version) from RubyInstaller

How to check: Open a `Cygwin Terminal`

or start ```
Command
Prompt with Ruby
```

and execute `ruby -v `

Important: If you have problems with installing Ruby, come to my lab to fix
it **before** the next lecture.

- Simple syntax
- No need to declare variables
- Data structures (e.g. array) are orthogonal to data types (e.g.
`int`

)

(i.e. no need for "array of int", "array of float", ...) - Ruby is often called "executable pseudocode"
- Very flexible (can change almost everything)
- Excellent environment (profiler, ...)

Important: In this course, you have to learn how to **read**
Ruby programs as pseudocode to understand algorithms.

But you do not need to **write** Ruby programs.

- For this course, you have to learn how to
**read**Ruby programs

(as pseudocode to understand algorithms) - For this course, you do not have to learn how to write Ruby programs
- We will
**write**Ruby programs in the spring term of 2020

(Projects in Information Technology II/情報総合プログラミング実習 II) - Writing Ruby programs now will help you both for this course and next year

Linear search and binary search: 2search.rb

How to execute:

- Open a
`Cygwin Terminal`

or`Start Command Prompt with Ruby`

- Use the
`cd`

command to move to a directory of your choice - Download the file into the choosen directory
- Execute
`ruby 2search.rb`

`#`

starts a comment (up to end of line)

(we use`#`

for comments about the algorithms,

`##`

for comments about Ruby itself)- No need for
`;`

(semicolon) at end of line - No need for
`()`

for conditions (`if`

/`while`

, ...) and most method calls - Methods (functions) are defined using
`def`

`class`

/`def`

/`if`

/`while`

/`do`

all end with`end`

(no`{}`

!)- Classes can be reopened, methods can be added or redefined

Main evaluation criteria:

- Execution time (computational (time) complexity)

(this is the main focus of this lecture, and of research on algorithms) - Amout of memory used (space complexity)

(in general, the difference between different algorithms or data structures for the same problem is minor) - Ease of implementation

(this is highly subjective, but in general, faster algorithms may be more difficult to implement)

Contextual information used for evaluation:

*How often*will the algorithm be used?- With
*how much data*will the algorithm be used?

Example: Comparing linear search and binary search

Possible questions:

- How many
*seconds*faster is binary search when compared to linear search? - How many
*times*faster is binary search when compared to linear search?

Problem: These questions do not have a single, clear answer.

When we compare algorithms, we want a single, clear answer.

Our answer will be: Linear search is
** O(n)**, binary search is

We will arrive at this answer, and understand it, in the next two weeks.

Very concrete

- Measure actual execution time

- Count operation steps

- Calculate worst case number of steps

- Think about asymtotic behavior

Very abstract

- Advantages:
exact results (if conditions match)

- Problems:
- Results depend on hardware used
- Results depend on implementation details
- Results vary slightly with each execution
- Results depend on problem size (number of data items in the input) and on the details of the data
- Impossible without implementation

⇒ We need a better method to compare algorithms, not implemenations or hardware

- Step:
- Operation executed in constant time
- Examples:
- Additions/multiplications/... of integers
- Comparisons

- Memory access

- Advantage: Independent of hardware
- Problems:
- The exact number of steps depends on implementation details
- There are many ways to count steps

(e.g. is an assignment one step or two steps (two memory accesses)) - Results depend on problem size and on the details of the data

⇒ We need a more abstract way to compare algorithms

Use 2search.rb to fill in the following
table. Set the `COUNT`

variable to `n`. Divide the number
of operations shown in the profiles by 2, because each operation is counted
twice.

n (number of data items) |
1 | 8 | 64 | 512 | 4'096 | 32'768 | 262'144 |
---|---|---|---|---|---|---|---|

linear search | |||||||

binary search |

Fill in the following table

(use engineering notation (e.g. 1.5E+20) if the numbers get big;

round liberally, the magnitude of the number is more important than the exact
value)

n |
1 | 10 | 100 | 1'000 | 10'000 | 100'000 |

5n |
||||||

n^{1.2} |
||||||

n^{2} |
||||||

n log_{2} n |
||||||

1.01^{n} |

Which function of each pair (left/right column) grows larger if `n`
increases? And why?

left column | right column | column with larger growth | reason |
---|---|---|---|

100n |
n^{2} |
||

1.1^{n} |
n^{20} |
||

5 log_{2} n |
10 log_{4} n |
||

20^{n} |
n! |
||

100·2^{n} |
2.1^{n} |

Review logarithms (`ln`, `log`_{10},
`log`_{2},...) and limits
(lim_{n→∞},...) based on high school books/notes and Web
resources

- There are four main ways of describing algorithms:

text, diagrams, pseudocode, and programs - Each description has its advantages and disadvantages
- In this course, we will use Ruby as
*executable pseudocode* - The main criteria to evaluate algorithms are time complexity, space complexity, and difficulty of implementation
- Time complexity is most important
- Measuring execution time and counting steps does not really compare
algorithms,

but counting steps is important when evaluating time complexity

(提出不要。ただし、記入した表などは来週授業に持参すること。)

- Ruby
を使って探索アルゴリズムのステップ数を調べ、テーブルを完成

(散布図を作成してもよい)

`n`の増加の場合、複数の関数の値を計算しなさい- 関数の増加の比較で、どちらの関数が「最終的に勝つ」か、そしてその理由を調べなさい
- 高校の教科書やウェブなどで対数 (
`ln`,`log`_{10},`log`_{2}など) と極限 (lim_{n→∞}など) について調査・再確認

- representation
- 表現
- evaluation
- 評価
- pseudocode
- 疑似コード
- non-expert
- 素人
- ambiguity
- 曖昧さ
- natural language
- 自然言語 (すなわち、人間が話す言語)
- flowchart
- 流れ図
- structured progamming
- 構造化プログラミング
- obfuscation
- ごまかし
- object-oriented (programming language)
- オブジェクト指向 ((プログラム) 言語)
- scripting language
- スクリプト言語
- criterion (複数 criteria)
- 基準
- execution time
- 実行時間
- computational complexity
- 計算量
- worst case
- 最悪の場合
- asymptotic
- 漸近的
- round
- 四捨五入する、概数で表す
- logarithm
- 対数
- limit
- 極限