Representation and Evaluation of Algorithms

Today's Schedule

Summary of Last Lecture

Solution to Homework 1, Question 1

Last Week's Homework 3: Help Ms. Noda

Last Week's Homework 2: Representation of Algorithms

Methods for Representing Algorithms

Text

Diagrams

What is Pseudocode

Evaluation of Pseudocode

Programing Language

The Programming Language Ruby

Last Week's Homework 4: Install Ruby

Ruby for Algorithm Representation

Reading and Writing Ruby

First Ruby Example

Basics of Ruby Syntax

Overview of Algorithm Evaluation

Comparing the Execution Time of Algorithms

Comparing Execution Times: From Concrete to Abstract

Measuring Execution Time

Counting Operation Steps

Homework 1: Example for Asymptotic Growth of Number of Steps

Homework 2: Calculate Function Values

Homework 3: Compare Function Growth

Homework 4: Logarithms and Limits

Summary of Today's Lecture

This Week's Homework

Glossary

Data Structures and Algorithms

2nd lecture, October 3, 2019

Martin J. Dürst

(アルゴリズムの表現と評価)

http://www.sw.it.aoyama.ac.jp/2019/DA/lecture2.html

Summary of last lecture・last week's homework
Representation of algorithms
The programming language Ruby
Evaluation of algorithms

Course overview
The fascination of algorithms and data structures
Data structures: A number of data items and their relationships.
Example: linked list
Algorithm: A clear set of instructions for how to solve a well-defined problem in finite time.
Examples: linear search, binary search

[Sorry, removed!]

Design an efficient (=fast) algorithm for Ms. Noda's problem.

Hint: Can you use an algorithm that you already know?

One idea/outline:

Order products by product price
For each day with target amount t, find the winning product combination as follows:
- For each product price p, calculate the remaining maximum amount r = t - p
- Find the second product with maximum price s ≦r using (a variant of) binary search
- Take the products so that p+s is the maximum

Examine the algorithm representations on the additional handout, and think about each representation's advantages and disadvantages.

(no need to submit)

Algorithms are abstract ideas
But we need some way to represent them
The main representation methods are:
1. Text(ual description)
2. Diagrams
3. Pseudocode
4. Programming languages
Each method has advantages and disadvantages

Principle: Describe the algorithm in a natural language

Advantages: Possible to write and understand for non-experts

Disadvantages:

Ambiguity of natural language
Difficulty to be precise
Structure is unclear
Dependent on natural language choice (e.g. Japanese vs. English)

Examples: Flowchart, ...

Advantages: Visual expression

Disadvantages:

Creation is time-consuming
Problem of data exchange (many different formats)
Gap to structured programming

(structured programming: replacing goto (jump to an arbitrary location in a program) with structured branches (if/switch/...) and loops (for/while/...) only)

Notation that is close to actual program code
Commonalities (not required):
- Simple notation (e.g.: no semicolon at end of line)
- No variable declarations
- Ignoring type details
- Use of special symbols (e.g. ←, ∨, ∧, ≤, ∈,...)
- Partial use of natural language (including comments)
Important: You can create your own pseudocode!

Advantages:

Middle ground between (natural language) text and program
Adjustable to author's preferences
Adjustable to the characteristics of the algorithm
Strong connection to structured programming
Possible to concentrate on the essence of the algorithm

Disadvantages:

Many different variations
Obfuscation is possible
Not executable

Advantages:

Precision
Executability

Disadvantages:

Many different programming languages
Dependent on execution environment
More precise than necessary
Different from essence of algorithm

Created in Japan by Yukihiro Matsumoto (松本行弘; nickname: Matz) starting in 1993
Gaining popularity outside Japan since 2000
Increased attention after publication of Ruby on Rails in 2004

Completely object-oriented
Easy-to-use scripting language
Easy for beginners, satisfying for experts
Contributions to Ruby implementation by Master/Bachelor students (internationalization,...)

Install Ruby on your notebook computer (and/or on your computer at home)

Main installation methods (choose one):

Add Ruby to your Cygwin installation (use cygwin setup.exe; detailled instructions)
(maybe already installed for Computer Programming I, please check)
Download and install Ruby 2.5.1-2 (or any other version) from RubyInstaller

How to check: Open a Cygwin Terminal or start Command Prompt with Ruby and execute ruby -v

Important: If you have problems with installing Ruby, come to my lab to fix it before the next lecture.

Simple syntax
No need to declare variables
Data structures (e.g. array) are orthogonal to data types (e.g. int)
(i.e. no need for "array of int", "array of float", ...)
Ruby is often called "executable pseudocode"
Very flexible (can change almost everything)
Excellent environment (profiler, ...)

Important: In this course, you have to learn how to read Ruby programs as pseudocode to understand algorithms.
But you do not need to write Ruby programs.

For this course, you have to learn how to read Ruby programs
(as pseudocode to understand algorithms)
For this course, you do not have to learn how to write Ruby programs
We will write Ruby programs in the spring term of 2020
(Projects in Information Technology II/情報総合プログラミング実習 II)
Writing Ruby programs now will help you both for this course and next year

Linear search and binary search: 2search.rb

How to execute:

Open a Cygwin Terminal or Start Command Prompt with Ruby
Use the cd command to move to a directory of your choice
Download the file into the choosen directory
Execute ruby 2search.rb

# starts a comment (up to end of line)
(we use # for comments about the algorithms,
## for comments about Ruby itself)
No need for ; (semicolon) at end of line
No need for () for conditions (if/while, ...) and most method calls
Methods (functions) are defined using def
class/def/if/while/do all end with end (no {}!)
Classes can be reopened, methods can be added or redefined

Main evaluation criteria:

Execution time (computational (time) complexity)
(this is the main focus of this lecture, and of research on algorithms)
Amout of memory used (space complexity)
(in general, the difference between different algorithms or data structures for the same problem is minor)
Ease of implementation
(this is highly subjective, but in general, faster algorithms may be more difficult to implement)

Contextual information used for evaluation:

How often will the algorithm be used?
With how much data will the algorithm be used?

Example: Comparing linear search and binary search

Possible questions:

How many seconds faster is binary search when compared to linear search?
How many times faster is binary search when compared to linear search?

Problem: These questions do not have a single, clear answer.

When we compare algorithms, we want a single, clear answer.

Our answer will be: Linear search is O(n), binary search is O(log n).

We will arrive at this answer, and understand it, in the next two weeks.

Very concrete

Measure actual execution time
Count operation steps
Calculate worst case number of steps
Think about asymtotic behavior

Very abstract

Advantages:
exact results (if conditions match)
Problems:
- Results depend on hardware used
- Results depend on implementation details
- Results vary slightly with each execution
- Results depend on problem size (number of data items in the input) and on the details of the data
- Impossible without implementation

⇒ We need a better method to compare algorithms, not implemenations or hardware

Step:
- Operation executed in constant time
- Examples:
  - Additions/multiplications/... of integers
  - Comparisons
  - Memory access
Advantage: Independent of hardware
Problems:
- The exact number of steps depends on implementation details
- There are many ways to count steps
  (e.g. is an assignment one step or two steps (two memory accesses))
- Results depend on problem size and on the details of the data

⇒ We need a more abstract way to compare algorithms

Use 2search.rb to fill in the following table. Set the COUNT variable to n. Divide the number of operations shown in the profiles by 2, because each operation is counted twice.

number of steps (counting additions and divisions)
`n (number of data items)`	1	8	64	512	4'096	32'768	262'144
linear search
binary search

Fill in the following table
(use engineering notation (e.g. 1.5E+20) if the numbers get big;
round liberally, the magnitude of the number is more important than the exact value)

Which function of each pair (left/right column) grows larger if n increases? And why?

left column	right column	column with larger growth	reason
100`n`	`n`²
1.1ⁿ	`n`²⁰
5 log₂ `n`	10 log₄ `n`
20ⁿ	`n`!
100·2ⁿ	2.1ⁿ

Review logarithms (ln, log₁₀, log₂,...) and limits (lim_n→∞,...) based on high school books/notes and Web resources

There are four main ways of describing algorithms:
text, diagrams, pseudocode, and programs
Each description has its advantages and disadvantages
In this course, we will use Ruby as executable pseudocode
The main criteria to evaluate algorithms are time complexity, space complexity, and difficulty of implementation
Time complexity is most important
Measuring execution time and counting steps does not really compare algorithms,
but counting steps is important when evaluating time complexity

(提出不要。ただし、記入した表などは来週授業に持参すること。)

Ruby を使って探索アルゴリズムのステップ数を調べ、テーブルを完成
(散布図を作成してもよい)
n の増加の場合、複数の関数の値を計算しなさい
関数の増加の比較で、どちらの関数が「最終的に勝つ」か、そしてその理由を調べなさい
高校の教科書やウェブなどで対数 (ln, log₁₀, log₂ など) と極限 (lim_n→∞など) について調査・再確認

n log₂ n

representation: 表現
evaluation: 評価
pseudocode: 疑似コード
non-expert: 素人
ambiguity: 曖昧さ
natural language: 自然言語 (すなわち、人間が話す言語)
flowchart: 流れ図
structured progamming: 構造化プログラミング
obfuscation: ごまかし
object-oriented (programming language): オブジェクト指向 ((プログラム) 言語)
scripting language: スクリプト言語
criterion (複数 criteria): 基準
execution time: 実行時間
computational complexity: 計算量
worst case: 最悪の場合
asymptotic: 漸近的
round: 四捨五入する、概数で表す
logarithm: 対数
limit: 極限