Representation and Evaluation of Algorithms

(アルゴリズムの表現と評価)

Data Structures and Algorithms

2nd lecture, October 6, 2016

http://www.sw.it.aoyama.ac.jp/2016/DA/lecture2.html

Martin J. Dürst

Today's Schedule

Summary of last lecture・Last week's homework
Representation of algorithms
The programming language Ruby
Evaluation of algorithms

Summary of Last Lecture

Course overview
The fascination of algorithms and data structures
Data structures: concept (example: linked list)
Algorithms: concept (example: linear search, binary search)

前回の宿題 1: 膨大なデータ

東京証券取引所の第一部の取引で、一つの株式会社の株が営業時間内に平均で 30秒で一回売買されていると想定して、合計で年間に (一売買行為を一つの項目と考えるとき) 何項目のデータが集まるかを、計算しなさい。
問題 1 の結果よりもデータ項目数がもっと多くて、実際に計算機で扱えそうな問題を考え、説明しなさい (他人と同じものの場合には減点対象)。

Methods for Representing Algorithms

Algorithms are abstract ideas
But we need some way to represent them
The main representation methods are:
- Text(ual description)
- Diagrams
- Pseudocode
- Programming languages
Each method has advantages and disadvantages

Text

Principle: Describe the algorithm in a natural language

Advantages: Possible to understand/write for non-experts

Disadvantages:

Ambiguity of natural language
Difficulty to be precise
Structure is unclear
Dependent on natural language choice (e.g. Japanese vs. English)

Diagrams

Examples: Flowchart, ...

Advantages: Visual expression

Disadvantages:

Creation is time-consuming
Problem of data exchange (many different formats)
Gap to structured programming

(structured programming: replacing goto (jump to an arbitrary location in a program) with structured branches (if/switch/...) and loops (for/while/...) only)

What is Pseudocode

Notation that is close to actual program code
Commonalities (not required):
- Simple notation (e.g.: no semicolon at end of line)
- No variable declarations
- Use of special symbols (e.g. ←, ∨, ∧, ≦, ...)
- Partial use of natural language (incl. comments)
Important: You can create your own pseudocode!

Evaluation of Pseudocode

Advantages:

Middle ground between (natural language) text and program
Adjustable to author's preferences
Adjustable to the characteristics of the algorithm
Strong connection to structured programming
Possible to concentrate on the essence of the algorithm

Disadvantages:

Many different variations
Obfuscation is possible
Not executable

Programing Languages

Advantages:

Precision
Executable

Disadvantages:

Many different languages
Dependent on execution environment
More precise than necessary
Different from essence of algorithm

The Programming Language Ruby

Created in Japan by Yukihiro Matsumoto (松本行弘; nickname: Matz) starting in 1993
Gaining popularity outside Japan since 2000
Increased attention after publication of Ruby on Rails in 2004

Completely object-oriented
Easy-to-use scripting language
Easy for beginners, satisfying for experts
Contributions to Ruby implementation by Master/Bachelor students (internationalization,...)
Labowork (see last week's slides)

Ruby for Algorithm Representation

Simple syntax
No need to declare variables
Data structures (e.g. array) are orthogonal to data types (e.g. int)
(i.e. no need for "array of int", "array of float", ...)
Ruby is often called "executable pseudocode"
Excellent environment (profiler, ...)

Important: In this course, you have to learn how to read Ruby programs. But you do not need to write Ruby programs.

First Ruby Example

Linear search and binary search: 2search.rb

How to execute:

Open a Cygwin Terminal or Start Command Prompt with Ruby
Use the cd command to move to a directory of your choice
Download the file into the choosen directory
Execute ruby 2search.rb

Basics of Ruby Syntax

# starts a comment (up to end of line)
(we use # for comments about the algorithms, ## for comments about Ruby itself)
No need for ; (semicolon) at end of line
No need for () for conditions (if/while, ...) and most method calls
Methods (functions) are defined using def
class/def/if/while/do all end with end (no {}!)

Overview of Algorithm Evaluation

Main evaluation criteria:

Execution time (computational (time) complexity)
(this is the main focus of this lecture, and of research on algorithms in general)
Amout of memory used (space complexity)
(in general, there is not that much difference between different algorithms or data structures)
Ease of implementation
(this is highly subjective, but in general, faster algorithms may be more difficult to implement)

Contextual information used for evaluation:

How often will the algorithm be used?
With how much data will the algorithm be used?

Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

Possible questions:

How many seconds faster is binary search when compared to linear search?
How many times faster is binary search when compared to linear search?

Problem: These questions do not have a single, clear answer.

When we compare algorithms, we want a simple answer.

Comparing Execution Times: From Concrete to Abstract

Concrete ↑

Measure actual execution time
Count operation steps
Estimate worst case number of steps
Think about asymtotic behavior

Abstract ↓

Measuring Execution Time

Advantages:
exact results (if conditions match)
Problems:
- Results depend on hardware used
- Results depend on implementation details
- Results vary slightly with each execution
- Impossible without implementation

⇒ We need a better method to compare algorithms, not implemenations or hardware

Counting Operation Steps

(Step: An operation that can be calculated in constant time; examples: arithmetic operations (addition, multiplication, ...), comparisons, memory access, ...)

Advantages:
Independent of hardware
Problems:
- The number of steps depends on implementation details
- There are many ways to count steps
  (e.g. is an assignment one step or two steps (two memory accesses))
- The number of steps changes depending on the number of data items in the input and on the details of the data

⇒ We need a more abstract way to compare algorithms

Summary of Today's Lecture

There are four main ways of describing algorithms:
text, diagrams, pseudocode, and programs
Each description has its advantages and disadvantages
In this course, we will use Ruby as "executable pseudocode"
The main criterion to evaluate algorithms is time complexity as a function of the number of (input) data items

This Week's Homework

(提出は不要不要だが、表など記入したまま資料を持ってくること)

Ruby を使って探索アルゴリズムのステップ数を調べ、テーブルを完成
(散布図を作成してもよい)
n の増加の場合、複数の関数の値を計算しなさい
関数の増加の比較で、どちらの関数が「最終的に勝つ」か、そしてその理由を調べなさい
高校の教科書やウェブなどで対数 (ln, log₁₀, log₂ など) と極限 (lim_n→∞など) について調査・再確認

Glossary

representation: 表現
evaluation: 評価
pseudocode: 疑似コード
non-expert: 素人
ambiguity: 曖昧さ
natural language: 自然言語 (すなわち、人間が話す言語)
flowchart: 流れ図
structured progamming: 構造化プログラミング
obfuscation: ごまかし
object-oriented (programming language): オブジェクト指向 ((プログラム) 言語)
scripting language: スクリプト言語
criterion (複数 criteria): 基準
execution time: 実行時間
computational complexity: 計算量
worst case: 最悪の場合
asymptotic: 漸近的
round: 四捨五入する、概数で表す
logarithm: 対数
limit: 極限