# Representation and Evaluation of Algorithms

(アルゴリズムの表現と評価)

## Data Structures and Algorithms

### 2nd lecture, September 29, 2022

https://www.sw.it.aoyama.ac.jp/2022/DA/lecture2.html

# Today's Schedule

• Summary of last lecture・last week's homework
• Representation of algorithms
• The programming language Ruby
• Evaluation of algorithms

# Introduce TA

Teaching Assistant (TA): YU JINSONG (于 津松、M1)

# Covid Precautions

• Every morning, measure your body temperature
• If you have increased temperature (above 37.5°), contact the health center
• Observe social distance
• Always wear a mask (correctly!)
• Regularly wash/disinfect your hands thoroughly
• Eat/drink quietly, alone
• If you are not vaccinated yet, get vaccinated as soon as possible

• There are some differences between official registration and Moodle enrollment
Both are necessary if you want to take this course
• In Moodle, there are some Japanese people with names in Latin letters,...
These also need to be fixed
• I call out students who are affected at the end of this lecture

# Summary of Last Lecture

• Course overview
• The fascination of algorithms and data structures
• Data structures: A number of data items and their relationships.
• Algorithm: A clear set of instructions for how to solve a well-defined problem in finite time.
Examples: linear search, binary search

# Solution to Homework 1, Question 1

For trades in all Sections of the Tokyo Stock Exchange, calculate the total number of data items (counting one trade as one data item) during 2022. Assume that during operating hours, each stock is traded once every second.

• Companies traded on the Tokyo Stock Exchange (TSE), all sections (Prime/Standard/Growth/PRO Market, as of 2022/9/28): 3'834 companies
• Saturdays/Sundays in 2021: 105; additional holidays: 16; ⇒ business days: 244
• Total number of data items: 244·18'000·3'834 = 16'838'928'000 trades ≊ 17 billion (17G) trades

Caution: Make sure you provide references for the above data

# Tokyo Stock Exchange: Actual system capacity

(source: The Japan News, 2015/9/22, p. 6)

 old system new system used until 2015/9/18 from 2015/9/24 Orders per day (max) 137 million 270 million Orders per second (max) 30'000~40'000 >50'000 Time to process orders 0.001s 0.0005s

# Last Week's Homework 3: Help Ms. Noda

Design an efficient (=fast) algorithm for Ms. Noda's problem.

Hint: Can you use an algorithm that you already know?

One idea/outline:

• Order products by product price
• For each day with target amount t, find the winning product combination as follows:
• For each product price p, calculate the remaining maximum amount r = t - p
• Find the second product with maximum price sr using (a variant of) binary search
• Take the products so that p+s is the maximum

# Last Week's Homework 2: Representation of Algorithms

(no need to submit)

# Methods for Representing Algorithms

• Algorithms are abstract ideas
• But we need some way to represent them
• There are 4 main representation methods:
1. Text
2. Diagrams
3. Pseudocode
4. Programming languages

# Text

Principle: Describe the algorithm in a natural language

Advantages: Possible to write and understand for non-experts

• Ambiguity of natural language
• Difficulty to be precise
• Structure is unclear
• Dependent on natural language choice (e.g. Japanese vs. English)

# Diagrams

Examples: Flowchart, ...

• Creation is time-consuming
• Problem of data exchange (many different formats)
• Gap to structured programming

(structured programming: replacing `goto` (jump to an arbitrary location in a program) with structured branches (`if`/`switch`/...) and loops (`for`/`while`/...) only)

# What is Pseudocode

• Notation that is similar to actual program code
• Commonalities (not required):
• Simple notation (e.g.: no semicolon at end of line)
• Ignoring type details
• No variable declarations
• Use of special symbols (e.g. ←, ∨, ∧, ≤, ∈,...)
• Partial use of natural language (including comments)
• Important: You can create your own pseudocode!

• Middle ground between (natural language) text and program
• Adjustable to the characteristics of the algorithm
• Strong connection to structured programming
• Possible to concentrate on the essence of the algorithm

• Many different variations
• Obfuscation is possible
• Not executable

# Programing Language

• Precision
• Executability

• Many different programming languages
• Dependent on execution environment
• More precise than necessary
• Different from essence of algorithm

# The Programming Language Ruby

• Created in Japan by Yukihiro Matsumoto (松本 行弘; nickname: Matz) starting in 1993
• Gaining popularity outside Japan since 2000
• Increased attention after publication of Ruby on Rails in 2004
• Completely object-oriented
• Easy-to-use scripting language
• Easy for beginners, satisfying for experts
• Contributions to Ruby implementation by Master/Bachelor students of my Lab (internationalization,...)

# Ruby for Algorithm Representation

• Simple syntax
• No need to declare variables
• Data structures (e.g. array) are orthogonal to data types (e.g. `int`)
(i.e. no need for "array of int", "array of float", ...)
• Ruby is often called "executable pseudocode"
• Very flexible (can change almost everything)
• Excellent environment

# Last Week's Homework 4: Install Ruby

Install Ruby on your notebook computer (and/or on your computer at home)

Main installation methods (choose at least one):

How to check: Open a `Cygwin Terminal` or start ```Command Prompt with Ruby``` and execute `ruby -v `

Important: If you have problems with installing Ruby, contact me before the next lecture.

• For this course, you have to learn how to read Ruby programs
(as pseudocode to understand algorithms)
• For this course, you do not have to learn how to write Ruby programs
• But you have to be able to write algorithms in (your own) pseudocode
• We will write Ruby programs in the spring term of 2023
(Projects in Information Technology II/情報総合プログラミング実習 II)
• Writing Ruby programs now will help you both for this course and next year

# First Ruby Example

Linear search and binary search: 2search.rb

How to execute:

• Open a `Cygwin Terminal` or ```Start Command Prompt with Ruby```
• Use the `cd` command to move to a directory of your choice
• Execute `ruby 2search.rb`

# Basics of Ruby Syntax

• `#` starts a comment (up to end of line)
(we use `#` for comments about the algorithms,
`##` for comments about Ruby itself)
• No need for `;` (semicolon) at end of line
• No need for `()` for conditions (`if`/`while`, ...) and most method calls
• Methods (functions) are defined using `def`
• `class`/`def`/`if`/`while`/`do` all end with `end` (no `{}`!)
• Classes can be reopened to add or redefine methods

# Overview of Algorithm Evaluation

Main evaluation criteria:

1. Execution time (computational (time) complexity)
(this is the main focus of this lecture, and of research on algorithms)
2. Amout of memory used (space complexity)
(in general, the difference between different algorithms or data structures for the same problem is small)
3. Ease of implementation
(this is highly subjective, but in general, faster algorithms may be more difficult to implement)

Contextual information used for evaluation:

• How often will the algorithm be used?
• With how much data will the algorithm be used?

# Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

# Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

Possible question:

• How many seconds faster is binary search compared to linear search?

# Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

Possible questions:

• How many seconds faster is binary search compared to linear search?
• How many times faster is binary search compared to linear search?

# Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

Possible questions:

• How many seconds faster is binary search compared to linear search?
• How many times faster is binary search compared to linear search?

Problem: These questions do not have a single, clear answer.

# Comparing the Execution Time of Algorithms

Example: Comparing linear search and binary search

Possible questions:

• How many seconds faster is binary search compared to linear search?
• How many times faster is binary search compared to linear search?

Problem: These questions do not have a single, clear answer.

When we compare algorithms, we want a single, clear answer.

Our answer will be: Linear search is O(n), binary search is O(log n).

We will arrive at this answer, and understand it, in the next two weeks.

# Comparing Execution Times: From Concrete to Abstract

Very concrete

• Measure actual execution time
• Count operation steps
• Calculate worst case number of steps

Very abstract

# Measuring Execution Time

exact results (if conditions match)
• Problems:
• Results depend on hardware used
• Results depend on implementation details
• Results vary slightly with each execution
• Results depend on problem size (number of data items in the input) and on the details of the data
• Impossible without implementation

⇒ We need a better method to compare algorithms, not implemenations and not hardware

# Comparing Execution Times: From Concrete to Abstract

Very concrete

• Measure actual execution time
• Count operation steps
• Calculate worst case number of steps

Very abstract

# Counting Operation Steps

• Step:
• Operation executed in constant time
• Examples:
• Comparison
• Memory access
• Problems:
• The exact number of steps depends on implementation details
• There are many ways to count steps
(e.g. is an assignment one step or two steps (two memory accesses))
• Results depend on problem size and on the details of the data

⇒ We need a more abstract way to compare algorithms

# Homework 1: Example for Asymptotic Growth of Number of Steps

Use 2search.rb to fill in the following table. Set the `COUNT` variable to n.

 n (number of data items) linear search binary search 1 8 64 512 4'096 32'768 262'144

# Homework 2: Calculate Function Values

Fill in the following table
(use engineering notation (e.g. 1.5E+20) if the numbers get big;
round liberally, the magnitude of the number is more important than the exact value)

 n 1 10 100 1'000 10'000 100'000 5n n1.2 n2 n log2 n 1.01n

# Homework 3: Compare Function Growth

Which function of each pair (left/right column) grows larger if n increases more and more? And why?

left column right column column with larger growth reason
100n n2
1.1n n20
5 log2 n 10 log4 n
20n n!
100·2n 2.1n

# Homework 4: Logarithms and Limits

Review logarithms (ln, log10, log2,...) and limits (limn→∞,...) based on high school books/notes and Web resources

# Summary of Today's Lecture

• There are four main ways of describing algorithms:
text, diagrams, pseudocode, and programs
• In this course, we will use Ruby as executable pseudocode
• The main criteria to evaluate algorithms are time complexity, space complexity, and difficulty of implementation
• Time complexity is most important
• Measuring execution time and counting steps does not really compare algorithms,
but counting steps is important when evaluating time complexity

# This Week's Homework

(提出不要。ただし、記入した表は次回に持参すること。)

1. Ruby を使って探索アルゴリズムのステップ数を調べ、テーブルを完成
(散布図を作成してもよい)
2. n の増加の場合、複数の関数の値を計算
3. 関数の増加の比較で、どちらの関数が「最終的に勝つ」か、そしてその理由を調査
4. 高校の教科書やウェブなどで対数 (ln, log10, log2 など) と極限 (limn→∞など) について調査・再確認

# Glossary

representation

evaluation

pseudocode

non-expert

ambiguity

natural language

flowchart

structured progamming

obfuscation
ごまかし
object-oriented (programming language)
オブジェクト指向 ((プログラム) 言語)
scripting language
スクリプト言語
criterion (複数 criteria)

execution time

computational complexity

worst case

asymptotic

round

logarithm

limit