# Algorithms and Data Structures: Concepts and Applications

(アルゴリズムとデータ構造の概要と応用分野)

## Data Structures and Algorithms

(データ構造とアルゴリズム)

### 1st lecture, September 22/on demand, 2022

https://www.sw.it.aoyama.ac.jp/2022/DA/lecture1.html

### Martin J. Dürst

(テュールスト マーティン ヤコブ)

duerst@it.aoyama.ac.jp

Building O, Room 529

# Today's Schedule

• Data Structures: Concept and Example
• Algorithms: Concept and Example
• Course Schedule

# Covid Precautions

• Every morning, measure your body temperature
• If you have increased temperature (above 37.5°) or feel ill, follow the instructions from the University
• Observe social distance
• Always wear a mask (correctly!)
• Regularly wash/disinfect your hands thoroughly
• Eat/drink quietly, alone
• If you are not vaccinated/boosted yet, get vaccinated/boosted as soon as possible

# 授業の位置づけ

• 情報テクノロジー学科: 2年後期、必修 (◉)
• 経営システム工学科:
• 2012年以前入学の学生のみ
• 機械創造工学科: 3年後期、選択必修、第一科目群 (△)
• 物理・数理学科と電気電子工学科も可能

This is a JE course (理工学国際プログラム JE 科目): The explanations are in Japanese, the materials (mostly) in English

# 成績評価方法

およその割合:

• 授業中のミニテストやクイズ: ~30%
• 演習課題: ~20%
• 期末試験: ~50%

# Lecture Schedule and Bibliography

Data Structures and Algorithms: Schedule

Bibliography (参考書)

# Glossary

• Each lecture handout comes with a glossary at the end
• The glossary contains:
• technical terms for this lecture (e.g. data structure/データ構造) ⇐ part of examination
• technical terms for computer science (e.g. compiler/コンパイラ)
• technical terms from other fields (e.g. topology/位相幾何学)
• selected general terms/expressions (e.g. technical term/専門用語)
• Please report missing terms

# Positioning of Algorithms and Data Structures

 Applications Theory Algorithms and Data Structures Programming Hardware

# Why Algorithms and Data Structures?

Example of what happens without data structures and algorithms:

• Don't know how to solve a problem at all
• Long program, difficult to maintain
• Slow program, especially when the number of data items increases

# One More Example

```  // target and pattern are very long strings
match=0;
for (i=0; i<strlen(target); i++) {
match_char = 0;
for (j=0; j<strlen(pattern); j++)
if (target[i+j] == pattern[j])
match_char++;
if (match_char==strlen(pattern))
match++;
}   ```

Very slow!

# Where is the Problem?

```  // target and pattern are very long strings
match=0;

for (i=0; i<strlen(target); i++) {
match_char = 0;
for (j=0; j<strlen(pattern); j++)
if (target[i+j] == pattern[j])
match_char++;
if (match_char==strlen(pattern))
match++;
}   ```

`strlen` takes more time for longer strings!

# Solution

```  // target and pattern are very long strings
match=0;
p_length = strlen(pattern);

for (i=0; i<strlen(target); i++) {
match_char = 0;
for (j=0; j<p_length; j++)
if (target[i+j] == pattern[j])
match_char++;
if (match_char==p_length)
match++;
}   ```

# Further Improvement

```  // target and pattern are very long strings
match=0;
p_length = strlen(pattern);
t_length = strlen(target);
for (i=0; i<t_length; i++) {
match_char = 0;
for (j=0; j<p_length; j++)
if (target[i+j] == pattern[j])
match_char++;
if (match_char==p_length)
match++;
}   ```

# The Fascination of Algorithms and Data Structures

• At the core of Computer Science/Information Technology
• At the intersection of theory and practice
• Clear evaluation criteria
• Lots of interesting ideas
• Highly practical (Examples: Google PageRank algorithm, Netflix prize, algorithmic trading)
• Advantageous for job search (Get that job at Google, Steve Yegge, 2008)
• Advantageous for Master Course entrance exams

# Lecture Goals

Understand

• Way of thinking for algorithms and data structures
• Design of algorithms and data structures
• Well-known algorithms and data structures

# Example of Data Structure: Linked List

```

```

• Each data item points to the next data item in the list
• Data items consist of a pointer/reference and some payload
• There is an external pointer to the start of the list
• The last data item contains a special null pointer/reference to indicate the end of the list

# Data Structure: Concept

A data structure consists of:

• A number of data items
Examples: Numbers, student data, ...
• Relationships (connections) between the data items

The term data structure is used mostly for structures inside a computer (in main memory).

There are two different views of data structures:

• Internal view (implementation): Construction out of arrays, structures, pointers, ...
• External view (functionality provided): Abstract data type (ADT)

# Algorithm Examples

Problem: Searching a word (target) in a (real!) dictionary

• Linear search:
• Starting with page 1, proceed page by page until the end
• For each page, search columns from left to right and entries from top to bottom
• Binary search:
• Repeatedly split dictionary in half
• Check the word in the middle
• If the target is larger than the word in the middle, keep the second half of the pages/words
• If the target is smaller than the word in the middle, keep the first half of the pages/words
• If the target is equal to the word in the middle, then return the target
• If you only have one word left, return failure

# Algorithm: Concept

An algorithm is a clear set of instructions for how to solve a well-defined problem in finite time.

Requirements:

1. Clear definition of problem and result (well-defined problem)
2. Detailled and precise step-by-step instructions (clear set of instructions)
3. Termination in a finite number of steps (finite time)

# Counterexamples

1. "Let's create world peace!"
→ Not a well-defined problem
2. "Just look it up in the dictionary!"
→ No clear set of instructions
3. Random dictionary search: Open the dictionary at random locations, stop if you find the target word.
→ No finite time (may take an infinite number of steps)

# Difference between Algorithms and Programs

• Algorithms cannot be executed directly,
they have to be implemented as programs in order to be executed.
• The same algorithm can be implemented in many different programming languages,
and in many different ways in the same program language
• Programs concentrate on details, algorithms are concepts (ideas)

# Relationship between Data Structures and Algorithms

• Data structures represent state (static aspect of a computation)
• Algorithms represent processing (dynamic aspect of a computation)
• Some algorithms use more than one data structure
• More than one algorithm may use the same data structure

# History of Algorithms

• Land area calculations in ancient Egypt
• Abstraction in ancient Greece (example: Euclid's algorithm for the greatest common denominator)
• Origin of the name algorithm: Persian Mathematician Muhammad ibn Mūsā al-Khwārizmī (الخوارزمي, ca. 800 A.D.)
• In the 1930ies: Establishing the Mathematical base of algorithms (Gödel, Turing ...)
• From the 1950ies: Used in practice with computers, dramatic increase in number of algorithms
• From the 1990ies: Increased economic importance
• Very recently also increasing criticism of some algorithms (social media,...)

# Homework 1: Huge Amounts of Data

Submission: Deadline: September 28 (Wednesday), 18:40; Place: Box in front of room O-529; Format: One page, A4 (both sides okay, legible handwriting, name (incl. reading) and student number at top right)

(for each subproblem, give the reasons for you assumptions, and cite references. When citing Wikipedia,..., use IRIs, not URIs, e.g. http://ja.wikipedia.org/wiki/情報, not http://ja.wikipedia.org/wiki/%E6%83%85%E5%A0%B1.)

1. For trades in all Sections of the Tokyo Stock Exchange, calculate the total number of data items (counting one trade as one data item) during 2022. Assume that during operating hours, each stock is traded once every second.
2. Imagine and explain some kind of data where the number of data items is much higher than in subproblem 1, and where the data may actually be processed on a computer (there will be a deduction if different students submit similar solutions).

# 宿題 1: 膨大なデータ

(それぞれの問題で、想定の根拠となる理由、参考にした文献など必ず明記のこと。Wikipedia などへの参照の場合、URI のではなく IRI を使用のこと (例: http://ja.wikipedia.org/wiki/%E6%83%85%E5%A0%B1 のではなく http://ja.wikipedia.org/wiki/情報))

1. 東京証券取引所の全部門の取引で、一つの株式会社の株が営業時間内に平均で 1秒で一回売買されていると想定して、合計で2022年に (一売買行為を一つの項目と考えるとき) 何項目のデータが集まるかを、計算しなさい。
2. 問題 1 の結果よりもデータ項目数がもっと多くて、実際に計算機で扱えそうなデータを考え、説明しなさい (他人と同じものの場合には減点対象)。

# Homework 2: Representation of Algorithms

Examine the algorithm representations in the separate document, and think about each representation's advantages and disadvantages.

(no need to submit)

# Homework 3: Help Ms. Noda

Design an efficient (=fast) algorithm for Ms. Noda's problem.

Hint: Can you use an algorithm that you already know?

(no need to submit)

# Homework 4: Install Ruby

Install Ruby on your notebook computer (and/or on your computer at home)

Main installation methods (choose one):

How to check: Open a `Cygwin Terminal` or start ```Command Prompt with Ruby``` and execute `ruby -v `

If the Ruby version is output, then your Ruby installation is succesful. If it says something such as "command not found", then your installation is not successful.

Important: If you have problems with installing Ruby, contact me before the next lecture.

Bring your notebook computer with you to the next lecture

# Preparation for the Next Lecture

• Submit Homework 1 (deadline: September 28, 18:40)
• Complete Homework 2 (no need to submit)
• Complete Homework 3 (no need to submit)
• Review today's lecture's content
• Complete Homework 4
• Make sure you can use Ruby on your computer next time

# ラボワークへのお誘い

• テーマ:
• 競技プログラミング (例: AtCoderACM ICPC など) の練習
• Red Data Tools への貢献: Ruby によるデータ解析
• その他のテーマ
• 時間などは自由
• 自宅からでも可能

# Summary of this Lecture

• Data Structures and Algorithms are core concepts of Computer Science.
• A data structure describes data items with their relationships.
• An algorithm is a clear set of instruction for how to solve a well-defined problem in finite time.
• Algorithms are not programs: Algorithms are abstract ideas, programs can be executed.
• Algorithms are more than 2000 years old, but have gained enormous economic importance recently.

# Glossary

job search

data structure
データ構造

data item
データ項目
abstract data type (ADT)

algorithm
アルゴリズム
linear search

binary search

counterexample

implement/implementation

land area calculations

ancient Egypt

Euclid
ユークリッド
greatest common denominator (GCD)

Mathematician

ancient Greece