# Abstract Datatypes and Data Structures: Stacks, Queues, ...

(抽象データ型とデータ構造、スタック、キューなど)

## Data Structures and Algorithms

### 4th lecture, October 8, 2015

http://www.sw.it.aoyama.ac.jp/2015/DA/lecture4.html

### Martin J. Dürst

© 2009-15 Martin J. Dürst 青山学院大学

# Today's Schedule

• Summary and homework of last lecture
• Polynomial vs. exponential time
• Finding the (asymptotic) time complexity of an algorithm
• Recurrence relations
• Abstract Data Types
• Stack
• Queue

# 50th Anniversary Celebration

The College of Science and Engineering (理工学部) will celebrate its 50th Anniversary on October 10th (Saturday, Sagamihara Festival).

I strongly recommend attending the following events:

• 研究室公開 (10:00~15:00)
• 記念展示 (10:00~16:30)
• 特別講演 (世界を照らす LED、15:10-16:10)
• 学科同窓会交流会 (情テクは 16:30~17:30)

# Summary of Last Lecture

The asymptotic growth (order of growth) of a function and the time (and space) complexity of an algorithm can be expressed with the Big-O/Ω/Θ notation:

• O(g(n)): Set of functions with lower or same order of growth as g(n)
• Ω(g(n)): Set of functions with larger or same order of growth as g(n)
• Θ(g(n)): Set of functions with same order of growth as g(n)

f(n)∈O(g(n)) ⇔ ∃c>0: ∃n0≥0: ∀nn0: f(n)≤c·g(n)

The order of growth of a function can be found by:

• Looking for appropriate c and n0
• Calculating limn→∞(f(n)/g(n))

When using Big-O notation, always try to simplify g() as much as possible.

[昨年度資料につき削除]

# Frequent Orders

O(1): Simple formulæ (e.g. interest calculation)

O(log n) (logarithmic order/time): binary search, other "divide and conquer" algorithms

O(n) (linear order, linear time): proportional to size of data, checking all data items once (or a finite number of times)

O(n log n): Sort, other "divide and conquer" algorithms

O(n2) (quadratic order/time), O(n3) (cubic order/time): Considering (almost) all combinatios of 2 or 3 data items

O(2n): Considering all subsets of data items

O(n!): Considering all permutations of data items

# Polynomial versus Exponential Growth

Example:

1.1nn20

log(1.1)·n ≶ log(n)·20

n/log10(n) ≶ 20/log10(1.1) ≊483.2

n0 ≊ 1541

Conclusion: For a, b > 1, an will always eventually grow faster than nb

(nb is polynaminal, an is exponential)

# The Importance of Polynomial Time

• What can be called a "realistic" time complexity depends on the problem
• In general:
• Polynomial time is realistic
• Exponential time is unrealistic

[We will discuss this in more detail in lecture 14]

# Finding the (Asymptotic) Time Complexity of an Algorithm

• Find/define the variables that determine the size of the input (e.g. n)
• Find the basic operations (steps) in the algorithm that are most frequently executed
• Express the total number of basic operations (steps) using summation or a recurrence relation
• Determine the time complexity expressed with big-O notation

Simplifications possible for big-O notation can be applied early.
Example: Because constant factors are irrelevant in big-O notation, they can be eliminated when counting steps.

# How to Define Input Size Variables

• In many cases, the input size is the number of data items (examples: search, sort)
• For matrices, ..., often the number of rows or columns is used (matrices of size n × n or n × m)
• In some cases, the size of individual data items has to be considered
Examples: Size in bits of integers with unlimited precision; length of variable-length strings, ...
• Sometimes, there are two or more kinds of data, with different size
Example: String matching (text size n and pattern size m)

# How to Identify the Most Frequent Basic Operations

• Usually inside a loop (especially inside multiple loops)
• If there are several independent loops, check all of them
• If the number of operations depends on the values in the input,
check the worst case
• When methonds/functions are called, consider the content of the function

Caution: Some methods/functions may hide complexity (e.g. Ruby `sort`, ...)

# Counting Basic Operations using Summation

• Example program:
```for (i=0; i<n; i++)
for (j=i; j<n; j++)
sum += i*j;```
• Most frequent operation: Addition or multiplication in last line (inner loop)
• Expressing the number of operations as a sum:
i=0n-1j=in-1 1
• Evaluating the sum:
i=0n-1j=in-1 1 = ∑i=0n-1 n-i =
= n + n-1 + n-2 + ... + 2 + 1 = n · (n+1) / 2
• Asymptotic time complexity:
n · (n+1) / 2 = 0.5 n2 + 0.5 nO(n2)

# Counting Basic Operations using Recurrence Relations

• Example program (recursive version of binary search):
```binsearch(array, low, high, key)
middle = (high+low)/2
if low==high
if array[low]==key
return low
else
return nil
elsif key>array[middle]
return binsearch(array, middle+1, high, key)
else
return binsearch(array, low, middle, key)```
• Expressing the number of operations as a recurrence:
B(n) = B(⌈n/2⌉) + 1
B(1) = 1

# Recurrence Relations

• A recurrence (relation) is a recursive definition of a mathematical function
• There are various ways to solve recurrences
• One way to solve a recurrence is to discover a pattern by repeated substitution:
B(n) = B(⌈n/2⌉) + 1 = B(⌈⌈n/2⌉/2⌉) + 1 + 1 = B(⌈n/22⌉) + 2 =
= B(⌈n/23⌉) + 3 = B(⌈n/2k⌉) + k
• Using B(1) = 1:
n/2k⌉ = 1 ⇒ 1 ≥ n/2k (>1/2) ⇒ 2kn (> 2k-1) ⇒ k ≥ log2 n (> k-1) ⇒ k = ⌈log2 n
• B(n) = 1 + ⌈log2 n⌉ ∈ O(log n)
• The asymptotic time complexity of binary search is O(log n)

# Comparing the Execution Time of Algorithms

(from previous lectures)

Possible questions:

• How many seconds faster is binary search when compared to linear search?
• How many times faster is binary search when compared to linear search?
• What is the order [of growth of the execution time] of linear search and binary search?
Linear search is O(n), binary search is O(log n).

Conclusion: Expressing time complexity as O() allows to evaluate the essence of an algorithm, ignoring hardware and implementation differences.

# Abstract Data Type (ADT)

• Combination of data with functions operating on data
• The data can only be accessed/changed using the functions (encapsulation)
• Goals
• Data integrity (example: birthday and age)
• Modularization of big software projects
• Related to type theory
• Often implemented by objects in object-oriended programming languages
• Type → class
• Function → member function/method

# Typical Examples of Abstract Data Types

• Stack
• Queue
• Linear list
• Dictionary
Caution: A dictionary ADT is not exactly the same as a dictionary book
• Priority queue

# Stack

Principle:
last-in-first-out (LIFO)
General example:
Stack of trays in cafeteria
Example from IT:
Function stack (local variables, return address, ...)
Main methods:
new, add/push, delete/pop
Other methods:
empty? (check whether the stack is empty or not)
top (return the topmost element without removing it from the stack)

# Axioms for Stacks

It is possible to define a stack using the following four axioms:

1. Stack.new.empty? ↔ true
2. s.push(e).empty? ↔ false
3. s.push(e).top ↔ e
4. s.push(e).pop ↔ s (here, pop returns the new stack, not the top element)

(s is any arbitrary stack, e is any arbitrary data item)

Axioms can define a contract between implementation and users

# Queue

Principle:
first-in-first-out (FIFO)
General example:
Queue in cafeteria waiting for food
Example from IT:
Queue of processes waiting for execution
Main methods:
add/enqueue, remove/delete/dequeue
Explain the meaning of GIGO: Garbage in, garbage out.

# Comparing ADTs

Implementation: 4ADTs.rb

ADT stack queue
Implemented as `Array` `LinearList` `Array` `LinearList`
create O(n) O(1) O(n) O(1)
add O(1) O(1) O(1) or O(n)* O(1)
delete O(1) O(1) O(n)* or O(1) O(1)
`empty?` O(1) O(1) O(1) O(1)
length O(1) O(n) O(1) O(n)

*) Can be improved to O(1) by using a ring buffer

# Summary

• The order (of growth)/(asymptotic) time complexity of an algorithm can be calculated from the number of the most frequent basic operations
• Calculation can use a summation or a recurrence (relation)
• The big-O notation compactly express the inherent efficiency of an algorithm
• An abstract data type (ADT) combines data and the operations on this data
• Stack and queue are typical examples of ADTs

# Homework

(no need to submit)

1. Order the following orders of growth, and explain the reason for your order:

O(n2), O(n!), O(n log log n), O(n log n), O(20n)

2. Write a simple program that uses the classes in 4ADTs.rb.
Use this program to compare the implementations.
Hint: Use the second part of 2search.rb as an example.
3. Implement the priority queue ADT (Ruby or any other programming language is okay)

A priority queue keeps a priority (e.g. integer) for each data item.
In the simplest case, the only data is the priority.
The items with the highest priority leave the queue first.
Implementation can use an array or a linked list or any other data structure.

# Glossary

polynomial growth

exponential growth

integers with unlimited precision

recurrence (relation)

substitution

abstract data type

encapsulation
カプセル化
data integrity
データの完全性
modularization
モジュール化
type theory

object-oriended
オブジェクト指向 (形容詞)
type
class
クラス
member function
メンバ関数
method
メソッド
stack
スタック
cafeteria

axiom

queue

ring buffer
リングバッファ
priority queue