# Heaps

(ヒープ)

## Data Structures and Algorithms

### 5th lecture, October 31, 2019

http://www.sw.it.aoyama.ac.jp/2019/DA/lecture5.html

### Martin J. Dürst © 2009-19 Martin J. Dürst 青山学院大学

# Today's Schedule

• Summary of last lecture, homework
• Priority queue as an ADT
• Efficient implementation of priority queue
• Complete binary tree
• Heap
• Heap sort
• How to use `irb`

# Summary of Last Lecture

• The order (of growth)/(asymptotic) time complexity of an algorithm can be calculated from the number of the most frequent basic operations
• Calculation can use a summation or a recurrence (relation)
• Big-O notation compactly express the inherent efficiency of an algorithm
• An abstract data type (ADT) combines data and the operations on this data
• Stack and queue are typical examples of ADTs
• Most ADTs can be implemented in different ways
• Depending on implementation, the time complexity of each operation of an ADT can change

# Last Week's Homework 1

Order the following orders of growth, and explain the reason for your order:

O(n2), O(n!), O(n log log n), O(n log n), O(20n)

Solution: O(n log log n) ⊂ O(n log n) ⊂ O(n2) ⊂ O(20n) ⊂ O(n!)

 f(n) g(n) n0 (example solution) c (example solution) n log log n n log n 2 1 n log n n2 2 1 n2 20n 1 1 20n n! 52 1 20n n! 20 2020/20! (=43099804)

# Last Week's Homework 2

Write a simple program that uses the classes in 4ADTs.rb.
Use this program to compare the implementations.
Hint: Use the second part of 2search.rb as an example.

# Last Week's Homework 3

Implement the priority queue ADT (Ruby or any other programming language is okay)

A priority queue keeps a priority for each data item.
In the simplest case, the only data is the priority.
The items with the highest priority leave the queue first.
Your implementation can use an array or a linked list or any other data structure.

Example solution: 5prioQ.rb

# Priority Queue

Example from IT:
Queue for process management, ...
Operations:
• Creation: new, init
• Check for emptiness: empty?
• Return and remove item with highest priority: getNext/delMax/dequeue...
• Return item with highest priority (without removal): peekAtNext/findMax/...

# Simple Implementations

 Implementation Ordered array or linked list Unordered array or linked list `insert` O(n) O(1) `getNext`/`findMax` O(1) O(n)

Time complexity for each operation differs for different implementations.

But there is always an operation that needs O(n).

Is it possible to improve this?

# Heap

A heap is a binary tree where each parent always has higher priority than its children

The root always has the highest priority

# Complete Binary Tree

Definition based on tree structure:

• Allmost all internal nodes (except maybe one node) have have 2 children
• All tree layers except the lowermost are full
• The lowermost tree layer is filled from the left

# Heap

A heap is (full definition):

• A complete binary tree where
• Each parent always has higher priority than its children

The root always has the highest priority

We need the following operations for implementing a heap:

• Addition and removal of data items
• Restoration of invariants

# Invariant

• A condition that is always maintained in a data structure, or algorithm (especially loop)
• Very important for data structures
• Can be used in proofs (properties of data structures, correctness of algorithms, ...)
• After an operation on (change to) a data structure, it may be necessary to restore invariants

# Implementing a Complete Binary Tree with an Array

• Implementing a tree by allocating individual nodes and connecting them with pointers is complicated
• Compared to this, operations on an array are simple
• A complete binary tree can be implemented with an array as follows:
(this is also how Knuth defines a complete binary tree)

Give each node in the complete binary tree with n nodes a number so that:

• Number 0 stays unused
• Each node has a number between 1 and n (inclusive)
• The root has number 1
• The number of the parent of node i is i/2⌋ (i>1)
• The numbers of the children of note i are 2i and 2i+1

# Restoring Heap Invariants

If the priority at a given node is too high: Use `heapify_up`

• Compare priority with parent
• If parent priority is lower, exchange with parent
• Continue until parent priority is higher

If the priority at a given node is too low: Use `heapify_down`

• Compare priority with both children
• If necessary, exchange with the child that has higher priority
• Continue at exchanged child until exchange becomes unnecessary

Implementation: 5heap.rb

# Implementing a Priority Queue with a Heap

• Insertion of a new element (`insert`):
1. Insert the new element at the end of the heap (next empty place in lowermost tree layer, or new layer if necessary)
2. Restore heap invariants for newly inserted element using `heapify_up`
• Removal of element with highest priority (`getNext`):
1. Remove the root element and store it separately
2. Move the last element of the heap (rightmost element in lowermost layer) to the root
3. Restore heap invariants for root element using `heapify_down`
4. Return the original root element

# Time Complexity of Heap Operations

 Implementation `Heap` (implemented as an `Array`) `insert` O(log n) `findMax` O(1) `getNext` O(log n)

# Heap Sort

• Use priority queue to sort by (decreasing) priority
1. Create a heap from all the items to be sorted
2. Remove items from heap one-by-one: They will be ordered by (decreasing) priority
• Implementation optimization:
Use space at the end of the array to store removed items
⇒ The items will end up in the array in increasing order
• Time complexity: O(n log n)
• Addition and removal of items is O(log n) for each item
• To sort n items, the total complexity is O(n log n)

# How to use `irb`

`irb`: Interactive Ruby, a 'command prompt' for Ruby

Example usage:

```C:\Algorithms>irb
=> true
irb(main):002:0> h = Heap.new
=> #<Heap:0x2833d60 @array=[nil], @size=0>
irb(main):003:0> h.insert 3
=> #<Heap:0x2833d60 @array=[nil, 3], @size=1>
irb(main):004:0> h.insert_many 5, 7
=> #<Heap:0x2833d60 @array=[nil, 7, 3, 5], @size=3>
...```

Alternative to `load '5heap.rb'`: `irb -r./5heap`

# Other Kinds of Heaps

• Priority queues can be used as components in many different algorithms
• Often, two priority queues need to be joined
• With the 'usual' heap, joining is O(n)
• With a binomial queue, joining is O(log n)
• With a Fibonacci heap, joining can be improved to O(1)

# Ideas to Improve Implementation of Priority Queue

• Started with two simple implementations:
completely ordered, completely unordered
• New idea: Combining both implementations/finding a balance between the two implementations
• Not completely ordered, but also not completely unordered
→ Partially ordered, just to the extent necessary to find highest priority item

# Conceptual Layers

• Application: Heap sort
• Conceptual data structure: Heap
• Actual data structure: Complete binary tree
• Internal implementation: Array

Caution: Implementation of heap sort directly uses the array. This is a layer violation.

# Summary

• A priority queue is an important ADT
• Implementing a priority queue with an array or a linked list is not efficient
• In a heap, each parent has higher priority than its children
• In a heap, the highest priority item is at the root of a complete binary tree
• A heap is an efficient implementation of a priority queue
• Many data structures are defined using invariants
• A heap can be used for sorting, using heap sort

# Homework

1. Cut the sorting cards, and bring them with you to the next lecture
2. Shuffle the sorting cards, and try to find a fast way to sort them. Play against others (who is fastest?).
3. Find five different applications of sorting (no need to submit)
4. Implement joining two (normal) heaps (no need to submit)
5. Think about the time complexity of creating a heap:
`heapify_down` will be called n/2 times and may take up to O(log n) each time.
Therefore, one guess for the overall time complexity is O(n log n).
However, this upper bound can be improved by careful analysis.
(no need to submit)

# Glossary

priority queue

complete binary tree

heap
ヒープ
internal node

restoration

invariant

sort

decreasing (order)

increasing (order)

join

binomial queue/heap
2項キュー、2 項ヒープ
distribution

layer violation