(ヒープ)

http://www.sw.it.aoyama.ac.jp/2017/DA/lecture5.html

© 2009-17 Martin J. Dürst 青山学院大学

- Summary of last lecture, homework
- Priority queue as an ADT
- Efficient implementation of priority queue
- Complete binary tree
- Heap
- Heap sort
- How to use
`irb`

- The order (of growth)/(asymptotic) time complexity of an algorithm can be calculated from the number of the most frequent basic operations
- Calculation can use a summation or a recurrence (relation)
- The big-
`O`notation compactly express the inherent efficiency of an algorithm - An
*abstract data type*(ADT) combines data and the operations on this data *Stack*and*queue*are typical examples of ADTs- Each ADT can be implemented in different ways
- Depending on implementation, the time complexity of each operation of an ADT can change

- Example from IT:
- Queue for process management, ...
- Operations:
- Creation: new, init
- Check for emptiness: empty?
- Insert additional item: add,...
- Return and remove item with highest priority: getNext/delMax/...
- Return item with highest priority (without removal): findMax/peekAtNext/...

Implementation | Array or linked list (ordered) | Array or linked list (unordered) |

`insert` |
O(n) |
O(1) |

`getNext` /`findMax` |
O(1) |
O(n) |

Time complexity for each operation differs for different implementations.

But there is always an operation that needs `O`(`n`).

Is it possible to improve this?

A heap is a binary tree where each parent always has higher priority than its children

⇒ The root always has the highest priority

Definition based on tree structure:

- Allmost all internal nodes (except maybe one node) have have 2 children
- All tree layers except the lowermost are full
- The lowermost tree layer is filled from the left

A heap is (full definition):

- A complete binary tree where
- Each parent always has higher priority than its children

⇒ The root always has the highest priority

We need the following operations for implementing a heap:

- Addition and removal of data items
- Restauration of invariants

- A condition that is always maintained in a data structure or algorithm (especially loop)
- Very important for data structures
- Can be used in proofs (properties of data structures, correctness of algorithms, ...)
- After an operation on (change to) a data structure, it may be necessary to restore invariants

- Implementing a tree by allocating individual nodes and connecting them with pointers is complicated
- Compared to this, operations on an array are simple
- A complete binary tree can be implemented with an array as follows:

(this is also how Knuth defines a complete binary tree)Give each node in the complete binary tree with

`n`nodes a number so that:- Number 0 stays unused
- Each node has a number between 1 and
`n`(inclusive) - The root has number 1
- The number of the parent of node
`i`is ⌊`i`/2⌋ (`i`>1) - The numbers of the children of note
`i`are 2`i`and 2`i`+1

If the priority at a given node is too high: Use `heapify_up`

- Compare priority with parent
- If parent priority is lower, exchange with parent
- Continue until parent priority is higher

If the priority at a given node is too low: Use `heapify_down`

- Compare priority with both children
- If necessary, exchange with child with higher priority
- Continue at exchanged child until exchange becomes unnecessary

Implementation: 5heap.rb

- Insertion of a new element (
`insert`

):- Insert the new element at the end of the heap (next empty place in lowermost tree layer, or new layer if necessary)
- Restore heap invariants for newly inserted element using
`heapify_up`

- Removal of element with highest priority (
`getNext`

):- Remove the root element and store it separately
- Move the last element of the heap (rightmost element in lowermost layer) to the root
- Restore heap invariants for root element using
`heapify_down`

- Return the original root element

Implementation | `Heap` (implemented as an `Array` ) |

`insert` |
O(log n) |

`findMax` |
O(1) |

`getNext` |
O(log n) |

- Use priority queue to sort by (decreasing) priority
- Create a heap from all the items to be sorted
- Remove items from heap one-by-one: They will be ordered by (decreasing) priority

- Implementation optimization:

Use space at the end of the array to store removed items

⇒ The items will end up in the array in increasing order - Time complexity:
`O`(`n`log`n`)- Addition and removal of items is
`O`(log`n`) for each item - To sort
`n`items, the total complexity is O(`n`log`n`)

- Addition and removal of items is

`irb`

`irb`

: Interactive Ruby, a 'command prompt' for Ruby

Example usage:

C:\Algorithms>irb irb(main):001:0> load './5heap' => true irb(main):002:0> h = Heap.new => #<Heap:0x2833d60 @array=[nil], @size=0> irb(main):003:0> h.insert 3 => #<Heap:0x2833d60 @array=[nil, 3], @size=1> irb(main):004:0> h.insert_many [5, 7] => #<Heap:0x2833d60 @array=[nil, 7, 3, 5], @size=3> ...

- Priority queues can be used as components in many different algorithms
- Often, two priority queues need to be joined
- With the 'usual' heap, joining is
`O`(`n`) - With a
*binomial queue*, joining is`O`(log`n`) - With a
*Fibonacci heap*, joining can be improved to`O`(1)

- Started with two simple implementations
- Advantages and disadvantages for each implementation
- New idea: Combining both implementations/finding a balance between the two implementations
- Not completely ordered, but also not completely unordered

→ Partially ordered, just to the extent necessary to find highest priority item

- Application: Heap sort
- ADT: Priority queue
- Conceptual data structure: Heap
- Actual data structure: Complete binary tree
- Internal implementation: Array

- A priority queue is an important ADT
- Implementing a priority queue with an array or a linked list is not efficient
- In a heap, each parent has higher priority than its children
- In a heap, the highest priority item is at the root of a complete binary tree
- A heap is an efficient implementation of a priority queue
- Many data structures are defined using invariants
- A heap can be used for sorting, using heap sort

(for next week, no need to submit, but bring the sorting cards)

- Cut the sorting cards, shuffle them, and
try to find a fast way to sort them. Play against others (who is
fastest?).

(Example: Two players, one player uses selection sort, one player uses insertion sort, who wins?) - Find five different applications of sorting.
- Implement joining two (normal) heaps.
- Think about the time complexity of creating a heap:

`heapify_down`

will be called`n`/2 times and may take up to`O`(log`n`) each time.

Therefore, one guess for the overall time complexity is`O`(`n`log`n`).

However, this upper bound can be improved by careful analysis. - Continue to work on the report (manual sorting)

- priority queue
- 順位キュー、優先順位キュー、優先順位付き待ち行列
- complete binary tree
- 完全二分木
- heap
- ヒープ
- internal node
- 内部節
- restauration
- 修復
- invariant
- 不変条件
- sort
- 整列、ソート
- decreasing (order)
- 降順
- increasing (order)
- 昇順
- join
- 合併
- binomial queue/heap
- 2項キュー、2 項ヒープ
- distribution
- 分布