Binary Heaps
The heap data structure is an array
Because the heap is based on a complete binary tree, the height of a heap of
Finding the Parent, Left and Right Children
Given an element
int parent(i) {
return i/2; // we know that i/2 is rounded down so we don't need to floor
}
int left(i) {
return i << 1; // 2*i;
}
int right(i) {
return (i << 1) + 1; // 2*i+1
}
The Heap Property
There are two kinds of binary heaps, min heaps and max heaps. Depending on the heap type the array must satisfy a heap property. The heap property depends on the type of the heap. For any given element
If this is a max-heap then,
If this is a min-heap then,
The heap property is crucial. Because of it, we know that the root of the heap must be the smallest or the largest element in the heap and therefore, extracting the minimum or the maximum depending on the heap type can be done in constant time!
Maintaining the Max-Heap Property
Suppose we have an element
void max_heapify(int i) {
int l = left(i);
int r = right(i);
// (1) check if the left child is greater
int largest = i;
if (l <= length && A[l] > A[i]) {
largest = l;
} else {
largest = i;
}
// (2) check if the right child is greater
if (r <= length && A[r] > A[largest]) {
largest = r;
}
// if one of the children is greater, swap it
// with i and then call heapify again on the child
// we swapped with
if (largest != i) {
std::swap(A[i], A[largest]);
max_heapify(largest);
}
}
How long does max_heapify take? Well, in the worst case, we will go down all the way to a leaf and so the runtime is
Building a Max Heap
Based on how we store the heap elements in the array, the leaves of the tree are located stating at
void build_max_heap(A) {
for (int i = A.length/2; i >=1; i--) {
max_heapify(A, i);
}
}
Example
Let’s look at an example of building a max-heap using the above idea. We’re given the following array and we want to build a max-heap out of it.
To build a max-heap we’ll only consider the non-leaf nodes, highlighted below.
We’ll start with node 5 and call max_heapify on it. In max_heapify, we’ll swap both nodes 5 and 11. We will then recusively call max_heapify node 5.
Notice that node 5 is now a leaf node so we can’t push it further down the tree and so we’ll move to the next node, node 3.
Node 3 above will be swapped with the larger of its children, node 13.
Notice how nodes 3 and 13 are now swapped. Next we will recursively call max_heapify on node 3 but node 3 is a just a leaf node and we will stop. The final node in the build_max_heap for loop is node 1.
We will swap node 1 with the larger of its children, node 13.
We will then recusively call max_heapify on node 1.
And finally we will swap 1 with the larger of its children, node 7.
Finally, you can see now that the tree/array is a max-heap:
Correctness Proof
Why should we believe that build_max_heap works? This is going to be exactly what’s in CLRS (my notes for myself to quickly remember). We’ll show that it works by proving that the following loop invariant is maintained prior to the first iteration, before each iteration and when the loop terminates.
At the start of each iteration of the for loop in build_max_heap, each node |
Initialization:
Before the first iteration that starts at
Maintenance: (so the gist here is that max_heapify will maintain that node
At each iteration, we call max-heapify on the node
Termination:
At termination when
I would obviously recommend looking at CLRS’s way unless I’m in a rush and this is easily accessible on my phone.
Running Time
The most exciting question is how long does it take to build a max heap? We know that max_heapify takes
(1) Binary heaps with
(2) At any height
Aside: What does (2) mean? Suppose we have a binary heap with
The number of nodes at height 0 (leaves) is
The number of nodes at height 1 is
The number of nodes at height 2 (root) is
Formal Proof:
TODO
We can now sum the work we’re doing at each level which is just the height of the level multipled by the number of nodes in that level. We do that for each level in the tree, from the 0th level to the
We can use the following summation:
By substituting
And therefore, we can bound the earlier summation to just:
and we’re done!
Heapsort
TODO
References
These are my study notes / summary on chapter 6 in CLRS.