“HEAPS ,the cutest little data structure ever invented”

Jay Prakash
8 min readNov 1, 2020

--

“HEAPS ,the cutest little data structure ever invented”

Heaps are a great example of many core computer science concepts working together in order to form one very large abstraction, which is made up of smaller parts that we’re already familiar with!

In this article we’ll be having an in-depth discussion about heaps as a data structure and in order to understand them completely let’s first discuss the below mentioned topics :
1. Array representation of a Binary Tree.
2. Complete Binary Tree.
Post that, we will broaden our knowledge by understanding the following:
3. What are heaps?
4. Insert/Delete in a heap.
5. Heapify and HeapSort.
6. Priority Queues.

So, let’s get started.

1. Array representation of a Binary Tree.

Talking about representation, trees can be represented in two ways:
1) Dynamic Node Representation (Linked Representation).
2) Array Representation (Sequential Representation).
We are going to talk about the sequential representation of the trees.
To represent tree using an array, the numbering of nodes can start either from 0–(n-1) or 1– n(this indexing is used in here).

The (binary) heap data structure is an array object that we can view as a nearly complete binary tree . Each node of the tree corresponds to an element of the array. An array A that represents a heap is an object with two attributes: A.length, which (as usual) gives the number of elements in the array, and A.heap-size, which represents how many elements in the heap are stored within array A. The root of the tree is A[1] , and given the index i of a node, we can easily compute the indices of its parent, left child, and right child:
If we are looking at the i-th index in an array:

  • It’s parent is at the floor (i)/2 index.
  • It’s left child is at 2 * i index.
  • It’s right child is at 2 *i + 1 index.
Array Representation of a Binary Tree

A max-heap viewed as
(a) a binary tree and
(b) an array. The number within the circle at each node in the tree is the value stored at that node. The number above a node is the corresponding index in the array. Above and below the array are lines showing parent-child relationships; parents are always to the left of their children. The tree has height three; the node at index 4 (with value 8) has height one.

2. Complete Binary Tree.

A complete binary tree is a binary tree in which every level of the tree is fully filled, except for perhaps the last level. To the extent that the last level is filled, it is filled left to right.

               18
/ \
15 30
/ \ / \
40 50 100 40


18
/ \
15 30
/ \ / \
40 50 100 40
/ \ /
8 7 9

Practical examples of Complete Binary Trees are Binary Heaps.

3.What are heaps?

A heap is really nothing more than a binary tree with some additional rules that it has to follow. These two rules are the two properties that define what differentiates a heap structure from any other tree structure. So, what are the two properties of a heap?

Well, it all boils down to the shape of the tree, and the order of the tree’s nodes. If we can remember these two aspects, it will be easy for us to identify a heap from any other binary tree.

Let’s talk about shape first. In order for a binary tree to qualify as a heap, it must be complete tree; in other words, every single level of the tree must be completely filled — with the last level being the only exception. Additionally, the last level of the tree must always have the left-most nodes filled first.
What about the order of a heap?The basic rule for the order property of a heap is this: a parent nodes (including the root node) of a heap must either be greater than or equal to the value of its children nodes(max-heap), or less than or equal to the value of its children nodes(min-heap).

Let’s visualize what these two types look like:

Max and Min Heaps

4. Insert/Delete in a heap

We’ll just discuss min-heaps here. Max-heaps are essentially equivalent, but the elements are in descending order rather than ascending order.

Insertion

To insert a node:

  1. Add the node to the bottom of the tree.
  2. Look at the parent node. If the parent is greater than the node, swap them.
  3. Continue comparing and swapping to allow the node to bubble up until it finds a parent node that is smaller than it.

Cost: As we know that, the height of a tree is log(n). Therefore, the worst case scenario is that the newly added node is smaller than every single parent node, causing us to traverse all the way to the top of the tree. This will cost us O(log(n)) time.

Deletion (Removing the smallest element)

Because our min-heap has the smallest element at the root node, we know exactly where it is in our array. Accessing this element takes O(1) time.

If we want to delete the element, we must shift the entire tree upwards to fill the root node’s place. To do this:

  1. Take the bottom level’s right most node (the last element in the array) and move it to top, replacing the deleted node.
  2. Compare the new root to its children. If it is larger than either child, swap the item with the smaller of the two children.
  3. Continue comparing and swapping, bubbling down the node until it is smaller than both of its children.

Like insertion, the worst case scenario is that we have to traverse the entire tree, but this time we are moving downwards. The cost of this is O(log(n)).

While inserting an element, we adjust(bubble up) the values from leaf to root node. Whereas, while deleting an element, we adjust(bubble down) the values from root to leaf node.

5. Heapify and HeapSort.

In order to maintain the min-heap property, we call the procedure MIN-HEAPIFY. Its inputs are an array A and an index i into the array. When it is called, MINHEAPIFY assumes that the binary trees rooted at LEFT(i) and RIGHT(i)are minheaps, but that A[i] might be greater than its children, thus violating the min-heap property. MIN-HEAPIFY lets the value at A[i] “float down” in the min-heap so that the subtree rooted at index i obeys the min-heap property.

Algorithm for maintaining the heap property:

We can describe the running time of MIN-HEAPIFY by the recurrence

T (n )≤T (2n/3)+O(1)

The solution to this recurrence, is T (n)=O(log n). Alternatively, we can characterize the running time of MINHEAPIFY on a node of height h as O(h).

Algorithm for building a heap:

We build from [A.length/2] to 1 and not from A.length to 1, because the last level has leaf nodes, and they do satisfy the min-heap property.

We build in a bottom up way, from [A.length/2] to 1 and not the other way around because for Min_heapify we assumed that the left and right subtrees also satisfy the min-heap property.

For the heapsort algorithm, we use max-heaps. Min-heaps commonly implement priority queues.

Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to selection sort where we first find the maximum element and place the maximum element at the end. We repeat the same process for the remaining elements.
The heapsort algorithm starts by using BUILD-MAX-HEAP to build a max-heap on the input array A[1… to n] where n =A.length. Since the maximum element of the array is stored at the root A[1], we can put it into its correct final position by exchanging it with A[n]. If we now discard node n from the heap — and we can do so by simply decrementing A.heap_size. We observe that the children of the root remain max-heaps, but the new root element might violate the max-heap property. All we need to do to restore the max-heap property, is calling MAX-HEAPIFY(A,1), which leaves a max-heap in A[1…to n-1] . The heapsort algorithm then repeats this process for the max-heap of size n-1 down to a heap of size 2.

Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of heap by 1. Finally, heapify the root of the tree.
3. Repeat step 2 while size of heap is greater than 1.

The following figure shows an example of the operation of HEAPSORT after the initial max-heap has been built by BuildMaxHeap . The figure shows the max-heap before the first iteration of the for loop and after each iteration.

The HEAPSORT procedure takes time O(n logn), since the call to BUILD-MAXHEAP takes time O(n) and each of the n-1 calls to MAX-HEAPIFY takes time O(log n).

6. Priority Queues

In this section, we present one of the most popular applications of a heap: as an efficient priority queue.

A priority queue is a queue data structure with some additional properties. Every item in a priority queue has a “priority” associated with (usually a numerical value). An item with a high priority is dequeued before an item with a lower priority. If two items have the same priority, they’re dequeued based on their order in the queue; in other words, they’re removed according to their location in the array.

Binary heaps are super efficient for implementing priority queues because it’s very easy to know and retrieve/remove the element with the highest priority: it will always be the root node!

Resources

The heap data structure is really worth learning since it is used for fundamental things like CPU scheduling. If you want to dig deeper into priority queues and heaps, I recommend starting out by choosing from this plethora of resources!

If you like such articles, check out my hashnode profile and my blog website.

Thanks for reading.

--

--