Fast queries using heaps

If you are an Operating System (OS) and from the pool of tasks with attached priorities waiting to be executed, how would you pick the one that needs to be executed next? Sure, you could have them in an array, sort them (O(n logn)) and pick the highest priority one from the sorted data structure. Note that you don’t care about the task with the second highest priority at the moment (if you can execute only one task at a time). You only need to pick the highest priority task, which intuitively tells us that we do not need all the tasks sorted, but just a way to pick the one with the highest priority and that the ordering of the remaining tasks is not important.

Now suppose you are generating tasks to be sent to the operating system. You need to keep inserting new tasks with attached priorities to the pool of tasks. If you want to add a new high-priority task to the pool, you would like to put it in such a position that would make it quick for the OS to query and fetch it. Again, you should not have to go through the priorities of all of the tasks in the pool in order to decide the position of the new task.

What data structure would fit the above task? You do not even have to be an OS, you could be a system allowing passengers (high-priority ones have first-class boarding passes, for example) to board an aircraft. That brings us to Priority Queue data structure that has heap property - i.e. it is a tree where the key with the highest (Max Heap) or lowest value (Min Heap) is at the root and the children nodes’ keys are smaller (or larger in the case of Min Heap) than the root’s key. So, every fetch operation can just get the key at the root instead of traversing the entire tree (looking at all tasks in the pool). Let us see how to implement a heap in the next post.