Skill Set - Two Heaps


Skills for tackling LeetCode problems - Two Heaps.

Two heaps pattern is efficient in finding the smallest element in one part and the biggest element in the other part. One heap is a Min Heap, the other is a Max Heap.

1. Find the Median of a Number Stream

Design a class to calculate the median of a number stream. The class should have the following two methods.

  1. insertNum(num): stores the number in the class
  2. findMedian(): returns the median of all numbers inserted in the class

If the count of numbers inserted in the class is even, the median will be the average of the middle two numbers.

# Time complexity O(log N) for "insertNum", O(N) for "findMedian". Space complexity O(N).
from heapq import *

class MedianOfAStream:

    maxHeap = []  
    minHeap = []  

    def insert_num(self, num):
        # keep max heap decreasing and min heap increasing
        if not self.maxHeap or -self.maxHeap[0] >= num: 
            heappush(self.maxHeap, -num)
        else:
            heappush(self.minHeap, num)

        # max heap will have one more element or equal element than the min-heap
        if len(self.maxHeap) > len(self.minHeap) + 1:
            heappush(self.minHeap, -heappop(self.maxHeap))
        elif len(self.maxHeap) < len(self.minHeap):
            heappush(self.maxHeap, -heappop(self.minHeap))

    def find_median(self):
        if len(self.maxHeap) == len(self.minHeap):
            # we have even number of elements, take the average of middle two elements
            return -self.maxHeap[0] / 2.0 + self.minHeap[0] / 2.0

        # because max-heap will have one more element than the min-heap
        return -self.maxHeap[0] / 1.0

2. Sliding Window Median

Given an array of numbers and a number ‘k’, find the median of all the ‘k’ sized sub-arrays (or windows) or the array.

# Time complexity O(NK), where N is the total number of elements in the input array, K is the size of the slidng window. (It takes O(log K) to insert/remove elements from heaps of size K. It takes O(K) to search element in an array of size K ).
# Space complexity O(K) for storing numbers within the sliding window.

from heapq import *
import heapq

class SlidingWindowMedian:
  def __init__(self):
    self.maxHeap, self.minHeap = [], []

  def find_sliding_window_median(self, nums, k):
    result = [0.0 for _ in range(len(nums) - k + 1)] # sliding window

    for i in range(0, len(nums)):
      # keep max heap decreasing and min heap increasing
      if not self.maxHeap or nums[i] <= -self.maxHeap[0]:
        heappush(self.maxHeap, -nums[i])
      else:
        heappush(self.minHeap, nums[i])

      self.rebalance_heaps()

      if i - k + 1 >= 0: # if start point of the sliding window is >= 0
        if len(self.maxHeap) == len(self.minHeap):
          result[i - k + 1] = -self.maxHeap[0] / 2.0 + self.minHeap[0] / 2.0
        else:
          result[i - k + 1] = -self.maxHeap[0] / 1.0

        # remove the element going out of the sliding window
        removed = nums[i - k + 1]
        if removed <= -self.maxHeap[0]:
          self.remove(self.maxHeap, -removed)
        else:
          self.remove(self.minHeap, removed)

        self.rebalance_heaps()

    return result

  def remove(self, heap, element):
    ind = heap.index(element)

    # move the element to the end and delete it
    heap[ind] = heap[-1]
    heappop(heap)

    # we can use heapify to readjust the elements but that would be O(N),
    # instead, we will adjust only one element which will O(logN)
    if ind < len(heap):
      heapq._siftup(heap, ind)
      heapq._siftdown(heap, 0, ind)
    # heapify(heap)

  def rebalance_heaps(self):
    if len(self.maxHeap) > len(self.minHeap) + 1:
      heappush(self.minHeap, -heappop(self.maxHeap))
    elif len(self.maxHeap) < len(self.minHeap):
      heappush(self.maxHeap, -heappop(self.minHeap))

Note: It is easy to delete the ith element in heap. It takes O(N) time complexity.

h[i] = h[-1]
h.pop()
heapq.heapify(h)

If we want to take O(Log N) to delete an element in heap, we need source code of “heap.py” siftup and siftdown.

h[i] = h[-1]
h.pop()
if i < len(h):
    heapq._siftup(h, i)
    heapq._siftdown(h, 0, i)  

  TOC