Skills for tackling LeetCode problems - Two Heaps.
Two heaps pattern is efficient in finding the smallest element in one part and the biggest element in the other part. One heap is a Min Heap, the other is a Max Heap.
1. Find the Median of a Number Stream
Design a class to calculate the median of a number stream. The class should have the following two methods.
insertNum(num)
: stores the number in the classfindMedian()
: returns the median of all numbers inserted in the class
If the count of numbers inserted in the class is even, the median will be the average of the middle two numbers.
# Time complexity O(log N) for "insertNum", O(N) for "findMedian". Space complexity O(N).
from heapq import *
class MedianOfAStream:
maxHeap = []
minHeap = []
def insert_num(self, num):
# keep max heap decreasing and min heap increasing
if not self.maxHeap or -self.maxHeap[0] >= num:
heappush(self.maxHeap, -num)
else:
heappush(self.minHeap, num)
# max heap will have one more element or equal element than the min-heap
if len(self.maxHeap) > len(self.minHeap) + 1:
heappush(self.minHeap, -heappop(self.maxHeap))
elif len(self.maxHeap) < len(self.minHeap):
heappush(self.maxHeap, -heappop(self.minHeap))
def find_median(self):
if len(self.maxHeap) == len(self.minHeap):
# we have even number of elements, take the average of middle two elements
return -self.maxHeap[0] / 2.0 + self.minHeap[0] / 2.0
# because max-heap will have one more element than the min-heap
return -self.maxHeap[0] / 1.0
2. Sliding Window Median
Given an array of numbers and a number ‘k’, find the median of all the ‘k’ sized sub-arrays (or windows) or the array.
# Time complexity O(NK), where N is the total number of elements in the input array, K is the size of the slidng window. (It takes O(log K) to insert/remove elements from heaps of size K. It takes O(K) to search element in an array of size K ).
# Space complexity O(K) for storing numbers within the sliding window.
from heapq import *
import heapq
class SlidingWindowMedian:
def __init__(self):
self.maxHeap, self.minHeap = [], []
def find_sliding_window_median(self, nums, k):
result = [0.0 for _ in range(len(nums) - k + 1)] # sliding window
for i in range(0, len(nums)):
# keep max heap decreasing and min heap increasing
if not self.maxHeap or nums[i] <= -self.maxHeap[0]:
heappush(self.maxHeap, -nums[i])
else:
heappush(self.minHeap, nums[i])
self.rebalance_heaps()
if i - k + 1 >= 0: # if start point of the sliding window is >= 0
if len(self.maxHeap) == len(self.minHeap):
result[i - k + 1] = -self.maxHeap[0] / 2.0 + self.minHeap[0] / 2.0
else:
result[i - k + 1] = -self.maxHeap[0] / 1.0
# remove the element going out of the sliding window
removed = nums[i - k + 1]
if removed <= -self.maxHeap[0]:
self.remove(self.maxHeap, -removed)
else:
self.remove(self.minHeap, removed)
self.rebalance_heaps()
return result
def remove(self, heap, element):
ind = heap.index(element)
# move the element to the end and delete it
heap[ind] = heap[-1]
heappop(heap)
# we can use heapify to readjust the elements but that would be O(N),
# instead, we will adjust only one element which will O(logN)
if ind < len(heap):
heapq._siftup(heap, ind)
heapq._siftdown(heap, 0, ind)
# heapify(heap)
def rebalance_heaps(self):
if len(self.maxHeap) > len(self.minHeap) + 1:
heappush(self.minHeap, -heappop(self.maxHeap))
elif len(self.maxHeap) < len(self.minHeap):
heappush(self.maxHeap, -heappop(self.minHeap))
Note: It is easy to delete the ith element in heap. It takes O(N) time complexity.
h[i] = h[-1]
h.pop()
heapq.heapify(h)
If we want to take O(Log N) to delete an element in heap, we need source code of “heap.py” siftup
and siftdown
.
h[i] = h[-1]
h.pop()
if i < len(h):
heapq._siftup(h, i)
heapq._siftdown(h, 0, i)