Skills for tackling LeetCode problems - Heap for Top K Numbers.
Heap can be used to efficiently solve problems dealing with K elements at a time from a set of given elements.
1. Top K Numbers
Given an unsorted array of numbers, find the K largest numbers in it.
We use a Min-Heap storing numbers in ascending order. Every time we find a larger number than the smallest number in the heap, we take out the smallest number and insert the larger number into the heap.
# Time complexity O(Klog K + (N - K)*log K) = O(N log K). It takes O(log K) to extract or insert the min number from the heap. We insert K numbers in the heap and then iterate through the remaining numbers, at every step, we extract the min and insert a new.
# Space complexity O(K)
from heapq import *
def find_k_largest_numbers(nums, k):
minHeap = []
for i in range(k): # add first k numbers
heappush(minHeap, nums[i])
for i in range(k, len(nums)): # replace the min in heap with the larger number in array.
if nums[i] > minHeap[0]:
heappop(minHeap)
heappush(minHeap, nums[i])
return list(minHeap)
2. Top K Elements
Given an unsorted array of numbers, find kth smallest number in it. Please note that it is the kth smallest number in the sorted order, not the kth distinct element.
We use a Max-Heap storing numbers in descending order. Every time we find a smaller number than the max number in the heap, we take out that largest number and insert the smaller number into the heap.
# Time complexity O(N log K). Space complexity O(K).
def find_Kth_smallest_number(nums, k):
maxHeap = []
for i in range(k): # add first k numbers
heappush(maxHeap, -nums[i])
for i in range(k, len(nums)): # replace the max in heap with the smaller number in array
if -nums[i] > maxHeap[0]:
heappop(maxHeap)
heappush(maxHeap, -nums[i])
return -maxHeap[0]
Alternative approach: we can also use a Min Heap storing all the numbers and extract top K from it.
# Time complexity O(N). Space complexity O(N).
def find_Kth_smallest_number(nums, k):
minHeap = []
for i in range(len(nums)):
heappush(minHeap, nums[i])
return minHeap[k-1]
3. K Closest Points to the Origin
Given an array of points in a 2D plane, find K closest points to the origin.
# Time complexity O(N log K). Space complexity O(K).
# class Point:
# def __init__(self, x, y):
# self.x = x
# self.y = y
# def __lt__(self, other):
# return self.distance_from_origin() > other.distance_from_origin()
# def distance_from_origin(self):
# return (self.x * self.x) + (self.y * self.y)
# def print_point(self):
# print("[" + str(self.x) + ", " + str(self.y) + "] ", end='')
def find_closest_points(points, k):
maxHeap = []
for i in range(k):
heappush(maxHeap, points[i])
for i in range(k, len(points)):
if points[i].distance_from_origin() < maxHeap[0].distance_from_origin():
heappop(maxHeap)
heappush(maxHeap, points[i])
return list(maxHeap) # the numbers are not sorted
4. Connect Ropes
Given N ropes with different lengths, we need to connect these ropes into one big rope with minimum cost. The cost of connecting two ropes is equal to the sum of their lengths.
Following a greedy approach to connect the smallest ropes first will ensure the lowest cost. We use Min Heap to find the smallest ropes. Once we connect two ropes, we need to insert the resulting rope back in the heap so that it can connect with the remaining ropes.
# Time complexity O(N log N). Space complexity O(N).
def minimum_cost_to_connect_ropes(ropeLengths):
minHeap = []
for i in ropeLengths: # add all numbers to the heap
heappush(minHeap, i)
result, tmp = 0, 0
while len(minHeap) > 1:
tmp = heappop(minHeap) + heappop(minHeap) # add two smallest numbers
result += tmp
heappush(minHeap, tmp) # insert back to the heap
return result
5. Top K Frequent Numbers
Given an unsorted array of numbers, find the top K frequently occurring numbers in it.
We use a HeapMap to count the frequency of numbers. We then use Min Heap to find the K most frequently occurring numbers.
# Time complexity O(N + N log K). Space complexity O(N).
def find_k_frequent_numbers(nums, k):
Hash = {} # get the frequency of numbers
for i in nums:
Hash[i] = Hash.get(i, 0) + 1
minHeap = []
for num, count in Hash.items():
heappush(minHeap, (count, num))
if len(minHeap) > k:
heappop(minHeap) # remove smallest numbers
result = []
while minHeap:
result.append(heappop(minHeap)[1])
return result
6. Frequency Sort
Given a string, sort it based on the decreasing frequency of its characters.
# Time complexity O(D log D) where D is the number of distinct characters in the input string.
# Space complexity O(N).
def sort_character_by_frequency(str):
Hash = {}
for char in str:
Hash[char] = Hash.get(char, 0) + 1
maxHeap = []
for char, count in Hash.items():
heappush(maxHeap, (-count, char))
result = []
while maxHeap:
count, char = heappop(maxHeap)
for _ in range(-count):
result.append(char)
return ''.join(result)
7. Kth Largest Number in a Stream
Design a class to find the kth largest element in a stream of numbers. The class should have the following two things.1) the constructor of the class should accept an integer array containing initial numbers from the stream and an integer K. 2) The class should expose a function add(int num)
which will store the given number and return the kth largest number.
# Time complexity O(log K) taken by 'add' to insert element to heap. Space complexity O(K).
class KthLargestNumberInStream:
minHeap = []
def __init__(self, nums, k):
self.k = k
for num in nums:
self.add(num)
def add(self, num):
heappush(self.minHeap, num)
if len(self.minHeap) > self.k:
heappop(self.minHeap)
return self.minHeap[0] # the kth largest number
8. K Closest Numbers
Given a sorted number array and two integers ‘K’ and ‘X’, find ‘K’ closest numbers to ‘X’ in the array. Return the numbers in the sorted order. ‘X’ is not necessarily present in the array.
First find the closest number ‘Y’ to ‘X’ by binary search. We then can search in both direction to find the K closest numbers. This can be done by pushing the neighbor numbers in a Min Heap, sorted by the absolute difference from ‘X’. Finally we can extract the top ‘K’ numbers from the Min Heap.
# Time complexity O(log N + K log K). O(log N) for binary search and O(K log K) for inserting numbers in Min Heap, and sort the result list.
# Space complexity O(K).
def find_closest_elements(arr, K, X):
index = binary_search(arr, X) # find X by binary search
l, r = index - K, index + K # search the neighbors
l = max(l, 0)
r = min(r, len(arr) - 1)
minHeap = []
for i in range(l, r + 1):
heappush(minHeap, (abs(arr[i] - X), arr[i])) # sort by the abs difference between the neighbor and X
result = []
for _ in range(K): # append the closest K number into result
result.append(heappop(minHeap)[1])
result.sort()
return result
def binary_search(arr, target):
l, r = 0, len(arr) - 1
while l < r:
m = l + (r - l) // 2
if arr[m] == target:
return m
if arr[m] < target:
l = m + 1
else:
r = m
return l
Alternative approach using two pointers: Use binary search to find the number X. Then use two pointers to move back and forward to find the closest neighbors which gives the smallest difference. To keep the resulting list sorted we can use a Queue.
# Time complexity O(log N + K). O(log N) for binary search and O(K) for using two pointers.
# Space complexity O(1).
from collections import deque
def find_closest_elements(arr, K, X):
result = deque()
index = binary_search(arr,X) # find the number X by binary search
p1, p2 = index, index + 1 # two pointers
n = len(arr)
for _ in range(K):
if p1 >= 0 and p2 < n: # two pointers are in the range
if abs(X - arr[p1]) <= abs(X - arr[p2]): # arr[p1] is closer
result.appendleft(arr[p1]) # since arr[p1] is smaller than X
p1 -= 1 # move to the left
else: # arr[p2] is closer
result.append(arr[p2]) # since arr[p2] is larger than X
p2 += 1 # move to the right
elif p1 >= 0: # p2 is out of range
result.appendleft(arr[p1]) # only consider arr[p1]
p1 -= 1 # move to the left
elif p2 < n: # p1 is out of range
result.append(arr[p2]) # only ocnsider arr[p2]
p2 += 1 # move to the right
return result
9. Maximum Distinct Elements
Given an array of numbers and a number K, we need to remove K numbers from the array such that we are left with maximum distinct numbers.
We first count the frequency by HashMap and then use a Min Heap sorted by the frequencies. Follow a greedy approach, we will remove the least frequent number from the heap and try to make it distinct.
# Time complexity O(N log N + K log N). Inserting N numbers to the heap takes O(N log N). Extracting K numbers will take O(K log N).
# Space complexity O(N) taken by HashMap.
def find_maximum_distinct_elements(nums, k):
distinctCount = 0
if len(nums) <= k: # nothing left after removal
return distinctCount
Hash = {} # count the frequencies
for i in nums:
Hash[i] = Hash.get(i, 0) + 1
minHeap = []
for num, count in Hash.items():
if count == 1: # distinct
distinctCount += 1
else: # only include duplicates
heappush(minHeap, (count, num))
while k > 0 and minHeap:
count, num = heappop(minHeap)
k -= count - 1 # delete duplicates to make num distinct
if k >= 0: # if k < 0, we cannot remove anymore
distinctCount += 1
if k > 0: # if we still need to remove, remove distinct numbers
distinctCount -= k
return distinctCount
10. Sum of Elements
Given an array, find the sum of all numbers between the k1’th and k2’th smallest elements of that array.
We first insert all numbers into a Min Heap, then remove the first ‘k1’ smallest numbers, then extract the next ‘k2 - k1 - 1’ numbers out of the heap. The sum of them will be the answer.
# Time complexity O(N log N). Space complexity O(N). N is the total input numbers.
def find_sum_of_elements(nums, k1, k2):
minHeap = []
for i in nums:
heappush(minHeap, i)
for _ in range(k1):
heappop(minHeap)
SUM = 0
for _ in range(k2 - k1 - 1):
SUM += heappop(minHeap)
return SUM
Alternative approach: we can use a Max Heap to keep track of the top k2 numbers, and then add the top k2-k1-1 numbers in the max heap to find the sum of all numbers.
# Time complexity O(N log K2). Space complexity O(K2)
def find_sum_of_elements(nums, k1, k2):
maxHeap = []
for i in range(len(nums)):
if i < k2 - 1: # add k2 numbers into Max Heap
heappush(maxHeap, -nums[i])
elif nums[i] < -maxHeap[0]: # add smaller numbers
heappop(maxHeap)
heappush(maxHeap, -nums[i])
SUM = 0
for _ in range(k2 - k1 -1):
SUM += -heappop(maxHeap)
return SUM
11. Rearranging String
Given a string, find if its letters can be rearranged in such a way that no two same characters come next to each other.
Use a HashMap to count the frequencies of the letters. In each step, we should append the letter to the result list in the order from the highest frequency to the lowest frequency.
# Time complexity O(N log N). Space complexity O(N).
def rearrange_string(str):
Hash = {}
for char in str: # count the frequencies of the letters
Hash[char] = Hash.get(char, 0) + 1
maxHeap = []
for char, count in Hash.items(): # append to max heap ordered by frequencies
heappush(maxHeap, (-count, char))
preChar, preCount = None, 0
result = []
while maxHeap:
count, char = heappop(maxHeap)
if preChar and -preCount > 0: # add back if the frequency is still greater than 0
heappush(maxHeap, (preCount, preChar))
result.append(char)
preChar = char
preCount = count + 1
# there cannot be two same letters near to each other, if it happens the while loop will terminate early
return ''.join(result) if len(result) == len(str) else ""