LC 28 & LC 459


Solution to LeetCode 28 Implement strStr() and LeetCode 459 Repeated Substring Pattern.

LeetCode 28

Implement strStr(). (Easy) [link]

1] The Brute Force solution is to compare each substring of haystack of the same length as the needle to the needle and find the match. The time complexity O(nm) where n is the length of haystack and m is the length of the needle.

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        n = len(haystack)
        m = len(needle)
        for i in range(n-m+1):
            if haystack[i:i+m] == needle:
                return i
        return -1

2] Knuth–Morris–Pratt (KMP): Prefix-table tells us where to restart when we have mismatch. To construct a prefix table, for each prefix of the whole string, find the index of the longest common prefix for the suffix. If there is no common prefix and suffix, after while loops and updating k, the number would be -1. Using the prefix table, we can find the match by moving double pointers, one on haystack the other on needle. The time complexity is O(n+m). It takes O(m) to construct a prefix table and O(n) to find the match.

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        n = len(haystack)
        m = len(needle)
        if m == 0: return 0
        tab =self.prefix_table(needle)
        p = -1
        for i in range(n):
            while p > -1 and haystack[i] != needle[p+1]:
                p=tab[p]
            if haystack[i] == needle[p+1]:
                p += 1
            if p == m-1: # p moving to the end of the tab
                return i-m+1 # first index
        return -1

    def prefix_table(self, needle):
        a = len(needle)
        k = -1
        tab = ['' for i in range(a)]
        tab[0] = k
        for i in range(1,a):
            while k > -1 and needle[k+1] != needle[i]:
                k = tab[k]
            if needle[k+1] == needle[i]:
                k += 1
            tab[i] = k
        return tab

LeetCode 459

Repeated Substring Pattern (Easy) [link]

1] Brute Force Solution. The idea is for each possible pattern, check whether it is the pattern of the whole string. Efficiency can be improved if 1) only iterate through half of the string to define a pattern because a pattern can not be longer than the string, 2) a substring cannot be a pattern if the length of the string is not a multiple of the substring . Time complexity O(n^2). Space complexity O(1).

class Solution:
    def repeatedSubstringPattern(self, s: str) -> bool:
        n = len(s)
        for i in range(1, n//2+1): # pattern cannot be longer than a half of s
            if n % i == 0: # s must be a multiple of the pattern
                if all(s[j] == s[j-i] for j in range(i, n)):
                    return True
        return False

2] Knuth–Morris–Pratt (KMP): Construct a prefix table. For example, the prefix table for ‘asdfasdfasdf’ is [-1,-1,-1,-1,0,1,2,3,4,5,6,7]. The length of the longest common prefix and suffix is tab[-1]+1. len % (len - (tab[-1]+1)) == 0 means there are repeated substrings. The time complexity is O(n). The space complexity is O(n).

class Solution:
    def repeatedSubstringPattern(self, s: str) -> bool:  
        if len(s) == 0:
            return False
        tab = self.prefix_table(s)
        if tab[-1] != -1 and len(s) % (len(s) - (tab[-1] + 1)) == 0:
            return True
        return False
    
    def prefix_table(self, s):
        tab = [0] * len(s)
        tab[0] = -1
        j = -1
        for i in range(1, len(s)):
            while j >= 0 and s[i] != s[j+1]:
                j = tab[j]
            if s[i] == s[j+1]:
                j += 1
            tab[i] = j
        return tab

  TOC