Permutation of a string as substring of another
Problem
Given a string A and another string B, Find whether any permutation of B exists as a substring of A.
For example,
if A = "encyclopedia"
if B="dep" then return true as ped is a permutation of dep and ped is a substring of A
if B="ycc" return true as cyc is a substring
if B="yyc" return false.
For example,
if A = "encyclopedia"
if B="dep" then return true as ped is a permutation of dep and ped is a substring of A
if B="ycc" return true as cyc is a substring
if B="yyc" return false.
Solution
We need to find whether a permutation of B exists as a substring of A. So there would be a substring of length of B in the string A which has exact same letter count as B. We will calculate the letter count of B in a hashmap. We will maintain 3 variables, existing letter positive count, existing letter negative count and non-existing letter. We will first initialize the hash map and the existing letter positive count with the letters of B. Now we will iterate over A with a window of length of B. As we come across new letters, we will decrease its count from the hash map. As letters are excluded from the window we will again add it to the hashmap. If at any point the hashmap contains zero for all letters, then we can say that in the given window we have found a substring which has exactly same count of letters as the word B. So we return true. If the length of B is k then we can search the hashmap and find out whether all letter count is zero or not in O(k) time. But by maintaining three extra variable we can do it in O(1) time. Whenever a letter count goes down by 1 but still stays positive we decrease the existing count positive. Whenever it becomes negative we increase the existing negative count. Similarly when letter count becomes 0 from -1 we again decrease the existing negative count. When some letter that is not present in the hashmap comes in the considered window we decrease non-existing count. And when it leaves the window we increase it. So when non-existing count becomes zero that means the considered window has only letters of B. Now if existing positive count and existing negative count becomes zero, that means all the letter count are zero. Why do we need positive and negative count? This is because one letter count can become -1 and one letter count becomes +1, then sum will be zero. although the substring is not having exact letter count for all letters.
Code
import java.util.HashMap; public class PermutationSubstring { public static void main(String[] args) { String a = "encyclopedia"; String b = "dope"; System.out.println(isPermutationSubstring(a, b)); } private static boolean isPermutationSubstring(String a, String b) { if (a.length() < b.length()) return false; int sameLetterPositive = b.length(); int sameLetterNegative = 0; int otherLetter = 0; HashMaphashMap = new HashMap (); for (char ch : b.toCharArray()) { if (hashMap.containsKey(ch)) { hashMap.put(ch, hashMap.get(ch) + 1); } else { hashMap.put(ch, 1); } } for (int i = 0; i < b.length(); ++i) { char ch = a.charAt(i); if (hashMap.containsKey(ch)) { int count = hashMap.get(ch); hashMap.put(ch, count - 1); if (count == 0) sameLetterNegative++; else sameLetterPositive--; } else { otherLetter--; } } if (sameLetterPositive == 0 && sameLetterNegative == 0 && otherLetter == 0) return true; for (int i = b.length(); i < a.length(); ++i) { char inclusion = a.charAt(i); char exclusion = a.charAt(i - b.length()); if (hashMap.containsKey(exclusion)) { int count = hashMap.get(exclusion); hashMap.put(exclusion, count + 1); if (count == -1) sameLetterNegative--; else sameLetterPositive++; } else { otherLetter++; } if (hashMap.containsKey(inclusion)) { int count = hashMap.get(inclusion); hashMap.put(inclusion, count - 1); if (count == 0) sameLetterNegative++; else sameLetterPositive--; } else { otherLetter--; } if (sameLetterPositive == 0 && sameLetterNegative == 0 && otherLetter == 0) return true; } return false; } }