One way to optimize this implementation is to store all already checked substrings in input S against there evaluation output matched/not matched. If the same substring comes up again later for matching then we can avoid processing it on the basis of the stored matched/not matched value. If matched then current substring will also match. If not matched then current will not match. It will help us bypass the testcase of S = "aaaaa.....aaaa" of length 10000 and words = {"a", "a", ... "a"} of length = 5000. It will increase memory usage but help to optimize on latency.
Basically, in one process it starts at index "i" and goes through entire string taking the window size, basically (i, i + window_size), (i+word_size, i + word_size +window_size) and so on generally (i+k*word_size, i + k*word_size +window_size) for different possible values of k. Now, he is saying i just needs to be 0, 1, 2 in this example and in general case (0, 1, 2, . . . , word_size - 1). Why this would suffice is what you are asking right? Let me explain this way: Lets suppose your answer lies at some (i_0, i_0 + window_size), the in the run where you started at (i_0%word_size) would do, why as i_0 would be of the form (i+k*word_size, i + k*word_size +window_size) for i = i_0%word_size and k = i_0//word_size. Please think on this line and let me know if this make sense.
java code explained the same way ,not my sol class Solution { public List findSubstring(String s, String[] words) { List ans = new ArrayList(); int n = s.length(); int m = words.length; int w = words[0].length(); HashMap map = new HashMap(); for(String x : words) map.put(x, map.getOrDefault(x,0)+1); for(int i=0; i 1) ? b - 1 : null); count--; k=k+w; } }//inner for loop }//outer for loop return ans; }//method }//class //................................................................
Leetcode has given 3 seconds time limit for it. It will paas in 2900 ms approx. But yes, you can try some optimizations on string operations as used by top performers in submissions. This idea is same but code can be modified but this code makes life easier for someone new :)
@@techdose4u I'm a newbie myself, your video helped me to understand algorithm. As far as I understand, the time is mainly spent on the copying operation curr = freq, so it’s faster to manually add and remove words from the curr dictionary. I stealed other code, very similar yo yours, it runs in 23 ms instead of 3sec: class Solution { public: vector findSubstring(string s, vector& words) { unordered_map dict_reference; //Эталонный словарь for (string& word: words) //Перебираю все слова в массиве words dict_reference[word]++; //Заполнение эталонного словаря (к его виду привожу текущий словарь каждую итерацию) int s_size = s.size(); //Длина s int word_size = words[0].size(); //длина words int words_count = words.size(); //кол-во words int window_size = word_size * words_count; //Размер sliding window vector result; //массив с ответами if (window_size > s_size or s.empty() or words.empty()) return result; for (int i = 0; i < word_size; i++) //Перебор всех стартовых позиций (проходим s words[0].size раз) { unordered_map dictionary; //Текущий словарь int left = i; //Левый край sliding window int right = i; //Правый край sliding window int count = 0; //Счетчик найденных слов while (right + word_size dict_reference[word]) //Пока текущий словарь толще эталонного { string leftWord = s.substr(left, word_size); left += word_size; dictionary[leftWord]--; //Удаляем из него самое левое слово (то от которого уже уехало sliding window) count--; //Счетчик найденных слов -- } if (count == words_count) result.push_back(left); //Все слова найдены, добавляем ответ в результат } else { dictionary.clear(); //убиваем текущий словарь count = 0; left = right; } } } return result; } };