Learn the basics of tries. This video is a part of HackerRank's Cracking The Coding Interview Tutorial with Gayle Laakmann McDowell. www.hackerrank.com/domains/tut...
Another key phrase to keep in mind (asked during an interview a few years back): "Imagine that you're implementing autocomplete for a video search engine..."
@@kumarmanish9046 That's what a person with low self esteem and confidence says, when they can't best themselves they try to pull others down and scoff at them, you're just one of those people. Think and wallow in contemplation for a minute about what you wrote. You can never in your life achieve anything and you know it's the truth.
Quick notes, Motivation: String manipulation problems involves searching through a list of words/sentences. As a standard solution, we would store words in string or character array and iterate through characters to perform operations. This typically is limited by number of words in the list to search for and the number of characters that we have in the input string. If length of input string is l and number of words to search through is n, the time complexity would be O(l*n). For example, if there are 5 million words and length of input string is say 10, there would be 50 million operations which is extremely expensive computationally. Our goal is to reduce the time complexity to O(l) which is a factor of length of input string(l) only. So, for the above example, we would only have to do 10 operations to complete our search. • Generally, We would use a tree like data structure when we can arrange data in a hierarchical way using a property of the data. For example, BST exploits the nature of sorted numbers ie the parent number is greater/smaller than the children. For words in the dictionary, we can exploit the feature that most words have common root. e.g house, housemaid • The above nature of a tree data structure helps in massively reducing our search space for problems. • A binary tree implemented correctly reduces the time complexity from O(n) to O(log n). Internally, because of the way it is organized, every traversal reduces the search space by half. And that's how there is huge save in time • In binary tree, every node can lead to potentially two ways. In a Trie, every node can lead to potentially 26 ways(for character space lowercase alphabets a-z). Thus reducing our search space by a factor of 26 Potential character space allowed for a trie node is decided by the problem statement. Till now, we assumed the simple case of lower case alphabets(a-z). If we allow upper case alphabets(A-Z), then our search space increases to 52. And similarly we can decide to allow special characters, numbers and so on and so forth
Not really, u didn't take into account that you need to build trie with 5 million words first, which takes n*m, where n is the number of words and m is the length of the longest word in that list.
My uni dont teach this data structure on the class, but they give me a project to build a mini search engine, then Trie is what i got recommend to put underlying the program
i used this in my game engine where i divided the world into a bunch of cubes in 3D space , and instead of searching a word using characters (like the example of the video) , i pass the (x,y,z) of the cube and i get quick access to the object in that world chunck
Huh, never heard of this before. I wonder if this would be handy for things like auto-complete mechanisms, or parsing partial commands and choosing the "closest" one, in an efficient way.
Guys Im trying to implement a spellchecker program that loads a dictionary file with over 140k plus words with the longest word having 45 letters, correct me if I'm wrong if I use tries at most I would be using 46 * 26 = 116 bytes of memory right and this data structure would be more efficient compared to a hash table right... In C btw
We need a Non-deterministic Finite Automata (NFA) for the word validation problem, and clearly it would be more complex than implementing a trie. Because for this automata, we need to specify all transition functions (to next valid state or to a stop state). That means, one node will have at least 26 children in a corresponding trie.
"The term trie was coined two years later by Edward Fredkin, who pronounces it /ˈtriː/ (as "tree"), after the middle syllable of retrieval" -- Wikipedia
Tries were first described by René de la Briandais in 1959. The term trie was coined two years later by Edward Fredkin, who pronounces it /ˈtriː/ (as "tree"), after the middle syllable of retrieval. However, other authors pronounce it /ˈtraɪ/ (as "try"), in an attempt to distinguish it verbally from "tree". --Wikipedia
She seemed to not include it for some reason but yes you're right. There would be another field in that Node class that stores the actual value (character) for that given node.
What did u say instead? Also what was the specific question? I’m gonna try and apply for internships soon and I wanna get a grasp of what kind of questions people get.
You wouldn't really have duplicates with a trie ( or I guess you could say you have many). Let's say that you had a trie with the entire English dictionary as data. For each letter, there is a word that starts with that letter from a..z. So, the first level of your trie would have the letters a..z. While there are many words that start with 'a', they would all share the same 'a' node within the trie. So when you are adding a new word, you would check if a node for that letter has already been created, and if so move down the tree, if not, create that node.
@@randomizednamme Even if two words have the same spelling they show on the same dictionary entry. Besides there would be no way to distinguish them, unless you wanted to have ids for those words? Then you'd just create an extra level
Awesome, never heard of this sort of tree. One thing comes to mind immediately is recursion, if I am looking for all words starting with prefix 'Ca', then at the subtree under 'Ca', i feel you could make some recursive calls. please share your view on this, myself havent' done recursion since 2nd year CS class (back in 1991 lol).
Right. You cannot balance a trie, because the position of a node encodes information -- if you were to move a node, you would necessarily change the data stored in the trie.
Her data structure implementation might have been in Java syntax, but it makes perfect sense in basically any other high level OOP language. Not sure how this is bound to Java as it's a pretty good high-level, language generic, explanation.
"This isn't something CS students might have spent that much time in school, but it's really really important for interview" Translation: You'll never use this in the real world but employers love to make impractical problems part of their interview process. Can't have enough hoops to jump through, especially if it has nothing to do with the role you're hiring for!
I graduated with a CS degree from a top tier school and now work for a big 4 company. Never used this in 10 years of actual work. It certainly has applications but pretty much irrelevant for 99% of CS grads. If you were to actually use it in the real world it wouldn't look like this anyways. It's good to know but silly to test for this for a SDE role.
why are people saying "Thank you so much these are life savers". She jumps from different topics and doesn't even explain it. I have her voice speed toned down and still can't understand the work.
Not taught in schools yet really important for interviews? Yeah that s bullshit. That is whats wrong with jobs wanting people to know stuff that schools didnt even show them.