For the first feature, your company wants you to design a module for the search engine that can be used to store and fetch words efficiently. This module will act as a dictionary with insert and search functionalities. Moreover, this dictionary should also have a feature for searching whether a given prefix exists in the dictionary or not. This feature can be represented by the startsWith function because a prefix comes at the beginning of the word.
Let’s say we insert the following words in the dictionary:
abc by calling the
insertWord() function. Then, by calling the
searchWord() function with input,
there should return
True. Similarly, calling the
startsWith() function with the prefix
by should also return
To implement this dictionary with efficient lookup times, we will use a trie data structure. Maybe you had already come to this conclusion, or maybe you were considering using a hash table instead. However, we will not use a hash table for this particular scenario is because it would be very inefficient for the prefix searching. Additionally, scaling hashtables for large datasets also increases collisions and space complexity. On the contrary, a trie is efficient in both time and space.
Briefly, a trie is a tree of characters. The tree takes a word as a parameter and creates a new node for each character of that word. The node registered for the last character of the word is marked as the end of the word, using a Boolean variable.
To implement the
WordDictionary class, we will do the following:
- Constructor: We will initialize the root node of the tree in the constructor. This node will be of type
Nodeclass contains a dictionary of nodes and the Boolean
isWord, which determines if the character is at the end of a word or not.
insertWord()function: In this function, we will take in a
wordas input. Starting from the
rootnode, we will add the word’s characters to the
childrendictionary of each character as nested dictionaries. We will check if the child node with the character is present or not at each step. If it’s not present, a new node is initialized. For the last character of the
word, we also set the
Truefor the corresponding node.
searchWord()function: In this function, we begin checking from the
rootnode to see if the first character exists in
children. If it exists, we move on to that node and check its
childrenfor the next character. If at any point the node corresponding to a character is not found, we return
False. If the whole
wordis found but
isWordis not set to
Truefor the last node, we return
Trueis only returned if all the characters match and the word also ends at that point.
startsWith()function: This function is mostly the same as the
searchWordfunction. The only exception is that we do not check if
isWordis also set to
Truein the last-found node because we are not looking for a complete word, a prefix is enough. In the above trie, for instance,
Let’s look at the code for the solution below:
insertWord(): The time complexity is O(l), where l is the length of the word being inserted.
searchWord(): The time complexity is O(l), where l is the length of the word that we need to search in the trie.
startsWith(): The time complexity is O(l), where l is the length of the prefix that we need to search in the trie.
insertWord(): The space complexity is O(l) because, in the worst case, we will add l nodes in the trie.
searchWord(): The space complexity is O(1) because constant space is used while searching the trie.
startsWith(): The space complexity is O(1) because constant space is used while searching the trie.