we want to implement the auto-complete query. This is the feature that prompts the search engine to give you some suggestions to complete your query when you start typing something in the search bar.
Assume the search engine has the following history of queries: {"beautiful", "best quotes", "best friend", "best birthday wishes", "instagram", "internet"}
. Additionally, you have access to the number of times each query was searched. The following list shows the number each query string occurred, respectively: {30, 14, 21, 10, 10, 15}
.
Given these parameters, we want to implement an autoComplete()
a function that takes in an incomplete query a user is typing and recommends the top three queries that match the prefix and are most popular.
The system should consider the inputs of the
autoComplete()
function as a continuous stream. For example, if theautoComplete("be")
is called, and then theautoComplete("st")
is called, the complete input at that moment will be"best"
. The input stream will end when a specific delimiter is passed. In our case, the delimiter is"#"
, and,autoComplete("#")
will be called. This marks the end of the query string. At this point, your system should store this input string in the record, or if it already exists, it should increase its number of instances.
Suppose, the current user has typed "be"
in the search bar; this will be the input for the autoComplete()
function, meaning autoComplete("be")
will be called. It will return {'beautiful', 'best friend', 'best quotes'}
because these queries match the prefix and are the most popular. The order of queries in the output list is determined by popularity. Then, the user adds "st"
to the query, making the string "best"
, and the autoComplete("st")
will be called. Now, the output should be {'best friend', 'best quotes', 'best birthday wishes'}
. Lastly, the user finishes the query, so the autoComplete("#")
will be called. It will output []
and "best"
will be added to the record to be used in future suggestions.
Solution
To design this system, we will again use the trie data structure. Instead of simply storing the words in the prefix tree, we will now store the query strings. The AutocompleteSystem
class will act as a trie that keeps a record of the previous queries and assigns them a rank based on their number of occurrences.
We will implement the AutocompleteSystem
class as follows:
AddRecord()
: This function inserts records in the trie by creating new nodes. Each node of the trie will have:
- A
children
dictionary - A Boolean called
isEnd
to set the end of the query sentence - A new variable called
data
that is optional, but we can use it to store the whole query sentence in the last character of the sentence. This will increase space complexity but can make computation easier. - A
rank
variable to store the number of occurrences.
AutoComplete()
: This function checks if the input string is not the end of the string delimiter "#"
. If it is not the end of the input string, we append the value to the keyword
member variable. Then, we call the search()
function, which returns the list of tuples as discussed above. On the other hand, if the input string is the end of the input, we add the value present in keyword
to the trie by calling addRecord()
. Next, the value of the keyword
variable is reset to an empty string. Before returning, you can see that we do some computation on the result
list. The list that we received from the search()
function was a list of tuples with rank
and sentence
as elements. We will sort the array in ascending order using the sorted()
function and fetch the first three elements of the sorted list. From this list of three tuples, we will create a list containing only the second element of each tuple, meaning only the sentences.
Time complexity
- constructor: The time complexity is O(n×l), where n records of average length l are traversed to create the trie.
autoComplete()
: The time complexity is O(q+m+[l×log(l)]), where q is the length of the query string at that moment and m is the number of nodes in the trie. The factor l×log(l) indicates the time taken to sort the list of queries to obtain the top three ranking queries.
Space complexity
- constructor: The space complexity is O(n×l), where n records of average length l are stored in the trie.
autoComplete()
: The space complexity is O(n×l), where n records of average length l are stored in the list to return at the end.