Signup/Sign In

Trie Data Structure

Posted in Programming   LAST UPDATED: OCTOBER 11, 2021

    In the previous post we covered the the Minimum Spanning Tree. many of you must have gone through it, and would have liked the algorithms explained there, but today you many of you must have already forgotten their implementation details. So what would you do? You will open your favorite browser and just start typing minimum... , and voila, the link to the blog will be automatically suggested to you by the browser. This happens to us very often, right? We open a link, read something, and then when we again type in something similar in the search bar of our browser, the browser automatically suggests the link.

    But how does our browser remembers all this? How does it know what we are looking for?

    A network browser keeps a history of the URLs of sites that you have visited. By organizing this history as a trie data stucture, you need only type the prefix of a previously used URL and the browser can complete the URL very easily.

    Does this sound something familiar? The autofill feature also uses the same technique. It maintains trie of strings and auto completes the word by searching for a string with the same prefix and having maximum frequency count.


    The Trie Data Structure

    Tries are made up of a set of nodes connected by pointers, which indicate parent-child relationship between them. Each node can contain multiple branches, with each of the branches representing a possible key character.

    Let us consider a trie which stores strings of English alphabets, in this case we will have a total 26 possible characters. In case of binary tries, we would have only 2 characters.

    In the usual implementation, we have a blank root node which points to all other starting characters of the set of strings, which then point to the corresponding next character.

    So let us build a trie. Suppose we have a list of strings ,

    List = { "study", "student", "stuff" , "ton", "tight" }

    So we start with a blank node and point it to character 's'. Now we will keep on linking it to the next characters until the end is reached, i.e., we finish the 'study' string. Now for 'student', we start from blank node, search if there is any 's' there. Since we already have one, now we search if it has a connected node with character 't'. We proceed in the same way and create a new node if we are not able to find an existing one.

    trie data structure example


    Trie Implementation in C++

    Now that we know how a trie data structure is formed, let us see its implementation in C++.

    #include<bits/stdc++.h>
    
    using namespace std;
    
    class trie {
    
        public:
        // pointers to all possible child nodes
        trie *node[total_size];
        int count_words;
    
        /* 
           a count variable storing frequency of string ending at that node
           function to make a new node and initialise it
        */
        trie *make_new_node() {
            trie *new_node = new trie;
            new_node->count_words = 0 ;
            for(int i=0;i<total_size;i++)
                new_node->node[i]= NULL;
            return new_node;
        }
    
        // function to insert a new string in trie
        void insert(trie *root, string s)
        {
            trie *temp = root;
            int i,id;
            bool flag = true; // to check if string is already present
            for(i=0;i<s.size();i++)
            {
                id = s[i] - 'a';
                // if there is no next node, make one
                if(!temp->node[id])
                {
                    flag = false;
                    temp->node[id] = make_new_node();
                }
                temp = temp->node[id];
            }
            // increment count of the string
            temp->count_words = temp->count_words + 1 ;
            if(flag)
                cout<<"String already present in trie , count updated .\n";
            else
                cout<<"String inserted !\n";
        }
    
        // function to search if a given string exists in trie
        bool search(trie *root,string s)
        {
            trie *temp = root;
            int i, id;
            for(i=0;i<s.size();i++)
            {
                id = s[i]-'a';
                // if next node is null , but string has more characters left, return false
                if(!temp->node[id])
                    return false;
             
                temp = temp->node[id];
            }
    
            // if node has a positive count , return true
            if(temp->count_words > 0)
                return true;
            else
                return false;
            }
        }
    
        // function to count frequency of strings
        int frequency(trie *root,string s)
        {
            trie* temp = root;
            int i, id;
            for(i=0;i<s.size();i++)
            {
                id = s[i] - 'a';
                
                if(!temp->node[id])
                    return 0;
               
                temp = temp->node[id];
            }
    
            return temp->count_words;
        }
    };
    
    // main method
    int main()
    {
        int t, q, i, p;
        string s;
        trie node;
        trie *root = new trie ;
        root = node.make_new_node();
    
        // insertion
        cin >> t;
        while(t--)
        {
            cin >> s;
            node.insert(root,s);
        }
    
        //searching
        cin >> q;
        while(q--)
        {
            cin >> s;
            
            if(node.search(root,s))
                cout << "String Found\n";
            else
                cout << "String Not Found\n";
        }
    
        // output frequency
        cin >> q;
        while(q--)
        {
            cin >> s;
            p = node.frequency(root,s);
            cout << p << endl;
        }
    
        return 0;
    }

    Sample Input:

    3
    study
    student
    study
    2
    top
    study
    3
    lol
    study
    ss

    Output:

    String inserted !
    String inserted !
    String already present in trie , count updated .
    String Not Found
    String Found
    0
    2
    0
    


    Time Complexity of Trie

    Time Complexity of trie creation: O(Length of longest string * Number of strings)

    Time Complexity of searching: O(Length of string)

    You may also like:

    About the author:
    In love with Algos !
    Tags:Data Structures
    IF YOU LIKE IT, THEN SHARE IT
     

    RELATED POSTS