Efficient string handling is very important in coding tests and algorithm problem solving. Understanding string-related problems is essential as they are frequently included by many companies. Today, we will learn how to efficiently store and retrieve strings using a data structure called ‘Trie’.

What is Trie?

A trie, also known as a ‘Prefix Tree’ or ‘Digital Tree’, is a tree-based data structure used to store a collection of strings. It is primarily suitable for string searching or implementing auto-complete features. A key feature of the trie is that each node contains information about each character of the string, allowing for very efficient storage and retrieval of strings that share common prefixes. For example, if there are the strings “hello”, “helicopter”, and “helic”, they can be represented with a single node because they share the first two characters “he”.

Problem Statement

Here is a problem where the trie data structure can be used:

Problem: Implement a function that counts how many words in the given list have a specific prefix.

Input: An array of strings words (e.g., [“apple”, “app”, “apricot”, “banana”, “band”, “bandana”]) and a string prefix (e.g., “ap”).

Output: The number of words that start with the given prefix.

Solution Process

1. Implementing the Trie

First, we will define a node class to implement the trie data structure. Each node will have the following properties:

  • children: The child nodes for moving to the next character.
  • isEndOfWord: A boolean variable indicating whether the node is the end of a word.
class TrieNode {
        Map children;
        boolean isEndOfWord;

        public TrieNode() {
            children = new HashMap<>();
            isEndOfWord = false;
        }
    }

2. Trie Insert Method

We will implement a method to insert a word into the trie. We will traverse each character of the string and create a new child node if the child node for that character does not exist, and set isEndOfWord to true when we reach the last character.

class Trie {
        private final TrieNode root;

        public Trie() {
            root = new TrieNode();
        }

        public void insert(String word) {
            TrieNode node = root;
            for (char ch : word.toCharArray()) {
                node = node.children.computeIfAbsent(ch, c -> new TrieNode());
            }
            node.isEndOfWord = true;
        }
    }

3. Prefix Search Method

To count the number of words that start with a given prefix, we first need to find the node corresponding to the prefix. After finding the prefix node, we will count the number of words in the subtree using DFS.

public int countWordsWithPrefix(String prefix) {
        TrieNode node = root;
        for (char ch : prefix.toCharArray()) {
            if (!node.children.containsKey(ch)) {
                return 0; // Return 0 if prefix does not exist
            }
            node = node.children.get(ch);
        }
        return countWords(node);
    }

    private int countWords(TrieNode node) {
        int count = node.isEndOfWord ? 1 : 0; // Add 1 if current node is the end of a word
        for (TrieNode child : node.children.values()) {
            count += countWords(child); // Recursive call for child nodes
        }
        return count;
    }

Complete Code

Now, I will present the entire complete code by integrating all the methods above.

import java.util.HashMap;
    import java.util.Map;

    class TrieNode {
        Map children;
        boolean isEndOfWord;

        public TrieNode() {
            children = new HashMap<>();
            isEndOfWord = false;
        }
    }

    class Trie {
        private final TrieNode root;

        public Trie() {
            root = new TrieNode();
        }

        public void insert(String word) {
            TrieNode node = root;
            for (char ch : word.toCharArray()) {
                node = node.children.computeIfAbsent(ch, c -> new TrieNode());
            }
            node.isEndOfWord = true;
        }

        public int countWordsWithPrefix(String prefix) {
            TrieNode node = root;
            for (char ch : prefix.toCharArray()) {
                if (!node.children.containsKey(ch)) {
                    return 0;
                }
                node = node.children.get(ch);
            }
            return countWords(node);
        }

        private int countWords(TrieNode node) {
            int count = node.isEndOfWord ? 1 : 0;
            for (TrieNode child : node.children.values()) {
                count += countWords(child);
            }
            return count;
        }
    }

    public class Main {
        public static void main(String[] args) {
            Trie trie = new Trie();
            trie.insert("apple");
            trie.insert("app");
            trie.insert("apricot");
            trie.insert("banana");
            trie.insert("band");
            trie.insert("bandana");
            
            System.out.println(trie.countWordsWithPrefix("ap")); // Output: 3
        }
    }

Conclusion

Today, we looked at the trie data structure. Tries are very useful for string searching and implementing auto-complete features, with efficient storage and retrieval capabilities. It would be beneficial to actively utilize tries to solve various string-related problems. In particular, for those preparing for coding tests, tries are essential data structures that must be understood.

Not only for direct problem-solving, but the structure and utilization of tries are techniques widely used in practice. Challenge yourself with various algorithm problems starting with tries!