[Data Structures] Tries (with Full Code)

4/27/2020

To watch the youtube video that goes over this topic, click here.

Introduction: Tries

A Trie is a tree data structure that stores sequences in a way such that overlapping elements (in the same position) can share the same node.

For example, "apple" and "apply" are two distinct sequences of characters, but they both share the initial sequence "appl".

A Trie data structure is a tree that holds sequences.

A Trie is implemented as a tree. Each node of the tree holds:

The data that's part of the sequence
A boolean that says whether the node is an end of a sequence
A list of pointers to other nodes

Trie's Underlying Data Model

As the picture above shows, a Trie is represented as a tree of nodes. However, unlike a binary trees where each node can just have a left and right pointer, you have to decide how many pointers each Trie node is going to have.

For example, a Trie that holds characters only containing the lowercase English alphabet will need exactly 26 pointers to other nodes. Or maybe our Trie holds phone numbers, in which case we expect only digits and each node holds 10 pointers (one for each digit).

We could hardcode each pointer if we know exactly what our Trie is going to hold.

We can clean this up by using an array (which still requires knowing the mapping. Ex. next[0] maps to the character 'a').

A more flexible approach (but potentially less efficient) is using a hash map, which means your Trie can scale to as many nodes as required without needing to know exactly how many ahead of time.

Now we can handle unexpected character inputs like punctuation (! ? &) or special characters.

We initialize our Trie with an empty root TrieNode, and from that rootnode we add/update nodes as necessary with methods designed to traverse and edit TrieNodes.

Trie Operations (with Pseudocode)

As with most tree data structures, traversing nodes is best done with recursion or iteration. Most of the operations we'll see utilize iteration, but recursion is also fine.

Initializing the Trie
Assuming we have our TrieNode structure set up, we start by initializing an empty node to be the root of our tree.

Insert
If we want to insert a sequence into our Trie, we iteratively traverse the Trie for each element in the sequence. If the node doesn't exist for a certain element in the Trie, then we create it. At the last element of the sequence, we mark the node as endOfSequence=true to indicate that node ends a sequence.

Contains
To check if a sequence was inserted into our Trie, we iteratively traverse the Trie. If we ever find that one of the elements in the sequence doesn't have a node for it in the Trie, we return false. If we get to the end of the sequence and the node in Trie doesn't have endOfSequence==true then we return false. (This can happen in cases where we inserted longer sequences. Ex. Insert "antler" but check if our Trie contains "ant".)

Prefix Exists
To check if there's any sequence in our Trie that starts with a sequence, we can iteratively traverse the Trie and make sure each element of the sequence has a node in the Trie. For example, check if any sequences begin with "ant".

(This is very similar to the Contains operation so code is omitted in this blog post. See the full code on github to see how it's implemented.)

The time and space complexity for all of these operations are O(number of elements in the sequence). Tries are very efficient.

To see full Trie code, see the code I wrote on github that creates a Trie class.

When To Use Tries

Tries are really good data structures if you need to:

Autocomplete a word. (You can pass a your sequence into the Trie and the Trie can look down the node paths to find matching words.)
Create a dictionary. Any time you need to insert words and check if they exist, tries are the best data structure. They're even more efficient on memory than hash maps.

Tries are good for any type of sequence, not just words.

To test your knowledge, Leetcode has a problem on implementing the methods of a Trie.

Like this content and want more? Feel free to look around and find another blog post that interests you. You can also contact me through one of the various social media channels.

Twitter: @srcmake
Discord: srcmake#3644
Youtube: srcmake
Twitch: www.twitch.tv/srcmake
Github: srcmake

Comments are closed.

Author

Hi, I'm srcmake. I play video games and develop software.

Pro-tip: Click the "DIRECTORY" button in the menu to find a list of blog posts.

License: All code and instructions are provided under the MIT License.

Discord

Chat with me.

Youtube

Watch my videos.

Twitter

Get the latest news.

Twitch

See the me code live.

Github

My latest projects.