Post

Trie 前缀树

Data Structure: Trie 数据结构——前缀树

Trie 前缀树

Trie

A Trie(/ˈtraɪ/, to distinguish from ‘tree’) is a specialized data structure to retrieve strings. It works better than normal Binary Search Trees or Binary Trees

1 Structure

Trie is a k-way tree with each of its nodes having a label. This label uses true or false to distinguish Terminal Nodes and Non-Terminal Nodes. Terminal Nodes stand for a word from the letters starting from the root, while Non-Terminal Nodes are part of a word and do not form a word. The root node is labeled false and stands for an empty string. Each node is connected to its children nodes letter-wise.

TrieExample An Example Of Trie

2 Operations

1
2
3
4
5
6
7
8
function search(node,queryStr):
    if( queryStr == "" ):
        return node.label
    c, tail = queryStr.splitAt(0)
    if(node.children[c] == null):
        return false
    else
        return search(node.children[c],tail)

2.2 Insert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function insert(node,str):
    if(str == ""):
        node.label = true
        return
    c,tail = str.splitAt(0)
    if(node.children[c] != null):
        return insert(node.children[c],tail)
    else
        return addNewBranch(node,str)

function addNewBranch(node,str):
    if(str == ""):
        node.label = true
        return
    c,tail = str.splitAt(0)
    node.children[c] = new Node(false)
    return addNewBranch(node.children[c],tail)

2.3 Delete

1
2
3
4
5
6
7
8
9
# No trimming:
function remove(str):
    node = searchNode(this::root,str)
    if(node == null || node.label == false):
        return false
    else
        node.label=false
        return true

2.4 Longest Prefix

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
function longestPrefix(node, str, res):
    if(str == ""):
        if(node.label):
            return res
        else
            return null
    c,tail = s.splitAt(0)
    if(node.children[c]==null):
        if(node.label):
            return res
        else
            return null
    else
        child = longestPrefix(node.children[c], tail, res+c)
        if(child != null): # child result is longer than current result
            return child
        else if(node.label): # tail is not found in child nodes, but current prefix is not null
            return res
        else
            return null

2.5 All Keys

1
2
3
4
5
6
7
8
9
10
11
12
13
14
function keysStartsWith(prefix):
    node = searchNode(this.root, prefix)
    if(node == null):
        return null
    else
        return findAll(node, prefix)

function findAll(node,prefix):
    keys = []
    if(node.label):
        keys.add(prefix)
    for c in node.children.keys():
        keys.append(findAll(node.children[c],prefix+c))
    return keys

3 Complexity

MethodBSTBST with HashTrie
search$O(m\times log(n))$$O(m + log(n))$$O(m)$
insert$O(m\times log(n))$$O(m + log(n))$$O(m)$
remove$O(m\times log(n))$$O(m + log(n))$$O(m)$
longestPrefix$O(m\times n)$$O(m + n)$$O(m)$
keysStartsWith$O(m\times n)$$O(m + n)$$O(n+m)$(average)
This post is licensed under CC BY 4.0 by the author.