Given a string, S, of length m, a suffix tree for S is a rooted directed tree with m leaves corresponding to the m suffixes of S. Each edge is labeled with a substring of S in such a way that the edge labels on any path from the root to a leaf spell out the suffix corresponding to that leaf when concatenated. No two edges out of a node begin with the same letter, and every internal node has at least two children (except possibly the root).
Definition: The string-depth of a node is the length of the string that leads to that node. A path that ends in the middle of an edge is said to split that edge.
Implicit suffix tree: An implicit suffix tree can be formed by building a suffix tree for S$ (the string S followed by a character not occurring in S) then removing every $, edges with no labels, and nodes with fewer than two children.
Ukkonen's algorithm forms the implicit suffix tree for the prefix S[1..i+1] from the implicit suffix tree for S[1..i] in phase i, which consists of i+1 extensions, extending the suffix S[j..i] to S[j..i+1] according to the following rules:
Extension rule 1: If S[j..i] ends at a leaf, the character S(i+1) is added to the label of that leaf.
Extension rule 2: If no path from the end of S[j..i] starts with S(i+1) a new leaf edge (to a leaf numbered j) is created. A new node needs to be created if S[j..i] ended in the middle of an edge.
Extension rule 3: If a path from the end of S[j..i] already starts with S(i+1) nothing needs to be done.
Definition: A suffix link goes from a node labeled xw to a node, s(v), labelled w. x is a single letter and w is a string (possibly empty).
Single extension algorithm:
Speedup trick 1: When following a path that must exist in the tree, just match the first letter on each edge
Edge label compression: Label edges with the pair of integers (begin position, end position) instead of explicitly giving the letters labeling the edge
Speedup trick 2: In any phase, when extension rule 3 first applies, that phase can be ended
Speedup trick 3: Label leaves (begin position, end of string) where "end of string" is a global position variable that is equal to i+1 in phase i+1.