In phase , the naive method traverses characters, since each extension follows the pathstr[j...i] from the root. The next extension follows str[j+1...i], which overlaps entirely with the previous path except for str[j].
This redundancy can be avoided using shortcuts that exploit the shared structure between successive extensions to avoid repeated traversal.
Suffix Links
A suffix link is a pointer from one internal node u to another internal node v in an implicit suffix tree, where:
- The path to
uis labelled bystr[j...k−1] - The path to
vis labelled bystr[j+1...k−1]
If the path str[j...k−1] is decomposed into x = str[j] and β = str[j+1...k−1], so that str[j...k−1] = xβ. Then if an internal node’s path corresponds to xβ, its suffix link points to the node whose path is β.
These links serve as shortcuts that speed up traversal by removing the need to re-traverse shared substrings.

Corner Cases To Consider
- If the path from
rtouyields a single character substring, the path fromrtovwill yield an empty substring (i.e.r = v)- If
u = rthen the substring traverse is empty, and sov = ras well, therefore the root always links to itself.
Example Of A Suffix Link
In the implicit suffix tree for
str[1...8] = abacabad, examining each internal node reveals the suffix links:
- The path from
rtourepresents the substring “a” (x = a,β = ε) and has a suffix link pointing back to the rootr, as it follows the initial path.- The path from
rtowrepresents the substring “ba” (x = b,β = a) and has a suffix link pointing back tou, which corresponds to the pathβ = a.- The path from
rtovrepresents the substring “aba” (x = a,β = ba) and has a suffix link pointing back tow, which corresponds to the pathβ = ba.
Active Node & Remainder
In any given extension:
- The active node is the internal node at the end of the path
str[j...k-1]under whose direct edge the suffix extension rules are applied. - The remainder is the substring
str[k...i]remaining below the active node, and before the extended characterstr[i+1].
These variables are used to keep track of the extension point—the point in the tree where the next extension rule is applied (i.e. the end of the path labelled str[j...i]).
If the active node is the root, the remainder is updated by removing the first character of the current substring
- The remainder is the portion of the string that remains to be extended.
- When the active node is the root, its suffix link points to itself.
- The suffix link points to a node with one less character (the first character) in its path
- This leads to the removal of the first character from the remainder
- E.g. if the active node is
rand the remainder isstr[k...i] = cabad, following the suffix link updates the remainder tostr[k+1...i] = abad, discardingc.
Using Suffix Links
Starting at extension of phase , there isn’t enough information to use suffix links, the algorithm must first naively locate the initial extension point by traversing the tree from the root.
First Extension Of A Phase
The First Extension of a Phase is Always Rule 1
At the start of any phase , the first extension () always follows Rule 1.
This is because the longest existing path in the tree at this point is
str[1...i], which ends at a leaf and has not yet been extended.Therefore, the next character
str[i+1]is simply appended to this leaf.This behaviour follows from a more general structural property of suffix trees—see Lemma 2.
This provides a guaranteed, well-structured starting point for the phase as, since the path ends at a leaf:
- The active node is the parent of this leaf, and
- The remainder is the substring remaining below the parent (excluding
str[i+1]).
After this first extension, the tree has sufficient structure, and the algorithm has enough information to support suffix link traversal in subsequent extensions.
Extension of phase
The extension
j = 1in phasei + 1 = 9of the implicit suffix tree construction forstr[1...9] = abacabaderequires traversingstr[1...8] = abacabad, which leads to leaf where a rule 1 extension is performed to addstr[9] = e.Here the active node would be
vand the remainder isstr[4...8] = cabad
The active node and remainder for extension are then used for the next extension
Subsequent Extensions
Extension of phase
To find the end of
str[2...i]and extend it bystr[i + 1]:
- From the active node
v, follow its suffix link tow- From
w, traverse down along the remainder substringstr[k...i] = str[4...8] = cabad- Apply the pertinent suffix extension rule
The path to v is the substring str[1...3] = aba, and the remainder substring is str[4...8] = cabad, which is the remaining portion of the path to the extension point (the end of the path labelled str[1...8])
By following the suffix link to w, the algorithm avoids re-traversing the path str[2...3] = ba, allowing it to directly traverse down the remainder substring str[4...8] to reach the extension point (the end of the path labelled str[2...8])
After the extension, the new active node is w, while the remainder remains as str[4...8] = cabad
Extension of phase
To find the end of
str[3...i]and extend it bystr[i + 1]:
- From the active node
w, follow its suffix link tou- From
u, traverse down along the remainder substringstr[4...8] = cabad- Apply the pertinent suffix extension rule
Once again, note that if the active node is the root r, then the suffix link points to itself. Due to this, the first character of the remainder substring needs to be removed, to update the extension point from the end of the path str[1...i] to the end of the path str[2...i].
General Extension Procedure Using Suffix Links
For any extension of phase , where :
- Base Case: At extension , suffix links are not used. Instead:
- Naively locate the initial extension point by traversing the tree from the root.
- Determine the active node
uand the remainderstr[k...i]. - Apply a rule 1 extension (always the case for the first extension).
- For extensions ,
- Given the active node
uand the remainderstr[k...i](which may be empty) from the previous extension : - Follow the suffix link from
uto the receiving nodev, and traverse fromvto the end of the remainder. - Note that, If
u = r, thenv = r, and the first character must be removed from the remainder before traversing.
- Given the active node
- Once the remainder is fully traversed, the extension point is reached, and the appropriate suffix extension rule is applied.
- After each extension, the active node and remainder are updated, and the algorithm moves to the next extension .
This procedure enables efficient traversal between extension points by removing the need to traverse the entire path str[j...i] for each extension.
Every Implicit Suffix Tree Has An Outgoing Suffix Link
The general extension procedure using suffix links assumes that every internal node in the implicit suffix tree has an outgoing suffix link. To prove this, there are some important lemmas to consider:
Lemma 1: Rule 2 extensions are never followed by rule 1 extensons
If extension of phase is performed using rule 2, then extension will not be performed using rule 1
- If rule 2 applies at extension , the path
str[j...i]doesn’t end at a leaf and is followed by other characters- Since
str[j...i]was not a leaf and is a prefix of some longer string, in the next extension , the pathstr[j+1...i]must be the same- Rule 1 only applies if the path ends at a leaf, so it cannot apply here.
Therefore, a rule 2 extension is always followed by either rule 2 or rule 3 in the next extension of the same phase, never rule 1.
Rule 2 Regular:
Rule 2 Alternate:
This lemma is actually a corollary of Lemma 2
Lemma 2: There's more structure below the shorter path
The subtree below some path
str[j...i]in the is a subtree of the subtree below the pathstr[j+1...i].“Subtree below some path
str[j...i]” → Follow the pathstr[j...i]in the tree, the subtree referenced is the subtree formed by any subsequent characters after that path.
- If
str[j...i]is added to the tree at extension of phase ,
- Then
str[j+1...i]is added in the next extension .- If in a later phase,
str[j...i]xis inserted (where is non-empty),
- Then in the next extension
str[j+1...i]xwill also be inserted.- Thus, by the end of any later phase: Every extension under
str[j...i]also exists understr[j+1...i].- However, not all instances of
str[j+1...i]are extensions ofstr[j...i]asstr[j+1...i]may include additional branches in its subtree thatstr[j...i]does not.
- The subtree below
str[1...3] = abahas two branches: one tod, another tocabad.
- It is identical to the subtree below
str[2...3] = babecause every extension ofabais followed by an extension ofba.- In contrast,
str[3] = aalso contains the subtree forba, but has extra structures.
- That’s because
aappears in other places (e.g.,str[1],str[5]) where it’s not part ofba, and those branches (like one beginning withb) are present only undera, notba.- This additional structure is what allows rule 3 to follow rule 2 — even if
str[j...i]isn’t followed by the next character,str[j+1...i]might be, due to these additional branches
Theorem 1: Every internal node in the implicit suffix tree has an outgoing suffix link.
Let be the statement that every internal node in the implicit suffix tree has an outgoing suffix link, we will prove this statement by mathematical induction on .
Base Case:
Let and consider , which contains a single internal node, the root. By definition the root has an outgoing suffix link that points to itself and so is true.
Inductive Hypothesis
Assume that is true, where . That is, assume that _every internal node in _ has an outgoing suffix link.
Inductive Step
In phase
i + 1the implicit suffix tree is constructed from .By the inductive hypothesis every internal node in has an outgoing suffix link at the beginning of the phase and no suffix links are ever removed from the tree.
Thus, all that remains is to prove that every new internal node created in phase has an outgoing suffix link by the conclusion of the phase.
At some extension , a new internal node
vis only added to the tree via Rule 2 Regular extension that splits an existing edge. For this to occur the path labelledstr[j...i]must have been continued by a characterx, such thatx != str[i+1].![[Speeding Up Traversals With Suffix Links.png|An example of a Rule 2 extension adding a new internal node
vto the implicit suffix tree]]By Lemmas 1 and 2 we know that extension can only be performed using a rule 2, or a rule 3, thus there are only two cases to consider:
Case 1
The path labelled
str[j+1...i]continues only viax != str[i+1]and so extensionj + 1will also be performed via rule 2 regular and a new internal nodewwill be created.The path to this new internal node is
str[j+1...i]and so by definition the outgoing suffix link fromvpoints tow![[suffix_link_existence_2.png|If a new internal node
vis created in extensionjand a new internal nodewis created in extensionj + 1, then the outgoing suffix link fromvpoints tow]]Case 2
The path labelled
str[j+1...i]ends in an existing internal nodeu, with one branch extending below viax != str[i+1]and other branches via other characters, i.e. the subtree understr[j+1...i]is larger than the subtree understr[j...i].If none of these characters are
str[i + 1], then thej + 1extension is performed using rule 2 alternate. Otherwise, if one of the characters isstr[i + 1], then the extension is performed using rule 3.In both cases the outgoing suffix link from
wpoints touby definition.![[suffix_link_existence_3.png|If a new internal node
wis created in extension and the pathstr[j+1...i]ends in an existing nodeudirectly below which either a rule 2 alternate, or rule 3 is performed, then the outgoing suffix link fromwpoints touby definition]]Thus, whenever a new internal node is created, its suffix link is resolved in the very next extension of the phase. This however relies on the fact a new node is never created in the last extension of a phase.
- Note the last extension in any phase traverses the path
str[i+i...i] = ""so the extension is made directly from the root.- The suffix
str[i+1]is either added via a rule 2 alternate extension, or via a rule 3 extension, and so no new internal node is created.With this we have shown that by the end of phase , every internal node in will have an outgoing suffix link and so is true.
Therefore, by the principle of mathematical induction is true for all .
Theorem 1 proves that the general extension process is sound as the active node is never a newly created internal node and so its outgoing suffix link is always defined.
However, even when using suffix links, the method still requires -time as it naively traverse the remainder substring character by character in every iteration.
There are four optimisation tricks to ensure the algorithm performs more efficiently.






