In phase , the naive method traverses characters, since each extension follows the pathstr[j...i] from the root. The next extension follows str[j+1...i], which overlaps entirely with the previous path except for str[j].

This redundancy can be avoided using shortcuts that exploit the shared structure between successive extensions to avoid repeated traversal.

A suffix link is a pointer from one internal node u to another internal node v in an implicit suffix tree, where:

  • The path to u is labelled by str[j...k−1]
  • The path to v is labelled by str[j+1...k−1]

If the path str[j...k−1] is decomposed into x = str[j] and β = str[j+1...k−1], so that str[j...k−1] = xβ. Then if an internal node’s path corresponds to , its suffix link points to the node whose path is β.

These links serve as shortcuts that speed up traversal by removing the need to re-traverse shared substrings.

Active Node & Remainder

In any given extension:

  • The active node is the internal node at the end of the path str[j...k-1] under whose direct edge the suffix extension rules are applied.
  • The remainder is the substring str[k...i] remaining below the active node, and before the extended character str[i+1].

These variables are used to keep track of the extension point—the point in the tree where the next extension rule is applied (i.e. the end of the path labelled str[j...i]).

Starting at extension of phase , there isn’t enough information to use suffix links, the algorithm must first naively locate the initial extension point by traversing the tree from the root.

First Extension Of A Phase

The First Extension of a Phase is Always Rule 1

At the start of any phase , the first extension () always follows Rule 1.

This is because the longest existing path in the tree at this point is str[1...i], which ends at a leaf and has not yet been extended.

Therefore, the next character str[i+1] is simply appended to this leaf.

This behaviour follows from a more general structural property of suffix trees—see Lemma 2.

This provides a guaranteed, well-structured starting point for the phase as, since the path ends at a leaf:

  • The active node is the parent of this leaf, and
  • The remainder is the substring remaining below the parent (excluding str[i+1]).

After this first extension, the tree has sufficient structure, and the algorithm has enough information to support suffix link traversal in subsequent extensions.

Extension of phase

The extension j = 1 in phase i + 1 = 9 of the implicit suffix tree construction for str[1...9] = abacabade requires traversing str[1...8] = abacabad, which leads to leaf where a rule 1 extension is performed to add str[9] = e.

Here the active node would be v and the remainder is str[4...8] = cabad

The active node and remainder for extension are then used for the next extension

Subsequent Extensions

Extension of phase

To find the end of str[2...i] and extend it by str[i + 1]:

  1. From the active node v, follow its suffix link to w
  2. From w, traverse down along the remainder substring str[k...i] = str[4...8] = cabad
  3. Apply the pertinent suffix extension rule

The path to v is the substring str[1...3] = aba, and the remainder substring is str[4...8] = cabad, which is the remaining portion of the path to the extension point (the end of the path labelled str[1...8])

By following the suffix link to w, the algorithm avoids re-traversing the path str[2...3] = ba, allowing it to directly traverse down the remainder substring str[4...8] to reach the extension point (the end of the path labelled str[2...8])

After the extension, the new active node is w, while the remainder remains as str[4...8] = cabad

Extension of phase

To find the end of str[3...i] and extend it by str[i + 1]:

  1. From the active node w, follow its suffix link to u
  2. From u, traverse down along the remainder substring str[4...8] = cabad
  3. Apply the pertinent suffix extension rule

Once again, note that if the active node is the root r, then the suffix link points to itself. Due to this, the first character of the remainder substring needs to be removed, to update the extension point from the end of the path str[1...i] to the end of the path str[2...i].

For any extension of phase , where :

  1. Base Case: At extension , suffix links are not used. Instead:
    • Naively locate the initial extension point by traversing the tree from the root.
    • Determine the active node u and the remainder str[k...i].
    • Apply a rule 1 extension (always the case for the first extension).
  2. For extensions ,
    • Given the active node u and the remainder str[k...i] (which may be empty) from the previous extension :
    • Follow the suffix link from u to the receiving node v, and traverse from v to the end of the remainder.
    • Note that, If u = r, then v = r, and the first character must be removed from the remainder before traversing.
  3. Once the remainder is fully traversed, the extension point is reached, and the appropriate suffix extension rule is applied.
  4. After each extension, the active node and remainder are updated, and the algorithm moves to the next extension .

This procedure enables efficient traversal between extension points by removing the need to traverse the entire path str[j...i] for each extension.

The general extension procedure using suffix links assumes that every internal node in the implicit suffix tree has an outgoing suffix link. To prove this, there are some important lemmas to consider:

Theorem 1 proves that the general extension process is sound as the active node is never a newly created internal node and so its outgoing suffix link is always defined.

However, even when using suffix links, the method still requires -time as it naively traverse the remainder substring character by character in every iteration.

There are four optimisation tricks to ensure the algorithm performs more efficiently.