To ensure Ukkonens Algorithm maintains linear-time complexity, the general incremental procedure is optimised through four implementation tricks.

Space-Efficient Representation Of Edge Labels

Explicitly storing the substring labels of each edge in a suffix tree requires space.

Such a representation prevents the development of a linear-time construction method, as it would be impossible to even output the resulting tree in time.

A more efficient approach involves implicitly representing edge labels, as follows:

Define a tuple to denote the substring str[j...i], where .

This approach reduces the space complexity to , assuming a fixed-size alphabet.

Skip Counting

When extending a suffix tree, a path labelled by the remainder substring must already exist below the active node. Therefore, it is unnecessary to traverse this path character by character.

The skip-counting process optimises this redundancy by following these steps:

  1. At each internal node, check the first character of each outgoing edge against the current position in the remainder, to determine which edge continues the path.
  2. Use the number of characters along each edge to determine how far the traversal progresses through the remainder substring.
  3. Skip to the next internal node along the path, rather than traversing one character at a time.
  4. Repeat this skip-count process, possibly passing through multiple internal nodes, until the extension point is reached.

By applying this optimisation, the time required for traversal depends on the number of nodes along the path, rather than the sum of the edge lengths.

When combined with suffix links, skip-counting reduces the worst-case time complexity of the extension process to

The Show Stopper Rule

The showstopper rule states that if extension in phase is performed using rule 3, the phase can be immediately terminated and extension for phase can begin.

As rule 3 extension don’t alter the state of the suffix tree, it would remain unchanged until the first extension of the next phase (which is always rule 1 after the base case, phase ).

Therefore, the traversal of all extension points for in phase can be skipped, and the process can move directly to the first extension of phase .

Rapid Leaf Extension

The regular suffix tree for str[1...n] contains leaves, one for each suffix.

The only extensions that create a leaf are rule 2 extensions, and the only extensions that update the path to a leaf are rule 1 extensions.

Intuitively, Lemma 5 shows that a suffix str[j...n] is added to the tree via a rule 2 extension in some phase .

In combination with Lemma 4, the prefix str[j...i] of the full suffix is then extended by one character in each subsequent phase via rule 1 extensions, until phase where the full suffix exists implicitly in the tree.

In summary:

  • Rule 2 extensions add new suffixes to the tree.
  • Rule 1 extensions grow existing suffixes in the tree.

Once a suffix is added to the tree, it is never added again and so if extension in phase is performed using rule 2, then any extension for subsequent phases will be performed using rule 1.

Optimising This Redundancy

The goal is to track how many extensions have already been added using rule 2 before phase . To do this, define:

  • last_j as the last extension that created a leaf via rule 2.

Such that, in each phase, all extensions will be rule 1 extensions.

Once a leaf is added, the only change to the edge involves extending it by one character in each phase, using space-optimal representation, this involves incrementing the end index.

As all leaf edges share the same end index in each phase, simply have a global pointer that is incremented in each phase. That is define:

  • globalEnd as a shared end index for all leaves

By setting globalEnd = i at the start of phase , all necessary rule 1 extensions are performed implicitly without traversal. Therefore, the first last_j extensions in phase are handled in time, and explicit extensions start from

Rapid Leaf Extensions

Given str[1...n], in Ukkonen’s algorithm, all rule 1 extensions are handled implicitly as follows:

  • Initialise last_j = 0 and globalEnd = 0.
  • At the start of each phase , increment globalEnd.
  • Begin explicit extensions from .
  • If an explicit extension uses rule 2:
    • Create a new leaf numbered and attach it appropriately.
    • Set the edge’s endIndex as a pointer to globalEnd.
    • Increment last_j.

This mechanism allows all early rule 1 extensions to be performed in constant time.

Additional Steps To Rule 3s

Intuitively, Lemma 6 shows that although a rule 3 extension doesn’t alter the tree, there is still “work waiting to happen”.

If extension in phase is a rule 3 extension:

  • As no internal node is created, last_j is not incremented
  • By the show stopper rule, the next phase immediately begins

Therefore, the next explicit (involves traversal) extension occurs at extension in phase (all extensions up to and including last_j, are rule 1’s that don’t require traversal due to globalEnd)

An issue with this is that:

  • Extension in phase traverses the path str[j...i-1], storing str[k...i-1] as the remainder
  • Whereas, extension of phase traverses the path str[j...i]
  • However, the remainder is one character short — it lackstr[i].

Additionally, the remainder below the active node u is only traversed by skip counting, and as rule 3 doesn’t create an internal node, no suffix link is followed.

Rule 3 Extensions In Ukkonen's

If extension of phase is performed using rule 3, the following steps are taken:

  1. Append str[i] to the remainder, extending it from str[k...i−1] to str[k...i].
  2. If a previous extension created an internal node v, set its outgoing suffix link from v to the current active node u
  3. Terminate phase using the showstopper rule.

Summarising Each Optimisation

The first two optimisations target cost per extension:

  • The space-efficient representation reduces the cost of updating and storing leaf edges, enabling rapid leaf extension by allowing all leaves to share a common end index.
  • Skip counting reduces traversal time by limiting traversals to the number of nodes visited rather than the total number of characters.

The next two optimisations target the total number of extensions

  • The show stopper rule terminates phases early when further extensions are guaranteed to be redundant.
  • Rapid leaf extension eliminates the need to explicitly handle many rule 1 extensions,

Together, these strategies transform Ukkonen’s algorithm from to by both reducing the cost per extension and the total number of extensions.