The Burrow-Wheelers Transform of a string is a useful pre processing step to parse large strings, and match very short patterns.
The semantics of LF mapping is the same for inversion and pattern matching. The minus 1 in ep.
When doing inversion, you have the burrow wheelers transform, the string, you have the rank array and the nocc pre processed table. Thus pattern matching is proportional to the pattern length.
To find out where the pat appear in the original text you use the suffix array needed to compute the burrow wheelers transform
For go, sp = 2 and ep = 3, SA[2] and SA[3] gives the start position of the occurrences
Matching ogo In The BWT lo$oogg
pat = gogo
bwt = lo$oogg
Preprocess the rank matrix just as in inversion to get:
rank = [$ = 1, g = 2, l = 4, o = 5]
Initialise the start point sp and end point ep to 1 and 7 respectively.
Searching backwards in pat, the first letter to search for is o. The start point is updated using the rule
So to search for go, we want to minimise the range covered by the start and end point. So starting at o:
sp = rank(o) + nocc(o, BWT[1...0]) = 5 + nocc(o, "") = 5 + 0 = 5ep = rank(o) + nocc(o, BWT[1...7]) - 1 = 5 + 3 - 1 = 7
Next we need to search for g:
sp = rank(g) + nocc(g, BWT[1...5-1]) = 2 + 0 = 2ep = rank(g) + nocc(g, BWT[1...7]) - 1 = 2 + 2 - 1 = 3
So cyclically sp and ep points to the occurrence of go
Next we nee to search for o,
sp = rank(o) + nocc(o, BWT[1...2-1]) = 5 + 0 = 5ep = rank(o) + nocc(o, BWT[1...3]) - 1 = 5 + 1 - 1 = 5
So cyclically sp and ep point to an occurrence of ogo. More specifically the range bounded by sp and ep are all occurrences
If we search for g now
sp = rank(g) + nocc(o, BWT[1...5-1]) = 2 + 0 = 2ep = rank(g) + nocc(o, BWT[1...5]) - 1 = 2 + 0 - 1 = 1
As gogo doesn’t exist, the ep is greater than the sp.