This old note about suffix arrays which provides O(N) neighbor pair match lengths is exactly analogous to using "follow pointers" for O(N) string matching in suffix tries.
(their paper also contains a proof of O(N)'ness , though it is obvious if you think about it a bit; see comments on previous post about this).
Doing Judy-ish stuff for a suffix tree is exacly analogous to the "introspective" stuff that's done in good suffix array sorters like divsufsort.
By Judy-ish I mean using a variety of tree structures and selecting one for the local area based on its properties. (eg. nodes with > 100 children switch to just using a radix array of 256 direct links to kids).
Suffix tries are annoying because it's easy to slide the head (adding nodes) but hard to slide the tail (removing nodes). Suffix arrays are even worse in that they don't slide at all.
The normal way to adapt suffix arrays to LZ string matching is just to use chunks of arrays (possibly a power-of-2 cascade). There are two problems I haven't found a good solution to. One is how to look up a string in the chunk that it is not a member of (eg. a chunk that's behind you). The other is how to deal with offsets that are in front of you.
If you just put your whole file in one suffix array, I believe that is unboundedly bad. If you were
allowed to match forwards, then finding the best match would be O(1) - you only have to look at the
two slots before you and after you in the sort order. But since we can't match forward, you have to
scan. The pseudocode is like this :
do both forward and backward :
start at the sort position of the string I want to match
walk to the next closest in sort order (this is an O(1) table lookup)
if it's a legal match (eg. behind me) - I'm done, it's the best
if not, keep walking
the problem is the walk is unbounded. When you are somewhere early in the array, there can be an arbitrary
number (by which I mean O(N)) of invalid matches between you and your best match in the sort order.
Other than these difficulties, suffix arrays provide a much simpler way of getting the advantages of suffix tries.
Suffix arrays also have implementation advantages. Because you separate the suffix string work from the rest of your coder it makes it easier to optimize each one in isolation, you get better cache use and better register allocation. Also, the suffix array can use more memory during the sort, or use scratch space, while a trie has to hold its structure around all the time. For example some suffix sorts will do things like use a 2-byte radix in parts of the sort where that makes sense (and then they can get rid of it and use it on another part of the sort), and that's usually impossible for a tree that you're holding in memory as you scan.