Suffix arrays with a twist
Tomasz Kowalski, Szymon Grabowski, Kimmo Fredriksson, Marcin, Raniszewski

TL;DR
This paper explores various enhancements to suffix arrays, including navigation strategies, data layout optimizations, and auxiliary data structures, to improve search efficiency and space-time tradeoffs.
Contribution
It introduces novel modifications to suffix array construction and querying, such as optimized navigation, B-tree layouts, compressed prefix lookup tables, and caching techniques.
Findings
B-tree data layout improves search speed
Optimized interval boundary search significantly reduces query time
Compressed prefix lookup tables save space without sacrificing performance
Abstract
The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that how we search for the right interval boundary impacts significantly the overall search speed, a B-tree data layout easily wins over the standard one, the well-known idea of a lookup table for the prefixes of the suffixes can be refined with using compression, caching prefixes of the suffixes in a helper array can pose a(nother) practical space-time tradeoff.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
