Linear-time string indexing and analysis in small space
Djamal Belazzougui, Fabio Cunial, Juha K\"arkk\"ainen, and Veli, M\"akinen

TL;DR
This paper presents deterministic and randomized algorithms for constructing the Burrows-Wheeler transform and related string indexes in linear time using small space, enabling efficient string analysis and indexing in large-scale and DNA sequencing applications.
Contribution
It introduces the first deterministic linear-time algorithms for BWT and related indexes within small space, improving over previous methods with higher time complexity.
Findings
Deterministic O(n) time BWT construction in O(n log σ) bits
Linear-time enumeration of suffix tree nodes from BWT
Constant-time operations in bidirectional BWT index
Abstract
The field of succinct data structures has flourished over the last 16 years. Starting from the compressed suffix array (CSA) by Grossi and Vitter (STOC 2000) and the FM-index by Ferragina and Manzini (FOCS 2000), a number of generalizations and applications of string indexes based on the Burrows-Wheeler transform (BWT) have been developed, all taking an amount of space that is close to the input size in bits. In many large-scale applications, the construction of the index and its usage need to be considered as one unit of computation. Efficient string indexing and analysis in small space lies also at the core of a number of primitives in the data-intensive field of high-throughput DNA sequencing. We report the following advances in string indexing and analysis. We show that the BWT of a string can be built in deterministic time using just…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
