Optimal-Time Queries on BWT-runs Compressed Indexes
Takaaki Nishimoto, Yasuo Tabei

TL;DR
This paper introduces OptBWTR, a new compressed index for highly repetitive strings that supports various queries in optimal time using run-length BWT compression, significantly improving query efficiency.
Contribution
It improves the computation time of LF and ^{-1} functions to constant time and presents the first index supporting multiple queries in optimal time with space proportional to the number of BWT runs.
Findings
LF and ^{-1} are computed in constant time.
OptBWTR supports locate, count, extract queries in optimal time.
The index uses O(r) words of space, where r is the number of BWT runs.
Abstract
Indexing highly repetitive strings (i.e., strings with many repetitions) for fast queries has become a central research topic in string processing, because it has a wide variety of applications in bioinformatics and natural language processing. Although a substantial number of indexes for highly repetitive strings have been proposed thus far, developing compressed indexes that support various queries remains a challenge. The run-length Burrows-Wheeler transform (RLBWT) is a lossless data compression by a reversible permutation of an input string and run-length encoding, and it has received interest for indexing highly repetitive strings. LF and are two key functions for building indexes on RLBWT, and the best previous result computes LF and in time with words of space for the string length and the number of runs in RLBWT. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
