An Elegant Algorithm for the Construction of Suffix Arrays
Sanguthevar Rajasekaran, Marius Nicolae

TL;DR
This paper introduces a simple, efficient, and parallelizable suffix array construction algorithm that operates in linear time with high probability, outperforming existing methods in practice.
Contribution
The paper presents a novel, simple suffix array construction algorithm with high-probability linear time complexity and parallel implementation, advancing both theoretical and practical aspects.
Findings
Algorithm runs in linear time with high probability.
Empirical tests show it is among the fastest known algorithms.
Parallel implementations demonstrate scalability and efficiency.
Abstract
The suffix array is a data structure that finds numerous applications in string processing problems for both linguistic texts and biological data. It has been introduced as a memory efficient alternative for suffix trees. The suffix array consists of the sorted suffixes of a string. There are several linear time suffix array construction algorithms (SACAs) known in the literature. However, one of the fastest algorithms in practice has a worst case run time of . The problem of designing practically and theoretically efficient techniques remains open. In this paper we present an elegant algorithm for suffix array construction which takes linear time with high probability; the probability is on the space of all possible inputs. Our algorithm is one of the simplest of the known SACAs and it opens up a new dimension of suffix array construction that has not been explored until now.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Genomics and Phylogenetic Studies
