Fast and Lightweight Distributed Suffix Array Construction -- First Results
Manuel Haag, Florian Kurpicz, Peter Sanders, Matthias Schimek

TL;DR
This paper introduces a practical, lightweight distributed suffix array construction algorithm that uses less memory and is competitive or faster than existing methods, advancing scalable text indexing.
Contribution
It adapts the DCX suffix array construction algorithm for distributed memory systems using bucketing, reducing memory usage significantly while maintaining competitive performance.
Findings
Uses less than half the memory of PSAC
Achieves comparable or faster running times
Demonstrates practical efficiency in distributed environments
Abstract
We present first algorithmic ideas for a practical and lightweight adaption of the DCX suffix array construction algorithm [Sanders et al., 2003] to the distributed-memory setting. Our approach relies on a bucketing technique which enables a lightweight implementation which uses less than half of the memory required by the currently fastest distributed-memory suffix array algorithm PSAC [Flick and Aluru, 2015] while being competitive or even faster in terms of running time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
