Scalable Distributed String Sorting

Florian Kurpicz; Pascal Mehnert; Peter Sanders; Matthias; Schimek

arXiv:2404.16517·cs.DS·April 26, 2024

Scalable Distributed String Sorting

Florian Kurpicz, Pascal Mehnert, Peter Sanders, Matthias, Schimek

PDF

1 Repo

TL;DR

This paper introduces scalable distributed string sorting algorithms that efficiently handle large-scale parallel systems, significantly improving speed and scalability over existing methods.

Contribution

The paper presents practical distributed-memory string sorting algorithms with near-optimal latency and communication, enabling efficient sorting on thousands of cores.

Findings

01

Achieved up to 5x speedup over state-of-the-art algorithms.

02

Scales effectively on up to 49152 cores.

03

Latency proportional to p^{1/k} with limited communication rounds.

Abstract

String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either have latency (at least) proportional to the number of processors $p$ or communicate the data a large number of times (at least logarithmic). We present practical and efficient algorithms for distributed-memory string sorting that scale to large $p$ . Similar to state-of-the-art sorters for atomic objects, the algorithms have latency of about $p^{1/ k}$ when allowing the data to be communicated $k$ times. Experiments indicate good scaling behavior on a wide range of inputs on up to 49152 cores. Overall, we achieve speedups of up to 5 over the current state-of-the-art distributed string sorting algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pmehnert/distributed-string-sorting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.