Perfectly load-balanced, optimal, stable, parallel merge
Christian Siebert, Jesper Larsson Tr\"aff

TL;DR
This paper introduces a simple, work-optimal, and synchronization-free parallel merge algorithm that ensures perfect load balancing and stability, significantly improving efficiency over previous methods.
Contribution
The paper presents a new direct co-ranking algorithm for stable merging that is faster, simpler, and maintains stability without extra space or time overhead.
Findings
Co-ranking algorithm runs in O(log min(m,n)) time.
Parallel merge achieves optimal speedup under certain conditions.
Algorithm is easy to implement on various parallel systems.
Abstract
We present a simple, work-optimal and synchronization-free solution to the problem of stably merging in parallel two given, ordered arrays of m and n elements into an ordered array of m+n elements. The main contribution is a new, simple, fast and direct algorithm that determines, for any prefix of the stably merged output sequence, the exact prefixes of each of the two input sequences needed to produce this output prefix. More precisely, for any given index (rank) in the resulting, but not yet constructed output array representing an output prefix, the algorithm computes the indices (co-ranks) in each of the two input arrays representing the required input prefixes without having to merge the input arrays. The co-ranking algorithm takes O(log min(m,n)) time steps. The algorithm is used to devise a perfectly load-balanced, stable, parallel merge algorithm where each of p processing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Parallel Computing and Optimization Techniques · Distributed systems and fault tolerance
