Low-Depth Parallel Algorithms for the Binary-Forking Model without Atomics
Zafar Ahmad, Rezaul Chowdhury, Rathish Das, Pramod Ganapathi, Aaron, Gregory, and Mohammad Mahdi Javanmard

TL;DR
This paper develops efficient parallel algorithms for matrix multiplication, sorting, and FFT in the binary-forking model without relying on atomic instructions, improving performance bounds over previous methods.
Contribution
It introduces lock-free parallel algorithms for key problems in the binary-forking model, avoiding atomic instructions except in join operations, and achieves better bounds than existing algorithms.
Findings
Algorithms for matrix multiplication, sorting, and FFT outperform previous binary-forking model results.
All algorithms avoid atomic instructions except in join operations.
Results demonstrate improved efficiency in realistic multithreaded environments.
Abstract
The binary-forking model is a parallel computation model, formally defined by Blelloch et al. very recently, in which a thread can fork a concurrent child thread, recursively and asynchronously. The model incurs a cost of to spawn or synchronize tasks or threads. The binary-forking model realistically captures the performance of parallel algorithms implemented using modern multithreaded programming languages on multicore shared-memory machines. In contrast, the widely studied theoretical PRAM model does not consider the cost of spawning and synchronizing threads, and as a result, algorithms achieving optimal performance bounds in the PRAM model may not be optimal in the binary-forking model. Often, algorithms need to be redesigned to achieve optimal performance bounds in the binary-forking model and the non-constant synchronization cost makes the task challenging.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Interconnection Networks and Systems
