# Optimal (Randomized) Parallel Algorithms in the Binary-Forking Model

**Authors:** Guy E. Blelloch, Jeremy T. Fineman, Yan Gu, Yihan Sun

arXiv: 1903.04650 · 2020-06-26

## TL;DR

This paper develops the first optimal work and span algorithms for fundamental problems in the binary-forking model, which reflects realistic multithreaded environments, using simple and often randomized techniques.

## Contribution

It introduces the first algorithms with optimal work and span for key problems in the binary-forking model, addressing limitations of PRAM-based approaches.

## Key findings

- Algorithms achieve optimal work and span in the binary-forking model.
- Most algorithms are simple and many are randomized.
- The work highlights the overhead of simulating PRAM algorithms in realistic models.

## Abstract

In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference. In the binary-forking model, tasks can only fork into two child tasks, but can do so recursively and asynchronously. The tasks share memory, supporting reads, writes and test-and-sets. Costs are measured in terms of work (total number of instructions), and span (longest dependence chain).   The binary-forking model is meant to capture both algorithm performance and algorithm-design considerations on many existing multithreaded languages, which are also asynchronous and rely on binary forks either explicitly or under the covers. In contrast to the widely studied PRAM model, it does not assume arbitrary-way forks nor synchronous operations, both of which are hard to implement in modern hardware. While optimal PRAM algorithms are known for the problems studied herein, it turns out that arbitrary-way forking and strict synchronization are powerful, if unrealistic, capabilities. Natural simulations of these PRAM algorithms in the binary-forking model (i.e., implementations in existing parallel languages) incur an $\Omega(\log n)$ overhead in span. This paper explores techniques for designing optimal algorithms when limited to binary forking and assuming asynchrony. All algorithms described in this paper are the first algorithms with optimal work and span in the binary-forking model. Most of the algorithms are simple. Many are randomized.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.04650/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1903.04650/full.md

## References

95 references — full list in the complete paper: https://tomesphere.com/paper/1903.04650/full.md

---
Source: https://tomesphere.com/paper/1903.04650