Extending the Nested Parallel Model to the Nested Dataflow Model with Provably Efficient Schedulers
David Dinh, Harsha Vardhan Simhadri, Yuan Tang

TL;DR
This paper introduces a new dataflow construct to extend the nested parallel model, enabling more expressive algorithms with optimal performance and cache efficiency on multicore architectures, supported by provably efficient schedulers.
Contribution
It proposes the nested dataflow model with a new composition construct, designs optimal algorithms within this model, and develops schedulers that guarantee locality and load balancing.
Findings
Algorithms in the ND model achieve optimal span and cache complexity.
SB schedulers can utilize increased parallelism for better cache and time bounds.
The running time scales with the sum of cache complexities divided by processors.
Abstract
The nested parallel (a.k.a. fork-join) model is widely used for writing parallel programs. However, the two composition constructs, i.e. "" (parallel) and "" (serial), are insufficient in expressing "partial dependencies" or "partial parallelism" in a program. We propose a new dataflow composition construct "" to express partial dependencies in algorithms in a processor- and cache-oblivious way, thus extending the Nested Parallel (NP) model to the \emph{Nested Dataflow} (ND) model. We redesign several divide-and-conquer algorithms ranging from dense linear algebra to dynamic-programming in the ND model and prove that they all have optimal span while retaining optimal cache complexity. We propose the design of runtime schedulers that map ND programs to multicore processors with multiple levels of possibly shared caches (i.e, Parallel Memory Hierarchies) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Distributed and Parallel Computing Systems
