Scheduling and Tiling Reductions on Realistic Machines
Nirmal Prajapati

TL;DR
This paper examines scheduling and tiling strategies for reductions on realistic machines with bounded fan-in, identifies issues in existing algorithms, and proposes improvements and extensions to enhance parallel reduction computations.
Contribution
It analyzes Gupta et al.'s scheduling algorithm, identifies a potential flaw, and offers a corrected approach along with methods to extend scheduling to tiled reductions.
Findings
Identified a potential issue in Gupta et al.'s scheduling algorithm.
Proposed a corrected scheduling technique for reductions.
Extended scheduling methods to support reduction tiling.
Abstract
Computations, where the number of results is much smaller than the input data and are produced through some sort of accumulation, are called Reductions. Reductions appear in many scientific applications. Usually, reductions admit an associative and commutative binary operator over accumulation. Reductions are therefore highly parallel. Given unbounded fan-in, one can execute a reduction in constant/linear time provided that the data is available. However, due to the fact that real machines have bounded fan-in, accumulations cannot be performed in one time step and have to be broken into parts. Thus, a (partial) serialization of reductions becomes necessary. This makes scheduling reductions a difficult and interesting problem. There have been a number of research works in the context of scheduling reductions. We focus on the scheduling techniques presented in Gupta et al., identify a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Parallel Computing and Optimization Techniques · Cellular Automata and Applications
