Reverse-Mode AD of Reduce-by-Index and Scan in Futhark

Lotte Maria Bruun; Ulrik Stuhr Larsen; Nikolaj Hinnerskov; Cosmin; Oancea

arXiv:2310.03568·cs.PL·October 6, 2023

Reverse-Mode AD of Reduce-by-Index and Scan in Futhark

Lotte Maria Bruun, Ulrik Stuhr Larsen, Nikolaj Hinnerskov, Cosmin, Oancea

PDF

Open Access

TL;DR

This paper introduces reverse-mode automatic differentiation for core parallel programming constructs in Futhark, optimizing performance through specialized algorithms and analyzing the effects of differentiating at different abstraction levels on GPU execution.

Contribution

It provides new algorithms for reverse-mode AD of reduce, scan, and reduce by index in Futhark, with practical specializations for efficient differentiation in GPU contexts.

Findings

01

Specialized algorithms improve differentiation efficiency on GPUs.

02

Differentiating at high level vs. low level has distinct performance impacts.

03

Experimental results highlight strengths and weaknesses of the proposed methods.

Abstract

We present and evaluate the Futhark implementation of reverse-mode automatic differentiation (AD) for the basic blocks of parallel programming: reduce, prefix sum (scan), and reduce by index. We first present derivations of general-case algorithms and then discuss several specializations that result in efficient differentiation of most cases of practical interest. We report an experiment that evaluates the performance of the differentiated code in the context of GPU execution and highlights the impact of the proposed specializations as well as the strengths and weaknesses of differentiating at high level vs. low level (i.e., ``differentiating the memory'').

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Interconnection Networks and Systems