Local Adjoints for Simultaneous Preaccumulations with Shared Inputs
Johannes Bl\"uhdorn, Nicolas R. Gauger

TL;DR
This paper introduces local adjoints for shared-memory parallel automatic differentiation, enabling safe simultaneous preaccumulations in multi-threaded environments by analyzing different storage approaches and benchmarking their performance.
Contribution
It proposes and evaluates local adjoint methods for parallel AD, addressing data race issues in shared-memory contexts with new storage strategies.
Findings
Local adjoints enable safe parallel preaccumulations.
Different storage approaches impact memory and access times.
Benchmark results show tradeoffs in memory use and performance.
Abstract
In shared-memory parallel automatic differentiation, inputs that are shared among simultaneous thread-local preaccumulations lead to data races if Jacobians are accumulated with a single, shared vector of adjoint variables. In this work, we discuss the benefits and tradeoffs of re-enabling such preaccumulations by a transition to suitable local adjoints. We propose different vector- and map-based approaches for storing local adjoint variables and analyze them with respect to memory consumption, memory allocation, and adjoint variable access times in the context of simultaneous preaccumulations in multiple threads. We implement the approaches in CoDiPack and benchmark them in parallel discrete adjoint computations in the multiphysics simulation suite SU2.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
