Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation
Jeffrey Mark Siskind, Barak A. Pearlmutter

TL;DR
This paper introduces a fully automated divide-and-conquer checkpointing method for reverse-mode automatic differentiation that applies at the language implementation level, enabling efficient memory management for arbitrary programs.
Contribution
It extends checkpointing automation to arbitrary computations by integrating the technique into language implementation, allowing flexible checkpoint intervals without user annotations.
Findings
Reduces storage growth from linear to sublinear in reverse-mode AD.
Automates checkpointing for complex, arbitrary programs.
Improves memory efficiency without manual intervention.
Abstract
Classical reverse-mode automatic differentiation (AD) imposes only a small constant-factor overhead in operation count over the original computation, but has storage requirements that grow, in the worst case, in proportion to the time consumed by the original computation. This storage blowup can be ameliorated by checkpointing, a process that reorders application of classical reverse-mode AD over an execution interval to tradeoff space \vs\ time. Application of checkpointing in a divide-and-conquer fashion to strategically chosen nested execution intervals can break classical reverse-mode AD into stages which can reduce the worst-case growth in storage from linear to sublinear. Doing this has been fully automated only for computations of particularly simple form, with checkpoints spanning execution intervals resulting from a limited set of program constructs. Here we show how the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
