GPU Accelerated Automatic Differentiation With Clad
Ioana Ifrim, Vassil Vassilev, David J Lange

TL;DR
This paper presents Clad, a GPU-accelerated automatic differentiation tool for C/C++ and CUDA, enabling efficient parallel gradient computations with significant performance improvements in scientific applications.
Contribution
Clad extends automatic differentiation to GPU architectures for C/C++ and CUDA, providing a compiler-assisted tool that enhances performance in scientific computing tasks.
Findings
Achieved approximately 10x speedup in ROOT histogram fitting.
Demonstrated effective GPU-based automatic differentiation for C++ functions.
Enabled seamless integration with existing frameworks and interactive environments.
Abstract
Automatic Differentiation (AD) is instrumental for science and industry. It is a tool to evaluate the derivative of a function specified through a computer program. The range of AD application domain spans from Machine Learning to Robotics to High Energy Physics. Computing gradients with the help of AD is guaranteed to be more precise than the numerical alternative and have a low, constant factor more arithmetical operations compared to the original function. Moreover, AD applications to domain problems typically are computationally bound. They are often limited by the computational requirements of high-dimensional parameters and thus can benefit from parallel implementations on graphics processing units (GPUs). Clad aims to enable differential analysis for C/C++ and CUDA and is a compiler-assisted AD tool available both as a compiler extension and in ROOT. Moreover, Clad works as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
