Adjoint-based exact Hessian computation
Shin-ichi Ito, Takeru Matsuda, Yuto Miyatake

TL;DR
This paper introduces a novel algorithm that computes the exact Hessian-vector product efficiently using a new derivation of the second-order adjoint system, improving accuracy in applications like optimization and uncertainty quantification.
Contribution
The paper presents a new, concise derivation of the second-order adjoint system and demonstrates that specific numerical methods can compute Hessian-vector products exactly.
Findings
Exact Hessian-vector multiplication can be achieved with the proposed algorithm.
Symplectic partitioned Runge--Kutta methods are effective for this computation.
The method improves accuracy over traditional approximation approaches.
Abstract
We consider a scalar function depending on a numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the initial value. The need to extract the information of the Hessian or to solve a linear system having the Hessian as a coefficient matrix arises in many research fields such as optimization, Bayesian estimation, and uncertainty quantification. From the perspective of memory efficiency, these tasks often employ a Krylov subspace method that does not need to hold the Hessian matrix explicitly and only requires computing the multiplication of the Hessian and a given vector. One of the ways to obtain an approximation of such Hessian-vector multiplication is to integrate the so-called second-order adjoint system numerically. However, the error in the approximation could be significant even if the numerical integration to the second-order adjoint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques
