Fast and Stable Gradient Approximation for Bilinear Forms of Hermitian Matrix Functions

Navjot Singh; Kipton Barros; Xiaoye Sherry Li

arXiv:2605.12801·math.NA·May 14, 2026

Fast and Stable Gradient Approximation for Bilinear Forms of Hermitian Matrix Functions

Navjot Singh, Kipton Barros, Xiaoye Sherry Li

PDF

TL;DR

This paper introduces a fast, stable, forward-only gradient approximation method for bilinear forms of Hermitian matrix functions, reducing computational costs and avoiding reorthogonalization.

Contribution

The authors propose a novel forward-only gradient approximation that reuses Lanczos passes, offering improved stability and efficiency over existing methods.

Findings

01

The new method's error is proportional to the Lanczos residual norm.

02

It is unconditionally stable in tests without reorthogonalization.

03

The approach is faster than current state-of-the-art techniques.

Abstract

Objectives involving bilinear forms $u^{⊤} f (A (θ)) v$ for Hermitian $A$ arise widely in scientific computing and probabilistic machine learning. For large matrices, Lanczos efficiently approximates these quantities, but differentiating them with respect to $θ$ is challenging. Existing approaches either backpropagate through the Lanczos recurrence, requiring reorthogonalization for stability, or apply Arnoldi to an augmented block matrix of twice the original size. Both introduce extra computation and orthogonalization costs that can limit performance on modern hardware. We propose a forward-only gradient approximation that reuses the Lanczos pass and adds very minimal overhead in most cases. We prove that its error is proportional to the Lanczos residual norm, the same quantity controlling the forward approximation. Whereas a traditional adjoint-based calculation would be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.