Better Training Data Attribution via Better Inverse Hessian-Vector Products

Andrew Wang; Elisa Nguyen; Runshi Yang; Juhan Bae; Sheila A. McIlraith; Roger Grosse

arXiv:2507.14740·cs.LG·July 22, 2025

Better Training Data Attribution via Better Inverse Hessian-Vector Products

Andrew Wang, Elisa Nguyen, Runshi Yang, Juhan Bae, Sheila A. McIlraith, Roger Grosse

PDF

1 Video

TL;DR

This paper introduces ASTRA, an efficient algorithm for accurately approximating inverse Hessian-vector products, significantly enhancing training data attribution methods in machine learning.

Contribution

ASTRA employs EKFAC preconditioning with Neumann series to improve iHVP approximation accuracy, requiring fewer iterations and being easier to tune than previous methods.

Findings

01

ASTRA outperforms existing iHVP approximation methods in accuracy.

02

Improved iHVP approximation leads to better training data attribution results.

03

ASTRA is computationally efficient and easier to tune than prior approaches.

Abstract

Training data attribution (TDA) provides insights into which training data is responsible for a learned model behavior. Gradient-based TDA methods such as influence functions and unrolled differentiation both involve a computation that resembles an inverse Hessian-vector product (iHVP), which is difficult to approximate efficiently. We introduce an algorithm (ASTRA) which uses the EKFAC-preconditioner on Neumann series iterations to arrive at an accurate iHVP approximation for TDA. ASTRA is easy to tune, requires fewer iterations than Neumann series iterations, and is more accurate than EKFAC-based approximations. Using ASTRA, we show that improving the accuracy of the iHVP approximation can significantly improve TDA performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Better Training Data Attribution via Better Inverse Hessian-Vector Products· slideslive