Revisiting inverse Hessian vector products for calculating influence   functions

Yegor Klochkov; Yang Liu

arXiv:2409.17357·cs.LG·September 27, 2024

Revisiting inverse Hessian vector products for calculating influence functions

Yegor Klochkov, Yang Liu

PDF

Open Access 1 Repo

TL;DR

This paper revisits inverse Hessian-vector products for influence functions, showing how hyperparameters can be effectively chosen based on spectral properties, improving practicality for large models.

Contribution

It demonstrates that hyperparameters for LiSSA can be set based on Hessian spectral properties, making influence function computation more feasible for large models.

Findings

01

Hyperparameters depend on Hessian trace and eigenvalues

02

Sufficiently large batch size is needed for convergence

03

Empirical validation confirms theoretical insights

Abstract

Influence functions are a popular tool for attributing a model's output to training data. The traditional approach relies on the calculation of inverse Hessian-vector products (iHVP), but the classical solver "Linear time Stochastic Second-order Algorithm" (LiSSA, Agarwal et al. (2017)) is often deemed impractical for large models due to expensive computation and hyperparameter tuning. We show that the three hyperparameters -- the scaling factor, the batch size, and the number of steps -- can be chosen depending on the spectral properties of the Hessian, particularly its trace and largest eigenvalue. By evaluating with random sketching (Swartworth and Woodruff, 2023), we find that the batch size has to be sufficiently large for LiSSA to converge; however, for all of the models we consider, the requirement is mild. We confirm our findings empirically by comparing to Proximal Bregman…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yklochkov-bytedance/gnhtools
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference