Revisit, Extend, and Enhance Hessian-Free Influence Functions
Ziao Yang, Han Yue, Jian Chen, Hongfu Liu

TL;DR
This paper revisits the TracIn influence estimation method, providing insights into its effectiveness, extending its applications to fairness and robustness, and enhancing it with ensemble strategies for better performance in various deep learning tasks.
Contribution
It offers a deeper understanding of the simple Hessian approximation in TracIn, extends its use to fairness and robustness, and introduces an ensemble enhancement for improved influence estimation.
Findings
TracIn performs well despite its naive Hessian approximation.
The extended TracIn improves fairness and robustness in models.
Ensemble strategies enhance influence estimation accuracy.
Abstract
Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. Various approaches, including matrix decomposition, have been explored to expedite and approximate the inversion of the Hessian matrix, with the aim of making influence functions applicable to deep models. In this paper, we revisit a specific, albeit naive, yet effective…
Peer Reviews
Decision·Submitted to ICLR 2025
- The paper is well written, and the reader can understand the main idea of the paper quickly in a short time. - Efficiency has become a very important topic for TDA(training data attribution).
- There is a large gap between the contribution claimed in this paper and actual literature. The major problem lies in the first and second contribution bullet point in section 1 (line 59 - line 62) - The Inner Product (IP) proposed by this paper (as a simplified version of TracIN) has long been proposed [1] and used in a large number of papers[2]. - Replacement of the loss gradient to some other metrics to fairness and robustness is also something tried for influence function or related me
The paper includes a lot of experiments and provides statistical ranges for the reported results.
However, the paper lacks a clear problem statement and positioning of the method within existing approaches. The formula provided to describe the method is the TracIn formula, with the only difference being the reduction of calculations to the final trained model. This raises several questions: (i) the explanation of the benefits of this simplification is vague and could be questioned, (ii) the intuition behind this approximation rests on a comparison with a method involving the Hessian’s invers
1. The proposed method is particularly simple, easy to implement and efficient to compute. 2. The experimental evaluations presented in the paper are fairly thorough and rigorous, with appropriate repeat experiments to establish confidence intervals. 3. The extension of influence estimation to algorithmic fairness metrics is interesting.
1. The order consistency argument for why IP is a good approximation to inverse hessian influence is quite weak. In figure 1, data points with vectors in regions I and III would not satisfy order consistency. The authors argue that since IP and IF both rate such points as beneficial/detrimental, the order doesn’t matter. This is clearly not true, as many applications of influence estimation involve determining set membership at the extreme ends of the influence spectrum (e.g. removing x% detrime
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
MethodsSparse Evolutionary Training
