Faithful and Fast Influence Function via Advanced Sampling

Jungyeon Koh; Hyeonsu Lyu; Jonggyu Jang; Hyun Jong Yang

arXiv:2510.26776·cs.LG·November 3, 2025

Faithful and Fast Influence Function via Advanced Sampling

Jungyeon Koh, Hyeonsu Lyu, Jonggyu Jang, Hyun Jong Yang

PDF

Open Access

TL;DR

This paper introduces advanced sampling techniques based on features and logits to improve the efficiency and accuracy of influence function estimations in black-box models, significantly reducing computation and memory costs.

Contribution

It proposes novel sampling methods that select representative data subsets for influence functions, enhancing accuracy and efficiency over traditional random sampling.

Findings

01

Reduces computation time by 30.1%

02

Decreases memory usage by 42.2%

03

Improves F1-score by 2.5% in class removal experiments

Abstract

How can we explain the influence of training data on black-box models? Influence functions (IFs) offer a post-hoc solution by utilizing gradients and Hessians. However, computing the Hessian for an entire dataset is resource-intensive, necessitating a feasible alternative. A common approach involves randomly sampling a small subset of the training data, but this method often results in highly inconsistent IF estimates due to the high variance in sample configurations. To address this, we propose two advanced sampling techniques based on features and logits. These samplers select a small yet representative subset of the entire dataset by considering the stochastic distribution of features or logits, thereby enhancing the accuracy of IF estimations. We validate our approach through class removal experiments, a typical application of IFs, using the F1-score to measure how effectively the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques · Machine Learning and Data Classification