Rethinking Influence Functions of Neural Networks in the Over-parameterized Regime
Rui Zhang, Shihua Zhang

TL;DR
This paper uses neural tangent kernel theory to improve the understanding and calculation of influence functions in over-parameterized neural networks, revealing their dependence on regularization and training data density.
Contribution
It introduces a NTK-based approach to accurately compute influence functions and analyzes the limitations of classic IHVP methods in over-parameterized regimes.
Findings
IHVP accuracy depends on regularization strength
Influence functions are affected by training data density
NTK theory helps quantify influence and training dynamics
Abstract
Understanding the black-box prediction for neural networks is challenging. To achieve this, early studies have designed influence function (IF) to measure the effect of removing a single training point on neural networks. However, the classic implicit Hessian-vector product (IHVP) method for calculating IF is fragile, and theoretical analysis of IF in the context of neural networks is still lacking. To this end, we utilize the neural tangent kernel (NTK) theory to calculate IF for the neural network trained with regularized mean-square loss, and prove that the approximation error can be arbitrarily small when the width is sufficiently large for two-layer ReLU networks. We analyze the error bound for the classic IHVP method in the over-parameterized regime to understand when and why it fails or not. In detail, our theoretical analysis reveals that (1) the accuracy of IHVP depends on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
MethodsNeural Tangent Kernel
