Extending Kernel Trick to Influence Functions
Zhenhuan Sun, Shahrokh Valaee

TL;DR
This paper introduces a dual representation of influence functions that scales with dataset size, enabling efficient estimation of data point impacts for large models, with limitations to linearizable models.
Contribution
It extends the kernel trick to influence functions, providing a scalable alternative for large models and datasets, with a focus on linearizable models.
Findings
Dual representation scales with dataset size, not model size.
Efficient estimation of parameter and output changes due to data removal.
Applicable to linearizable models with manageable matrix materialization.
Abstract
In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
