Imperfect Influence, Preserved Rankings: A Theory of TRAK for Data Attribution
Han Tong, Shubhangi Ghosh, Haolin Zou, Arian Maleki

TL;DR
This paper provides a theoretical analysis of the TRAK data attribution algorithm, showing it preserves data rankings despite approximation errors, supported by simulations and empirical studies.
Contribution
It offers the first theoretical characterization of TRAK's performance, explaining when and how it accurately preserves data influence rankings.
Findings
TRAK's influence estimates are highly correlated with true influence.
Approximation errors in TRAK can be significant but do not affect ranking accuracy.
Theoretical results are validated through extensive simulations and empirical data.
Abstract
Data attribution, tracing a model's prediction back to specific training data, is an important tool for interpreting sophisticated AI models. The widely used TRAK algorithm addresses this challenge by first approximating the underlying model with a kernel machine and then leveraging techniques developed for approximating the leave-one-out (ALO) risk. Despite its strong empirical performance, the theoretical conditions under which the TRAK approximations are accurate as well as the regimes in which they break down remain largely unexplored. In this paper, we provide a theoretical analysis of the TRAK algorithm, characterizing its performance and quantifying the errors introduced by the approximations on which the method relies. We show that although the approximations incur significant errors, TRAK's estimated influence remains highly correlated with the original influence and therefore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Quality and Management · Adversarial Robustness in Machine Learning
