In-Run Data Shapley for Adam Optimizer
Meng Ding, Zeqing Zhang, Di Wang, Lijie Hu

TL;DR
This paper introduces Adam-Aware In-Run Data Shapley, a novel method for data attribution that accurately accounts for the optimizer's influence, especially for adaptive optimizers like Adam, improving fidelity and efficiency.
Contribution
We develop a new Adam-aware data attribution method that approximates true contributions under Adam optimizer, overcoming limitations of SGD-based proxies.
Findings
Achieves near-perfect fidelity to ground-truth contributions ($R > 0.99$).
Retains approximately 95% of standard training throughput.
Outperforms SGD-based baselines in downstream data attribution tasks.
Abstract
Reliable data attribution is essential for mitigating bias and reducing computational waste in modern machine learning, with the Shapley value serving as the theoretical gold standard. While recent "In-Run" methods bypass the prohibitive cost of retraining by estimating contributions dynamically, they heavily rely on the linear structure of Stochastic Gradient Descent (SGD) and fail to capture the complex dynamics of adaptive optimizers like Adam. In this work, we demonstrate that data attribution is inherently optimizer-dependent: we show that SGD-based proxies diverge significantly from true contributions under Adam (Pearson ), rendering them ineffective for modern training pipelines. To bridge this gap, we propose Adam-Aware In-Run Data Shapley. We derive a closed-form approximation that restores additivity by redefining utility under a fixed-state assumption and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Bandit Algorithms Research
