Enhancing Training Data Attribution with Representational Optimization
Weiwei Sun, Haokun Liu, Nikhil Kandpal, Colin Raffel, Yiming Yang

TL;DR
AirRep is a scalable, representation-based method for training data attribution that learns task-specific, model-aligned representations optimized for attribution, achieving high accuracy with much greater efficiency.
Contribution
We introduce AirRep, a novel approach that optimizes representations specifically for training data attribution, combining scalability with high fidelity.
Findings
AirRep matches state-of-the-art gradient-based methods in attribution performance.
AirRep is nearly 100 times more efficient at inference than gradient-based approaches.
AirRep demonstrates robustness and generalization across different tasks and models.
Abstract
Training data attribution (TDA) methods aim to measure how training data impacts a model's predictions. While gradient-based attribution methods, such as influence functions, offer theoretical grounding, their computational costs make them impractical for large-scale applications. Representation-based approaches are far more scalable, but typically rely on heuristic embeddings that are not optimized for attribution, limiting their fidelity. To address these challenges, we propose AirRep, a scalable, representation-based approach that closes this gap by learning task-specific and model-aligned representations optimized explicitly for TDA. AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence. We train AirRep using a ranking objective over automatically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
