Scalable Data Attribution via Forward-Only Test-Time Inference
Sibo Ma, Julian Nyarko

TL;DR
This paper introduces a scalable data attribution method that shifts computation from inference to training simulation, enabling real-time attribution in large models with lower computational costs.
Contribution
The proposed method eliminates per-query backward passes by simulating training influence through short-horizon gradient propagation, improving scalability and efficiency.
Findings
Matches or surpasses state-of-the-art attribution metrics
Offers orders-of-magnitude lower inference cost
Applicable to large pretrained models for real-time attribution
Abstract
Data attribution seeks to trace model behavior back to the training examples that shaped it, enabling debugging, auditing, and data valuation at scale. Classical influence-function methods offer a principled foundation but remain impractical for modern networks because they require expensive backpropagation or Hessian inversion at inference. We propose a data attribution method that preserves the same first-order counterfactual target while eliminating per-query backward passes. Our approach simulates each training example's parameter influence through short-horizon gradient propagation during training and later reads out attributions for any query using only forward evaluations. This design shifts computation from inference to simulation, reflecting real deployment regimes where a model may serve billions of user queries but originate from a fixed, finite set of data sources (for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Complex Network Analysis Techniques · Information Retrieval and Search Behavior
