Learning to Weight Parameters for Training Data Attribution

Shuangqi Li; Hieu Le; Jingyi Xu; Mathieu Salzmann

arXiv:2506.05647·cs.LG·February 23, 2026

Learning to Weight Parameters for Training Data Attribution

Shuangqi Li, Hieu Le, Jingyi Xu, Mathieu Salzmann

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a method to explicitly learn parameter importance weights for gradient-based data attribution, improving the identification of influential training examples across various tasks without needing labeled data.

Contribution

It presents a novel approach to directly learn parameter importance weights, addressing limitations of existing methods that rely on uniform or Hessian-based implicit weighting.

Findings

01

Improves attribution accuracy across image, language, and diffusion tasks.

02

Enables fine-grained attribution for concepts like subject and style.

03

Does not require annotated labels for learning parameter importance.

Abstract

We study gradient-based data attribution, aiming to identify which training examples most influence a given output. Existing methods for this task either treat network parameters uniformly or rely on implicit weighting derived from Hessian approximations, which do not fully model functional heterogeneity of network parameters. To address this, we propose a method to explicitly learn parameter importance weights directly from data, without requiring annotated labels. Our approach improves attribution accuracy across diverse tasks, including image classification, language modeling, and diffusion, and enables fine-grained attribution for concepts like subject and style.

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

- This work presents a methodology to automatically learn weight model parameter groups on their attribution abilities. Prior work in attribution using influence functions focuses on a predefined set of parameter groups. This work presents a more principled solution for this. - Experiments include comparison against multiple strong attribution baselines. - The analysis of involving subject, style and background variations in synthetic dataset shows strong results.

Weaknesses

- The language modeling experiments use GPT-2 small and its unclear to me if the gains will necessarily transfer to larger LMs. As the authors highlight in section 2, recent methods (LoGRA, TrackStar) have improved efficiency. Any reason for not experimenting with larger models in this work? - The paper could benefit from analyzing its relative performance gains across different baseline attribution methods. For instance, they show strong results with TracIn but moderate results with recent meth

Reviewer 02Rating 6Confidence 3

Strengths

1. Identifying parameter heterogeneity as a fundamental but overlooked factor in data attribution. 2. Generalizes across TracIn, TRAK, DAS, and others with a single weighting formulation. 3. Strong, consistent results across image, text. 4. Learned weights provide semantic insights into layer-level specialization

Weaknesses

1. Technically, TRAK and similar Hessian-based methods already introduce an implicit parameter weighting through the approximation of $H^{-1}$, which scales gradients by local curvature. The proposed explicit weighting can thus be viewed as learning functional heterogeneity on top of curvature-based scaling. 2. The self-supervised loss bootstraps from existing methods (e.g., TRAK), so its ultimate accuracy may inherit their biases. 3. Equation (6) defines the weighted attribution as $\til

Reviewer 03Rating 6Confidence 3

Strengths

- The paper propose an interesting question (and opportunity) to improve the effectiveness of data attribution. This direction is discussed but not checked in depth in previous literature. - The self-supervised mechanism to learn parameter weight is practical and easy to use. - The improvement over standard data attribution methods are obvious.

Weaknesses

- The main weakness lies in the self-supervised weight learning loss design. The analysis use a signal-to-noise ratio model to see the attribution score and try to optimize the parameter weight to get highest signal-to-noise. - Problem is that the definition of signal is related to the top-k attribution score. The decision is not justified well in Section 4.2 as well as in Appendix A. - Intuitively, top-k attribution score is very important (to the counterfactual prediction), while the least

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification