Fast Training Dataset Attribution via In-Context Learning
Milad Fotouhi, Mohammad Taha Bahadori, Oluwaseyi Feyisetan, Payman, Arabshahi, David Heckerman

TL;DR
This paper introduces two new methods leveraging in-context learning to attribute the influence of training data on large language model outputs, with the mixture model approach showing superior robustness.
Contribution
It presents two novel approaches for training data attribution in LLMs, including a mixture distribution model that outperforms similarity-based methods under noisy conditions.
Findings
Mixture model approach is more robust to retrieval noise.
The methods effectively estimate data contributions in instruction-tuned LLMs.
Abstract
We investigate the use of in-context learning and prompt engineering to estimate the contributions of training data in the outputs of instruction-tuned large language models (LLMs). We propose two novel approaches: (1) a similarity-based approach that measures the difference between LLM outputs with and without provided context, and (2) a mixture distribution model approach that frames the problem of identifying contribution scores as a matrix factorization task. Our empirical comparison demonstrates that the mixture model approach is more robust to retrieval noise in in-context learning, providing a more reliable estimation of data contributions.
Peer Reviews
Decision·Submitted to ICLR 2026
- Using in-context learning to measure the dataset contribution is an interesting idea that has efficiency advantages. - The authors conducted extensive experiments for empirical evaluation. - The paper is easy to follow.
- The proposed methods have strong assumptions that are not explicitly validated. - For the Shapley Context Model, a key (implicit) assumption is that adding the dataset to the context would achieve a similar effect as if the model is finetuned on this dataset. This could be validated empirically (e.g., by comparing the outputs of a base model + dataset context vs a model finetuned on the dataset). - For the Context Matrix Factorization, the assumptions that in-context learning (and finetuning)
The potential to create a reliable data attribution method that is computationally efficient is good.
I found the explanation of the methods in this paper highly confusing. For 2.1, the most salient issues are: * Line 074: "context" is not clearly defined. Based on the datasets used, it seems that "context" is a passage that is relevant for answering the question, but in other places "context" appears to refer to in-context demonstrations from a dataset. * Line 092/Equation 1: While claimed to be a definition of $s_k$, this is not actually a definition as "sim" is also not defined. As a more min
1. The proposed method can be applied to black-box LLMs and does not require re-training. 2. Thoughtful dataset design covering likely seen, synthetic unseen but similar, and guaranteed unseen settings. 3. Demonstrated downstream use for evaluating unlearning methods and strong runtime efficiency for CMF.
1. The presentation of the proposed methodology is not smooth, which hinders comprehension. If the author intends to use mathematical symbols, it is necessary to clearly define them beforehand. Furthermore, many of the mathematical notations appear to be inaccurate. For example, denoting $p(y|q,c)$ as $M(q|c)$ is unconventional, and the notation used in Eq. (1) is imprecise, which may lead to misinterpretation of the author's intended meaning. 2. The rationale behind the described methods, SCM
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
