TL;DR
This paper introduces kernel surrogate models for task attribution in AI, improving accuracy over linear models and influence functions, enabling scalable and precise influence estimation for training tasks.
Contribution
It develops a kernel surrogate modeling approach with a gradient-based estimation method, capturing nonlinear task interactions more effectively than prior linear models.
Findings
Kernel surrogate models achieve less than 2% relative error in performance prediction.
They outperform linear surrogates and influence functions by 25% in correlation with ground truth.
Using kernel surrogates for data selection improves downstream task performance by 40%.
Abstract
Modern AI agents such as large language models are trained on diverse tasks -- translation, code generation, mathematical reasoning, and text prediction -- simultaneously. A key question is how to quantify the influence of each individual training task on performance on a target task, a problem we refer to as task attribution. The direct approach, leave-one-out retraining, measures the effect of removing each task, but is computationally infeasible at scale. An alternative approach that builds surrogate models to predict the performance on a target task for any subset of training tasks has emerged in the recent literature. Prior work focuses on linear surrogate models, which capture first-order relationships but miss nonlinear interactions such as XOR-type effects. In this paper, we first consider a unified task-weighting framework for analyzing task-attribution methods and establish a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
