One Algorithm, Two Goals: Dual Scoring for Parameter and Data Selection in LLM Fine-Tuning
Xinrui Chen, Liu Yang, Ou Wu

TL;DR
This paper introduces DualSFT, a novel one-shot dual-scoring algorithm for joint parameter and data selection in LLM fine-tuning, improving efficiency and performance.
Contribution
It formulates parameter and data selection as bilevel problems and derives a shared scoring rule, enabling a unified approach for both tasks.
Findings
DualSFT enhances target-task performance on large language models.
It achieves better stability-plasticity trade-offs compared to baseline methods.
Full DualSFT outperforms sequential hybrid baselines under equal budgets.
Abstract
In Large Language Model (LLM) fine-tuning, parameter and data selection are common strategies for reducing fine-tuning cost, yet they are typically driven by separate scoring mechanisms. When a parameter mask and data subset jointly determine restricted fine-tuning, this separation incurs redundant overhead and makes coordinated selection difficult. We cast parameter and data selection as two bilevel selection problems under a common validation objective and derive a shared local response-surrogate scoring rule. Under first- and second-order validation-improvement approximations, parameter importance and data utility emerge as column-wise and row-wise aggregations of a single gradient interaction matrix, yielding a closed-form row-column correspondence for co-extracting both signals. Building on this structure, we propose DualSFT (Dual-Selection Fine-Tuning), a one-shot dual-scoring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
