ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs

Zige Wang; Qi Zhu; Fei Mi; Minghui Xu; Ruochun Jin; Wenjing Yang

arXiv:2506.10288·cs.CL·June 13, 2025

ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs

Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang

PDF

Open Access

TL;DR

ClusterUCB introduces an efficient gradient-based data selection method for fine-tuning large language models, utilizing clustering and a modified UCB algorithm to reduce computational costs while maintaining performance.

Contribution

The paper proposes a novel data selection framework combining clustering and a modified UCB algorithm to improve efficiency in gradient-based data influence approximation.

Findings

01

Achieves comparable fine-tuning results with reduced computational resources.

02

Effectively balances exploration and exploitation during data selection.

03

Demonstrates efficiency across various benchmark datasets.

Abstract

Gradient-based data influence approximation has been leveraged to select useful data samples in the supervised fine-tuning of large language models. However, the computation of gradients throughout the fine-tuning process requires too many resources to be feasible in practice. In this paper, we propose an efficient gradient-based data selection framework with clustering and a modified Upper Confidence Bound (UCB) algorithm. Based on the intuition that data samples with similar gradient features will have similar influences, we first perform clustering on the training data pool. Then, we frame the inter-cluster data selection as a constrained computing budget allocation problem and consider it a multi-armed bandit problem. A modified UCB algorithm is leveraged to solve this problem. Specifically, during the iterative sampling process, historical data influence information is recorded to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Natural Language Processing Techniques · Machine Learning and Algorithms