Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs

Abhinav Arabelly; Jagrut Nemade; Robert D Nowak; Jifan Zhang

arXiv:2507.21482·cs.CL·July 30, 2025

Improving Task Diversity in Label Efficient Supervised Finetuning of LLMs

Abhinav Arabelly, Jagrut Nemade, Robert D Nowak, Jifan Zhang

PDF

TL;DR

This paper introduces a simple, confidence-based sampling method leveraging task diversity for label-efficient supervised finetuning of LLMs, achieving comparable or better performance with significantly reduced annotation costs.

Contribution

It proposes a novel inverse confidence weighting sampling strategy that improves data efficiency in supervised finetuning of LLMs by utilizing task labels and model confidence levels.

Findings

01

Achieves up to 4% higher accuracy than full dataset training.

02

Reduces annotation costs by up to 80%.

03

Performs consistently at or above state-of-the-art across datasets.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse domains, but developing high-performing models for specialized applications often requires substantial human annotation -- a process that is time-consuming, labor-intensive, and expensive. In this paper, we address the label-efficient learning problem for supervised finetuning (SFT) by leveraging task-diversity as a fundamental principle for effective data selection. This is markedly different from existing methods based on the prompt-diversity. Our approach is based on two key observations: 1) task labels for different prompts are often readily available; 2) pre-trained models have significantly varying levels of confidence across tasks. We combine these facts to devise a simple yet effective sampling strategy: we select examples across tasks using an inverse confidence weighting strategy. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.