SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

Yuxin Xiao; Shujian Zhang; Wenxuan Zhou; Marzyeh Ghassemi; Sanqiang Zhao

arXiv:2410.05248·cs.CL·April 21, 2026

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

Yuxin Xiao, Shujian Zhang, Wenxuan Zhou, Marzyeh Ghassemi, Sanqiang Zhao

PDF

TL;DR

SFTMix introduces a Mixup-based regularization method for instruction tuning of large language models, improving performance without requiring high-quality datasets by interpolating examples based on confidence levels.

Contribution

The paper proposes SFTMix, a novel Mixup recipe that leverages training dynamics to interpolate examples with different confidence levels, enhancing instruction tuning without curated datasets.

Findings

01

SFTMix improves instruction-following performance across multiple LLMs.

02

SFTMix enhances healthcare-specific SFT tasks.

03

The method is compatible with data selection and scalable to various applications.

Abstract

To acquire instruction-following capabilities, large language models (LLMs) undergo instruction tuning, where they are trained on instruction-response pairs using next-token prediction (NTP). Efforts to improve instruction tuning often focus on higher-quality supervised fine-tuning (SFT) datasets, typically requiring data filtering with proprietary LLMs or human annotation. In this paper, we take a different approach by proposing SFTMix, a novel Mixup-based recipe that elevates LLM instruction tuning without relying on well-curated datasets. We observe that LLMs exhibit uneven confidence across the semantic representation space. We argue that examples with different confidence levels should play distinct roles in instruction tuning: Confident data is prone to overfitting, while unconfident data is harder to generalize. Based on this insight, SFTMix leverages training dynamics to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.