DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections

Haebin Shin; Lei Ji; Xiao Liu; Zhiwei Yu; Qi Chen; Yeyun Gong

arXiv:2508.12116·cs.LG·August 19, 2025

DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections

Haebin Shin, Lei Ji, Xiao Liu, Zhiwei Yu, Qi Chen, Yeyun Gong

PDF

Open Access

TL;DR

DynamixSFT is a dynamic, automated method for optimizing instruction-tuning dataset mixtures, formulated as a multi-armed bandit problem, leading to performance improvements across multiple benchmarks.

Contribution

We introduce a novel bandit-based approach with Prior-scaled Boltzmann Exploration for dataset mixture optimization in instruction tuning, preserving diversity and enhancing performance.

Findings

01

Achieves up to 2.2% performance improvement on 10 benchmarks

02

Effectively balances dataset diversity and model performance

03

Provides insights into adaptive dataset mixture dynamics

Abstract

As numerous instruction-tuning datasets continue to emerge during the post-training stage, dynamically balancing and optimizing their mixtures has become a critical challenge. To address this, we propose DynamixSFT, a dynamic and automated method for instruction-tuning dataset mixture optimization. We formulate the problem as a multi-armed bandit setup and introduce a Prior-scaled Boltzmann Exploration that softly anchors the updated sampling distribution to the original dataset proportions, thereby preserving the inherent diversity and coverage of the collection. Sampling probabilities are updated using a lightweight 1-Step Look-ahead Reward, reflecting how much the dataset contributes to improving the model's performance at its current state. When applied to the Tulu-v2-mixture collection comprising 16 instruction-tuning datasets, DynamixSFT achieves up to a 2.2% performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvancements in Photolithography Techniques