SPICE: Submodular Penalized Information-Conflict Selection for Efficient Large Language Model Training

Powei Chang; Jinpeng Zhang; Bowen Chen; Chenyu Wang; Chenlu Guo; Yixing Zhang; Yukang Gao; JianXiang Xiang; Yue Gao; Chaoqun Sun; Yiyi Chen; and Dongying Kong

arXiv:2601.23155·cs.LG·April 9, 2026

SPICE: Submodular Penalized Information-Conflict Selection for Efficient Large Language Model Training

Powei Chang, Jinpeng Zhang, Bowen Chen, Chenyu Wang, Chenlu Guo, Yixing Zhang, Yukang Gao, JianXiang Xiang, Yue Gao, Chaoqun Sun, Yiyi Chen, and Dongying Kong

PDF

1 Video

TL;DR

SPICE is a novel data selection method for large language model training that reduces gradient conflicts, leading to more efficient training with less data and comparable or better performance.

Contribution

It introduces a conflict-aware data selector that maximizes information gain while penalizing gradient misalignment, improving efficiency in instruction tuning.

Findings

01

SPICE selects higher-information subsets than existing criteria.

02

Using only 10% of data, SPICE matches or exceeds full-data tuning performance.

03

SPICE achieves significant training cost reductions across multiple benchmarks.

Abstract

Information-based data selection for instruction tuning is compelling: maximizing the log-determinant of the Fisher information yields a monotone submodular objective, enabling greedy algorithms to achieve a $(1 - 1/ e)$ approximation under a cardinality budget. In practice, however, we identify alleviating gradient conflicts, misalignment between per-sample gradients, is a key factor that slows down the decay of marginal log-determinant information gains, thereby preventing significant loss of information. We formalize this via an $ε$ -decomposition that quantifies the deviation from ideal submodularity as a function of conflict statistics, yielding data-dependent approximation factors that tighten as conflicts diminish. Guided by this analysis, we propose SPICE, a conflict-aware selector that maximizes information while penalizing misalignment, and that supports early stopping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SPICE: Submodular Penalized Information–Conflict Selection for Efficient Large Language Model Training· slideslive