ProDS: Preference-oriented Data Selection for Instruction Tuning

Wenya Guo; Zhengkun Zhang; Xumeng Liu; Ying Zhang; Ziyu Lu; Haoze Zhu; Xubo Liu; Ruxue Yan

arXiv:2505.12754·cs.LG·May 20, 2025

ProDS: Preference-oriented Data Selection for Instruction Tuning

Wenya Guo, Zhengkun Zhang, Xumeng Liu, Ying Zhang, Ziyu Lu, Haoze Zhu, Xubo Liu, Ruxue Yan

PDF

Open Access

TL;DR

ProDS introduces a preference-oriented data selection approach for instruction tuning that aligns training data with human preferences, improving task performance by focusing on response diversity and preference modeling.

Contribution

The paper presents ProDS, a novel data selection method that incorporates human preferences into training data selection for instruction tuning, unlike existing methods.

Findings

01

ProDS outperforms existing data selection methods in experiments.

02

Preference alignment improves instruction tuning effectiveness.

03

Bidirectional preference synthesis enhances sample scoring accuracy.

Abstract

Instruction data selection aims to identify a high-quality subset from the training set that matches or exceeds the performance of the full dataset on target tasks. Existing methods focus on the instruction-to-response mapping, but neglect the human preference for diverse responses. In this paper, we propose Preference-oriented Data Selection method (ProDS) that scores training samples based on their alignment with preferences observed in the target set. Our key innovation lies in shifting the data selection criteria from merely estimating features for accurate response generation to explicitly aligning training samples with human preferences in target tasks. Specifically, direct preference optimization (DPO) is employed to estimate human preferences across diverse responses. Besides, a bidirectional preference synthesis strategy is designed to score training samples according to both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Intelligent Tutoring Systems and Adaptive Learning

MethodsFocus · Sparse Evolutionary Training