Data Selection for LLM Alignment Using Fine-Grained Preferences

Jia Zhang; Yao Liu; Chen-Xi Zhang; Yi Liu; Yi-Xuan Jin; Lan-Zhe Guo; Yu-Feng Li

arXiv:2508.07638·cs.LG·March 3, 2026

Data Selection for LLM Alignment Using Fine-Grained Preferences

Jia Zhang, Yao Liu, Chen-Xi Zhang, Yi Liu, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a data selection method based on preference divergence to improve LLM alignment with fine-grained human preferences, achieving better results with less data.

Contribution

It formulates preference conflicts as divergence and proposes a data selection strategy that enhances alignment efficiency and effectiveness.

Findings

01

Achieves better alignment with 30% of data compared to full-data methods.

02

Theoretically guarantees near-optimal data selection based on preference divergence.

03

Empirically demonstrates consistent improvements across various datasets.

Abstract

Large language models (LLMs) alignment aims to ensure that the behavior of LLMs meets human preferences. While collecting data from multiple fine-grained, aspect-specific preferences becomes more and more feasible, existing alignment methods typically work on a single preference and thus struggle with conflicts inherent in such aggregated datasets. As one early attempt, in this paper, we propose a data-centric approach to align LLMs through the effective use of fine-grained preferences. Specifically, we formulate the problem as a direct fine-grained preference optimization and introduce preference divergence (PD) that quantifies inter-aspect preference conflicts. Instead of directly tackling the consequent complicated optimization, we recast it as a data selection problem and propose a simple yet effective strategy, which identifies a subset of data corresponding to the most negative PD…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

A balanced sampling strategy is applied to mitigate the intrinsic bias towards longer responses that are favored regardless of quality. A penalty term is introduced into the reward model to discourage length bias as well. The paper is well written and easy to follow.

Weaknesses

See the below questions.

Reviewer 02Rating 4Confidence 3

Strengths

1. The authors clearly demonstrate motivation for the problem, where aggregating fine-grained preferences introduces conflicts, redundancy, and noise that degrade LLM alignment. 2. The development of loss bounds and the selection optimality result underpin the proposed data selection strategy with rigorous analysis, providing compelling mathematical justification for selecting samples by most-negative PD. 3. Extensive evaluation: The method is thoroughly evaluated against full-data and alternati

Weaknesses

1. I am not familiar with this research scope, but the current evaluation focuses on UltraFeedback and HelpSteer, and their derived conflict settings are limited. The author should conduct experiments with more advanced benchmarks for a clear demonstration of their effectiveness. 2. The empirical studies do not report in-depth on the sensitivity of the method to hyperparameters (e.g., $\lambda$, quantile level $\gamma$, length penalty $\rho$, sampling ratio $p_r$), aside from the generic selec

Reviewer 03Rating 8Confidence 3

Strengths

1. Detailed problem formulation and the novel transformation of the problem into data selection methods instead of algorithm development is interesting. 2. Empirical coverage is thorough.

Weaknesses

1. With ever-larger models and compute, scaling-laws may simply “wash out” moderate preference noise; the urgency of the problem is not demonstrated. 2. The method operates on a fixed dataset and is demonstrated only with the now “classical” DPO pipeline. Readers working with on-policy RL extensions are unlikely to see an immediate hook. Extending DFPO to iterative regimes like iterative DPO would greatly widen its appeal. 3. If a dataset contains several conflicting preferences, DFPO appear

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques