Alignment Data Map for Efficient Preference Data Selection and Diagnosis

Seohyeong Lee; Eunwon Kim; Hwaran Lee; Buru Chang

arXiv:2505.23114·cs.CL·April 21, 2026

Alignment Data Map for Efficient Preference Data Selection and Diagnosis

Seohyeong Lee, Eunwon Kim, Hwaran Lee, Buru Chang

PDF

1 Repo

TL;DR

The paper introduces Alignment Data Map, a tool for selecting high-quality preference data to efficiently train aligned language models, reducing costs and improving annotation accuracy.

Contribution

It proposes a novel data analysis method that identifies effective preference data and detects label misannotations, enhancing alignment training efficiency.

Findings

01

Training on 33% of high-quality, low-variability data achieves comparable or better alignment.

02

Alignment Data Map detects label misannotations by analyzing label-score correlations.

03

Experimental results on multiple benchmarks demonstrate improved efficiency and accuracy.

Abstract

Human preference data is essential for aligning large language models (LLMs) with human values, but collecting such data is often costly and inefficient-motivating the need for efficient data selection methods that reduce annotation costs while preserving alignment effectiveness. To address this issue, we propose Alignment Data Map, a data analysis tool for identifying and selecting effective preference data. We first evaluate alignment scores of the preference data by LLM-as-a-judge, explicit reward model, and reference-based approaches. The Alignment Data Map considers both response quality and inter-response variability based on the alignment scores. From our experimental findings, training on only 33% of samples that exhibit high-quality and low-variability, achieves comparable or superior alignment performance on MT-Bench, Evol-Instruct, and AlpacaEval, compared to training with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

01choco/Alignment-Data-Map
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.