REAL: Response Embedding-based Alignment for LLMs

Honggen Zhang; Xufeng Zhao; Igor Molybog; June Zhang

arXiv:2409.17169·cs.CL·June 5, 2025

REAL: Response Embedding-based Alignment for LLMs

Honggen Zhang, Xufeng Zhao, Igor Molybog, June Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces REAL, a response embedding-based method that improves LLM alignment by selecting less ambiguous response pairs, reducing annotation bias and effort, and enhancing alignment quality.

Contribution

REAL proposes a novel embedding-based selection strategy for response pairs that improves annotation efficiency and alignment accuracy in LLMs.

Findings

01

Selecting dissimilar response pairs improves LLM alignment.

02

The method reduces labeling errors and annotation effort.

03

Enhanced performance on dialogue tasks with less annotation work.

Abstract

Aligning large language models (LLMs) to human preferences is a crucial step in building helpful and safe AI tools, which usually involve training on supervised datasets. Popular algorithms such as Direct Preference Optimization (DPO) rely on pairs of AI-generated responses ranked according to human annotation. The response pair annotation process might bring human bias. Building a correct preference dataset is the costly part of the alignment pipeline. To improve annotation efficiency and quality in the LLMs alignment, we propose REAL: Response Embedding-based Alignment for LLMs, a strategy for constructing a high-quality training dataset that focuses on acquiring the less ambiguous preference pairs for labeling out of a set of response candidates. Our selection process is based on the similarity of embedding responses independently of prompts, which guarantees the selection process in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

honggen-zhang/real-alignment
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training