Preference-Guided Reflective Sampling for Aligning Language Models

Hai Ye; Hwee Tou Ng

arXiv:2408.12163·cs.CL·October 7, 2024

Preference-Guided Reflective Sampling for Aligning Language Models

Hai Ye, Hwee Tou Ng

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces Preference-Guided Reflective Sampling (PRS), a novel sampling method that improves the alignment of large language models to human preferences through adaptive self-refinement and natural language preference specification.

Contribution

PRS offers a tree-based, adaptive sampling framework that enhances response quality and preference alignment in large language models, outperforming traditional random sampling methods.

Findings

01

PRS generates higher-quality responses with better rewards.

02

PRS outperforms repeated random sampling in best-of-N scenarios.

03

PRS shows strong performance in iterative offline RL training.

Abstract

Iterative data generation and model re-training can effectively align large language models(LLMs) to human preferences. The process of data sampling is crucial, as it significantly influences the success of policy improvement. Repeated random sampling is a widely used method that independently queries the model multiple times to generate outputs. In this work, we propose a more effective sampling method, named Preference-Guided Reflective Sampling (PRS). Unlike random sampling, PRS employs a tree-based generation framework to enable more efficient sampling. It leverages adaptive self-refinement techniques to better explore the sampling space. By specifying user preferences in natural language, PRS can further optimize response generation according to these preferences. As a result, PRS can align models to diverse user preferences. Our experiments demonstrate that PRS generates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nusnlp/prs
pytorchOfficial

Datasets

oceanpty/alpaca_user_preference
dataset· 19 dl
19 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsALIGN