Towards Comprehensive Preference Data Collection for Reward Modeling

Yulan Hu; Qingyang Li; Sheng Ouyang; Ge Chen; Kaihui Chen; Lijun Mei,; Xucheng Ye; Fuzheng Zhang; Yong Liu

arXiv:2406.16486·cs.AI·June 25, 2024

Towards Comprehensive Preference Data Collection for Reward Modeling

Yulan Hu, Qingyang Li, Sheng Ouyang, Ge Chen, Kaihui Chen, Lijun Mei,, Xucheng Ye, Fuzheng Zhang, Yong Liu

PDF

Open Access

TL;DR

This paper introduces a structured, four-step framework for collecting high-quality preference data in reinforcement learning from human feedback, aiming to improve reward models for language models.

Contribution

It proposes a novel comprehensive framework for preference data collection, decomposing the process into four steps to enhance data quality and reduce human labor reliance.

Findings

01

The framework improves the quality of preference data.

02

Experiments show the effectiveness of the proposed method.

03

Structured data collection enhances reward model training.

Abstract

Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models (LLMs) with human preferences, thereby enhancing the quality of responses generated. A critical component of RLHF is the reward model, which is trained on preference data and outputs a scalar reward during the inference stage. However, the collection of preference data still lacks thorough investigation. Recent studies indicate that preference data is collected either by AI or humans, where chosen and rejected instances are identified among pairwise responses. We question whether this process effectively filters out noise and ensures sufficient diversity in collected data. To address these concerns, for the first time, we propose a comprehensive framework for preference data collection, decomposing the process into four incremental steps: Prompt Generation, Response Generation, Response…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms