RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Hanze Dong; Wei Xiong; Deepanshu Goyal; Yihan Zhang; Winnie Chow; Rui; Pan; Shizhe Diao; Jipeng Zhang; Kashun Shum; Tong Zhang

arXiv:2304.06767·cs.LG·December 4, 2023·31 cites

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui, Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang

PDF

Open Access 1 Repo 8 Models

TL;DR

This paper introduces RAFT, a new fine-tuning framework that improves generative models by selecting high-quality samples based on reward models, addressing inefficiencies of previous reinforcement learning methods.

Contribution

RAFT offers a robust and efficient alternative to RLHF by using reward-ranked sample filtering for better model alignment with human preferences.

Findings

01

RAFT improves reward learning performance

02

RAFT enhances automated metric scores

03

RAFT is effective for language and diffusion models

Abstract

Generative foundation models are susceptible to implicit biases that can arise from extensive unsupervised training data. Such biases can produce suboptimal samples, skewed outcomes, and unfairness, with potentially serious consequences. Consequently, aligning these models with human ethics and preferences is an essential step toward ensuring their responsible and effective deployment in real-world applications. Prior research has primarily employed Reinforcement Learning from Human Feedback (RLHF) to address this problem, where generative models are fine-tuned with RL algorithms guided by a human-feedback-informed reward model. However, the inefficiencies and instabilities associated with RL algorithms frequently present substantial obstacles to the successful alignment, necessitating the development of a more robust and streamlined approach. To this end, we introduce a new framework,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

optimalscale/lmflow
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Music and Audio Processing · Speech Recognition and Synthesis

MethodsDiffusion · ALIGN