Aligning Language Models with Offline Learning from Human Feedback

Jian Hu; Li Tao; June Yang; Chandler Zhou

arXiv:2308.12050·cs.CL·December 12, 2023

Aligning Language Models with Offline Learning from Human Feedback

Jian Hu, Li Tao, June Yang, Chandler Zhou

PDF

Open Access 2 Repos

TL;DR

This paper introduces an offline framework for aligning language models with human preferences, avoiding online learning's instability and complexity, and achieving comparable results with less computational resources.

Contribution

It proposes new offline methods like filtering alignment, reward-weighted regression, and conditional alignment for stable, resource-efficient language model alignment.

Findings

01

Conditional alignment outperforms other offline methods.

02

The proposed methods require around 9% less computing resources.

03

Conditional alignment achieves results comparable to PPO.

Abstract

Learning from human preferences is crucial for language models (LMs) to effectively cater to human needs and societal values. Previous research has made notable progress by leveraging human feedback to follow instructions. However, these approaches rely primarily on online learning techniques like Proximal Policy Optimization (PPO), which have been proven unstable and challenging to tune for language models. Moreover, PPO requires complex distributed system implementation, hindering the efficiency of large-scale distributed training. In this study, we propose an offline learning from human feedback framework to align LMs without interacting with environments. Specifically, we explore filtering alignment (FA), reward-weighted regression (RWR), and conditional alignment (CA) to align language models to human preferences. By employing a loss function similar to supervised fine-tuning, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Entropy Regularization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Layer Normalization · Dense Connections · Absolute Position Encodings