Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language   Model Control

Xiang Fan; Yiwei Lyu; Paul Pu Liang; Ruslan Salakhutdinov,; Louis-Philippe Morency

arXiv:2211.05750·cs.CL·September 26, 2023

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

Xiang Fan, Yiwei Lyu, Paul Pu Liang, Ruslan Salakhutdinov,, Louis-Philippe Morency

PDF

Open Access 1 Repo

TL;DR

Nano introduces a human-in-the-loop training method that enables language models to generate text aligned with both quantified and unquantified distributions, including personal preferences, with high efficiency.

Contribution

Nano is a novel few-shot training algorithm that learns from human feedback to control text generation distributions, including unquantified and personalized preferences.

Findings

01

Achieves state-of-the-art control over single attributes and distributions.

02

Effectively learns unquantified distributions and personal preferences.

03

Demonstrates high sample efficiency in personalization tasks.

Abstract

Pretrained language models have demonstrated extraordinary capabilities in language generation. However, real-world tasks often require controlling the distribution of generated text in order to mitigate bias, promote fairness, and achieve personalization. Existing techniques for controlling the distribution of generated text only work with quantified distributions, which require pre-defined categories, proportions of the distribution, or an existing corpus following the desired distributions. However, many important distributions, such as personal preferences, are unquantified. In this work, we tackle the problem of generating text following arbitrary distributions (quantified and unquantified) by proposing Nano, a few-shot human-in-the-loop training algorithm that continuously learns from human feedback. Nano achieves state-of-the-art results on single topic/attribute as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sfanxiang/nano
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification