ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF

Victor Gallego

arXiv:2308.06385·cs.CL·December 15, 2023

ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF

Victor Gallego

PDF

Open Access 2 Repos

TL;DR

This paper introduces ZYN, a zero-shot reward modeling approach using Yes-No questions with instruction-tuned language models to align text generation with human preferences without labeled data.

Contribution

The paper presents a novel zero-shot reward model framework that leverages Yes-No prompts for guiding language models, applicable across various text generation tasks.

Findings

01

Effective in detoxification and sentiment optimization

02

Compatible with quality-diversity search methods

03

Enables personalized prompt generation for text-to-image tasks

Abstract

In this work, we address the problem of directing the text generation of a language model (LM) towards a desired behavior, aligning the generated text with the preferences of the human operator. We propose using another, instruction-tuned language model as a critic reward model in a zero-shot way thanks to the prompt of a Yes-No question that represents the user preferences, without requiring further labeled data. This zero-shot reward model provides the learning signal to further fine-tune the base LM using Reinforcement Learning from AI Feedback (RLAIF); yet our approach is also compatible in other contexts such as quality-diversity search. Extensive evidence of the capabilities of the proposed ZYN framework is provided through experiments in different domains related to text generation, including detoxification; optimizing sentiment of movie reviews, or any other attribute; steering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

MethodsBalanced Selection