Human-centric Dialog Training via Offline Reinforcement Learning

Natasha Jaques; Judy Hanwen Shen; Asma Ghandeharioun; Craig Ferguson,; Agata Lapedriza; Noah Jones; Shixiang Shane Gu; and Rosalind Picard

arXiv:2010.05848·cs.CL·October 13, 2020

Human-centric Dialog Training via Offline Reinforcement Learning

Natasha Jaques, Judy Hanwen Shen, Asma Ghandeharioun, Craig Ferguson,, Agata Lapedriza, Noah Jones, Shixiang Shane Gu, and Rosalind Picard

PDF

1 Repo

TL;DR

This paper introduces a novel offline reinforcement learning approach to train dialog models using human feedback, addressing exploration and overestimation challenges, and demonstrates improved conversational quality in real-world tests.

Contribution

It develops a new offline RL algorithm with KL-control and pessimistic strategies, enabling effective training of dialog models from static human feedback datasets.

Findings

01

Significant improvement in dialog quality over existing offline RL methods.

02

Effective use of implicit human feedback cues as reward signals.

03

Validated approach with 80 user ratings in open-domain conversations.

Abstract

How can we train a dialog model to produce better conversations by learning from human feedback, without the risk of humans teaching it harmful chat behaviors? We start by hosting models online, and gather human feedback from real-time, open-ended conversations, which we then use to train and improve the models using offline reinforcement learning (RL). We identify implicit conversational cues including language similarity, elicitation of laughter, sentiment, and more, which indicate positive human feedback, and embed these in multiple reward functions. A well-known challenge is that learning an RL policy in an offline setting usually fails due to the lack of ability to explore and the tendency to make over-optimistic estimates of future reward. These problems become even harder when using RL for language models, which can easily have a 20,000 action vocabulary and many possible reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

natashamjaques/neural_chat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.