Interactive Dialogue Agents via Reinforcement Learning on Hindsight   Regenerations

Joey Hong; Jessica Lin; Anca Dragan; Sergey Levine

arXiv:2411.05194·cs.LG·November 11, 2024

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations

Joey Hong, Jessica Lin, Anca Dragan, Sergey Levine

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach that improves dialogue agents by learning from hindsight to better steer conversations, especially in sensitive domains like mental health and charity, outperforming existing methods.

Contribution

The paper proposes a novel offline reinforcement learning method that rewrites and augments dialogue data using hindsight, enabling agents to learn effective conversational strategies without extensive expert data.

Findings

01

Outperforms state-of-the-art dialogue agents in user studies

02

Effective in domains requiring understanding of human mental states

03

Enhances dialogue steering capabilities through hindsight-based learning

Abstract

Recent progress on large language models (LLMs) has enabled dialogue agents to generate highly naturalistic and plausible text. However, current LLM language generation focuses on responding accurately to questions and requests with a single effective response. In reality, many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion. Accounting for how an agent can effectively steer a conversation is a crucial ability in many dialogue tasks, from healthcare to preference elicitation. Existing methods for fine-tuning dialogue agents to accomplish such tasks would rely on curating some amount of expert data. However, doing so often requires understanding the underlying cognitive processes of the conversational partner, which is a skill neither humans nor LLMs trained on human data can reliably…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI