Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations
Joey Hong, Jessica Lin, Anca Dragan, Sergey Levine

TL;DR
This paper introduces a reinforcement learning approach that improves dialogue agents by learning from hindsight to better steer conversations, especially in sensitive domains like mental health and charity, outperforming existing methods.
Contribution
The paper proposes a novel offline reinforcement learning method that rewrites and augments dialogue data using hindsight, enabling agents to learn effective conversational strategies without extensive expert data.
Findings
Outperforms state-of-the-art dialogue agents in user studies
Effective in domains requiring understanding of human mental states
Enhances dialogue steering capabilities through hindsight-based learning
Abstract
Recent progress on large language models (LLMs) has enabled dialogue agents to generate highly naturalistic and plausible text. However, current LLM language generation focuses on responding accurately to questions and requests with a single effective response. In reality, many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion. Accounting for how an agent can effectively steer a conversation is a crucial ability in many dialogue tasks, from healthcare to preference elicitation. Existing methods for fine-tuning dialogue agents to accomplish such tasks would rely on curating some amount of expert data. However, doing so often requires understanding the underlying cognitive processes of the conversational partner, which is a skill neither humans nor LLMs trained on human data can reliably…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI
