Two Approaches to Building Collaborative, Task-Oriented Dialog Agents through Self-Play
Arkady Arkhangorodsky, Scot Fang, Victoria Knight, Ajay Nagesh, Maria, Ryskina, Kevin Knight

TL;DR
This paper explores two self-play methods, reinforcement learning and game theory, for training task-oriented dialog agents, enabling autonomous discovery of communication strategies in API environments.
Contribution
It introduces and empirically evaluates two novel self-play approaches for training dialog agents without relying on large human dialog datasets.
Findings
Reinforcement learning approach effectively trains dialog agents through self-play.
Game-theoretic equilibrium finding provides a stable strategy for dialog agents.
Both methods enable agents to autonomously develop communication strategies.
Abstract
Task-oriented dialog systems are often trained on human/human dialogs, such as collected from Wizard-of-Oz interfaces. However, human/human corpora are frequently too small for supervised training to be effective. This paper investigates two approaches to training agent-bots and user-bots through self-play, in which they autonomously explore an API environment, discovering communication strategies that enable them to solve the task. We give empirical results for both reinforcement learning and game-theoretic equilibrium finding.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Speech and dialogue systems · Reinforcement Learning in Robotics
