Building a Conversational Agent Overnight with Dialogue Self-Play

Pararth Shah; Dilek Hakkani-T\"ur; Gokhan T\"ur; Abhinav Rastogi,; Ankur Bapna; Neha Nayak; Larry Heck

arXiv:1801.04871·cs.AI·January 16, 2018·161 cites

Building a Conversational Agent Overnight with Dialogue Self-Play

Pararth Shah, Dilek Hakkani-T\"ur, Gokhan T\"ur, Abhinav Rastogi,, Ankur Bapna, Neha Nayak, Larry Heck

PDF

Open Access 3 Repos

TL;DR

This paper introduces M2M, a rapid, scalable framework for building goal-oriented dialogue agents through machine self-play and crowdsourcing, achieving diverse, natural dialogues efficiently.

Contribution

The paper presents a novel framework combining automation and crowdsourcing to quickly bootstrap dialogue agents with high diversity and naturalness, applicable to arbitrary domains.

Findings

01

M2M generates diverse dialogue flows surpassing Wizard-of-Oz methods.

02

The framework produces natural-sounding utterances with high coverage of salient dialogues.

03

A new dataset of 3,000 dialogues across 2 domains was created using M2M.

Abstract

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue "outlines", i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Mobile Crowdsensing and Crowdsourcing