TicketTalk: Toward human-level performance with end-to-end,   transaction-based dialog systems

Bill Byrne; Karthik Krishnamoorthi; Saravanan Ganesh; Mihir Sanjay; Kale

arXiv:2012.12458·cs.CL·December 29, 2020

TicketTalk: Toward human-level performance with end-to-end, transaction-based dialog systems

Bill Byrne, Karthik Krishnamoorthi, Saravanan Ganesh, Mihir Sanjay, Kale

PDF

1 Datasets

TL;DR

This paper introduces TicketTalk, a large movie ticketing dialog dataset and an end-to-end neural model that achieves near-human response quality and high API call accuracy, advancing transaction-based dialog systems.

Contribution

The paper presents TicketTalk dataset and demonstrates a neural approach that significantly improves response quality and factual grounding in transaction-based dialog systems.

Findings

01

Model responses rated 86.5% sensible by humans.

02

API call predictions achieved 93.9% correctness.

03

Dataset size positively impacts response and API prediction scores.

Abstract

We present a data-driven, end-to-end approach to transaction-based dialog systems that performs at near-human levels in terms of verbal response quality and factual grounding accuracy. We show that two essential components of the system produce these results: a sufficiently large and diverse, in-domain labeled dataset, and a neural network-based, pre-trained model that generates both verbal responses and API call predictions. In terms of data, we introduce TicketTalk, a movie ticketing dialog dataset with 23,789 annotated conversations. The movie ticketing conversations range from completely open-ended and unrestricted to more structured, both in terms of their knowledge base, discourse features, and number of turns. In qualitative human evaluations, model-generated responses trained on just 10,000 TicketTalk dialogs were rated to "make sense" 86.5 percent of the time, almost the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

GEM/Taskmaster
dataset· 210 dl
210 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.