Modeling Performance in Open-Domain Dialogue with PARADISE
Marilyn Walker, Colin Harmon, James Graupera, Davan Harrison, Steve, Whittaker

TL;DR
This paper develops a PARADISE model to predict dialogue system performance using user ratings and dialogue length, aiming to optimize open-domain conversational agents like Athena in real time.
Contribution
It introduces a general objective function for evaluating and optimizing open-domain dialogue systems based on automatic features and real user data.
Findings
Best user rating prediction model achieves R^2 of 0.136 with DistilBert.
Dialogue length prediction model achieves R^2 of 0.865 with system-independent features.
Dialogue length may be a more reliable automatic metric for training dialogue systems.
Abstract
There has recently been an explosion of work on spoken dialogue systems, along with an increased interest in open-domain systems that engage in casual conversations on popular topics such as movies, books and music. These systems aim to socially engage, entertain, and even empathize with their users. Since the achievement of such social goals is hard to measure, recent research has used dialogue length or human ratings as evaluation metrics, and developed methods for automatically calculating novel metrics, such as coherence, consistency, relevance and engagement. Here we develop a PARADISE model for predicting the performance of Athena, a dialogue system that has participated in thousands of conversations with real users, while competing as a finalist in the Alexa Prize. We use both user ratings and dialogue length as metrics for dialogue quality, and experiment with predicting these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Softmax · Weight Decay · Residual Connection · Linear Warmup With Linear Decay · WordPiece · Attention Dropout
