Topic-based Evaluation for Conversational Bots

Fenfei Guo; Angeliki Metallinou; Chandra Khatri; Anirudh Raju; Anu; Venkatesh; Ashwin Ram

arXiv:1801.03622·cs.CL·January 12, 2018·41 cites

Topic-based Evaluation for Conversational Bots

Fenfei Guo, Angeliki Metallinou, Chandra Khatri, Anirudh Raju, Anu, Venkatesh, Ashwin Ram

PDF

Open Access 1 Repo

TL;DR

This paper introduces topic-based metrics for evaluating conversational bots, focusing on coherence, engagement, and topic diversity, using a novel deep learning approach to classify conversation topics.

Contribution

It presents a new topic classification method with a topic-word attention mechanism and demonstrates that topic-based metrics align with human judgments in bot evaluation.

Findings

01

Metrics correlate with human ratings

02

Topic diversity improves user engagement

03

Proposed method outperforms baseline classifiers

Abstract

Dialog evaluation is a challenging problem, especially for non task-oriented dialogs where conversational success is not well-defined. We propose to evaluate dialog quality using topic-based metrics that describe the ability of a conversational bot to sustain coherent and engaging conversations on a topic, and the diversity of topics that a bot can handle. To detect conversation topics per utterance, we adopt Deep Average Networks (DAN) and train a topic classifier on a variety of question and query data categorized into multiple topics. We propose a novel extension to DAN by adding a topic-word attention table that allows the system to jointly capture topic keywords in an utterance and perform topic classification. We compare our proposed topic based metrics with the ratings provided by users and show that our metrics both correlate with and complement human judgment. Our analysis is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

knights207210/Deep-Learning-for-VUI
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining