Evaluating Open-Domain Dialogues in Latent Space with Next Sentence   Prediction and Mutual Information

Kun Zhao; Bohao Yang; Chenghua Lin; Wenge Rong; Aline Villavicencio; and Xiaohui Cui

arXiv:2305.16967·cs.CL·June 13, 2023·1 cites

Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information

Kun Zhao, Bohao Yang, Chenghua Lin, Wenge Rong, Aline Villavicencio, and Xiaohui Cui

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new automatic evaluation metric for open-domain dialogues that leverages latent space modeling, next sentence prediction, and mutual information to better assess semantic similarity and handle diverse responses.

Contribution

The proposed CMN metric combines CVAEs with NSP and MI to improve the robustness and accuracy of dialogue evaluation, addressing the one-to-many response issue.

Findings

01

Outperforms existing baselines in dialogue evaluation.

02

Effectively handles semantically distant responses.

03

Demonstrates robustness across multiple datasets.

Abstract

The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i.e., there may be multiple suitable responses which differ in semantics for a given conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robustly evaluate open-domain dialogues by augmenting Conditional Variational Autoencoders (CVAEs) with a Next Sentence Prediction (NSP) objective and employing Mutual Information (MI) to model the semantic similarity of text in the latent space. Experimental results on two open-domain dialogue datasets demonstrate the superiority of our method compared with a wide range of baselines, especially in handling responses which are distant to the golden reference responses in semantics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bernard-yang/cmn-acl2023
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems