GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating   Open-Domain Dialogue Systems

Lishan Huang; Zheng Ye; Jinghui Qin; Liang Lin; Xiaodan Liang

arXiv:2010.03994·cs.CL·October 9, 2020·6 cites

GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems

Lishan Huang, Zheng Ye, Jinghui Qin, Liang Lin, Xiaodan Liang

PDF

Open Access 1 Repo

TL;DR

GRADE is a novel graph-based evaluation metric that improves the assessment of dialogue coherence by modeling topic transitions and reasoning over dialogue graphs, outperforming existing metrics.

Contribution

The paper introduces GRADE, a new metric combining utterance and topic-level graph representations for better dialogue coherence evaluation.

Findings

01

GRADE outperforms state-of-the-art metrics in correlation with human judgments.

02

Incorporates topic transition dynamics via dialogue graphs.

03

Provides a large-scale human evaluation benchmark.

Abstract

Automatically evaluating dialogue coherence is a challenging but high-demand ability for developing high-quality open-domain dialogue systems. However, current evaluation metrics consider only surface features or utterance-level semantics, without explicitly considering the fine-grained topic transition dynamics of dialogue flows. Here, we first consider that the graph structure constituted with topics in a dialogue can accurately depict the underlying communication logic, which is a more natural way to produce persuasive metrics. Capitalized on the topic-level dialogue graph, we propose a new evaluation metric GRADE, which stands for Graph-enhanced Representations for Automatic Dialogue Evaluation. Specifically, GRADE incorporates both coarse-grained utterance-level contextualized representations and fine-grained topic-level graph representations to evaluate dialogue coherence. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

li3cmz/GRADE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems