Rethinking and Refining the Distinct Metric

Siyang Liu; Sahand Sabour; Yinhe Zheng; Pei Ke; Xiaoyan Zhu; Minlie; Huang

arXiv:2202.13587·cs.CL·April 5, 2022

Rethinking and Refining the Distinct Metric

Siyang Liu, Sahand Sabour, Yinhe Zheng, Pei Ke, Xiaoyan Zhu, Minlie, Huang

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Expectation-Adjusted Distinct (EAD) metric, which refines the calculation of diversity scores in language generation by removing biases related to sequence length, leading to better correlation with human judgments.

Contribution

The paper proposes a novel bias-corrected distinct score, EAD, with empirical and theoretical validation, improving diversity evaluation in language models.

Findings

01

EAD correlates better with human judgments

02

Original distinct scores are biased against longer sequences

03

EAD effectively removes length-related biases

Abstract

Distinct- $n$ score\cite{Li2016} is a widely used automatic metric for evaluating diversity in language generation tasks. However, we observed that the original approach for calculating distinct scores has evident biases that tend to assign higher penalties to longer sequences. We refine the calculation of distinct scores by scaling the number of distinct tokens based on their expectations. We provide both empirical and theoretical evidence to show that our method effectively removes the biases existing in the original distinct score. Our experiments show that our proposed metric, \textit{Expectation-Adjusted Distinct (EAD)}, correlates better with human judgment in evaluating response diversity. To foster future research, we provide an example implementation at \url{https://github.com/lsy641/Expectation-Adjusted-Distinct}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lsy641/expectation-adjusted-distinct
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Speech and dialogue systems