SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

Longxuan Ma; Ziyu Zhuang; Weinan Zhang; Mingda Li; Ting; Liu

arXiv:2208.08094·cs.CL·September 19, 2022·1 cites

SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation

Longxuan Ma, Ziyu Zhuang, Weinan Zhang, Mingda Li, Ting, Liu

PDF

Open Access 1 Repo

TL;DR

SelF-Eval is a self-supervised framework for fine-grained dialogue evaluation that correlates turn and dialogue quality, outperforming existing models and aligning well with human judgments.

Contribution

It introduces a novel automatic data construction method and a multi-level contrastive learning schema for dialogue quality assessment.

Findings

01

Highly consistent with human evaluations

02

Outperforms state-of-the-art models

03

Effective on multiple benchmarks

Abstract

This paper introduces a novel Self-supervised Fine-grained Dialogue Evaluation framework (SelF-Eval). The core idea is to model the correlation between turn quality and the entire dialogue quality. We first propose a novel automatic data construction method that can automatically assign fine-grained scores for arbitrarily dialogue data. Then we train \textbf{SelF-Eval} with a multi-level contrastive learning schema which helps to distinguish different score levels. Experimental results on multiple benchmarks show that SelF-Eval is highly consistent with human evaluations and better than the state-of-the-art models. We give a detailed analysis of the experiments in this paper. Our code is available on GitHub.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

royny/self-eval
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsContrastive Learning