SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation
Longxuan Ma, Ziyu Zhuang, Weinan Zhang, Mingda Li, Ting, Liu

TL;DR
SelF-Eval is a self-supervised framework for fine-grained dialogue evaluation that correlates turn and dialogue quality, outperforming existing models and aligning well with human judgments.
Contribution
It introduces a novel automatic data construction method and a multi-level contrastive learning schema for dialogue quality assessment.
Findings
Highly consistent with human evaluations
Outperforms state-of-the-art models
Effective on multiple benchmarks
Abstract
This paper introduces a novel Self-supervised Fine-grained Dialogue Evaluation framework (SelF-Eval). The core idea is to model the correlation between turn quality and the entire dialogue quality. We first propose a novel automatic data construction method that can automatically assign fine-grained scores for arbitrarily dialogue data. Then we train \textbf{SelF-Eval} with a multi-level contrastive learning schema which helps to distinguish different score levels. Experimental results on multiple benchmarks show that SelF-Eval is highly consistent with human evaluations and better than the state-of-the-art models. We give a detailed analysis of the experiments in this paper. Our code is available on GitHub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsContrastive Learning
