HaRiM$^+$: Evaluating Summary Quality with Hallucination Risk

Seonil Son; Junsoo Park; Jeong-in Hwang; Junghwa Lee; Hyungjong Noh,; Yeonsoo Lee

arXiv:2211.12118·cs.CL·November 28, 2022

HaRiM$^+$: Evaluating Summary Quality with Hallucination Risk

Seonil Son, Junsoo Park, Jeong-in Hwang, Junghwa Lee, Hyungjong Noh,, Yeonsoo Lee

PDF

Open Access 2 Repos

TL;DR

HaRiM+ is a reference-free, model-based metric for evaluating summary quality by measuring hallucination risk, achieving high correlation with human judgments without additional training.

Contribution

We introduce HaRiM+, a novel hallucination risk metric that requires no extra training and correlates well with human assessments of summary quality.

Findings

01

HaRiM+ achieves state-of-the-art correlation with human judgments.

02

It requires only an off-the-shelf summarization model and token likelihoods.

03

HaRiM+ is reference-free and easy to deploy.

Abstract

One of the challenges of developing a summarization model arises from the difficulty in measuring the factual inconsistency of the generated text. In this study, we reinterpret the decoder overconfidence-regularizing objective suggested in (Miao et al., 2021) as a hallucination risk measurement to better estimate the quality of generated summaries. We propose a reference-free metric, HaRiM+, which only requires an off-the-shelf summarization model to compute the hallucination risk based on token likelihoods. Deploying it requires no additional training of models or ad-hoc modules, which usually need alignment to human judgments. For summary-quality estimation, HaRiM+ records state-of-the-art correlation to human judgment on three summary-quality annotation sets: FRANK, QAGS, and SummEval. We hope that our work, which merits the use of summarization models, facilitates the progress of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques