DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely
Forrest Sheng Bao, Ruixuan Tu, Ge Luo, Yinfei Yang, Hebi Li, Minghui, Qiu, Youbiao He, Cen Chen

TL;DR
This study demonstrates that reference-based summary quality metrics can be adapted to function as reference-free metrics, with the repurposed zero-shot BERTScore outperforming many existing metrics across multiple datasets.
Contribution
The paper introduces a novel method to transform reference-based metrics into effective reference-free summary quality evaluators using adaptation techniques.
Findings
Repurposed zero-shot BERTScore outperforms original reference-based version.
The adapted metric surpasses most existing reference-free metrics.
It closely rivals GPT-3.5 based evaluators.
Abstract
Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system summary against its corresponding reference can be effectively adapted to assess it against its source document, thereby transforming these metrics into reference-free ones. Experimental results support this hypothesis. After being repurposed reference-freely, the zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B parameters consistently outperforms its original reference-based version across various aspects on the SummEval and Newsroom datasets. It also excels in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Cosine Annealing · Adam · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines · Residual Connection · Byte Pair Encoding
