WIDAR -- Weighted Input Document Augmented ROUGE
Raghav Jain, Vaibhav Mavi, Anubhav Jangra, Sriparna Saha

TL;DR
WIDAR is a new evaluation metric for text summarization that incorporates input documents alongside reference summaries, improving correlation with human judgments across multiple quality aspects.
Contribution
The paper introduces WIDAR, a versatile evaluation metric that enhances ROUGE by utilizing input documents, leading to better alignment with human assessments.
Findings
WIDAR correlates 26-82% better with human judgments across various criteria.
WIDAR achieves comparable results to state-of-the-art metrics.
WIDAR requires less computational time than some existing metrics.
Abstract
The task of automatic text summarization has gained a lot of traction due to the recent advancements in machine learning techniques. However, evaluating the quality of a generated summary remains to be an open problem. The literature has widely adopted Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation metric for summarization. However, ROUGE has some long-established limitations; a major one being its dependence on the availability of good quality reference summary. In this work, we propose the metric WIDAR which in addition to utilizing the reference summary uses also the input document in order to evaluate the quality of the generated summary. The proposed metric is versatile, since it is designed to adapt the evaluation score according to the quality of the reference summary. The proposed metric correlates better than ROUGE by 26%, 76%, 82%, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
