Revisiting Summarization Evaluation for Scientific Articles

Arman Cohan; Nazli Goharian

arXiv:1604.00400·cs.CL·April 5, 2016·36 cites

Revisiting Summarization Evaluation for Scientific Articles

Arman Cohan, Nazli Goharian

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the limitations of ROUGE for scientific article summarization evaluation and introduces SERA, a new relevance-based metric that correlates better with manual assessments.

Contribution

The paper reveals ROUGE's unreliability in scientific summarization and proposes SERA, a content relevance metric with higher correlation to manual scores.

Findings

01

ROUGE shows low reliability for scientific summaries.

02

Different ROUGE variants yield inconsistent correlations with manual scores.

03

SERA outperforms ROUGE in correlating with manual evaluation metrics.

Abstract

Evaluation of text summarization approaches have been mostly based on metrics that measure similarities of system generated summaries with a set of human written gold-standard summaries. The most widely used metric in summarization evaluation has been the ROUGE family. ROUGE solely relies on lexical overlaps between the terms and phrases in the sentences; therefore, in cases of terminology variations and paraphrasing, ROUGE is not as effective. Scientific article summarization is one such case that is different from general domain summarization (e.g. newswire data). We provide an extensive analysis of ROUGE's effectiveness as an evaluation metric for scientific summarization; we show that, contrary to the common belief, ROUGE is not much reliable in evaluating scientific summaries. We furthermore show how different variants of ROUGE result in very different correlations with the manual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jessicalopezespejel/gesera
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques