Is Summary Useful or Not? An Extrinsic Human Evaluation of Text   Summaries on Downstream Tasks

Xiao Pu; Mingqi Gao; Xiaojun Wan

arXiv:2305.15044·cs.CL·May 25, 2023·1 cites

Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks

Xiao Pu, Mingqi Gao, Xiaojun Wan

PDF

Open Access

TL;DR

This paper evaluates the usefulness of text summaries in downstream tasks using extrinsic human evaluation, revealing that summaries are more beneficial for overall judgment tasks than for question answering, and highlighting limitations of intrinsic metrics.

Contribution

It introduces a novel extrinsic evaluation framework for summaries across multiple downstream tasks and compares the effectiveness of different summarization models and automatic metrics.

Findings

01

Summaries improve performance in overall judgment tasks.

02

Fine-tuned models produce more consistent usefulness across tasks.

03

Intrinsic automatic metrics are effective for question answering but less so for other tasks.

Abstract

Research on automated text summarization relies heavily on human and automatic evaluation. While recent work on human evaluation mainly adopted intrinsic evaluation methods, judging the generic quality of text summaries, e.g. informativeness and coherence, our work focuses on evaluating the usefulness of text summaries with extrinsic methods. We carefully design three different downstream tasks for extrinsic human evaluation of summaries, i.e., question answering, text classification and text similarity assessment. We carry out experiments using system rankings and user behavior data to evaluate the performance of different summarization models. We find summaries are particularly useful in tasks that rely on an overall judgment of the text, while being less effective for question answering tasks. The results show that summaries generated by fine-tuned models lead to higher consistency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques