Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks
Xiao Pu, Mingqi Gao, Xiaojun Wan

TL;DR
This paper evaluates the usefulness of text summaries in downstream tasks using extrinsic human evaluation, revealing that summaries are more beneficial for overall judgment tasks than for question answering, and highlighting limitations of intrinsic metrics.
Contribution
It introduces a novel extrinsic evaluation framework for summaries across multiple downstream tasks and compares the effectiveness of different summarization models and automatic metrics.
Findings
Summaries improve performance in overall judgment tasks.
Fine-tuned models produce more consistent usefulness across tasks.
Intrinsic automatic metrics are effective for question answering but less so for other tasks.
Abstract
Research on automated text summarization relies heavily on human and automatic evaluation. While recent work on human evaluation mainly adopted intrinsic evaluation methods, judging the generic quality of text summaries, e.g. informativeness and coherence, our work focuses on evaluating the usefulness of text summaries with extrinsic methods. We carefully design three different downstream tasks for extrinsic human evaluation of summaries, i.e., question answering, text classification and text similarity assessment. We carry out experiments using system rankings and user behavior data to evaluate the performance of different summarization models. We find summaries are particularly useful in tasks that rely on an overall judgment of the text, while being less effective for question answering tasks. The results show that summaries generated by fine-tuned models lead to higher consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
