Centroid-based summarization of multiple documents: sentence extraction,   utility-based evaluation, and user studies

Dragomir R. Radev (University of Michigan); Hongyan Jing (Columbia; University); Malgorzata Budzikowska (IBM TJ Watson Research Center)

arXiv:cs/0005020·cs.CL·May 23, 2007·122 cites

Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

Dragomir R. Radev (University of Michigan), Hongyan Jing (Columbia, University), Malgorzata Budzikowska (IBM TJ Watson Research Center)

PDF

Open Access

TL;DR

This paper introduces MEAD, a multi-document summarizer using cluster centroids, along with new evaluation techniques and user studies to assess summary quality and utility.

Contribution

The paper presents a novel multi-document summarizer, MEAD, and introduces utility-based and subsumption-based evaluation methods, validated through user studies.

Findings

01

MEAD effectively summarizes multiple documents using cluster centroids.

02

Utility-based evaluation correlates well with user preferences.

03

User studies demonstrate the effectiveness of the proposed summarization models.

Abstract

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on sentence utility and subsumption, which we have applied to the evaluation of both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-document summarization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques