Unsupervised Multi-document Summarization with Holistic Inference
Haopeng Zhang, Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Hongwei, Wang, Jiawei Zhang, Dong Yu

TL;DR
This paper introduces a novel unsupervised multi-document summarization framework that uses holistic inference and a new diversity measure called SRI, significantly improving summary quality and diversity.
Contribution
It presents a holistic beam search inference method with the Subset Representative Index (SRI) for unsupervised extractive summarization, emphasizing diversity and importance.
Findings
Outperforms strong baselines in ROUGE scores
Demonstrates the importance of diversity in summaries
Effective on both small and large datasets
Abstract
Multi-document summarization aims to obtain core information from a collection of documents written on the same topic. This paper proposes a new holistic framework for unsupervised multi-document extractive summarization. Our method incorporates the holistic beam search inference method associated with the holistic measurements, named Subset Representative Index (SRI). SRI balances the importance and diversity of a subset of sentences from the source documents and can be calculated in unsupervised and adaptive manners. To demonstrate the effectiveness of our method, we conduct extensive experiments on both small and large-scale multi-document summarization datasets under both unsupervised and adaptive settings. The proposed method outperforms strong baselines by a significant margin, as indicated by the resulting ROUGE scores and diversity measures. Our findings also suggest that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
