An Unsupervised Masking Objective for Abstractive Multi-Document News Summarization
Nikolai Vogler, Songlin Li, Yujie Xu, Yujian Mi, Taylor, Berg-Kirkpatrick

TL;DR
This paper introduces an unsupervised masking approach for abstractive multi-document news summarization that achieves near-supervised performance without using ground-truth summaries.
Contribution
It proposes a novel unsupervised training objective based on masking source documents with high lexical centrality, improving summarization quality.
Findings
Outperforms previous unsupervised methods on Multi-News dataset.
Surpasses the best supervised methods in human evaluations.
Effectively leverages lexical centrality measures for training.
Abstract
We show that a simple unsupervised masking objective can approach near supervised performance on abstractive multi-document news summarization. Our method trains a state-of-the-art neural summarization model to predict the masked out source document with highest lexical centrality relative to the multi-document group. In experiments on the Multi-News dataset, our masked training objective yields a system that outperforms past unsupervised methods and, in human evaluation, surpasses the best supervised method without requiring access to any ground-truth summaries. Further, we evaluate how different measures of lexical centrality, inspired by past work on extractive summarization, affect final performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
