PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan

TL;DR
PRIMERA is a novel pre-trained model designed for multi-document summarization that effectively captures and aggregates information across documents, reducing the need for extensive fine-tuning and dataset-specific adjustments.
Contribution
It introduces a new pre-training objective and an efficient transformer architecture tailored for multi-document summarization tasks.
Findings
Outperforms state-of-the-art models on multiple datasets
Effective in zero-shot, few-shot, and full-supervised settings
Reduces reliance on large labeled datasets
Abstract
We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. PRIMERA uses our newly proposed pre-training objective designed to teach the model to connect and aggregate information across documents. It also uses efficient encoder-decoder transformers to simplify the processing of concatenated input documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on zero-shot, few-shot and full-supervised settings, PRIMERA outperforms current state-of-the-art dataset-specific and pre-trained models on most of these settings with large margins. The code and pre-trained models can be found at \url{https://github.com/allenai/PRIMER}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsHow do I complain to Expedia?*ComplainByAgent · Multi-Head Attention · Attention Is All You Need · Linear Layer · AdamW · Weight Decay · Softmax · How do I get a human at Expedia immediately? (2025-2026) · Linear Warmup With Linear Decay · Residual Connection
