PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document   Summarization

Wen Xiao; Iz Beltagy; Giuseppe Carenini; Arman Cohan

arXiv:2110.08499·cs.CL·March 18, 2022·1 cites

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan

PDF

Open Access 3 Repos 2 Datasets

TL;DR

PRIMERA is a novel pre-trained model designed for multi-document summarization that effectively captures and aggregates information across documents, reducing the need for extensive fine-tuning and dataset-specific adjustments.

Contribution

It introduces a new pre-training objective and an efficient transformer architecture tailored for multi-document summarization tasks.

Findings

01

Outperforms state-of-the-art models on multiple datasets

02

Effective in zero-shot, few-shot, and full-supervised settings

03

Reduces reliance on large labeled datasets

Abstract

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. PRIMERA uses our newly proposed pre-training objective designed to teach the model to connect and aggregate information across documents. It also uses efficient encoder-decoder transformers to simplify the processing of concatenated input documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on zero-shot, few-shot and full-supervised settings, PRIMERA outperforms current state-of-the-art dataset-specific and pre-trained models on most of these settings with large margins. The code and pre-trained models can be found at \url{https://github.com/allenai/PRIMER}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies

MethodsHow do I complain to Expedia?*ComplainByAgent · Multi-Head Attention · Attention Is All You Need · Linear Layer · AdamW · Weight Decay · Softmax · How do I get a human at Expedia immediately? (2025-2026) · Linear Warmup With Linear Decay · Residual Connection