FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for   Abstractive Summarization

David Wan; Mohit Bansal

arXiv:2205.07830·cs.CL·May 17, 2022·1 cites

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

David Wan, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

FactPEGASUS is a new abstractive summarization model that enhances factual accuracy through improved pre-training and fine-tuning techniques, outperforming existing models in factuality on multiple tasks.

Contribution

It introduces a fact-aware pre-training strategy and three novel fine-tuning components to improve the factuality of abstractive summaries.

Findings

01

Significantly improves factuality metrics and human evaluations.

02

More robustly maintains factuality in zero-shot and few-shot settings.

03

Does not rely solely on extractiveness for factual accuracy.

Abstract

We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning. The corrector removes hallucinations present in the reference summary, the contrastor uses contrastive learning to better differentiate nonfactual summaries from factual ones, and the connector bridges the gap between the pre-training and fine-tuning for better transfer of knowledge. Experiments on three downstream tasks demonstrate that FactPEGASUS substantially improves factuality evaluated by multiple automatic metrics and humans. Our thorough analysis suggests that FactPEGASUS is more factual than using the original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

meetdavidwan/factpegasus
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsContrastive Learning