Planning with Learned Entity Prompts for Abstractive Summarization
Shashi Narayan, Yao Zhao, Joshua Maynez, Gon\c{c}alo Simoes, Vitaly, Nikolaev, Ryan McDonald

TL;DR
This paper proposes a method for abstractive summarization that uses learned entity chains as intermediate prompts to improve summary quality, entity specificity, and faithfulness, achieving state-of-the-art results.
Contribution
It introduces a flexible entity chain prompting mechanism for summarization, enhancing content planning and hallucination control in transformer models.
Findings
Improves entity specificity and planning in summaries across multiple datasets.
Achieves state-of-the-art Rouge scores on XSum and SAMSum.
Provides a mechanism to reduce hallucinations in generated summaries.
Abstract
We introduce a simple but flexible mechanism to learn an intermediate plan to ground the generation of abstractive summaries. Specifically, we prepend (or prompt) target summaries with entity chains -- ordered sequences of entities mentioned in the summary. Transformer-based sequence-to-sequence models are then trained to generate the entity chain and then continue generating the summary conditioned on the entity chain and the input. We experimented with both pretraining and finetuning with this content planning objective. When evaluated on CNN/DailyMail, XSum, SAMSum and BillSum, we demonstrate empirically that the grounded generation with the planning objective improves entity specificity and planning in summaries for all datasets, and achieves state-of-the-art performance on XSum and SAMSum in terms of Rouge. Moreover, we demonstrate empirically that planning with entity chains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
