Salient Information Prompting to Steer Content in Prompt-based   Abstractive Summarization

Lei Xu; Mohammed Asad Karim; Saket Dingliwal; Aparna Elangovan

arXiv:2410.02741·cs.CL·December 4, 2024

Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization

Lei Xu, Mohammed Asad Karim, Saket Dingliwal, Aparna Elangovan

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that incorporating phrase-level salient keyphrases into prompts significantly improves the quality of LLM-generated summaries, with a new lightweight keyphrase extractor enhancing performance across models and datasets.

Contribution

It introduces SigExt, a lightweight, finetuneable keyphrase extractor, and shows that salient phrase-level information in prompts enhances summarization quality in prompt-based LLMs.

Findings

01

Adding keyphrases improves ROUGE scores and recall.

02

Phrase-level salient info outperforms word- or sentence-level info.

03

Salient information can be effectively extracted with SigExt across models and datasets.

Abstract

Large language models (LLMs) can generate fluent summaries across domains using prompting techniques, reducing the need to train models for summarization applications. However, crafting effective prompts that guide LLMs to generate summaries with the appropriate level of detail and writing style remains a challenge. In this paper, we explore the use of salient information extracted from the source document to enhance summarization prompts. We show that adding keyphrases in prompts can improve ROUGE F1 and recall, making the generated summaries more similar to the reference and more complete. The number of keyphrases can control the precision-recall trade-off. Furthermore, our analysis reveals that incorporating phrase-level salient information is superior to word- or sentence-level. However, the impact on hallucination is not universally positive across LLMs. To conduct this analysis,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-science/SigExt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Natural Language Processing Techniques