Multi-round, Chain-of-thought Post-editing for Unfaithful Summaries
Yi-Hui Lee, Xiangci Li, Jessica Ouyang

TL;DR
This paper explores using large language models with chain-of-thought prompting for multi-round post-editing to improve the faithfulness of news summaries, showing it can outperform prior methods and enhance correction success.
Contribution
It introduces a multi-round, chain-of-thought prompting approach for LLM-based post-editing of summaries, demonstrating improved faithfulness correction over single-round methods.
Findings
Chain-of-thought prompting correlates well with human judgments of faithfulness.
Multi-round editing further improves summary accuracy.
Prompting with error-type reasoning is effective for factual correction.
Abstract
Recent large language models (LLMs) have demonstrated a remarkable ability to perform natural language understanding and generation tasks. In this work, we investigate the use of LLMs for evaluating faithfulness in news summarization, finding that it achieves a strong correlation with human judgments. We further investigate LLMs' capabilities as a faithfulness post-editor, experimenting with different chain-of-thought prompts for locating and correcting factual inconsistencies between a generated summary and the source news document and are able to achieve a higher editing success rate than was reported in prior work. We perform both automated and human evaluations of the post-edited summaries, finding that prompting LLMs using chain-of-thought reasoning about factual error types is an effective faithfulness post-editing strategy, performing comparably to fine-tuned post-editing models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship
