What's Wrong? Refining Meeting Summaries with LLM Feedback
Frederic Kirstein, Terry Ruas, Bela Gipp

TL;DR
This paper presents a multi-LLM correction method for meeting summarization that identifies errors and refines summaries, significantly improving relevance, informativeness, and coherence in generated meeting summaries.
Contribution
It introduces a two-phase multi-LLM correction approach and a new dataset for error annotation, enhancing meeting summary quality through iterative feedback.
Findings
High accuracy in error identification by LLMs
Effective summary quality improvement through multi-LLM refinement
Potential applicability to other complex text generation tasks
Abstract
Meeting summarization has become a critical task since digital encounters have become a common practice. Large language models (LLMs) show great potential in summarization, offering enhanced coherence and context understanding compared to traditional methods. However, they still struggle to maintain relevance and avoid hallucination. We introduce a multi-LLM correction approach for meeting summarization using a two-phase process that mimics the human review process: mistake identification and summary refinement. We release QMSum Mistake, a dataset of 200 automatically generated meeting summaries annotated by humans on nine error types, including structural, omission, and irrelevance errors. Our experiments show that these errors can be identified with high accuracy by an LLM. We transform identified mistakes into actionable feedback to improve the quality of a given summary measured by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Natural Language Processing Techniques
