Towards a Robust Retrieval-Based Summarization System
Shengjie Liu, Jing Wu, Jingyuan Bao, Wenyi Wang, Naira Hovakimyan,, Christopher G Healey

TL;DR
This paper evaluates and improves the robustness of large language models in retrieval-based summarization through a new evaluation framework and a fine-tuned system, demonstrating enhanced logical coherence and summarization quality.
Contribution
It introduces LogicSumm for realistic robustness evaluation and SummRAG, a fine-tuned system that significantly improves LLM performance in complex summarization scenarios.
Findings
SummRAG improves logical coherence in summaries
Enhanced robustness in real-world scenarios
Open access to data and code
Abstract
This paper describes an investigation of the robustness of large language models (LLMs) for retrieval augmented generation (RAG)-based summarization tasks. While LLMs provide summarization capabilities, their performance in complex, real-world scenarios remains under-explored. Our first contribution is LogicSumm, an innovative evaluation framework incorporating realistic scenarios to assess LLM robustness during RAG-based summarization. Based on limitations identified by LogiSumm, we then developed SummRAG, a comprehensive system to create training dialogues and fine-tune a model to enhance robustness within LogicSumm's scenarios. SummRAG is an example of our goal of defining structured methods to test the capabilities of an LLM, rather than addressing issues in a one-off fashion. Experimental results confirm the power of SummRAG, showcasing improved logical coherence and summarization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Web Data Mining and Analysis · Data Quality and Management
