An Answer is just the Start: Related Insight Generation for Open-Ended Document-Grounded QA
Saransh Sharma, Pritika Ramu, Aparna Garimella, Koyel Mukherjee

TL;DR
This paper introduces a new task and dataset for generating related insights to improve open-ended document-grounded QA, enabling richer user interaction and iterative refinement.
Contribution
It proposes InsightGen, a two-stage approach combining thematic clustering and neighborhood selection to generate diverse, relevant insights using large language models.
Findings
InsightGen produces useful, relevant, and actionable insights.
The dataset SCOpE-QA contains 3,000 questions across 20 research collections.
Evaluation shows the approach establishes a strong baseline for the new task.
Abstract
Answering open-ended questions remains challenging for AI systems because it requires synthesis, judgment, and exploration beyond factual retrieval, and users often refine answers through multiple iterations rather than accepting a single response. Existing QA benchmarks do not explicitly support this refinement process. To address this gap, we introduce a new task, document-grounded related insight generation, where the goal is to generate additional insights from a document collection that help improve, extend, or rethink an initial answer to an open-ended question, ultimately supporting richer user interaction and a better overall question answering experience. We curate and release SCOpE-QA (Scientific Collections for Open-Ended QA), a dataset of 3,000 open-ended questions across 20 research collections. We present InsightGen, a two-stage approach that first constructs a thematic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
