From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems
Youngjoon Jang, Seongtae Hong, Junyoung Son, Sungjin Park, Chanjun Park, Heuiseok Lim

TL;DR
This paper investigates how coreference resolution impacts retrieval-augmented generation systems, demonstrating that resolving entity ambiguity improves retrieval relevance and question-answering performance, especially in smaller models.
Contribution
The study systematically analyzes the effect of coreference resolution on RAG systems, highlighting its benefits for retrieval effectiveness and generative accuracy, and compares pooling strategies for context capturing.
Findings
Coreference resolution improves retrieval relevance in RAG systems.
Smaller models benefit more from disambiguation in QA tasks.
Mean pooling outperforms other strategies after coreference resolution.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a crucial framework in natural language processing (NLP), improving factual consistency and reducing hallucinations by integrating external document retrieval with large language models (LLMs). However, the effectiveness of RAG is often hindered by coreferential complexity in retrieved documents, introducing ambiguity that disrupts in-context learning. In this study, we systematically investigate how entity coreference affects both document retrieval and generative performance in RAG-based systems, focusing on retrieval relevance, contextual understanding, and overall response quality. We demonstrate that coreference resolution enhances retrieval effectiveness and improves question-answering (QA) performance. Through comparative analysis of different pooling strategies in retrieval tasks, we find that mean pooling demonstrates superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
