From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Youngjoon Jang; Seongtae Hong; Junyoung Son; Sungjin Park; Chanjun Park; Heuiseok Lim

arXiv:2507.07847·cs.CL·April 29, 2026

From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Youngjoon Jang, Seongtae Hong, Junyoung Son, Sungjin Park, Chanjun Park, Heuiseok Lim

PDF

TL;DR

This paper investigates how coreference resolution impacts retrieval-augmented generation systems, demonstrating that resolving entity ambiguity improves retrieval relevance and question-answering performance, especially in smaller models.

Contribution

The study systematically analyzes the effect of coreference resolution on RAG systems, highlighting its benefits for retrieval effectiveness and generative accuracy, and compares pooling strategies for context capturing.

Findings

01

Coreference resolution improves retrieval relevance in RAG systems.

02

Smaller models benefit more from disambiguation in QA tasks.

03

Mean pooling outperforms other strategies after coreference resolution.

Abstract

Retrieval-Augmented Generation (RAG) has emerged as a crucial framework in natural language processing (NLP), improving factual consistency and reducing hallucinations by integrating external document retrieval with large language models (LLMs). However, the effectiveness of RAG is often hindered by coreferential complexity in retrieved documents, introducing ambiguity that disrupts in-context learning. In this study, we systematically investigate how entity coreference affects both document retrieval and generative performance in RAG-based systems, focusing on retrieval relevance, contextual understanding, and overall response quality. We demonstrate that coreference resolution enhances retrieval effectiveness and improves question-answering (QA) performance. Through comparative analysis of different pooling strategies in retrieval tasks, we find that mean pooling demonstrates superior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.