Unraveling and Mitigating Retriever Inconsistencies in   Retrieval-Augmented Large Language Models

Mingda Li; Xinyu Li; Yifan Chen; Wenfeng Xuan; Weinan Zhang

arXiv:2405.20680·cs.AI·March 7, 2025

Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models

Mingda Li, Xinyu Li, Yifan Chen, Wenfeng Xuan, Weinan Zhang

PDF

Open Access 1 Repo

TL;DR

This paper investigates the inconsistency issues in Retrieval-Augmented Large Language Models, analyzes their causes, and proposes an ensemble retriever framework to improve factual accuracy and reduce errors.

Contribution

It provides a theoretical decomposition of RALM degeneration, identifies key factors causing inconsistency, and introduces EoR, a trainable ensemble retriever to enhance performance.

Findings

01

EoR significantly reduces inconsistent behaviors in RALMs.

02

Analysis reveals knowledge source differences and reader errors as main causes.

03

EoR improves open-domain QA performance over single-retriever RALMs.

Abstract

Although Retrieval-Augmented Large Language Models (RALMs) demonstrate their superiority in terms of factuality, they do not consistently outperform the original retrieval-free Language Models (LMs). Our experiments reveal that this example-level performance inconsistency exists not only between retrieval-augmented and retrieval-free LM but also among different retrievers. To understand this phenomenon, we investigate the degeneration behavior of RALMs and theoretically decompose it into four categories. Further analysis based on our decomposition reveals that the innate difference in knowledge sources and the unpredictable degeneration of the reader model contribute most to the inconsistency. Drawing from our analysis, we introduce Ensemble of Retrievers (EoR), a trainable framework that can adaptively retrieve from different knowledge sources and effectively decrease unpredictable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mingdali6717/ensemble-of-retrievers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques