Assessing "Implicit" Retrieval Robustness of Large Language Models

Xiaoyu Shen; Rexhina Blloshmi; Dawei Zhu; Jiahuan Pei; Wei Zhang

arXiv:2406.18134·cs.CL·June 27, 2024

Assessing "Implicit" Retrieval Robustness of Large Language Models

Xiaoyu Shen, Rexhina Blloshmi, Dawei Zhu, Jiahuan Pei, Wei Zhang

PDF

Open Access

TL;DR

This paper investigates the ability of large language models to implicitly handle retrieval errors without explicit relevance judgment, showing that fine-tuning improves robustness while maintaining accuracy.

Contribution

It demonstrates that end-to-end fine-tuning on mixed contexts enhances retrieval robustness without needing explicit relevance assessment.

Findings

01

Fine-tuning on gold and distracting contexts improves robustness.

02

Models can implicitly handle irrelevant retrieved information.

03

Explicit relevance judgment may be unnecessary for robustness.

Abstract

Retrieval-augmented generation has gained popularity as a framework to enhance large language models with external knowledge. However, its effectiveness hinges on the retrieval robustness of the model. If the model lacks retrieval robustness, its performance is constrained by the accuracy of the retriever, resulting in significant compromises when the retrieved context is irrelevant. In this paper, we evaluate the "implicit" retrieval robustness of various large language models, instructing them to directly output the final answer without explicitly judging the relevance of the retrieved context. Our findings reveal that fine-tuning on a mix of gold and distracting context significantly enhances the model's robustness to retrieval inaccuracies, while still maintaining its ability to extract correct answers when retrieval is accurate. This suggests that large language models can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques