How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?
Siye Wu, Jian Xie, Jiangjie Chen, Tinghui Zhu, Kai Zhang, Yanghua Xiao

TL;DR
This paper investigates how large language models are affected by irrelevant information retrieved from external sources, revealing their vulnerability to semantically related distractions and limitations of current mitigation methods.
Contribution
The study introduces a framework for generating high-quality irrelevant information and analyzes LLM robustness, highlighting their difficulty in discriminating related distractions and the shortcomings of existing solutions.
Findings
LLMs struggle to distinguish highly semantically related irrelevant information.
Current retrieval systems often retrieve misleading yet relevant-looking data.
Existing methods have limited effectiveness in improving LLM robustness to irrelevant inputs.
Abstract
By leveraging the retrieval of information from external knowledge databases, Large Language Models (LLMs) exhibit enhanced capabilities for accomplishing many knowledge-intensive tasks. However, due to the inherent flaws of current retrieval systems, there might exist irrelevant information within those retrieving top-ranked passages. In this work, we present a comprehensive investigation into the robustness of LLMs to different types of irrelevant information under various conditions. We initially introduce a framework to construct high-quality irrelevant information that ranges from semantically unrelated, partially related, and related to questions. Furthermore, our analysis demonstrates that the constructed irrelevant information not only scores highly on similarity metrics, being highly retrieved by existing systems, but also bears semantic connections to the context. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
