Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

Jinyang Wu; Shuai Zhang; Feihu Che; Mingkuan Feng; Chuyuan Zhang; Pengpeng Shao; Jianhua Tao

arXiv:2408.13533·cs.CL·June 3, 2025·2 cites

Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

Jinyang Wu, Shuai Zhang, Feihu Che, Mingkuan Feng, Chuyuan Zhang, Pengpeng Shao, Jianhua Tao

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces a comprehensive evaluation framework called NoiserBench for analyzing the impact of seven distinct noise types on large language models in retrieval-augmented generation, revealing that some noise can be beneficial while others are harmful.

Contribution

It defines seven noise types from a linguistic perspective, establishes a new benchmark, and empirically evaluates their effects on diverse LLMs, uncovering that some noise can improve model performance.

Findings

01

Beneficial noise can enhance LLM capabilities.

02

Harmful noise generally impairs performance.

03

Some noise types are actually advantageous for LLMs.

Abstract

Retrieval-Augmented Generation (RAG) has emerged as a crucial method for addressing hallucinations in large language models (LLMs). While recent research has extended RAG models to complex noisy scenarios, these explorations often confine themselves to limited noise types and presuppose that noise is inherently detrimental to LLMs, potentially deviating from real-world retrieval environments and restricting practical applicability. In this paper, we define seven distinct noise types from a linguistic perspective and establish a Noise RAG Benchmark (NoiserBench), a comprehensive evaluation framework encompassing multiple datasets and reasoning tasks. Through empirical evaluation of eight representative LLMs with diverse architectures and scales, we reveal that these noises can be further categorized into two practical groups: noise that is beneficial to LLMs (aka beneficial noise) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jinyangwu/NoiserBench
pytorchOfficial

Datasets

Jinyang23/NoiserBench
dataset· 20 dl
20 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Adam · Layer Normalization · Weight Decay · Dense Connections · WordPiece · Attention Dropout · Linear Warmup With Linear Decay · Byte Pair Encoding