ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter
Yixin Chen, Ying Xiong, Shangyu Wu, Xiangrui Ke, Nan Guan, Chun Jason Xue

TL;DR
ReFilter introduces a token-level filtering and fusion method for retrieval-augmented generation, enhancing robustness and performance in knowledge-intensive QA tasks, especially with large retrieval sets.
Contribution
The paper proposes ReFilter, a novel latent-based fusion framework with token-level filtering and gating, improving scalability and effectiveness over existing methods in RAG systems.
Findings
ReFilter outperforms existing fusion methods on four general-domain QA benchmarks.
ReFilter achieves 70.01% accuracy on biomedical QA benchmarks without domain fine-tuning.
The approach scales well with larger retrieval sets, maintaining high performance.
Abstract
Retrieval-augmented generation (RAG) has become a dominant paradigm for grounding large language models (LLMs) with external evidence in knowledge-intensive question answering. A core design choice is how to fuse retrieved samples into the LLMs, where existing internal fusion approaches broadly fall into query-based fusion, parametric fusion, and latent-based fusion. Despite their effectiveness at modest retrieval scales, these methods often fail to scale gracefully as the number of retrieved candidates k increases: Larger k improves evidence coverage, yet realistic top-k retrieval inevitably contains irrelevant or redundant content and increases the inference cost. To address these limitations, we propose ReFilter, a novel latent-based fusion framework that performs token-level filtering and fusion. ReFilter consists of three key components: a context encoder for encoding context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
