ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter

Yixin Chen; Ying Xiong; Shangyu Wu; Xiangrui Ke; Nan Guan; Chun Jason Xue

arXiv:2602.12709·cs.CL·February 16, 2026

ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter

Yixin Chen, Ying Xiong, Shangyu Wu, Xiangrui Ke, Nan Guan, Chun Jason Xue

PDF

Open Access

TL;DR

ReFilter introduces a token-level filtering and fusion method for retrieval-augmented generation, enhancing robustness and performance in knowledge-intensive QA tasks, especially with large retrieval sets.

Contribution

The paper proposes ReFilter, a novel latent-based fusion framework with token-level filtering and gating, improving scalability and effectiveness over existing methods in RAG systems.

Findings

01

ReFilter outperforms existing fusion methods on four general-domain QA benchmarks.

02

ReFilter achieves 70.01% accuracy on biomedical QA benchmarks without domain fine-tuning.

03

The approach scales well with larger retrieval sets, maintaining high performance.

Abstract

Retrieval-augmented generation (RAG) has become a dominant paradigm for grounding large language models (LLMs) with external evidence in knowledge-intensive question answering. A core design choice is how to fuse retrieved samples into the LLMs, where existing internal fusion approaches broadly fall into query-based fusion, parametric fusion, and latent-based fusion. Despite their effectiveness at modest retrieval scales, these methods often fail to scale gracefully as the number of retrieved candidates k increases: Larger k improves evidence coverage, yet realistic top-k retrieval inevitably contains irrelevant or redundant content and increases the inference cost. To address these limitations, we propose ReFilter, a novel latent-based fusion framework that performs token-level filtering and fusion. ReFilter consists of three key components: a context encoder for encoding context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning