Rethinking Soft Compression in Retrieval-Augmented Generation: A Query-Conditioned Selector Perspective
Yunhao Liu, Zian Jia, Xinyu Gao, Kanjun Xu, Yun Xiong

TL;DR
This paper introduces SeleCom, a query-conditioned selector framework for retrieval-augmented generation that improves efficiency and performance by selectively compressing relevant information, addressing limitations of full compression methods.
Contribution
The paper proposes a novel selector-based soft compression method for RAG, redefining the encoder as a query-conditioned information selector trained with curriculum learning.
Findings
SeleCom outperforms existing soft compression methods.
SeleCom achieves comparable or better performance than non-compression baselines.
SeleCom reduces computation and latency by up to 84.6%.
Abstract
Retrieval-Augmented Generation (RAG) effectively grounds Large Language Models (LLMs) with external knowledge and is widely applied to Web-related tasks. However, its scalability is hindered by excessive context length and redundant retrievals. Recent research on soft context compression aims to address this by encoding long documents into compact embeddings, yet they often underperform non-compressed RAG due to their reliance on auto-encoder-like full-compression that forces the encoder to compress all document information regardless of relevance to the input query. In this work, we conduct an analysis on this paradigm and reveal two fundamental limitations: (I) Infeasibility, full-compression conflicts with the LLM's downstream generation behavior; and (II) Non-necessity: full-compression is unnecessary and dilutes task-relevant information density. Motivated by these insights, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Multimodal Machine Learning Applications
