QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory
Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng

TL;DR
This paper introduces a novel context compression method for large language models based on information bottleneck theory, significantly reducing context size while maintaining or improving task performance.
Contribution
It applies information bottleneck theory to context compression, proposing a flexible cross-attention-based approach to approximate mutual information, outperforming existing methods.
Findings
25% higher compression rate than state-of-the-art
Maintains question answering performance with compressed context
Compressed context sometimes outperforms full context
Abstract
Generative LLM have achieved remarkable success in various industrial applications, owing to their promising In-Context Learning capabilities. However, the issue of long context in complex tasks poses a significant barrier to their wider adoption, manifested in two main aspects: (i) The excessively long context leads to high costs and inference delays. (ii) A substantial amount of task-irrelevant information introduced by long contexts exacerbates the "lost in the middle" problem. Existing methods compress context by removing redundant tokens using metrics such as self-information or PPL, which is inconsistent with the objective of retaining the most important tokens when conditioning on a given query. In this study, we introduce information bottleneck theory (IB) to model the problem, offering a novel perspective that thoroughly addresses the essential properties required for context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Advanced Data Compression Techniques
MethodsALIGN
