QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory

Yihang Wang; Xu Huang; Bowen Tian; Yueyang Su; Lei Yu; Huaming Liao; Yixing Fan; Jiafeng Guo; Xueqi Cheng

arXiv:2408.10497·cs.CL·October 14, 2025

QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory

Yihang Wang, Xu Huang, Bowen Tian, Yueyang Su, Lei Yu, Huaming Liao, Yixing Fan, Jiafeng Guo, Xueqi Cheng

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel context compression method for large language models based on information bottleneck theory, significantly reducing context size while maintaining or improving task performance.

Contribution

It applies information bottleneck theory to context compression, proposing a flexible cross-attention-based approach to approximate mutual information, outperforming existing methods.

Findings

01

25% higher compression rate than state-of-the-art

02

Maintains question answering performance with compressed context

03

Compressed context sometimes outperforms full context

Abstract

Generative LLM have achieved remarkable success in various industrial applications, owing to their promising In-Context Learning capabilities. However, the issue of long context in complex tasks poses a significant barrier to their wider adoption, manifested in two main aspects: (i) The excessively long context leads to high costs and inference delays. (ii) A substantial amount of task-irrelevant information introduced by long contexts exacerbates the "lost in the middle" problem. Existing methods compress context by removing redundant tokens using metrics such as self-information or PPL, which is inconsistent with the objective of retaining the most important tokens when conditioning on a given query. In this study, we introduce information bottleneck theory (IB) to model the problem, offering a novel perspective that thoroughly addresses the essential properties required for context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

QUITO-X: A New Perspective on Context Compression from the Information Bottleneck Theory· underline

Taxonomy

TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Advanced Data Compression Techniques

MethodsALIGN