Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio
Yijiong Yu, Shuai Yuan, Jie Zheng, Huazheng Wang, Ji Pei

TL;DR
This paper introduces a density-aware semi-dynamic context compression method for LLMs that predicts and applies optimal compression ratios based on information density, outperforming static approaches.
Contribution
The authors propose a novel semi-dynamic compression framework with a discrete ratio selector trained jointly on synthetic data, improving context compression efficiency.
Findings
Outperforms static compression baselines consistently.
Establishes a robust Pareto frontier for context compression.
Utilizes synthetic data for training the ratio predictor.
Abstract
Soft context compression reduces the computational workload of processing long contexts in LLMs by encoding long context into a smaller number of latent tokens. However, existing frameworks apply uniform compression ratios, failing to account for the extreme variance in natural language information density. While adopting a density-aware dynamic compression ratio seems intuitive, empirical investigations reveal that models struggle intrinsically with operations parameterized by input dependent, continuous structural hyperparameters. To resolve this pitfall, we introduce Semi-Dynamic Context Compression framework. Our approach features a Discrete Ratio Selector, which predicts a compression target based on intrinsic information density and quantizes it to a predefined set of discrete compression ratios. It is efficiently jointly trained with the compressor on synthetic data, with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
