TL;DR
QwenLong-CPRS introduces a dynamic context optimization framework for large language models, significantly improving long-context processing efficiency and accuracy through novel compression and boundary-aware mechanisms.
Contribution
It presents a novel, natural language-guided dynamic context compression method with architecture-agnostic integration, achieving state-of-the-art performance on long-sequence benchmarks.
Findings
Achieves 21.59× context compression with 19.15-point performance gains.
Outperforms RAG and sparse attention methods in accuracy and efficiency.
Surpasses leading proprietary LLMs on long-context benchmarks.
Abstract
This technical report presents QwenLong-CPRS, a context compression framework designed for explicit long-context optimization, addressing prohibitive computation overhead during the prefill stage and the "lost in the middle" performance degradation of large language models (LLMs) during long sequence processing. Implemented through a novel dynamic context optimization mechanism, QwenLong-CPRS enables multi-granularity context compression guided by natural language instructions, achieving both efficiency gains and improved performance. Evolved from the Qwen architecture series, QwenLong-CPRS introduces four key innovations: (1) Natural language-guided dynamic optimization, (2) Bidirectional reasoning layers for enhanced boundary awareness, (3) Token critic mechanisms with language modeling heads, and (4) Window-parallel inference. Comprehensive evaluations across five benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Linear Warmup With Linear Decay · Attention Dropout · Byte Pair Encoding · Softmax · Linear Layer · Dropout · Dense Connections · Attention Is All You Need
