DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
Yi Zhao, Zuchao Li, Hai Zhao, Baoyuan Qi, Guoming Liu

TL;DR
DAC introduces a dynamic, attention-aware prompt compression method that adaptively balances entropy and attention information, significantly improving efficiency and robustness across multiple tasks and large language models.
Contribution
The paper presents a novel dynamic approach that incorporates attention-critical tokens and entropy shifts for more effective task-agnostic prompt compression.
Findings
Consistently improves performance across diverse tasks
Effectively balances entropy and attention information during compression
Demonstrates robustness across various large language models
Abstract
Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios. Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss. However, these approaches overlook two critical aspects: (i) the importance of attention-critical tokens at the algorithmic level, and (ii) shifts in information entropy during the compression process. Motivated by these challenges, we propose a dynamic attention-aware approach for task-agnostic prompt compression (DAC). This approach effectively integrates entropy and attention information, dynamically sensing entropy shifts during compression to achieve fine-grained prompt compression. Extensive experiments across various domains, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Cognitive Functions and Memory
