Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles
Xiao Pu, Tianxing He, Xiaojun Wan

TL;DR
Style-Compress is a prompt compression framework that uses a smaller language model to generate task-specific compressed prompts, improving efficiency and performance across multiple tasks without additional training.
Contribution
It introduces a novel style-aware prompt compression method that adapts a small model to effectively compress prompts for larger models on new tasks without extra training.
Findings
Outperforms baseline models in four tasks
Achieves comparable or better performance with minimal samples
Maintains high compression ratios with strong task performance
Abstract
Prompt compression condenses contexts while maintaining their informativeness for different usage scenarios. It not only shortens the inference time and reduces computational costs during the usage of large language models, but also lowers expenses when using closed-source models. In a preliminary study, we discover that when instructing language models to compress prompts, different compression styles (e.g., extractive or abstractive) impact performance of compressed prompts on downstream tasks. Building on this insight, we propose Style-Compress, a lightweight framework that adapts a smaller language model to compress prompts for a larger model on a new task without additional training. Our approach iteratively generates and selects effective compressed prompts as task-specific demonstrations through style variation and in-context learning, enabling smaller models to act as efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · Advanced Database Systems and Queries
