Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data
Andresa Rodrigues de Campos, David Lee, Imry Kissos, Piyush Paritosh

TL;DR
This paper introduces a lossless prompt compression method for LLMs using dictionary encoding, enabling cost-effective analysis of repetitive data without model fine-tuning, while maintaining high analytical accuracy.
Contribution
It presents a novel, training-free compression algorithm that identifies repetitive patterns and enables lossless prompt compression for LLMs, reducing costs and token limits.
Findings
Achieves up to 80% compression ratios depending on data.
Maintains over 99% exact match rate in analysis after compression.
Compression ratio has minimal impact on analytical accuracy.
Abstract
In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded representations. This finding enables lossless prompt compression via dictionary encoding without model fine-tuning: frequently occurring subsequences are replaced with compact meta-tokens, and when provided with the compression dictionary in the system prompt, LLMs correctly interpret these meta-tokens during analysis, producing outputs equivalent to those from uncompressed inputs. We present a compression algorithm that identifies repetitive patterns at multiple length scales, incorporating a token-savings optimization criterion that ensures compression reduces costs by preventing dictionary overhead from exceeding savings. The algorithm achieves compression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
