CompactPrompt: A Unified Pipeline for Prompt Data Compression in LLM Workflows
Joong Ho Choi, Jiayang Zhao, Jeel Shah, Ritvika Sonawane, Vedant Singh, Avani Appalla, Will Flanagan, Filipe Condessa

TL;DR
CompactPrompt is a comprehensive pipeline that combines prompt and data compression techniques to significantly reduce token usage and inference costs in large language model workflows while maintaining high output quality.
Contribution
It introduces a novel end-to-end compression pipeline that integrates prompt pruning, phrase grouping, n-gram abbreviation, and data quantization for efficient LLM processing.
Findings
Reduces token usage and inference costs by up to 60%.
Maintains less than 5% accuracy drop on benchmark datasets.
Enables real-time visualization of compression decisions.
Abstract
Large Language Models (LLMs) deliver powerful reasoning and generation capabilities but incur substantial run-time costs when operating in agentic workflows that chain together lengthy prompts and process rich data streams. We introduce CompactPrompt, an end-to-end pipeline that merges hard prompt compression with lightweight file-level data compression. CompactPrompt first prunes low-information tokens from prompts using self-information scoring and dependency-based phrase grouping. In parallel, it applies n-gram abbreviation to recurrent textual patterns in attached documents and uniform quantization to numerical columns, yielding compact yet semantically faithful representations. Integrated into standard LLM agents, CompactPrompt reduces total token usage and inference cost by up to 60% on benchmark dataset like TAT-QA and FinQA, while preserving output quality (Results in less than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Scientific Computing and Data Management
