Learning to Compress Prompt in Natural Language Formats
Yu-Neng Chuang, Tianwei Xing, Chia-Yuan Chang, Zirui Liu, Xun Chen,, Xia Hu

TL;DR
This paper introduces Nano-Capsulator, a method to compress long natural language prompts into concise, transferable capsule prompts that significantly reduce length and inference costs while maintaining utility across various large language models.
Contribution
The work presents a novel framework for compressing natural language prompts into capsule prompts, addressing transferability, length constraints, and utility preservation in a unified approach.
Findings
Reduced prompt length by 81.4%
Decreased inference latency by up to 4.5 times
Saved 80.1% of computational budget
Abstract
Large language models (LLMs) are great at processing multiple natural language processing tasks, but their abilities are constrained by inferior performance with long context, slow inference speed, and the high cost of computing the results. Deploying LLMs with precise and informative context helps users process large-scale datasets more effectively and cost-efficiently. Existing works rely on compressing long prompt contexts into soft prompts. However, soft prompt compression encounters limitations in transferability across different LLMs, especially API-based LLMs. To this end, this work aims to compress lengthy prompts in the form of natural language with LLM transferability. This poses two challenges: (i) Natural Language (NL) prompts are incompatible with back-propagation, and (ii) NL prompts lack flexibility in imposing length constraints. In this work, we propose a Natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Natural Language Processing Techniques · Speech and dialogue systems
