Discrete Prompt Compression with Reinforcement Learning
Hoyoun Jung, Kyung-Joong Kim

TL;DR
This paper introduces PCRL, a reinforcement learning-based method for discrete prompt compression that reduces token count in prompts for language models, enhancing efficiency and reusability without requiring gradient access or labeled data.
Contribution
PCRL is a novel discrete prompt compression technique using reinforcement learning, addressing interpretability, fixed token limitations, and black-box API interaction issues.
Findings
Achieves 24.6% average token reduction in prompts.
Maintains performance across various LMs.
Policy transferability to larger models.
Abstract
Compressed prompts aid instruction-tuned language models (LMs) in overcoming context window limitations and reducing computational costs. Existing methods, which primarily based on training embeddings, face various challenges associated with interpretability, the fixed number of embedding tokens, reusability across different LMs, and inapplicability when interacting with black-box APIs. This study proposes prompt compression with reinforcement learning (PCRL), which is a discrete prompt compression method that addresses these issues. The proposed PCRL method utilizes a computationally efficient policy network that edits prompts directly. The training approach employed in the proposed PCRLs can be applied flexibly to various types of LMs, including both decoder-only and encoder-decoder architecture and it can be trained without gradient access to the LMs or labeled data. The proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
