Behavior-Equivalent Token: Single-Token Replacement for Long Prompts in LLMs

Jiancheng Dong; Pengyue Jia; Jingyu Peng; Maolin Wang; Yuhao Wang; Lixin Su; Xin Sun; Shuaiqiang Wang; Dawei Yin; Xiangyu Zhao

arXiv:2511.23271·cs.CL·December 1, 2025

Behavior-Equivalent Token: Single-Token Replacement for Long Prompts in LLMs

Jiancheng Dong, Pengyue Jia, Jingyu Peng, Maolin Wang, Yuhao Wang, Lixin Su, Xin Sun, Shuaiqiang Wang, Dawei Yin, Xiangyu Zhao

PDF

Open Access

TL;DR

This paper introduces a method to replace lengthy system prompts in large language models with a single token, significantly reducing inference costs while maintaining nearly the same task performance.

Contribution

A novel three-stage training framework that learns a single Behavior-Equivalent token to replace long prompts without model internals or labeled data.

Findings

01

Up to 3000x reduction in prompt length

02

Retains about 98% of original performance

03

Reduces inference cost significantly

Abstract

Carefully engineered system prompts play a critical role in guiding the behavior of LLM agents, but their considerable length introduces significant drawbacks, including increased inference latency, higher computational cost, and reduced effective context length. This raises the question of whether such lengthy prompts can be replaced by a drastically reduced number of tokens while preserving their behavioral effect on downstream tasks. To enable this, we propose a lightweight three-stage training framework that learns a single prompt-specific Behavior-Equivalent token ([BE]). The framework first trains [BE] to encode the natural-language content of the original system prompt via reconstruction, and then distills the prompt 's downstream behavior into this single token. Importantly, our method requires no access to model internals, no auxiliary compression models, and no labeled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Multimodal Machine Learning Applications