LanguaShrink: Reducing Token Overhead with Psycholinguistics

Xuechen Liang; Meiling Tao; Yinghui Xia; Tianyu Shi; Jun Wang,; JingSong Yang

arXiv:2409.00855·cs.CL·September 4, 2024

LanguaShrink: Reducing Token Overhead with Psycholinguistics

Xuechen Liang, Meiling Tao, Yinghui Xia, Tianyu Shi, Jun Wang,, JingSong Yang

PDF

Open Access

TL;DR

LanguaShrink is a novel prompt compression framework inspired by psycholinguistics that significantly reduces prompt length and improves inference efficiency in large language models without sacrificing semantic integrity.

Contribution

It introduces a task-agnostic compression method using psycholinguistic principles, part-of-speech prioritization, and reinforcement learning, achieving up to 26x compression and faster inference.

Findings

01

Achieves up to 26 times prompt compression.

02

Improves end-to-end latency by 1.43 times.

03

Maintains semantic similarity across datasets.

Abstract

As large language models (LLMs) improve their capabilities in handling complex tasks, the issues of computational cost and efficiency due to long prompts are becoming increasingly prominent. To accelerate model inference and reduce costs, we propose an innovative prompt compression framework called LanguaShrink. Inspired by the observation that LLM performance depends on the density and position of key information in the input prompts, LanguaShrink leverages psycholinguistic principles and the Ebbinghaus memory curve to achieve task-agnostic prompt compression. This effectively reduces prompt length while preserving essential information. We referred to the training method of OpenChat.The framework introduces part-of-speech priority compression and data distillation techniques, using smaller models to learn compression targets and employing a KL-regularized reinforcement learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital and Cyber Forensics