Rep2Text: Decoding Full Text from a Single LLM Token Representation

Haiyan Zhao; Zirui He; Yiming Tang; Fan Yang; Ali Payani; Dianbo Liu; Mengnan Du

arXiv:2511.06571·cs.CL·May 11, 2026

Rep2Text: Decoding Full Text from a Single LLM Token Representation

Haiyan Zhao, Zirui He, Yiming Tang, Fan Yang, Ali Payani, Dianbo Liu, Mengnan Du

PDF

TL;DR

This paper introduces Rep2Text, a framework that decodes input text from a single last-token representation in LLMs, revealing information bottlenecks and generalization capabilities.

Contribution

Rep2Text is a novel method that reconstructs input text from last-token representations, highlighting the extent of information retention and semantic preservation in LLMs.

Findings

01

Approximately half of tokens in 16-token sequences can be recovered.

02

Token recovery declines with increasing sequence length.

03

The framework generalizes well to out-of-distribution clinical data.

Abstract

Large language models (LLMs) have achieved remarkable progress across diverse tasks, yet their internal mechanisms remain largely opaque. In this work, we investigate a fundamental question: to what extent can the original input text be recovered from a single last-token representation in an LLM? To this end, we propose Rep2Text, a novel framework for decoding text from last-token representations. Rep2Text employs a trainable adapter that maps a target model's last-token representation into the token embedding space of a decoding language model, which then autoregressively reconstructs the input text. Experiments across various model combinations (Llama-3.1-8B, Gemma-7B, Mistral-7B-v0.1, Llama-3.2-3B, etc.) show that, on average, roughly half of the tokens in 16-token sequences can be recovered from this compressed representation while preserving strong semantic coherence. Further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.