LLMs can hide text in other text of the same length

Antonio Norelli; Michael Bronstein

arXiv:2510.20075·cs.AI·January 19, 2026

LLMs can hide text in other text of the same length

Antonio Norelli, Michael Bronstein

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Calgacus, a protocol enabling large language models to embed and extract hidden messages within coherent texts of the same length, raising concerns about trust and AI safety.

Contribution

The paper presents Calgacus, a novel method demonstrating how modest open-source LLMs can covertly hide and reveal messages within ordinary texts, challenging assumptions about AI transparency.

Findings

01

High-quality message encoding achievable with 8-billion-parameter LLMs

02

Encoding and decoding can be performed locally on a laptop in seconds

03

The protocol demonstrates a radical decoupling of text from authorial intent

Abstract

A meaningful text can be hidden inside another, completely different yet still coherent and plausible, text of the same length. For example, a tweet containing a harsh political critique could be embedded in a tweet that celebrates the same political leader, or an ordinary product review could conceal a secret manuscript. This uncanny state of affairs is now possible thanks to Large Language Models, and in this paper we present Calgacus, a simple and efficient protocol to achieve it. We show that even modest 8-billion-parameter open-source LLMs are sufficient to obtain high-quality results, and a message as long as this abstract can be encoded and decoded locally on a laptop in seconds. The existence of such a protocol demonstrates a radical decoupling of text from authorial intent, further eroding trust in written communication, already shaken by the rise of LLM chatbots. We illustrate…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

The method is conceptually elegant. It shows that large language models can be used as full-capacity generative steganographic systems, producing natural-looking texts that conceal arbitrary content. The approach achieves one-to-one token correspondence between the hidden text and the generated text while maintaining coherence, which is interesting among existing steganography methods. The paper connects the technique to broader philosophical questions about language, intention, and meaning in

Weaknesses

The experimental analysis is minimal. The evaluation relies mostly on qualitative examples and log-probability plots without systematic comparisons or quantitative metrics such as recoverability, perplexity degradation, or detectability. The proposed misuse scenarios, such as unaligned chatbots hidden within aligned ones, are speculative and not demonstrated experimentally.

Reviewer 02Rating 8Confidence 3

Strengths

1. The writing is clear, intuitive and intriguing. 2. The idea is simple yet effective. 3. Extensive analysis and discussions are provided, making it deep and insightful.

Weaknesses

1. The novelty of the work is not very clear. Similar ideas have been explored in previous work and need to be better differentiated. 2. Some practical limitations. 3. Lack of robustness analysis in the adversarial scenario.

Reviewer 03Rating 8Confidence 3

Strengths

1. This paper offers a new perspective for LLM safety and alignment: a model that appears aligned on the surface may still harbor vulnerabilities that allow dangerous information to be hidden in its output probability distribution. 2. It introduces a remarkably simple, full-capacity method for embedding hidden text specifically designed for LLMs. 3. Due to the secret key prompt and the inherent chaos in LLM behavior, this approach is currently nearly impossible to detect without access to both t

Weaknesses

1. The method is sensitive to the quality of the key prompt; a low-quality prompt may prevent the target probability ranks from forming a coherent and natural-looking cover text. 2. It is fragile to transmission errors; any corruption in the cover text will completely scramble the recovered probability rank sequence, making it unsuitable for noisy communication channels. 3. It imposes constraints on the secret text itself, which must lie within the model’s training domain, for example, rare dial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Topic Modeling · Adversarial Robustness in Machine Learning