Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Andrii Shportko

arXiv:2603.21567·cs.LG·March 24, 2026

Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Andrii Shportko

PDF

Open Access

TL;DR

This paper establishes theoretical bounds on the complexity cost of LLM steganography using Kolmogorov complexity and proposes a perplexity-based detection proxy, supported by preliminary experimental validation.

Contribution

It introduces an information-theoretic bound on steganography in language models and proposes a practical perplexity-based detection method.

Findings

01

Kolmogorov complexity bounds imply increased complexity for payload embedding.

02

Perplexity ratio correlates with steganographic payload presence.

03

Preliminary experiments support the theoretical predictions.

Abstract

Large language models can rewrite text to embed hidden payloads while preserving surface-level meaning, a capability that opens covert channels between cooperating AI systems and poses challenges for alignment monitoring. We study the information-theoretic cost of such embedding. Our main result is that any steganographic scheme that preserves the semantic load of a covertext~ $M_{1}$ while encoding a payload~ $P$ into a stegotext~ $M_{2}$ must satisfy $K (M_{2}) \geq K (M_{1}) + K (P) - O (lo g n)$ , where $K$ denotes Kolmogorov complexity and $n$ is the combined message length. A corollary is that any non-trivial payload forces a strict complexity increase in the stegotext, regardless of how cleverly the encoder distributes the signal. Because Kolmogorov complexity is uncomputable, we ask whether practical proxies can detect this predicted increase. Drawing on the classical correspondence between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting