OD-Stega: LLM-Based Relatively Secure Steganography via Optimized Distributions
Yu-Shin Huang, Peter Just, Hanyun Yin, Krishna Narayanan, Ruihong Huang, and Chao Tian

TL;DR
OD-Stega introduces a novel LLM-based coverless steganography method that optimizes token distribution to embed secret information efficiently while maintaining naturalness, addressing practical challenges like tokenization mismatch.
Contribution
This paper proposes a new entropy-maximizing distribution optimization approach for LLM-based steganography, including solutions for token mismatch and integration with existing techniques.
Findings
Achieves efficient secret embedding with minimal tokens
Provides a closed-form solution under divergence constraints
Effectively addresses tokenization mismatch issues
Abstract
We consider coverless steganography where a Large Language Model (LLM) is used to generate stego-texts in combination with arithmetic coding. An efficient method should embed secret bits in as few language tokens as possible while keeping the stego-text as natural as possible. We show that this problem is equivalent to maximizing the entropy of a replacement probability distribution of the next token generation, subject to a constraint on the divergence between the new distribution and the original one produced by the LLM. A closed-form solution is provided under either the KL divergence or the total variation constraint. Several important practical issues are also tackled: 1) An often-overlooked tokenization mismatch issue is resolved with a simple prompt selection approach, 2) The combination of the optimized distribution and the vocabulary truncation technique is considered, and 3)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis
