An Additive Approximation Scheme for Generating Dyadic Codings for the Outputs of an LLM
Daniella Bar-Lev, Farzad Farnoud, Ryan Gabrys

TL;DR
This paper introduces a polynomial-time additive approximation scheme for creating dyadic codings of LLM outputs, optimizing rate constraints and enabling applications like steganography with statistical guarantees.
Contribution
It develops a novel tree-based partitioning algorithm that efficiently approximates probability distributions under rate constraints, with provable near-optimality guarantees.
Findings
Provides a polynomial-time additive approximation scheme for dyadic distribution approximation.
Guarantees near-optimality in rate-constrained dyadic coding.
Enables a principled framework for LLM steganography with statistical detectability bounds.
Abstract
We study the problem of approximating a discrete probability distribution, such as the next-token distribution of a large language model, by a dyadic distribution induced by a binary tree under encoding rate constraints. The objective is to partition the support of the distribution and assign dyadic probabilities to minimize total variation distance while achieving a prescribed rate. We formulate this task as a tree-based partitioning problem and develop a polynomial-time additive approximation scheme for the rate-constrained setting in the constant-rate regime. Our results provide provable guarantees for near-optimal dyadic approximations and, as an application, yield a principled framework for LLM-based steganography, where the rate maps to bits of hidden information embedded per token and the total variation bound controls statistical detectability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
