Coarse-Grained Kullback--Leibler Control of Diffusion-Based Generative AI
Tatsuaki Tsuruyama

TL;DR
This paper introduces a novel theoretical framework for controlling coarse-grained quantities in diffusion-based generative models, enabling explicit management of blockwise features during image synthesis.
Contribution
It extends an information-theoretic Lyapunov function to reverse diffusion processes, proposing a projected scheme that maintains coarse-grained quantities within prescribed tolerances.
Findings
The V-delta projection acts as an approximate Lyapunov function under certain conditions.
Numerical experiments show the method preserves block-mass errors within tolerances.
The approach achieves high-quality image generation comparable to standard methods.
Abstract
Diffusion models and score-based generative models provide a powerful framework for synthesizing high-quality images from noise. However, there is still no satisfactory theory that describes how coarse-grained quantities, such as blockwise intensity or class proportions after partitioning an image into spatial blocks, are preserved and evolve along the reverse diffusion dynamics. In previous work, the author introduced an information-theoretic Lyapunov function V for non-ergodic Markov processes on a state space partitioned into blocks, defined as the minimal Kullback-Leibler divergence to the set of stationary distributions reachable from a given initial condition, and showed that a leak-tolerant potential V-delta with a prescribed tolerance for block masses admits a closed-form expression as a scaling-and-clipping operation on block masses. In this paper, I transplant this framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music Technology and Sound Studies · Computer Graphics and Visualization Techniques
