Learning To Sample From Diffusion Models Via Inverse Reinforcement Learning
Constant Bourdrez, Alexandre V\'erine, Olivier Capp\'e

TL;DR
This paper introduces an inverse reinforcement learning approach to optimize sampling strategies in diffusion models, improving sample quality and hyperparameter tuning without retraining the denoiser.
Contribution
It formulates diffusion sampling as a Markov Decision Process and uses policy gradients to enhance sampling efficiency and quality without retraining the denoiser.
Findings
Improved sample quality with pretrained diffusion models
Automatic tuning of sampling hyperparameters
Effective optimization of sampling strategies
Abstract
Diffusion models generate samples through an iterative denoising process, guided by a neural network. While training the denoiser on real-world data is computationally demanding, the sampling procedure itself is more flexible. This adaptability serves as a key lever in practice, enabling improvements in both the quality of generated samples and the efficiency of the sampling process. In this work, we introduce an inverse reinforcement learning framework for learning sampling strategies without retraining the denoiser. We formulate the diffusion sampling procedure as a discrete-time finite-horizon Markov Decision Process, where actions correspond to optional modifications of the sampling dynamics. To optimize action scheduling, we avoid defining an explicit reward function. Instead, we directly match the target behavior expected from the sampler using policy gradient techniques. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
