Learning To Sample From Diffusion Models Via Inverse Reinforcement Learning

Constant Bourdrez; Alexandre V\'erine; Olivier Capp\'e

arXiv:2602.08689·cs.LG·February 10, 2026

Learning To Sample From Diffusion Models Via Inverse Reinforcement Learning

Constant Bourdrez, Alexandre V\'erine, Olivier Capp\'e

PDF

Open Access

TL;DR

This paper introduces an inverse reinforcement learning approach to optimize sampling strategies in diffusion models, improving sample quality and hyperparameter tuning without retraining the denoiser.

Contribution

It formulates diffusion sampling as a Markov Decision Process and uses policy gradients to enhance sampling efficiency and quality without retraining the denoiser.

Findings

01

Improved sample quality with pretrained diffusion models

02

Automatic tuning of sampling hyperparameters

03

Effective optimization of sampling strategies

Abstract

Diffusion models generate samples through an iterative denoising process, guided by a neural network. While training the denoiser on real-world data is computationally demanding, the sampling procedure itself is more flexible. This adaptability serves as a key lever in practice, enabling improvements in both the quality of generated samples and the efficiency of the sampling process. In this work, we introduce an inverse reinforcement learning framework for learning sampling strategies without retraining the denoiser. We formulate the diffusion sampling procedure as a discrete-time finite-horizon Markov Decision Process, where actions correspond to optional modifications of the sampling dynamics. To optimize action scheduling, we avoid defining an explicit reward function. Instead, we directly match the target behavior expected from the sampler using policy gradient techniques. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning