Towards Controllable Diffusion Models via Reward-Guided Exploration
Hengtong Zhang, Tingyang Xu

TL;DR
This paper introduces RGDM, a reinforcement learning-based framework for controlling diffusion models during training, enabling more flexible and effective sample generation without relying on conditional inputs or pre-trained classifiers.
Contribution
The paper presents a novel RL-guided training framework for diffusion models that improves controllability and exploration capabilities during sample generation.
Findings
Significant improvements in 3D shape generation.
Enhanced molecule generation performance.
Reduced gradient variance during training.
Abstract
By formulating data samples' formation as a Markov denoising process, diffusion models achieve state-of-the-art performances in a collection of tasks. Recently, many variants of diffusion models have been proposed to enable controlled sample generation. Most of these existing methods either formulate the controlling information as an input (i.e.,: conditional representation) for the noise approximator, or introduce a pre-trained classifier in the test-phase to guide the Langevin dynamic towards the conditional goal. However, the former line of methods only work when the controlling information can be formulated as conditional representations, while the latter requires the pre-trained guidance classifier to be differentiable. In this paper, we propose a novel framework named RGDM (Reward-Guided Diffusion Model) that guides the training-phase of diffusion models via reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Gene Regulatory Network Analysis · Innovative Microfluidic and Catalytic Techniques Innovation
MethodsDiffusion
