Adding Conditional Control to Diffusion Models with Reinforcement   Learning

Yulai Zhao; Masatoshi Uehara; Gabriele Scalia; Sunyuan Kung; Tommaso; Biancalani; Sergey Levine; Ehsan Hajiramezanali

arXiv:2406.12120·cs.LG·February 25, 2025·1 cites

Adding Conditional Control to Diffusion Models with Reinforcement Learning

Yulai Zhao, Masatoshi Uehara, Gabriele Scalia, Sunyuan Kung, Tommaso, Biancalani, Sergey Levine, Ehsan Hajiramezanali

PDF

Open Access

TL;DR

This paper introduces CTRL, a reinforcement learning-based method to add controllability to pre-trained diffusion models, improving efficiency and simplifying dataset requirements for conditional sample generation.

Contribution

The paper proposes a novel RL-based approach to incorporate controls into diffusion models, eliminating the need for classifier training and enhancing sample efficiency.

Findings

01

CTRL enables sampling from conditional distributions during inference.

02

The RL approach improves sample efficiency over classifier-free guidance.

03

It simplifies dataset construction by leveraging conditional independence.

Abstract

Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples. While these diffusion models trained on large datasets have achieved success, there is often a need to introduce additional controls in downstream fine-tuning processes, treating these powerful models as pre-trained diffusion models. This work presents a novel method based on reinforcement learning (RL) to add such controls using an offline dataset comprising inputs and labels. We formulate this task as an RL problem, with the classifier learned from the offline dataset and the KL divergence against pre-trained models serving as the reward functions. Our method, $CTRL$ ( $C$ onditioning pre- $T$ rained diffusion models with $R$ einforcement $L$ earning), produces soft-optimal policies that maximize the abovementioned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization

MethodsDiffusion