Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with   Energy-Based Models

Sangwoong Yoon; Himchan Hwang; Dohyun Kwon; Yung-Kyun Noh; Frank C.; Park

arXiv:2407.00626·cs.LG·November 1, 2024

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank C., Park

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a maximum entropy IRL framework for diffusion models, jointly training them with EBMs to improve sample quality with fewer steps and stabilize EBM training.

Contribution

It proposes DxMI, a joint training method for diffusion models and EBMs using IRL principles, and introduces DxDP, a new RL algorithm for efficient diffusion model updates.

Findings

01

High-quality samples with as few as 4-10 steps.

02

EBM training stabilized without MCMC.

03

Enhanced anomaly detection performance.

Abstract

We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from training data. Since we employ an energy-based model (EBM) to represent the log density, our approach boils down to the joint training of a diffusion model and an EBM. Our IRL formulation, named Diffusion by Maximum Entropy IRL (DxMI), is a minimax problem that reaches equilibrium when both models converge to the data distribution. The entropy maximization plays a key role in DxMI, facilitating the exploration of the diffusion model and ensuring the convergence of the EBM. We also propose Diffusion by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

swyoon/diffusion-by-maxentirl
pytorchOfficial

Videos

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models· slideslive

Taxonomy

TopicsModel Reduction and Neural Networks · Advancements in Semiconductor Devices and Circuit Design · Iterative Learning Control Systems

MethodsDiffusion · energy-based model