DPM-OT: A New Diffusion Probabilistic Model Based on Optimal Transport
Zezeng Li, ShengHao Li, Zhanpeng Wang, Na Lei, Zhongxuan Luo, Xianfeng, Gu

TL;DR
DPM-OT introduces a novel diffusion probabilistic model leveraging optimal transport to generate high-quality images rapidly, reducing sampling steps to around 10 while maintaining stability and minimizing mode mixture.
Contribution
It proposes a unified framework using optimal transport for fast diffusion models, significantly improving sampling speed and sample quality with theoretical guarantees.
Findings
Achieves high-quality samples within ~10 steps
Reduces mode mixture in short-step sampling
Outperforms existing methods in speed and quality metrics
Abstract
Sampling from diffusion probabilistic models (DPMs) can be viewed as a piecewise distribution transformation, which generally requires hundreds or thousands of steps of the inverse diffusion trajectory to get a high-quality image. Recent progress in designing fast samplers for DPMs achieves a trade-off between sampling speed and sample quality by knowledge distillation or adjusting the variance schedule or the denoising equation. However, it can't be optimal in both aspects and often suffer from mode mixture in short steps. To tackle this problem, we innovatively regard inverse diffusion as an optimal transport (OT) problem between latents at different stages and propose the DPM-OT, a unified learning framework for fast DPMs with a direct expressway represented by OT map, which can generate high-quality samples within around 10 function evaluations. By calculating the semi-discrete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
MethodsKnowledge Distillation · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
