Scalable Diffusion Models with State Space Backbone
Zhengcong Fei, Mingyuan Fan, Changqian Yu, Junshi Huang

TL;DR
This paper introduces Diffusion State Space Models (DiS), a novel diffusion architecture using state space backbones that effectively handle long-range dependencies, achieving competitive image generation performance with improved scalability and reduced computational costs.
Contribution
The paper proposes DiS, a new diffusion model architecture based on state space models, demonstrating comparable or superior performance to traditional U-Net models with better scalability and efficiency.
Findings
DiS achieves competitive image generation quality on ImageNet benchmarks.
Higher Gflops in DiS correlate with lower FID scores.
Latent space DiS models reduce computational load while maintaining performance.
Abstract
This paper presents a new exploration into a category of diffusion models built upon state space architecture. We endeavor to train diffusion models for image data, wherein the traditional U-Net backbone is supplanted by a state space backbone, functioning on raw patches or latent space. Given its notable efficacy in accommodating long-range dependencies, Diffusion State Space Models (DiS) are distinguished by treating all inputs including time, condition, and noisy image patches as tokens. Our assessment of DiS encompasses both unconditional and class-conditional image generation scenarios, revealing that DiS exhibits comparable, if not superior, performance to CNN-based or Transformer-based U-Net architectures of commensurate size. Furthermore, we analyze the scalability of DiS, gauged by the forward pass complexity quantified in Gflops. DiS models with higher Gflops, achieved through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization
MethodsConvolution · *Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Diffusion · Max Pooling · U-Net
