Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models
Beier Zhu, Ruoyu Wang, Tong Zhao, Hanwang Zhang, Chi Zhang

TL;DR
This paper introduces EPD, a parallel gradient-based ODE solver for diffusion models that reduces sampling latency and maintains high image quality by leveraging multiple parallel gradient evaluations and a distillation training approach.
Contribution
The paper proposes a novel parallel gradient ODE solver, EPD, that improves sampling speed and quality in diffusion models through parallelization and learnable parameter distillation.
Findings
EPD achieves state-of-the-art FID scores at low latency levels.
EPD's parallel gradient approach reduces truncation errors effectively.
The method can be integrated into existing ODE samplers as a plugin.
Abstract
Diffusion models (DMs) have achieved state-of-the-art generative performance but suffer from high sampling latency due to their sequential denoising nature. Existing solver-based acceleration methods often face image quality degradation under a low-latency budget. In this paper, we propose the Ensemble Parallel Direction solver (dubbed as \ours), a novel ODE solver that mitigates truncation errors by incorporating multiple parallel gradient evaluations in each ODE step. Importantly, since the additional gradient computations are independent, they can be fully parallelized, preserving low-latency sampling. Our method optimizes a small set of learnable parameters in a distillation fashion, ensuring minimal training overhead. In addition, our method can serve as a plugin to improve existing ODE samplers. Extensive experiments on various image synthesis benchmarks demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
