DuoDiff: Accelerating Diffusion Models with a Dual-Backbone Approach
Daniel Gallo Fern\'andez, R\u{a}zvan-Andrei Mati\c{s}an, Alejandro, Monroy Mu\~noz, Ana-Maria Vasilcoiu, Janusz Partyka, Tin Had\v{z}i, Veljkovi\'c, Metod Jazbec

TL;DR
DuoDiff accelerates diffusion model image generation by using a dual-backbone approach that employs a shallow network early on and a deeper network later, improving speed and quality.
Contribution
The paper introduces DuoDiff, a novel dual-backbone method that leverages a phase transition in sampling to enhance diffusion model acceleration.
Findings
DuoDiff outperforms existing early-exit methods in speed and quality.
The approach is easy to implement and complements existing acceleration techniques.
Empirical results show significant improvements in inference efficiency.
Abstract
Diffusion models have achieved unprecedented performance in image generation, yet they suffer from slow inference due to their iterative sampling process. To address this, early-exiting has recently been proposed, where the depth of the denoising network is made adaptive based on the (estimated) difficulty of each sampling step. Here, we discover an interesting "phase transition" in the sampling process of current adaptive diffusion models: the denoising network consistently exits early during the initial sampling steps, until it suddenly switches to utilizing the full network. Based on this, we propose accelerating generation by employing a shallower denoising network in the initial sampling steps and a deeper network in the later steps. We demonstrate empirically that our dual-backbone approach, DuoDiff, outperforms existing early-exit diffusion methods in both inference speed and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear reactor physics and engineering · Advanced Mathematical Modeling in Engineering
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
