A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
Jinchao Zhu, Yuxuan Wang, Siyuan Pan, Pengfei Wan, Di Zhang, Gao Huang

TL;DR
This paper introduces A-SDM, a set of strategies including model assembly, feature inheritance, and multi-expert convolution to accelerate Stable Diffusion models, reducing computation and increasing speed while maintaining performance.
Contribution
The paper proposes novel tuning and tuning-free methods for model acceleration, including model assembly, feature inheritance, and multi-expert convolution, improving speed and efficiency of SDM.
Findings
Lightweight model increases speed by 22.4%.
Feature inheritance boosts generation speed by 40%.
Methods maintain performance while reducing computation.
Abstract
The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Topic Modeling
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Diffusion
