A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature   Inheritance Strategies

Jinchao Zhu; Yuxuan Wang; Siyuan Pan; Pengfei Wan; Di Zhang; Gao Huang

arXiv:2406.00210·cs.CV·June 18, 2024

A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies

Jinchao Zhu, Yuxuan Wang, Siyuan Pan, Pengfei Wan, Di Zhang, Gao Huang

PDF

Open Access

TL;DR

This paper introduces A-SDM, a set of strategies including model assembly, feature inheritance, and multi-expert convolution to accelerate Stable Diffusion models, reducing computation and increasing speed while maintaining performance.

Contribution

The paper proposes novel tuning and tuning-free methods for model acceleration, including model assembly, feature inheritance, and multi-expert convolution, improving speed and efficiency of SDM.

Findings

01

Lightweight model increases speed by 22.4%.

02

Feature inheritance boosts generation speed by 40%.

03

Methods maintain performance while reducing computation.

Abstract

The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these approaches typically maintain the original network architecture. The extensive parameter scale and substantial computational demands have limited research into adjusting the model architecture. This study focuses on reducing redundant computation in SDM and optimizes the model through both tuning and tuning-free methods. 1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation. Second, to mitigate performance loss due to pruning, we incorporate multi-expert conditional convolution (ME-CondConv) into compressed UNets to enhance network performance by increasing capacity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Topic Modeling

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Diffusion