Open-Source Acceleration of Stable-Diffusion.cpp Deployable on All Devices
Jingxu Ng, Cheng Lv, Pu Zhao, Wei Niu, Juyi Lin, Minzhou Pan, Yun, Liang, Yanzhi Wang

TL;DR
This paper introduces an optimized version of stable-diffusion.cpp that leverages the Winograd algorithm to significantly accelerate image generation across various models and devices, reducing latency and memory usage.
Contribution
The work presents an optimized implementation of stable-diffusion.cpp using the Winograd algorithm, improving inference speed and efficiency on all devices.
Findings
Up to 2.76x speedup in convolutional layer performance.
Up to 4.79x reduction in overall image generation time.
Compatible with multiple stable diffusion models, maintaining accuracy.
Abstract
Stable diffusion plays a crucial role in generating high-quality images. However, image generation is time-consuming and memory-intensive. To address this, stable-diffusion.cpp (Sdcpp) emerges as an efficient inference framework to accelerate the diffusion models. Although it is lightweight, the current implementation of ggml_conv_2d operator in Sdcpp is suboptimal, exhibiting both high inference latency and massive memory usage. To address this, in this work, we present an optimized version of Sdcpp leveraging the Winograd algorithm to accelerate 2D convolution operations, which is the primary bottleneck in the pipeline. By analyzing both dependent and independent computation graphs, we exploit the device's locality and parallelism to achieve substantial performance improvements. Our framework delivers correct end-to-end results across various stable diffusion models, including SDv1.4,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Biology Tumor Growth · Parallel Computing and Optimization Techniques · Model Reduction and Neural Networks
MethodsConvolution · Diffusion
