LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights
Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi

TL;DR
LD-Pruner is a novel structured pruning method for Latent Diffusion Models that reduces inference time and parameters while preserving performance, enabling deployment on resource-limited devices.
Contribution
The paper introduces LD-Pruner, a task-agnostic, latent space-based pruning technique specifically designed for LDMs, addressing their unique challenges and improving efficiency.
Findings
Reduced inference time of Stable Diffusion by 34.9%.
Improved FID score by 5.2% on MS-COCO T2I benchmark.
Effective pruning across text-to-image, image, and audio generation tasks.
Abstract
Latent Diffusion Models (LDMs) have emerged as powerful generative models, known for delivering remarkable results under constrained computational resources. However, deploying LDMs on resource-limited devices remains a complex issue, presenting challenges such as memory consumption and inference speed. To address this issue, we introduce LD-Pruner, a novel performance-preserving structured pruning method for compressing LDMs. Traditional pruning methods for deep neural networks are not tailored to the unique characteristics of LDMs, such as the high computational cost of training and the absence of a fast, straightforward and task-agnostic method for evaluating model performance. Our method tackles these challenges by leveraging the latent space during the pruning process, enabling us to effectively quantify the impact of pruning on model performance, independently of the task at hand.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Speech Recognition and Synthesis
MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
