LD-Pruner: Efficient Pruning of Latent Diffusion Models using   Task-Agnostic Insights

Thibault Castells; Hyoung-Kyu Song; Bo-Kyeong Kim; Shinkook Choi

arXiv:2404.11936·cs.LG·April 19, 2024·1 cites

LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi

PDF

Open Access

TL;DR

LD-Pruner is a novel structured pruning method for Latent Diffusion Models that reduces inference time and parameters while preserving performance, enabling deployment on resource-limited devices.

Contribution

The paper introduces LD-Pruner, a task-agnostic, latent space-based pruning technique specifically designed for LDMs, addressing their unique challenges and improving efficiency.

Findings

01

Reduced inference time of Stable Diffusion by 34.9%.

02

Improved FID score by 5.2% on MS-COCO T2I benchmark.

03

Effective pruning across text-to-image, image, and audio generation tasks.

Abstract

Latent Diffusion Models (LDMs) have emerged as powerful generative models, known for delivering remarkable results under constrained computational resources. However, deploying LDMs on resource-limited devices remains a complex issue, presenting challenges such as memory consumption and inference speed. To address this issue, we introduce LD-Pruner, a novel performance-preserving structured pruning method for compressing LDMs. Traditional pruning methods for deep neural networks are not tailored to the unique characteristics of LDMs, such as the high computational cost of training and the absence of a fast, straightforward and task-agnostic method for evaluating model performance. Our method tackles these challenges by leveraging the latent space during the pruning process, enabling us to effectively quantify the impact of pruning on model performance, independently of the task at hand.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Speech Recognition and Synthesis

MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion