Attention-Driven Training-Free Efficiency Enhancement of Diffusion   Models

Hongjie Wang; Difan Liu; Yan Kang; Yijun Li; Zhe Lin; Niraj K. Jha,; Yuchen Liu

arXiv:2405.05252·cs.CV·May 9, 2024

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Hongjie Wang, Difan Liu, Yan Kang, Yijun Li, Zhe Lin, Niraj K. Jha,, Yuchen Liu

PDF

Open Access

TL;DR

This paper introduces a run-time, training-free token pruning method for diffusion models that significantly reduces computational costs while preserving high image quality, using attention maps and novel algorithms.

Contribution

It proposes AT-EDM, a novel framework that leverages attention maps for efficient, training-free token pruning in diffusion models, including new algorithms G-WPR and DSAP for improved performance.

Findings

01

38.8% FLOPs reduction achieved

02

Up to 1.53x speed-up over Stable Diffusion XL

03

Maintains similar FID and CLIP scores as full models

Abstract

Diffusion Models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly due to the attention module heavily used in leading models. Existing works mainly adopt a retraining process to enhance DM efficiency. This is computationally expensive and not very scalable. To this end, we introduce the Attention-driven Training-free Efficient Diffusion Model (AT-EDM) framework that leverages attention maps to perform run-time pruning of redundant tokens, without the need for any retraining. Specifically, for single-denoising-step pruning, we develop a novel ranking algorithm, Generalized Weighted Page Rank (G-WPR), to identify redundant tokens, and a similarity-based recovery method to restore tokens for the convolution operation. In addition, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsConvolution · Pruning · Diffusion · Contrastive Language-Image Pre-training