LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Xuan Shen; Zhao Song; Yufa Zhou; Bo Chen; Yanyu Li; Yifan Gong; Kai; Zhang; Hao Tan; Jason Kuen; Henghui Ding; Zhihao Shu; Wei Niu; Pu Zhao,; Yanzhi Wang; Jiuxiang Gu

arXiv:2412.12444·cs.LG·March 24, 2025

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai, Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao,, Yanzhi Wang, Jiuxiang Gu

PDF

Open Access

TL;DR

LazyDiT introduces a lazy learning framework for diffusion transformers that reduces redundant computations during inference, significantly speeding up the process while maintaining high performance, even on mobile devices.

Contribution

It proposes a novel lazy learning approach that reuses previous computations in diffusion transformers, enabling faster inference without sacrificing accuracy.

Findings

01

LazyDiT outperforms DDIM across multiple models and resolutions.

02

The method achieves better performance than DDIM on mobile devices with similar latency.

03

Experimental results validate the efficiency and effectiveness of lazy computation reuse.

Abstract

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as each denoising step requires running the whole transformer model with a large amount of parameters. In this paper, we show that performing the full computation of the model at each diffusion step is unnecessary, as some computations can be skipped by lazily reusing the results of previous steps. Furthermore, we show that the lower bound of similarity between outputs at consecutive steps is notably high, and this similarity can be linearly approximated using the inputs. To verify our demonstrations, we propose the \textbf{LazyDiT}, a lazy learning framework that efficiently leverages cached results from earlier steps to skip redundant computations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsDiffusion