TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang; Kunjun Li; Xinyin Ma; Xinchao Wang

arXiv:2412.01199·cs.CV·December 3, 2024

TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang, Kunjun Li, Xinyin Ma, Xinchao Wang

PDF

Open Access 1 Repo

TL;DR

TinyFusion introduces a learnable depth pruning method for diffusion transformers, enabling significant model compression and speedup with minimal performance loss, applicable across various architectures.

Contribution

The paper proposes a differentiable, learnable pruning technique that optimizes post-fine-tuning performance, reducing diffusion transformer depth efficiently.

Findings

01

Achieves less than 7% of original training cost for shallow models.

02

Attains 2× speedup with competitive FID scores.

03

Outperforms existing importance-based pruning methods.

Abstract

Diffusion Transformers have demonstrated remarkable capabilities in image generation but often come with excessive parameterization, resulting in considerable inference overhead in real-world applications. In this work, we present TinyFusion, a depth pruning method designed to remove redundant layers from diffusion transformers via end-to-end learning. The core principle of our approach is to create a pruned model with high recoverability, allowing it to regain strong performance after fine-tuning. To accomplish this, we introduce a differentiable sampling technique to make pruning learnable, paired with a co-optimized parameter to simulate future fine-tuning. While prior works focus on minimizing loss or error after pruning, our method explicitly models and optimizes the post-fine-tuning performance of pruned models. Experimental results indicate that this learnable paradigm offers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vainf/tinyfusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsDiffusion · Pruning · Focus