Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
Xinyin Ma, Gongfan Fang, Michael Bi Mi, Xinchao Wang

TL;DR
This paper introduces Learning-to-Cache (L2C), a dynamic caching scheme that significantly reduces inference computation in diffusion transformers without sacrificing much quality, by learning to identify redundant layers across timesteps.
Contribution
The paper proposes a novel differentiable optimization approach for dynamic layer caching in diffusion transformers, enabling up to 93.68% computation reduction with minimal quality loss.
Findings
L2C outperforms existing cache-based methods and samplers like DDIM and DPM-Solver.
Up to 93.68% of cache step computation can be removed with less than 0.01 FID drop.
L2C produces a static computation graph optimized for inference speed.
Abstract
Diffusion Transformers have recently demonstrated unprecedented generative capabilities for various tasks. The encouraging results, however, come with the cost of slow inference, since each denoising step requires inference on a transformer model with a large scale of parameters. In this study, we make an interesting and somehow surprising observation: the computation of a large proportion of layers in the diffusion transformer, through introducing a caching mechanism, can be readily removed even without updating the model parameters. In the case of U-ViT-H/2, for example, we may remove up to 93.68% of the computation in the cache steps (46.84% for all steps), with less than 0.01 drop in FID. To achieve this, we introduce a novel scheme, named Learning-to-Cache (L2C), that learns to conduct caching in a dynamic manner for diffusion transformers. Specifically, by leveraging the identical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Low-power high-performance VLSI design · Analog and Mixed-Signal Circuit Design
MethodsDiffusion
