OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
Huanpeng Chu, Wei Wu, Guanyu Fen, Yutao Zhang

TL;DR
OmniCache is a training-free method that accelerates diffusion Transformer models by strategically reusing cached computations across the entire sampling trajectory, reducing computation without sacrificing quality.
Contribution
It introduces a global, sampling-trajectory-based caching strategy for diffusion Transformers, enhancing efficiency without additional training.
Findings
Significant speedup in sampling time
Maintains high generative quality
Effective cache reuse across entire diffusion process
Abstract
Diffusion models have emerged as a powerful paradigm for generative tasks such as image synthesis and video generation, with Transformer architectures further enhancing performance. However, the high computational cost of diffusion Transformers-stemming from a large number of sampling steps and complex per-step computations-presents significant challenges for real-time deployment. In this paper, we introduce OmniCache, a training-free acceleration method that exploits the global redundancy inherent in the denoising process. Unlike existing methods that determine caching strategies based on inter-step similarities and tend to prioritize reusing later sampling steps, our approach originates from the sampling perspective of DIT models. We systematically analyze the model's sampling trajectories and strategically distribute cache reuse across the entire sampling process. This global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
