Accelerating Diffusion Transformer via Error-Optimized Cache
Junxiang Qiu, Shuo Wang, Jinda Lu, Lin Liu, Houcheng Jiang, Xingyu Zhu, Yanbin Hao

TL;DR
This paper introduces Error-Optimized Cache (EOC) for Diffusion Transformers, significantly reducing caching errors and improving image generation quality without extra computational cost.
Contribution
The paper proposes a novel error-optimized caching method that enhances diffusion transformer sampling efficiency by reducing caching-induced errors.
Findings
Significant FID improvements across various caching levels.
EOC reduces caching errors without increasing computational load.
Enhanced image quality demonstrated on ImageNet dataset.
Abstract
Diffusion Transformer (DiT) is a crucial method for content generation. However, it needs a lot of time to sample. Many studies have attempted to use caching to reduce the time consumption of sampling. Existing caching methods accelerate generation by reusing DiT features from the previous time step and skipping calculations in the next, but they tend to locate and cache low-error modules without focusing on reducing caching-induced errors, resulting in a sharp decline in generated content quality when increasing caching intensity. To solve this problem, we propose the \textbf{E}rror-\textbf{O}ptimized \textbf{C}ache (\textbf{EOC}). This method introduces three key improvements: \textbf{(1)} Prior knowledge extraction: Extract and process the caching differences; \textbf{(2)} A judgment method for cache optimization: Determine whether certain caching steps need to be optimized;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Analog and Mixed-Signal Circuit Design · Low-power high-performance VLSI design
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Residual Connection · Multi-Head Attention · Label Smoothing · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Softmax
