Accelerating Diffusion Transformer via Error-Optimized Cache

Junxiang Qiu; Shuo Wang; Jinda Lu; Lin Liu; Houcheng Jiang; Xingyu Zhu; Yanbin Hao

arXiv:2501.19243·cs.CV·July 21, 2025

Accelerating Diffusion Transformer via Error-Optimized Cache

Junxiang Qiu, Shuo Wang, Jinda Lu, Lin Liu, Houcheng Jiang, Xingyu Zhu, Yanbin Hao

PDF

Open Access

TL;DR

This paper introduces Error-Optimized Cache (EOC) for Diffusion Transformers, significantly reducing caching errors and improving image generation quality without extra computational cost.

Contribution

The paper proposes a novel error-optimized caching method that enhances diffusion transformer sampling efficiency by reducing caching-induced errors.

Findings

01

Significant FID improvements across various caching levels.

02

EOC reduces caching errors without increasing computational load.

03

Enhanced image quality demonstrated on ImageNet dataset.

Abstract

Diffusion Transformer (DiT) is a crucial method for content generation. However, it needs a lot of time to sample. Many studies have attempted to use caching to reduce the time consumption of sampling. Existing caching methods accelerate generation by reusing DiT features from the previous time step and skipping calculations in the next, but they tend to locate and cache low-error modules without focusing on reducing caching-induced errors, resulting in a sharp decline in generated content quality when increasing caching intensity. To solve this problem, we propose the \textbf{E}rror-\textbf{O}ptimized \textbf{C}ache (\textbf{EOC}). This method introduces three key improvements: \textbf{(1)} Prior knowledge extraction: Extract and process the caching differences; \textbf{(2)} A judgment method for cache optimization: Determine whether certain caching steps need to be optimized;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Analog and Mixed-Signal Circuit Design · Low-power high-performance VLSI design

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Residual Connection · Multi-Head Attention · Label Smoothing · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Softmax