Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor

Xiaoliu Guan; Lielin Jiang; Hanqi Chen; Xu Zhang; Jiaxing Yan; Guanzhong Wang; Yi Liu; Zetao Zhang; Yu Wu

arXiv:2508.02240·cs.CV·November 11, 2025

Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor

Xiaoliu Guan, Lielin Jiang, Hanqi Chen, Xu Zhang, Jiaxing Yan, Guanzhong Wang, Yi Liu, Zetao Zhang, Yu Wu

PDF

TL;DR

This paper introduces a dynamic Taylor-based acceleration method for diffusion transformers that selectively trades off inference speed and output quality, achieving significant speedups with minimal quality loss.

Contribution

It shifts Taylor prediction to the last block level and uses error-based dynamic caching to improve speed and reliability in diffusion model inference.

Findings

01

Achieves 3.17x acceleration on FLUX

02

Achieves 2.36x acceleration on DiT

03

Achieves 4.14x acceleration on Wan Video

Abstract

Diffusion Transformers (DiTs) have demonstrated remarkable performance in visual generation tasks. However, their low inference speed limits their deployment in low-resource applications. Recent training-free approaches exploit the redundancy of features across timesteps by caching and reusing past representations to accelerate inference. Building on this idea, TaylorSeer instead uses cached features to predict future ones via Taylor expansion. However, its module-level prediction across all transformer blocks (e.g., attention or feedforward modules) requires storing fine-grained intermediate features, leading to notable memory and computation overhead. Moreover, it adopts a fixed caching schedule without considering the varying accuracy of predictions across timesteps, which can lead to degraded outputs when prediction fails. To address these limitations, we propose a novel approach to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.