InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

Zihao Wu

arXiv:2512.05134·cs.CV·December 8, 2025

InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

Zihao Wu

PDF

Open Access

TL;DR

InvarDiff is a training-free method that accelerates diffusion model sampling by exploiting feature invariance across timesteps and layers, achieving 2-3x speed-ups with minimal quality loss.

Contribution

We introduce InvarDiff, a novel caching technique leveraging cross-scale invariance in diffusion models for faster inference without retraining.

Findings

01

Achieves 2-3x speed-up in diffusion sampling.

02

Maintains high fidelity with minimal quality degradation.

03

Applicable to models like DiT and FLUX.

Abstract

Diffusion models deliver high-fidelity synthesis but remain slow due to iterative sampling. We empirically observe there exists feature invariance in deterministic sampling, and present InvarDiff, a training-free acceleration method that exploits the relative temporal invariance across timestep-scale and layer-scale. From a few deterministic runs, we compute a per-timestep, per-layer, per-module binary cache plan matrix and use a re-sampling correction to avoid drift when consecutive caches occur. Using quantile-based change metrics, this matrix specifies which module at which step is reused rather than recomputed. The same invariance criterion is applied at the step scale to enable cross-timestep caching, deciding whether an entire step can reuse cached results. During inference, InvarDiff performs step-first and layer-wise caching guided by this matrix. When applied to DiT and FLUX,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Image and Video Quality Assessment