HiCache: A Plug-in Scaled-Hermite Upgrade for Taylor-Style Cache-then-Forecast Diffusion Acceleration

Liang Feng; Shikang Zheng; Jiacheng Liu; Yuqi Lin; Qinming Zhou; Peiliang Cai; Xinyu Wang; Junjie Chen; Chang Zou; Yue Ma; and Linfeng Zhang

arXiv:2508.16984·cs.CV·January 27, 2026

HiCache: A Plug-in Scaled-Hermite Upgrade for Taylor-Style Cache-then-Forecast Diffusion Acceleration

Liang Feng, Shikang Zheng, Jiacheng Liu, Yuqi Lin, Qinming Zhou, Peiliang Cai, Xinyu Wang, Junjie Chen, Chang Zou, Yue Ma, and Linfeng Zhang

PDF

3 Reviews

TL;DR

HiCache introduces a Hermite polynomial-based feature caching method that accelerates diffusion model inference by improving feature prediction accuracy, achieving significant speedups while maintaining or enhancing output quality across multiple content generation tasks.

Contribution

The paper presents a novel, training-free acceleration framework using Hermite polynomials for Gaussian-like feature prediction in diffusion models, enhancing existing caching methods.

Findings

01

Achieves 5.55x speedup on FLUX.1-dev with maintained or improved quality.

02

Enhances performance of previous caching methods, e.g., ClusCa.

03

Effective across text-to-image, video, and super-resolution tasks.

Abstract

Diffusion models have achieved remarkable success in content generation but often incur prohibitive computational costs due to iterative sampling. Recent feature caching methods accelerate inference via temporal extrapolation, yet can suffer quality degradation from inaccurate modeling of the complex dynamics of feature evolution. We propose HiCache (Hermite Polynomial-based Feature Cache), a training-free acceleration framework that improves feature prediction by aligning mathematical tools with empirical properties. Our key insight is that feature-derivative approximations in diffusion Transformers exhibit multivariate Gaussian characteristics, motivating the use of Hermite polynomials as a potentially optimal basis for Gaussian-correlated processes. We further introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy, and is also…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 2

Strengths

1. The paper addresses a practical and underexplored efficiency bottleneck specific to diffusion-based language models, where repeated denoising iterations make KV caching more memory-intensive than in autoregressive models. This focus is well motivated and timely. 2. HiCache successfully adapts hierarchical caching concepts from systems design to the setting of diffusion language models. The integration into the diffusion pipeline is neat and minimally invasive, requiring no retraining or archi

Weaknesses

Although HiCache is well designed and practically useful, its applicability is limited to diffusion-based language models. The caching and reuse patterns exploited here rely on the iterative refinement process of diffusion models, which differ substantially from autoregressive decoding. The evaluation focuses on throughput and memory reduction, but latency variance and system scalability are not thoroughly discussed. Diffusion inference involves synchronized denoising steps, so delayed cold-ca

Reviewer 02Rating 6Confidence 3

Strengths

The key strengths of the paper lie in its strong theoretical foundation and practical effectiveness. HiCache introduces a principled improvement over Taylor-based caching by recognizing that diffusion transformer features evolve according to approximately Gaussian dynamics. By replacing Taylor’s monomial basis with scaled Hermite polynomials, which are theoretically optimal for Gaussian-correlated processes, the method provides a mathematically sound and statistically aligned framework for featu

Weaknesses

The main weaknesses of the paper stem from its scope, assumptions, and evaluation coverage. HiCache is designed specifically for Diffusion Transformers (DiTs) and relies heavily on the assumption that feature derivatives follow Gaussian statistics. While this is empirically validated for certain architectures like FLUX, the assumption may not hold universally across other diffusion models, such as U-Net–based or multi-modal architectures. Likewise, the framework’s reliance on Hermite polynomials

Reviewer 03Rating 6Confidence 5

Strengths

1. The paper replaces Taylor’s monomial basis with Hermite polynomials derived from Gaussian feature correlations, leveraging Karhunen–Loeve optimality and a single scaling factor σ to improve stability and accuracy. 2. HiCache preserves almost the same implementation form as TaylorSeer, merely replacing the polynomial basis and adding a few scalar evaluations, thereby allowing direct integration into any feature caching–based acceleration framework with negligible computational overhead. 3. Ext

Weaknesses

1. The paper primarily relies on automated metrics such as PSNR, SSIM, LPIPS, and VBench. Incorporating human preference evaluations would make the assessment more convincing. 2. The paper heavily relies on the scaling factor σ to stabilize predictions, yet it lacks a principled rule or analysis on how to select or adapt σ across architectures, acceleration ratios, or polynomial orders.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.