Test-Time Iterative Error Correction for Efficient Diffusion Models
Yunshan Zhong, Weiqi Yan, Yuxin Zhang

TL;DR
This paper introduces Iterative Error Correction (IEC), a test-time method that significantly improves the quality of efficient diffusion models by iteratively refining outputs, reducing error propagation from exponential to linear growth without retraining.
Contribution
The paper presents IEC, a novel test-time correction technique that enhances diffusion model outputs by mitigating approximation errors without modifying the model architecture.
Findings
IEC improves image quality across multiple datasets.
IEC reduces error growth from exponential to linear.
IEC is compatible with various models and efficiency techniques.
Abstract
With the growing demand for high-quality image generation on resource-constrained devices, efficient diffusion models have received increasing attention. However, such models suffer from approximation errors introduced by efficiency techniques, which significantly degrade generation quality. Once deployed, these errors are difficult to correct, as modifying the model is typically infeasible in deployment environments. Through an analysis of error propagation across diffusion timesteps, we reveal that these approximation errors can accumulate exponentially, severely impairing output quality. Motivated by this insight, we propose Iterative Error Correction (IEC), a novel test-time method that mitigates inference-time errors by iteratively refining the model's output. IEC is theoretically proven to reduce error propagation from exponential to linear growth, without requiring any retraining…
Peer Reviews
Decision·ICLR 2026 Poster
- IEC is model-agnostic, requires no retraining, and can be dropped into existing inference pipelines. This plug-and-play nature makes it highly practical for real deployments. - The method is evaluated across several models and multiple datasets. Improvements are consistent, with especially notable gains for aggressive efficiency settings. - The paper includes ablations over λ, iteration count K, and which timesteps are refined, and a brief comparison to naïvely adding iterations, helping under
- For Stable Diffusion, IEC is applied only at the first timestep, yielding modest improvements. This raises concerns about the practical benefits at scale if the method cannot be applied more broadly due to compute. - W4A8 on LSUN-Bedrooms “collapses” without detail. A short analysis of when IEC helps vs. cannot rescue severe degradation would be valuable, potentially guiding users to regimes where IEC is most beneficial. - Most experiments use T=100 (or 250). Since many efficient systems run a
-- Problem: It targets the practical, post-deployment scenario for efficient models, which is somewhat under-explored. -- Idea: It is related to test-time scaling, and potentially provides another new dimension in which we can scale (the inner-loop iteration at each timestep). -- Analysis: The theoretical analysis is of good quality. It is also appreciated that the authors try to validate their method on multiple models and settings.
-- The theoretical framework seems general to not be limited to errors from efficiency techniques. It is unclear how the definition of $\tilde x_t = x_t + \delta_t$ and $\tilde\epsilon_\theta = \epsilon_\theta + \epsilon_\theta^\delta$ are necessarily linked to quantization or caching. It seems potentially applicable to any diffusion model, where $x_t$ and $\epsilon_\theta$ correspond to an ideal denoiser (please refer to Figure 1 in EDM). It would strengthen the paper if the authors could show
1. Clear problem motivation (deployed models can't be easily modified) thus a training free method is needed. 2. The ability to apply IEC to selected timesteps (±1/10, ±1/20) provides efficiency options, though this flexibility is expected from any refinement method 3. IEC shows gains across all tested scenarios, demonstrating broad applicability 4. The paper is generally easy to follow with good visual aids
Fundamentally Unfair Comparison: 1. Baseline uses 100 forward passes, while IEC with K=1 applied to all T=100 steps uses 200 forward pass. 2. Missing crucial baseline: what is the FID with T=200 steps without IEC? 3. This is like claiming "Method A is better than Method B" when Method A uses 2× computation IEC is Multi-Step Refinement essentially 1. Eq. 10 is mathematically equivalent to: repeatedly calling the denoising function with corrected inputs 2. This is essentially running multiple sam
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Computer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis
