TL;DR
RestoreVAR introduces a VAR-based image restoration method that outperforms diffusion models in quality and speed, enabling faster and more effective all-in-one image restoration.
Contribution
It presents RestoreVAR, a novel VAR-based approach with architectural enhancements for superior performance and over 10x faster inference in all-in-one image restoration.
Findings
RestoreVAR outperforms LDM-based models in restoration quality.
RestoreVAR achieves over 10 times faster inference.
RestoreVAR demonstrates strong generalization capabilities.
Abstract
The use of latent diffusion models (LDMs) such as Stable Diffusion has significantly improved the perceptual quality of All-in-One image Restoration (AiOR) methods, while also enhancing their generalization capabilities. However, these LDM-based frameworks suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications. Visual autoregressive modeling (VAR), a recently introduced approach for image generation, performs scale-space autoregression and achieves comparable performance to that of state-of-the-art diffusion transformers with drastically reduced computational costs. Moreover, our analysis reveals that coarse scales in VAR primarily capture degradations while finer scales encode scene detail, simplifying the restoration process. Motivated by this, we propose RestoreVAR, a novel VAR-based generative approach for AiOR…
Peer Reviews
Decision·ICLR 2026 Poster
The paper provides a nice visualization on on what is encoded in each scale and finds that VAR captures degradations predominantly in coarse scales and scene-level details in fine scales. The paper argues that generative model offers strong generalization and convincingly demonstrates it with performance on real-world degradation as well as human preference. Overall, it's a well executed and presented paper.
Fine-tuning VAE decoder and adding a Latent Refiner Transformer (LRT) are critical to the final performance of the model. In the meantime, these techniques seem to be transferrable to other methods. So it's unclear whether we can attribute the success solely to VAR.
* The paper extends VAR for image restoration tasks, and discusses how restoration tasks can be framed within this ‘next-scale’ paradigm through the scale space analysis (Sec 3.2). * The proposed method performs better than other generative methods in terms of PSNR, SSIM and LPIPS (Table 1\) and is also significantly faster than these prior generative methods. * Experimental comparisons across numerous state of the art generative and non generative methods over standard datasets.
* In terms of PSNR, SSIM and LPIPS, the proposed method is poorer than existing non-generative baselines. The authors suggest that the generative nature of the proposed method provides better generalisation capabilities and attempt to demonstrate this with comparison on other datasets using referenceless IQA metrics. Given the differences in the datasets, methods benchmarked against (generative and non-generative in Table 1 vs only non-generative in Table 2), and metrics, it is hard to place whe
1.RestoreVAR is the first work to apply Visual Autoregressive Modeling (VAR) to image restoration tasks in a generative setting, demonstrating the potential of autoregressive models beyond traditional generation tasks. 2.Across five degradation types, RestoreVAR achieves state-of-the-art performance among current LDM-based generative approaches, while also providing a 10× faster inference speed. 3.The ablation studies are well-designed and verify the effectiveness of each component in the prop
1.The paper compares RestoreVAR with both non-generative AiOR models and LDM-based generative methods. However, on many standard metrics, non-generative methods still outperform RestoreVAR. Although the paper acknowledges this and attributes the limitation to the VAE, it could further clarify when and why a generative approach is preferable in practice and quantify the trade-offs. 2.The fine-tuning of the VAE decoder on continuous latent variables seems like a heuristic workaround, but the pape
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
