InstantIR: Blind Image Restoration with Instant Generative Reference
Jen-Yuan Huang, Haofan Wang, Qixun Wang, Xu Bai, Hao Ai, Peng Xing,, Jen-Tse Huang

TL;DR
InstantIR is a diffusion-based blind image restoration method that dynamically uses a reference image and adaptive sampling to handle unknown degradations, achieving state-of-the-art results and enabling creative restoration.
Contribution
The paper introduces InstantIR, a novel diffusion-based approach that adaptively incorporates generative references during inference for improved blind image restoration.
Findings
Achieves state-of-the-art performance on benchmark datasets.
Effectively handles extreme degradations with creative restoration.
Uses variance of references as an indicator for adaptive sampling.
Abstract
Handling test-time unknown degradation is the major challenge in Blind Image Restoration (BIR), necessitating high model generalization. An effective strategy is to incorporate prior knowledge, either from human input or generative model. In this paper, we introduce Instant-reference Image Restoration (InstantIR), a novel diffusion-based BIR method which dynamically adjusts generation condition during inference. We first extract a compact representation of the input via a pre-trained vision encoder. At each generation step, this representation is used to decode current diffusion latent and instantiate it in the generative prior. The degraded image is then encoded with this reference, providing robust generation condition. We observe the variance of generative references fluctuate with degradation intensity, which we further leverage as an indicator for developing a sampling algorithm…
Peer Reviews
Decision·Submitted to ICLR 2025
Good theoretical analysis and well written manuscript. Some experimental results are used to prove the effectiveness of design.
The performance on blind image restoration is not so good as other methods. The pipeline is simple and not novel.
1. The design of the proposed method seems to be reasonable. In fact, different images should be enhanced with different generation ability, previous works do not address this issue. Meanwhile, this work introduces a solution for this important issue, which is important. 2. The flexibility of the proposed method is verified, which may contribute to the usage of the diffusion model for image restoration. 3. The paper summarizes the advantages of INSTANTIR and points out possible future research
1. The figure seems to be naive, I need to read the method carefully to get the key idea of the proposed method. Can authors provide a more detailed figure for both Sec.3.2 and Sec.3.3 to make it more clear? 2. In Sec.3.3, the authors claim "By employing a text-guided Previewer, we can generate diverse restoration variations with compound semantics from both modalities. However, these variation samples can conflict with the original input, making them ineligible as generative references." Howeve
It is not easy to conclude what's the major strengths of this paper since most of the components are pretty straightforward compared with other paper. I will give authors a chance to answer what's the Strengths from authors point of view.
Here are some of the points I am not clear by reading this paper. 1. Why DINO as the model to provide compact LQ image representation. From my understanding, DINO is a self-supervised learned representation that is very good for semantic understanding. In a sense, the dino feature is highly compact and salient so that a lot of low-level cues are missing in the DINO feature. And Image Restoration is a pretty low-level task which need to pay more on the details rather than saliency. That's wh
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Image Processing Techniques and Applications
MethodsDiffusion
