InstantIR: Blind Image Restoration with Instant Generative Reference

Jen-Yuan Huang; Haofan Wang; Qixun Wang; Xu Bai; Hao Ai; Peng Xing,; Jen-Tse Huang

arXiv:2410.06551·cs.CV·October 10, 2024

InstantIR: Blind Image Restoration with Instant Generative Reference

Jen-Yuan Huang, Haofan Wang, Qixun Wang, Xu Bai, Hao Ai, Peng Xing,, Jen-Tse Huang

PDF

Open Access 1 Models 3 Reviews

TL;DR

InstantIR is a diffusion-based blind image restoration method that dynamically uses a reference image and adaptive sampling to handle unknown degradations, achieving state-of-the-art results and enabling creative restoration.

Contribution

The paper introduces InstantIR, a novel diffusion-based approach that adaptively incorporates generative references during inference for improved blind image restoration.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Effectively handles extreme degradations with creative restoration.

03

Uses variance of references as an indicator for adaptive sampling.

Abstract

Handling test-time unknown degradation is the major challenge in Blind Image Restoration (BIR), necessitating high model generalization. An effective strategy is to incorporate prior knowledge, either from human input or generative model. In this paper, we introduce Instant-reference Image Restoration (InstantIR), a novel diffusion-based BIR method which dynamically adjusts generation condition during inference. We first extract a compact representation of the input via a pre-trained vision encoder. At each generation step, this representation is used to decode current diffusion latent and instantiate it in the generative prior. The degraded image is then encoded with this reference, providing robust generation condition. We observe the variance of generative references fluctuate with degradation intensity, which we further leverage as an indicator for developing a sampling algorithm…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 4

Strengths

Good theoretical analysis and well written manuscript. Some experimental results are used to prove the effectiveness of design.

Weaknesses

The performance on blind image restoration is not so good as other methods. The pipeline is simple and not novel.

Reviewer 02Rating 6Confidence 5

Strengths

1. The design of the proposed method seems to be reasonable. In fact, different images should be enhanced with different generation ability, previous works do not address this issue. Meanwhile, this work introduces a solution for this important issue, which is important. 2. The flexibility of the proposed method is verified, which may contribute to the usage of the diffusion model for image restoration. 3. The paper summarizes the advantages of INSTANTIR and points out possible future research

Weaknesses

1. The figure seems to be naive, I need to read the method carefully to get the key idea of the proposed method. Can authors provide a more detailed figure for both Sec.3.2 and Sec.3.3 to make it more clear? 2. In Sec.3.3, the authors claim "By employing a text-guided Previewer, we can generate diverse restoration variations with compound semantics from both modalities. However, these variation samples can conflict with the original input, making them ineligible as generative references." Howeve

Reviewer 03Rating 5Confidence 5

Strengths

It is not easy to conclude what's the major strengths of this paper since most of the components are pretty straightforward compared with other paper. I will give authors a chance to answer what's the Strengths from authors point of view.

Weaknesses

Here are some of the points I am not clear by reading this paper. 1. Why DINO as the model to provide compact LQ image representation. From my understanding, DINO is a self-supervised learned representation that is very good for semantic understanding. In a sense, the dino feature is highly compact and salient so that a lot of low-level cues are missing in the DINO feature. And Image Restoration is a pretty low-level task which need to pay more on the details rather than saliency. That's wh

Code & Models

Models

🤗
InstantX/InstantIR
model· 1 dl· ♡ 180
1 dl♡ 180

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Image Processing Techniques and Applications

MethodsDiffusion