Prompt-tuning latent diffusion models for inverse problems
Hyungjin Chung, Jong Chul Ye, Peyman Milanfar, Mauricio Delbracio

TL;DR
This paper introduces P2L, a prompt tuning method for latent diffusion models that enhances inverse problem solving by optimizing text prompts and maintaining latent variables within the encoder's range, leading to improved image reconstruction.
Contribution
The paper presents a novel prompt tuning approach combined with latent space projection to improve inverse problem solutions using latent diffusion models.
Findings
P2L outperforms existing inverse problem solvers on multiple tasks.
Prompt tuning improves the faithfulness of generated images.
Latent space projection reduces artifacts in reconstructed images.
Abstract
We propose a new method for solving imaging inverse problems using text-to-image latent diffusion models as general priors. Existing methods using latent diffusion models for inverse problems typically rely on simple null text prompts, which can lead to suboptimal performance. To address this limitation, we introduce a method for prompt tuning, which jointly optimizes the text embedding on-the-fly while running the reverse diffusion process. This allows us to generate images that are more faithful to the diffusion prior. In addition, we propose a method to keep the evolution of latent variables within the range space of the encoder, by projection. This helps to reduce image artifacts, a major problem when using latent diffusion models instead of pixel-based diffusion models. Our combined method, called P2L, outperforms both image- and latent-diffusion model-based inverse problem solvers…
Peer Reviews
Decision·ICML 2024 Poster
(1) The idea of learning prompts to guide the diffusion models for inverse problems is very interesting. (2) The method is technically sound.
The paper lacks a theoretical analysis of, for example, convergence. The results are not very promising. (1) From Table 1, the performance (PSNR) gains of P2L are subtle. (2) From the ablation experiments in Table 4, the difference between the results obtained by not using any of the three proposed modules and the results obtained by using all of them is not significant. (3) From Table 5, the proposed proximal calibration is not that superior to the projection-based calibration, which is even
1. Optimizing the null-embeddings in addition to the latents is a strong contribution and very useful in several downstream applications. 2. The authors achieve state-of-the-art performance in several tasks. 3. The paper is well-written and the main points are clearly discussed with sufficient details to reproduce the results.
1. In Section 3.2, the authors conduct an experiment using PSLD to show that it always diverges even if it started from a clean image. I believe that this experiment does not offer any insights because of two reasons: (i) The approximation used is not what was proposed in PSLD. In fact, the authors of PSLD show that aiming for any fixed point is not a good idea. Instead, they prove that the gluing objective helps recover the unique fixed point that exhibits contraction towards the optimal soluti
1. The motivation of using learnable prompt to improve the performance is meaningful. 2. Experiments on different kinds of image inverse tasks including super-resolution, deblurring, and inpainting are performed.
1. The writing is poor and the submission is hard to follow. 2. Why we need iterative optimization similar to EM algorithm for optimizing text prompt and latent variable? The submission lacks necessary theoretical analysis and experimental evaluation. 3. The proposed projection is similar to that proposed in Chung et al. (2023b). Why the proposed approach can provide a good regularization is not clearly elaborated and why this approach is named projection is not clear. 4. No further an
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMycobacterium research and diagnosis · Numerical methods in inverse problems · Advanced Neuroimaging Techniques and Applications
MethodsDiffusion
