Repulsive Latent Score Distillation for Solving Inverse Problems
Nicolas Zilberstein, Morteza Mardani, Santiago Segarra

TL;DR
This paper introduces a novel variational framework with a repulsion mechanism to improve diversity and address latent space ambiguity in score distillation sampling for inverse problems, especially in high-dimensional data.
Contribution
It proposes a multimodal variational approximation with a repulsion mechanism and an augmented distribution to enhance diversity and disentangle latent space in SDS.
Findings
Improved solution diversity in inverse problems.
Enhanced performance on high-resolution image tasks.
Effective mitigation of mode collapse and latent ambiguity.
Abstract
Score Distillation Sampling (SDS) has been pivotal for leveraging pre-trained diffusion models in downstream tasks such as inverse problems, but it faces two major challenges: mode collapse and latent space inversion, which become more pronounced in high-dimensional data. To address mode collapse, we introduce a novel variational framework for posterior sampling. Utilizing the Wasserstein gradient flow interpretation of SDS, we propose a multimodal variational approximation with a repulsion mechanism that promotes diversity among particles by penalizing pairwise kernel-based similarity. This repulsion acts as a simple regularizer, encouraging a more diverse set of solutions. To mitigate latent space ambiguity, we extend this framework with an augmented variational distribution that disentangles the latent and data. This repulsive augmented formulation balances computational…
Peer Reviews
Decision·ICLR 2025 Poster
1. In Table 1, the authors compared RLSD with all the relevant prior works in a very clear way, emphasizing the scalability over dimensions and linear / nonlinear inverse problems as well as the avoidance of score Jacobian computations. 2. The introduction of a repulsion regularization mechanism that leverages kernel-based similarity to prevent particles from collapsing into a single mode is a step forward. This approach promotes exploration of diverse solutions, a common challenge in high-dime
1. While the proposed approach is well-executed, its novelty compared to existing methods could be more explicitly justified. I think using repulsive forces for diversity and variational formulations for latent space decoupling are not truly unique and novel, compared the existing framework of variational formulations. 2. For the experiment section, is it possible for you to add more discussion or results on the performance of RLSD with larger-scale datasets and higher-dimensional spaces to cla
I appreciate the overall strong contribution of this paper. It is well-written and easy to follow. The proposed method introduces a repulsion mechanism to enhance diversity among solutions and an augmented variational distribution to separate latent and data spaces. Experiments on both linear and nonlinear inverse problems demonstrate the method’s effectiveness.
1. A recent baseline published in ICLR 2024 [1], which also uses a latent-diffusion-based approach for inverse problem-solving, is missing. 2. I found the coverage of linear inverse problems to be limited. Previous works like DPS, PLSD and RED-DIFF typically include four types of linear inverse problems—such as inpainting, super-resolution, Gaussian deblurring, and motion deblurring—which would strengthen the analysis. [1] Solving Inverse Problems with Latent Diffusion Models via Hard Data C
The paper presents a clear and well-structured introduction to the concepts of repulsive regularization and augmented variational distribution. The motivation behind these methods is well-founded, and the theoretical derivation is both detailed and easy to understand.
My major concern is that the effectiveness of the two proposed techniques in this paper does not seem to be fully supported by the experiments. The proposed techniques do not consistently yield good results under all conditions. In more detail: 1. Table 2,3,4 show that there is no single setting that can consistently perform the best (or almost the best) across all the three tasks in this paper. For example, NonRepuls-RLSD works the best in HDR, but it performs badly (second-to-worst) in Phase
Code & Models
Videos
Taxonomy
TopicsStatistical Methods and Inference
MethodsSparse Evolutionary Training · Focus · Diffusion
