What's in a Prior? Learned Proximal Networks for Inverse Problems
Zhenghan Fang, Sam Buchanan, Jeremias Sulam

TL;DR
This paper introduces learned proximal networks (LPN) that serve as exact proximal operators for data-driven regularizers, enabling guaranteed convergence and improved inverse problem solutions with insights into learned priors.
Contribution
The authors develop a framework for LPNs that are provably exact proximal operators for nonconvex regularizers and introduce proximal matching for training, advancing deep inverse problem methods.
Findings
LPNs provide exact proximal operators for nonconvex regularizers.
Proximal matching promotes recovery of true data priors.
Models achieve state-of-the-art performance and reveal learned priors.
Abstract
Proximal operators are ubiquitous in inverse problems, commonly appearing as part of algorithmic strategies to regularize problems that are otherwise ill-posed. Modern deep learning models have been brought to bear for these tasks too, as in the framework of plug-and-play or deep unrolling, where they loosely resemble proximal operators. Yet, something essential is lost in employing these purely data-driven approaches: there is no guarantee that a general deep network represents the proximal operator of any function, nor is there any characterization of the function for which the network might provide some approximate proximal. This not only makes guaranteeing convergence of iterative schemes challenging but, more fundamentally, complicates the analysis of what has been learned by these networks about their training data. Herein we provide a framework to develop learned proximal…
Peer Reviews
Decision·ICLR 2024 poster
This work addresses a prevalent problem at the core of PnP methods, which is the use of MMSE denoisers instead of MAP denoisers, despite the fact that the convergence results for PnP holds for MAP denoisers. They offer a novel way to learn a MAP denoiser and provide convergence results.
- The main contribution of the work is to propose a way to learn a MAP denoiser through a proximal loss under equation 3.4. Optimizing for this loss entails that the prior distribution is assumed to be a mixture of Gaussians (or Diracs when $\gamma$ tends to zero) around the training samples. Why is this a good prior? It seems to me that is a too simplistic prior and in the limit of $\gamma \to 0$ a discontinuous prior. - Although most of the paper was quite clear I found it unclear whether the
The use of proximal and descent methods in conjunction with deep neural networks to address inverse problems is a dynamic and exciting field of study. Numerous papers have explored these approaches, attempting to approximate Maximum A Posteriori (MAP) estimation in various ways or training regularizers in a supervised manner (which is not always feasible and not even correct). To the best of my knowledge, this paper stands out as the first to present a principled approach for training proximal m
- A straightforward solution to obtain the proximal map of log p(x) is to initially train an energy-based model, E, and then compute the proximal map of og E. The authors should either compare this approach with their proposed method or provide clarification on why this is not considered viable or advisable. For instance, one might anticipate encountering similar challenges as those faced when training energy models when training Learned Proximal Networks (LPNs). - In a similar vein, a natural
1. The paper forms a class of neural networks to guarantee to parameterize proximal operators. The idea is novel. 2. Some theoretical results are provided. 3. Experiments show the effectiveness.
1. The paper is not well organized, making it hard to follow. 2. The experiments are a little weak. See the questions below.
Code & Models
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Face and Expression Recognition
