Inertial nonconvex alternating minimizations for the image deblurring
Tao Sun, Roberto Barrio, Marcos Rodriguez, Hao Jiang

TL;DR
This paper introduces a novel inertial nonconvex ADMM algorithm for image deblurring that demonstrates improved efficiency and convergence properties over traditional convex methods, validated through numerical experiments.
Contribution
It extends the classical ADMM with an inertial technique to handle nonconvex TV regularization, providing convergence guarantees under certain conditions.
Findings
The proposed IADMM converges under specific parameter assumptions.
Numerical simulations show improved performance over traditional ADMM.
The method effectively reconstructs blurred images with better numerical stability.
Abstract
In image processing, Total Variation (TV) regularization models are commonly used to recover blurred images. One of the most efficient and popular methods to solve the convex TV problem is the Alternating Direction Method of Multipliers (ADMM) algorithm, recently extended using the inertial proximal point method. Although all the classical studies focus on only a convex formulation, recent articles are paying increasing attention to the nonconvex methodology due to its good numerical performance and properties. In this paper, we propose to extend the classical formulation with a novel nonconvex Alternating Direction Method of Multipliers with the Inertial technique (IADMM). Under certain assumptions on the parameters, we prove the convergence of the algorithm with the help of the Kurdyka-{\L}ojasiewicz property. We also present numerical simulations on classical TV image reconstruction…
| stands for or ( norm) | |
|---|---|
| dist | |
| stands for the Kronecker product | |
| stands for the function class whose derivatives are continuous | |
| for a matrix A, rank(A) stands for its rank, | |
| stands for the minimum eigenvalue of | |
| IADMM () | IADMM () | ADMM(TV1) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| IMG | I1 | ERROR | SNR | RES | I2 | ERROR | SNR | RES | I2/I1 | I3 | ERROR | SNR | RES | I3/I1 |
| L0 | 4 | 17.4 | 11.1 | 4.59e-03 | 6 | 17.4 | 11.1 | 4.62e-03 | 1.50 | 7 | 17.6 | 11.0 | 4.80e-03 | 1.75 |
| L1 | 5 | 16.7 | 10.0 | 4.67e-03 | 8 | 16.5 | 10.1 | 4.60e-03 | 1.60 | 9 | 16.8 | 9.9 | 4.95e-03 | 1.80 |
| L2 | 2 | 19.4 | 8.7 | 4.36e-03 | 2 | 19.9 | 8.5 | 4.69e-03 | 1.00 | 2 | 20.2 | 8.3 | 4.88e-03 | 1.00 |
| L3 | 2 | 16.6 | 13.2 | 3.01e-03 | 2 | 17.1 | 13.0 | 3.25e-03 | 1.00 | 2 | 17.9 | 12.6 | 3.67e-03 | 1.00 |
| M0 | 8 | 30.6 | 13.8 | 4.50e-03 | 13 | 30.4 | 13.8 | 4.67e-03 | 1.62 | 16 | 30.6 | 13.8 | 4.88e-03 | 2.00 |
| M1 | 9 | 28.5 | 13.7 | 4.33e-03 | 14 | 28.7 | 13.6 | 4.86e-03 | 1.56 | 18 | 28.5 | 13.7 | 4.78e-03 | 2.00 |
| M2 | 6 | 36.6 | 12.8 | 3.48e-03 | 7 | 40.1 | 12.0 | 4.82e-03 | 1.17 | 9 | 39.9 | 12.1 | 4.86e-03 | 1.50 |
| M3 | 7 | 35.5 | 12.5 | 4.34e-03 | 11 | 35.6 | 12.4 | 4.50e-03 | 1.57 | 13 | 36.3 | 12.3 | 4.97e-03 | 1.86 |
| H0 | 8 | 71.4 | 10.3 | 4.50e-03 | 13 | 71.0 | 10.3 | 4.65e-03 | 1.62 | 16 | 71.4 | 10.3 | 4.84e-03 | 2.00 |
| H1 | 9 | 69.2 | 6.3 | 4.47e-03 | 14 | 69.7 | 6.2 | 4.96e-03 | 1.56 | 18 | 69.2 | 6.3 | 4.89e-03 | 2.00 |
| H2 | 7 | 74.2 | 3.3 | 4.16e-03 | 10 | 76.2 | 3.0 | 4.87e-03 | 1.43 | 13 | 75.5 | 3.1 | 4.77e-03 | 1.86 |
| H3 | 6 | 77.4 | 1.5 | 4.08e-03 | 8 | 80.8 | 1.1 | 4.87e-03 | 1.33 | 11 | 78.9 | 1.3 | 4.48e-03 | 1.83 |
| IADMM () | IADMM () | ADMM(TV1) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| IMG | I1 | ERROR | SNR | RES | I2 | ERROR | SNR | RES | I2/I1 | I3 | ERROR | SNR | RES | I3/I1 |
| L0 | 5 | 3.0 | 26.5 | 4.72e-01 | 5 | 2.8 | 27.1 | 4.31e-01 | 1.00 | 10 | 6.2 | 20.0 | 9.56e-02 | 2.00 |
| L1 | 3 | 3.3 | 24.1 | 5.38e-01 | 5 | 3.2 | 24.3 | 4.62e-01 | 1.67 | 10 | 5.8 | 19.2 | 9.48e-02 | 3.33 |
| L2 | 1 | 3.8 | 22.9 | 8.22e-01 | 3 | 3.7 | 23.0 | 5.13e-01 | 3.00 | 10 | 9.3 | 15.0 | 9.68e-02 | 10.00 |
| L3 | 6 | 1.4 | 34.6 | 2.41e-01 | 7 | 1.3 | 35.5 | 2.00e-01 | 1.17 | 10 | 7.0 | 20.7 | 9.86e-02 | 1.67 |
| M0 | 3 | 6.2 | 27.6 | 5.79e-01 | 5 | 6.0 | 28.0 | 4.98e-01 | 1.67 | 10 | 7.0 | 26.5 | 9.74e-02 | 3.33 |
| M1 | 7 | 4.3 | 30.2 | 4.40e-01 | 7 | 3.9 | 31.0 | 4.12e-01 | 1.00 | 10 | 5.4 | 28.1 | 9.78e-02 | 1.43 |
| M2 | 4 | 3.0 | 34.5 | 1.50e-01 | 4 | 3.0 | 34.7 | 1.00e-01 | 1.00 | 10 | 20.6 | 17.8 | 9.90e-02 | 2.50 |
| M3 | 1 | 14.6 | 20.2 | 8.94e-01 | 1 | 14.5 | 20.2 | 8.68e-01 | 1.00 | 10 | 31.7 | 13.4 | 9.64e-02 | 10.00 |
| H0 | 5 | 13.8 | 24.5 | 4.96e-01 | 5 | 12.6 | 25.3 | 4.44e-01 | 1.00 | 10 | 15.2 | 23.7 | 9.52e-02 | 2.00 |
| H1 | 3 | 16.6 | 18.7 | 5.73e-01 | 5 | 16.2 | 18.9 | 4.98e-01 | 1.67 | 10 | 20.4 | 16.9 | 9.47e-02 | 3.33 |
| H2 | 5 | 15.7 | 16.8 | 5.11e-01 | 5 | 14.6 | 17.4 | 4.57e-01 | 1.00 | 10 | 20.7 | 14.4 | 9.29e-02 | 2.00 |
| IADMM () | IADMM () | ADMM(TV1) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I1 | ERROR | SNR | RES | I2 | ERROR | SNR | RES | I2/I1 | I3 | ERROR | SNR | RES | I3/I1 | |
| 0.001 | 6 | 33.5 | 13.0 | 7.40e-03 | 8 | 35.3 | 12.5 | 8.70e-03 | 1.33 | 9 | 34.0 | 12.8 | 7.50e-03 | 1.50 |
| 0.01 | 4 | 19.1 | 17.9 | 8.88e-03 | 4 | 21.2 | 16.9 | 9.82e-03 | 1.00 | 5 | 20.0 | 17.4 | 7.68e-03 | 1.25 |
| 0.10 | 3 | 11.2 | 22.5 | 2.39e-02 | 4 | 11.4 | 22.3 | 2.31e-02 | 1.33 | 5 | 10.8 | 22.8 | 2.32e-02 | 1.67 |
| 1.00 | 1 | 8.0 | 25.4 | 2.25e-01 | 2 | 7.5 | 26.0 | 2.03e-01 | 2.00 | 3 | 7.3 | 26.3 | 2.20e-01 | 3.00 |
| 10.00 | 3 | 6.2 | 27.6 | 5.79e-01 | 5 | 6.0 | 28.0 | 4.98e-01 | 1.67 | 10 | 7.0 | 26.5 | 9.74e-02 | 3.33 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAlternating Direction Method of Multipliers
Inertial nonconvex alternating minimizations for the image deblurring
Tao Sun, Roberto Barrio, Marcos Rodríguez, and Hao Jiang This work was supported by the National Key Research and Development Program of China (2017YFB0202003), and National Natural Science Foundation of Hunan Province in China (2018JJ3616), and by the Spanish Research projects MTM2015-64095-P, PGC2018-096026-B-I00, and European Regional Development Fund and Diputación General de Aragón (E24-17R). Tao Sun and Hao Jiang are with the College of Computer, National University of Defense Technology, Changsha, 410073, Hunan, China (e-mail: [email protected], [email protected]). Roberto Barrio is with Departamento de Matemática Aplicada, IUMA and CoDy group, Universidad de Zaragoza, Pedro Cerbuna, 12, Zaragoza, 50009, Spain. (e-mail: [email protected]). Marcos Rodríguez is with CoDy group, Zaragoza, Universidad de Zaragoza, Pedro Cerbuna, 12, Zaragoza, 50009, Spain. (e-mail: [email protected]).
Abstract
In image processing, Total Variation (TV) regularization models are commonly used to recover blurred images. One of the most efficient and popular methods to solve the convex TV problem is the Alternating Direction Method of Multipliers (ADMM) algorithm, recently extended using the inertial proximal point method. Although all the classical studies focus on only a convex formulation, recent articles are paying increasing attention to the nonconvex methodology due to its good numerical performance and properties. In this paper, we propose to extend the classical formulation with a novel nonconvex Alternating Direction Method of Multipliers with the Inertial technique (IADMM). Under certain assumptions on the parameters, we prove the convergence of the algorithm with the help of the Kurdyka-Łojasiewicz property. We also present numerical simulations on classical TV image reconstruction problems to illustrate the efficiency of the new algorithm and its behavior compared with the well established ADMM method.
Index Terms:
Inertial algorithms, nonconvex method, Kurdyka-Łojasiewicz property, image deblurring, ADMM, inertial proximal ADMM.
I Introduction
Denoising and deblurring have numerous applications in communications, control, machine learning, and many other fields of engineering and science. Restoration of distorted images is, from the theoretical, as well as from the practical point of view, one of the most interesting and important problems of image processing. One special case is the blurring, due, for instance, to incorrect focus and/or blurring due to movement, or added Gaussian noise (a Gaussian blur).
A mathematical model for the process of blurred images can be expressed as follows. Let be a two-dimensional index set representing the image domain, be the original image, be the observed image, and be a linear blurring operator. Then, the blurred image can be written [1] as
[TABLE]
where is an unknown additive noise vector. In this paper, the blurring operator is assumed to be known, otherwise, one will deal with the blind image deblurring problem [2], in which also needs to be solved.
In image processing one typically aims at recovering an image from noisy data while still keeping edges in the image, and this goal is the main reason of the tremendous success of the Total Variation (TV) regularization [3] for solving the deblurring problem (although other methods are also used). The TV method can be presented as
[TABLE]
being a parameter and where is the gradient operator, , and the norms , .
In most situations, rather than directly minimizing the support of the image, one is interested in minimizing the support of the gradient of the recovered image. In most references, the convex methodology is considered [4, 5, 6], but in recent years, some nonconvex methods have been developed [7, 8, 9]. The use of a suitable nonconvex and nondifferentiable function allows possibly a smaller number of measurements than the convex one in compressed sensing [7]. In [10] the authors showed that nonconvex regularization terms in total variation-based image restoration yields even better edge preservation when compared to the convex-type regularization. Moreover, they showed that it seems to be also more robust with respect to noise. Nonconvex regularization in image restoration poses significant challenges on the existence of solutions of associated minimization problems and on the development of efficient solution algorithms.
The main difference between the convex and nonconvex methods is replacing the -norm of the variational term by the nonconvex and nondifferentiable function that uses the nonconvex regulation function , and that we refer to as the semi-norm (). Therefore, the general nonconvex deblurring model is presented as
[TABLE]
Many efficient numerical algorithms have been developed for solving the TV regularization problem. One of the most efficient methods for the convex problem (2) is the Alternating Direction Method of Multipliers (ADMM) algorithm [11, 12, 13]. In the general case (convex and nonconvex cases depend of the function ), the method is constructed by introducing an auxiliary variable , which actually represents , to reformulate (3) into a composite optimization problem with linear constraints. The augmented Lagrange dual function is then
[TABLE]
where is a parameter, and the norm . If , we use the notation for representing . Now, the standard convex ADMM method () for the deblurring problem can be presented as
[TABLE]
The earlier analyses of convergence and performance of the ADMM algorithms directly depended on the existing results of ADMM framework [14, 15, 16, 17, 18]. More recently, motivated by acceleration techniques proposed in [19], inertial algorithms have been proposed for many areas such as (distributed) optimization and imaging sciences in references [20, 21, 22, 23, 24]. The ideas of the inertial strategy have been also applied to ADMM in [4, 25]; and under several assumptions in convex case, some convergence results are proved in those articles. As the nonconvex penalty functions perform more efficiently in some applications, as above commented, nonconvex ADMM has been also developed and studied [26, 27, 28, 29, 30, 31, 32, 33, 34]. The main goal of this paper is to propose a new algorithm that combines the nonconvex methods and the inertial strategy organically.
In this paper, when is nonconvex, we consider a new inertial scheme for the image deblurring model (3). One of the main differences (and new difficulties) with the convex ADMM, is that in order to properly define the nonconvex ADMM some extra assumptions are needed to prove the convergence. First, at least one of the objective functions has to be smooth. And more, matrix corresponding to the smooth function is required to be injective, i.e., reversible. Thus, a direct application of the ADMM scheme to the image deblurring model cannot guarantee the convergence because the operator fails to be injective (although the numerical performance may be good in some cases). Considering this, we first modify the model (3), and then we develop the new nonconvex inertial ADMM. By using the Kurdyka-Łojasiewicz property, we prove the convergence of the new algorithm under several requirements on the parameters. In opposite to the convex case, selecting a suitable parameter is crucial to obtain the convergence of the new algorithm. In order to make the method more useful, we provide a probabilistic strategy for selecting a suitable .
The rest of the paper is organized as follows. In Section II we collect some mathematical preliminaries needed for the convergence analysis. Section III presents the details for the new algorithm (inertial alternating minimization algorithm, IADMM) including the schemes and parameters. In Section IV, we prove the convergence of the new algorithm. Section V reports the numerical results and compares the algorithm with convex and nonconvex ADMM. Section VI gives some conclusions. Finally, we provide in the Appendixes all the detailed proofs of the proposed results.
II Mathematical tools
In this section we present the definitions and basic properties of the subdifferentials and the Kurdyka-Łojasiewicz functions used later in the convergence analysis. The basic notations used in this paper are detailed in Table I.
II-A Subdifferentials
We collect several definitions as well as some useful properties in variational and convex analysis (see the monographes [35, 36, 37, 38]). For any matrix , we define to be the adjoint of .
Definition 1
Let be a proper and lower semicontinuous function. The (limiting) subdifferential, or simply the subdifferential, of at , written as , is defined as
[TABLE]
It is easy to verify that the Fréchet subdifferential is convex and closed while the subdifferential is closed. When is convex, the definition agrees with the subgradient in convex analysis [38]. We call is strongly convex with constant if for any and any , it holds that And is called as -gradient continuous (Lipschitz) with constant if Noting the closedness of the subdifferential, we have the following simple proposition.
Proposition 1
If , and , then we have
Definition 2
A necessary condition for to be a minimizer of is
[TABLE]
which is also sufficient when is convex. A point that satisfies (9) is called (limiting) critical point. The set of critical points of is denoted by .
With these basics, we can easily obtain the following proposition.
Proposition 2
If is a critical point of whose definition given in (III), it must hold that
[TABLE]
Finally, the proximal map of is defined as
[TABLE]
Note that can be nonconvex. If is convex, is a point-to-point operator; otherwise, it may be point-to-set.
II-B Kurdyka-Łojasiewicz property
In this paper the convergence analysis is based on the Kurdyka-Łojasiewicz functions, originated in the seminal works of Łojasiewicz [39] and Kurdyka [40]. This kind of functions has played a key role in several recent convergence results on nonconvex minimization problems and they are ubiquitous in applications.
Definition 3** ([41, 42])**
(a) The function is said to have the Kurdyka-Łojasiewicz property at if there exist , a neighborhood of and a continuous concave function such that
. 2. 2.
* is on .* 3. 3.
For all , . 4. 4.
For all in , the Kurdyka-Łojasiewicz inequality holds
[TABLE]
(b) Proper lower semicontinuous functions which satisfy the Kurdyka-Łojasiewicz inequality at each point of are called KŁ functions.
Remark 1
There are large sets of functions that are KŁ functions [41].
Lemma 1** ([42])**
Let be a proper lower semi-continuous function and be a compact set. If is a constant on and satisfies the KŁ property at each point on , then there exists a concave function satisfying the four properties given in Definition 3, and constants such that for any and any satisfying that and , it holds that
[TABLE]
III Nonconvex IADMM algorithm
In this section we introduce the new extended Inertial Alternating Direction Method of Multipliers (ADMM) algorithm for nonconvex functions.
In this paper, we consider (equivalent to space ) as the two-dimensional index set representing the image domain. In this case, the image variable constrained on is actually a matrix. We use the symbol to present its vectorization (a vector of all the columns of the image variable). And then the original total variation operator then enjoys the following form
[TABLE]
where the identity matrix and the banded matrix
[TABLE]
If we directly apply the inertial ADMM, the convergence is hard to be proved as fails to be injective. Therefore, we need to modify the image deblurring model (3). To that goal, we define
[TABLE]
Obviously, we have . Noting that
[TABLE]
and thus, is injective. The following technical lemma focuses on giving a lower bound for the operator .
Lemma 2
For any , it holds that
[TABLE]
where \theta=1/(2\sin(\frac{\pi}{2N})\big{)}.
Then, the image deblurring model (3) is equivalent to
[TABLE]
Instead, we consider its extended penalty form
[TABLE]
where is a large weight parameter. Therefore, we apply the nonconvex inertial ADMM to
[TABLE]
where K=\left(\begin{array}[]{cc}\tilde{K}&\textbf{0}\\ \beta\mathbb{I}&-\beta\mathbb{I}\\ \end{array}\right) and f=\left(\begin{array}[]{c}\tilde{f}\\ \textbf{0}\\ \end{array}\right). This leads us to define the function
[TABLE]
Inertial methods have witnessed great success in convex ADMM and nonconvex first-order algorithms. In the nonconvex optimization community, the inertial style ADMM has never been proposed and analyzed. The convex inertial ADMM has been proposed in [4], in which one first uses the “inertial method” to refresh the current sequence with last iteration, and then performs the ADMM scheme with the updated variables. However, the direct extension of convex ADMM is not allowed in the nonconvex settings. This is because without convexity, several descents are heavily dependent on the continuities of the functions, which may fail to obey. And the difference of function values at two different points is hard to estimate, which leads to troubles in the convergence proof. Thus, in the updating of , we used rather than the updated one. And the nonconvex IADMM scheme proposed in this paper is defined as follows
[TABLE]
where is a free parameter chosen by the user. Actually, if , the algorithm then will reduce to basic ADMM.
Now we can focus on rewriting the inertial scheme for the image deblurring model (30). First, we rearrange the minimization for ,
[TABLE]
being the proximal map of (14). For a matrix , and indices ,
[TABLE]
The scheme for updating can be rewritten as
[TABLE]
That is also
[TABLE]
Taking into account (30), (31) and (34) we propose a nonconvex inertial version of the IADMM algorithm (Algorithm 1).
Assumption 1: We assume that , where K_{0}:=\left(\begin{array}[]{cc}\tilde{K}&\textbf{0}\\ \mathbb{I}&-\mathbb{I}\\ \end{array}\right). And the minimum single value is given as .
This hypothesis also indicates that the matrix \left(\begin{array}[]{c}\mathcal{T}\\ K_{0}\\ \end{array}\right) is reversible. Note that the rank of is . Then, the assumed hypothesis is easy to be satisfied.
We remark that is the minimizer of , which is strongly convex with . If we set , then we have
[TABLE]
where we used the fact when .
The following problem is what exactly is. In a real situation, the dimensions of are large, and so, a direct calculation leads to a large computational cost. Therefore, we provide a probabilistic method to estimate a suitable value of . If is reversible, it is easy to see that . Then, if we obtain a bound , we then have . To that goal we employ a Lemma proposed in [43]:
Lemma 3** (Lemma 4.1, [43])**
For a fixed positive integer and a real number , and given an independent family of standard Gaussian vectors, we have that
[TABLE]
with probability at least .
Note that for computing we just need several FFT and inverse FFT. Therefore, its computational cost is low (), and the estimation of is very fast.
IV Convergence analysis
This section consists of two parts and provides a complete analysis of the convergence problem of the nonconvex IADMM algorithm. The first subsection contains the main convergence results, the proof sketch, the difficulties in the proof and theoretical contributions. While the second subsection introduces the necessary technical lemmas. Assumption 1 holds through this section.
IV-A Main results
Theorem 1** (Stationary point convergence)**
Assume that the free parameter satisfies the condition
[TABLE]
with \theta=1/(2\sin(\frac{\pi}{2N})\big{)}, then any cluster point is also a critical point of .
Theorem 1 describes the stationary point convergence result for the IADMM method, which is free of using the KŁ property of the functions. If the KŁ property is further assumed, the sequence convergence can be proved giving us the Theorem 2.
Theorem 2** (Sequence convergence)**
Let condition (46) hold, and the auxiliary function (40) be a KŁ function. Then the sequence , generated by Algorithm 1, converges to a critical point of .
The proof can be divided into two parts, and in order to help the reader we first give a brief sketch of the proof:
I. In the first part we introduce an auxiliary sequence , where are composite points from . An auxiliary function is also proposed. In Lemma 5, we prove a “sufficient descent condition” of the values of at , i.e.,
[TABLE]
where is a positive constant.
II. We prove a “relative error condition” of , i.e., there exists such that
[TABLE]
where is a positive constant. Note that this condition is different from the “real” relative error condition proposed in [41].
The major difficulty in deriving these two conditions is the use of inertial terms, with which the descent values are lower bounded by and rather than and . Similarly, the relative error is bounded by and . The relative error can be expanded by triangle inequalities, which is relatively proved. However, for the sufficient descent, the use of the triangle inequalities is much more difficult and technical for the lower boundedness.
The theoretical contributions in this paper are two-fold. The first one is, of course, dealing with the difference caused by the inertial terms. This part also includes how to design the scheme of the algorithm, whose details have been presented in previous section. The second one is to determine the parameters for IADMM applied to the image deblurring.
The main results can be proved with the following lemmas.
IV-B Technical lemmas
Lemma 4
Let the sequence be generated by Algorithm 1 to solve problem (3), then
[TABLE]
where is given in Lemma 2 and denotes the spectral radius of .
Now we provide the main technical lemma that states the descent condition for a suitable function of the sequences of the Algorithm 1.
Lemma 5
Let the sequence be generated by Algorithm 1 and conditions of Theorem 1 hold. By defining the auxiliary function
[TABLE]
and
[TABLE]
we have that
[TABLE]
where
[TABLE]
If the sequence is bounded, then, it holds that
[TABLE]
Remark 2
Based on Lemma 5, it is important to guarantee that the condition (36) can be satisfied. This fact can be reached if is large enough. Fortunately, in the Algorithm 1 the parameter can be fixed by the user. Thus, the parameter shall be chosen large enough to guarantee the convergence considering condition (36).
Lemma 6
If the nonconvex regulation function is coercive and
[TABLE]
the sequence is bounded.
Remark 3
By combining the conditions (36) and (45), we just need that
[TABLE]
Lemma 7
Let the sequence be generated by Algorithm 1. Then, for any , there exists and such that
[TABLE]
Now, we recall a definition about the limit point set introduced in [42], which denotes the set of all the stationary points generated by the nonconvex IADMM. The specific mathematical definition of is given as follows.
Definition 4
Let be generated by the nonconvex IADMM . We define the set by
[TABLE]
Lemma 8
Let the sequence be generated by Algorithm 1, the auxiliary function defined in (40) and suppose that condition (46) holds. Then, we have the following results.
(1) is nonempty and .
(2) .
(3) The objective function is finite and constant on .
V Numerics
In this section, we illustrate the effectiveness of the proposed algorithm on different numerical blurred images with Gaussian blur.
All the programs have been written entirely in C++, and all the experiments are implemented under Linux running on a desktop computer with an Intel Core i5-2400S CPU (2.5 GHz) and 4 GB Memory. The FFT subroutines used in the algorithms are taken from the fftw-3111http://www.fftw.org/ library. As test problems we have selected twelve images (see Figure 1), which include seven images from USC-SIPI image database222http://sipi.usc.edu/database/, two classical test images (Lena and cameraman), one text image from “El Quijote” book and two medical images. In order to obtain the blurred images, we use, as it is common in literature, the blurring operator generated using a convolution with Gaussian kernel (KernelSize , KernelMu , KernelSigma ) and circular mapping on the edges of the image.
The proposed algorithm IADMM (Algorithm 1) is compared with the widely used augmented Lagrangian methods (ADMM [22]) for image deblurring. We mainly consider two models, i.e., and . We call them as TV1 and TV(1/2) methods, respectively. Note that TV1 is a convex method, while TV(1/2) is nonconvex. In the tests we have considered (unless so indicated) for all the methods the value of the penalty parameter (a small one) and/or (a large one) just to see the behaviour of the IADMM algorithm.
The performance of the deblurring algorithms is quantitatively measured by means of the objective function (Equations 2 or 3), the Real Error as of the difference between the original and deblurred images, the signal-to-noise ratio (SNR) [22]
[TABLE]
where and denote the original image and the restored image, respectively, and represents the mean of the original image , and the residual ( and in the standard and inertial versions, respectively) as described in [22]:
[TABLE]
In the tests we do not provide CPU tests as all the algorithms have a very similar computation cost per iteration (mainly from the FFT routines). Therefore, there is almost no difference between pictures showing iterations or CPU cost, and the consumed CPU is basically proportional to the respective number of iterations.
On our first test we use the brain M0 image with a low value of and we show in Figure 2 that the performances of the ADMM TV1 and TV(1/2) are quite similar in error and SNR. Therefore, in the rest of comparisons we will just consider the TV1 method. For the IADMM method the inertial parameter in Algorithm 1 is investigated firstly by using two different values ( and ). The main difference observed in these tests is that the nonconvex method is more unstable once reached the maximum precision (at this point the ADMM TV1 convex method seems to be the more stable with a quite smooth behaviour). The fastest convergence is observed using the IADMM (with the largest stepsize value ) method, but when the maximum precision is obtained unstable behaviour appears. Therefore, using mainly the information provided by the residual (Eq. (50)), we provide a stop control criterion (in the same spirit as [22]) that stops the iterative process at the black points of Figure 2 (d) in the tests. That is, we stop when the following condition is hold
[TABLE]
In the first case the convergence till the desired tolerance error is obtained, while in the second case the algorithm has reached its stability limit and the residual grows, behaving later in an unstable way. Note that the residual is used in the stop criterion, as it uses known data from the iterations (and which does not depend on the original image that is unknown). We remark that the use of stop control techniques avoids the use of unnecessary iterations, and also to stop at the limit accuracy of the method. Also, from the pictures we observe that the IADMM algorithm provides enough precision in a lower number of iterations. The larger means faster method, but at the price of a more unstable method as it can be seen on Fig. 2(d). On that picture we observe that when the residual begins to behave chaotically, with sudden increases, it means that it is advisable to stop the iterative process as considered in the stop criterion (51) (black dot points on Fig. 2(d)). With that criterion, the IADMM method seems to be an interesting option for fast deblurring problems.
To observe more clearly the influence of the parameter in Algorithm 1 we perform several tests on the brain M0 image on Figure 3 for values and . Note that this parameter plays a role similar to the stepsize (as it also occurs to the parameter ), as it controls the perturbation at each step. A large value will provide, when the method works, a quite fast method, but on the other hand it makes the method more unstable. In fact, from the plot 3(b) we observe that in this case it would be optimal to use the parameter value in combination with the stop criterion, giving the maximum precision in just 28 iterations. Besides, it is shown that after the values selected by the stop criterion the residual begins to oscillate among values that provides similar error but that generates an unstable behaviour giving rise to an increment of the error in subsequent iterations (this instability is delayed when the parameter decreases, what is expected because the increment is smaller, as the vertical lines connecting the error and residual plots show).
The influence of the penalty parameter is also quite relevant, but a detailed analysis is out of the scope of this paper. On the Figure 4 we show the evolution of the residual using two values of and several values of the parameter and . We observe that low values of the penalty parameter gives a lower residual, but the error is lower for large providing a faster convergence, and it has a big effect on the empirical performance of the methods as shown in [33], but it remains to study optimal combinations of the parameters and and suitable criteria for an automatic selection (this will be part of the next steps in our study of these methods).
On Figure 5, we present the original medium resolution brain M0 image, the blurred one using, as indicated, a convolution with Gaussian kernel, and the results of the IADMM deblurring images using two error tolerances (, ) in the stop criterion (51). We can see that in both cases the quality of the recovered image is visually good.
A more detailed analysis is shown on Table II, where we present for the IADMM ( and ) and the ADMM (TV1) methods (we do not show results for the TV(1/2) as they are quite similar to those of the TV1 ones), the iteration number, real error (ERROR), SNR, the residual (RES) and the efficiency rates of the IADMM () method over the two other ones, I2/I1 and I3/I1, for different values of the tolerance and using a small value of . In all the methods we have used the same values of the parameters and a similar stop criterion (51) with the respective definition of the residual (Eqn. (50)). From the tests, the inertial IADMM nonconvex methods are the fastest, as expected, but also with a more unstable behaviour, also as expected due to the nonconvex version of the methods. Note that from Figure 2 the ADMM TV1 and TV(1/2) perform similarly for low , but if we make simulations small differences appear in medium-high resolution, obtaining for the M1 image the TV1 method 113 iterations and a ratio 2.26, but the TV(1/2) 99 iterations and a ratio 1.98; and for the H0 image the TV1 uses 110 iterations and a ratio 2.75, but the TV(1/2) 81 iterations and a ratio 2.03. That is, the nonconvex TV(1/2) may perform better in some circumstances, but the instability may also appear. In case of using larger values of the parameter , as the convergence theorems suggest, all the methods are much faster, giving a better performance as shown on Table III for . Finally, we present on Table IV some results obtained by changing the value of the parameter and with the fixed value of the tolerance . Again, the IADMM () method presents the best performance. Therefore, globally, the inertial IADMM version seems to be an interesting option, being the fastest one.
As above commented, how to select optimal combinations of the parameters and , the use on other types of blur, development of suitable criteria for automatic selection of all the parameters and so on, remains a research issue, but these goals are out of the scope of this article.
VI Conclusion
In this paper, a more efficient nonconvex inertial alternating minimization algorithm (IADMM) is developed for solving the Total Variation model for image deblurring. The proposed scheme is based on using the inertial proximal strategy on nonconvex ADMM methods. By the using the Kurdyka-Łojasiewicz property, we prove the convergence of the algorithm under several reasonable assumptions. Numerical experiments demonstrate that the proposed algorithm overall outperforms the widely used augmented Lagrangian ADMM methods, being a fast option, although it can be unstable for high precision. This instability is easily controlled via a suitable stop control criteria.
Appendix A: proof of Lemma 2
Direct computation yields
[TABLE]
Then, we have
[TABLE]
With Lemma 7.2 in [44], we obtain
[TABLE]
Noting is a symmetric tridiagonal matrix
[TABLE]
and using the result in [45],
[TABLE]
we obtain \lambda_{\min}(\mathcal{T}^{*})\geq 2\sin\bigg{(}\frac{\pi}{2N}\bigg{)}. Now, taking into account that is full row-rank, \|\mathcal{T}^{*}x\|\geq\lambda_{\min}(\mathcal{T}^{*})\|x\|\geq 2\sin\bigg{(}\frac{\pi}{2N}\bigg{)}\|x\|.
Appendix B: proof of Lemma 4
As the point is the minimization of , then the optimization condition yields
[TABLE]
Therefore, we can derive
[TABLE]
and replacing by
[TABLE]
Subtracting (60) and (61), and using Lemma 2,
[TABLE]
Appendix C: proof of Lemma 5
Note that is the minimizer of with respect to the variable . Then we have:
[TABLE]
In (III), we have obtained
[TABLE]
By direct calculations, we obtain
[TABLE]
and
[TABLE]
Combining the above equations, we obtain
[TABLE]
Thus, taking into account the definition of (40), we have
[TABLE]
Now, using the condition (36) and the definition (43) of , we obtain the descent result given in Eq. (42).
The boundness of implies the boundness of , and the continuity of implies that is bounded. From the above proven result Eq. (42), is decreasing. Thus, the sequence is convergent, i.e., . Therefore, from (42), we have
[TABLE]
which indicates due to (39). The scheme of updating gives us .
Appendix D: proof of Lemma 6
From (61), we derive
[TABLE]
By direct calculations, we obtain
[TABLE]
Thus, we have
[TABLE]
From Lemma 5, is bounded. This means, using the definition of , the boundedness of sequences , and . Taking into account the coercivity of , is bounded. Now, taking into account (69), is bounded. Then, \bigg{\{}\left(\begin{array}[]{c}\mathcal{T}\\ K\\ \end{array}\right)u^{k}\bigg{\}}_{k=0,1,2,\ldots} is bounded, and using the reversibility of \left(\begin{array}[]{c}\mathcal{T}\\ K\\ \end{array}\right), we obtain that is bounded. Therefore, all the components of the sequence are bounded, and so the result is proved.
Appendix E: Proof of Lemma 7
For the component, we have
[TABLE]
Easy computations on the new defined term give
[TABLE]
Direct calculations hold
[TABLE]
Thus, we obtain that
[TABLE]
where the parameter is given by
[TABLE]
For the component, we have
[TABLE]
With (73), we have the new term
[TABLE]
It is easy to obtain
[TABLE]
where
[TABLE]
Obviously, it holds
[TABLE]
From the scheme, we can easily see
[TABLE]
where \gamma_{p}=\max\bigg{\{}\frac{\theta\|K\|_{2}^{2}}{\delta},\,\frac{\alpha\theta\|K\|_{2}^{2}}{\delta}\bigg{\}}. Therefore, we have
[TABLE]
Letting , we have . With all the above equations (72), (75), (76), and (77), we obtain the global bound
[TABLE]
Denoting and then, we finish the proof.
Appendix F: Proof of Theorem 1
For any cluster point , there exists such that . Then, from Lemma 5, we have From the scheme of Algorithm 1, we have the following conditions
[TABLE]
Letting , with Proposition 1, we are then led to
[TABLE]
Finally, from Proposition 2, we obtain that is a critical point of .
Appendix G: Proof of Lemma 8
(1) From Lemma 6 the sequences and are bounded. Thus, is nonempty. Assume that , then from the definition there exists a subsequence . In that case, from Lemma 5, we also have . That is also , i.e., . Besides, from Lemma 7 and Lemma 5, we have and . Finally, Proposition 1 indicates that , i.e. .
(2) This item follows as a consequence of the definition of the limit point set (Definition 4).
(3) In Eq. (64), by taking limits on , we obtain And with the closedness of , we have the other bound . That means Using the continuity of the rest of functions that defines , we have On the other hand, Lemma 5 indicates is decreasing. Noting the boundedness of , this sequence has a lower bound. Thus, is convergent. And then, we have
Appendix H: Proof of Theorem 2
From Lemma 8, is constant on . Let be a stationary point of . Then, from the definition of we have Also from Lemma 8, we have and for any , for some . Hence, from Lemma 1, we have
[TABLE]
that together with Lemma 7 give us
[TABLE]
Then, the concavity of yields that
[TABLE]
Using Lemma 7, we have
[TABLE]
which is equivalent to
[TABLE]
Using the Schwartz’s inequality, we then derive
[TABLE]
Summing (Appendix H: Proof of Theorem 2) from to yields that
[TABLE]
Letting , and applying Lemma 5, we have Then, is convergent. And noting that is a stationary point of we also have that converges to . With (39), the convergence of indicates the convergence of . And then the sequence is also convergent. With the identity , is convergent. That means converges to some point . Finally, from Theorem 1, we have that is a critical point of .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. C. Hansen, J. G. Nagy, and D. P. O’Leary, Deblurring Images: Matrices, Spectra, and Filtering . Philadelphia, PA, USA: Society for Industrial and Applied Mathematics, 2006.
- 2[2] L. B. Lucy, “An iterative technique for the rectification of observed distributions,” The Astronomical Journal , vol. 79, p. 745, 1974.
- 3[3] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: Nonlinear Phenomena , vol. 60, no. 1-4, pp. 259–268, 1992.
- 4[4] C. Chen, R. H. Chan, S. Ma, and J. Yang, “Inertial proximal ADMM for linearly constrained separable convex optimization,” SIAM Journal on Imaging Sciences , vol. 8, no. 4, pp. 2239–2267, 2015.
- 5[5] Y. Wang, J. Yang, W. Yin, and Y. Zhang, “A new alternating minimization algorithm for total variation image reconstruction,” SIAM Journal on Imaging Sciences , vol. 1, no. 3, pp. 248–272, 2008.
- 6[6] F. Chen, H. Zhou, C. Grecos, and P. Ren, “Segmenting oil spills from blurry images based on alternating direction method of multipliers,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , vol. 11, no. 6, pp. 1858–1873, 2018.
- 7[7] M. Hintermüller and T. Wu, “Nonconvex TV q -models in image restoration: Analysis and a trust-region regularization–based superlinearly convergent solver,” SIAM Journal on Imaging Sciences , vol. 6, no. 3, pp. 1385–1415, 2013.
- 8[8] M. Hintermüller and T. Wu, “A smoothing descent method for nonconvex TV q -models,” in Efficient Algorithms for Global Optimization Methods in Computer Vision , pp. 119–133, Springer, 2014.
