Low-Rank Regularized Convex-Non-Convex Problems for Image Segmentation or Completion

Mohamed El Guide; Anas El Hachimi; Khalide Jbilou; Lothar Reichel

arXiv:2508.21765·math.NA·September 1, 2025

Low-Rank Regularized Convex-Non-Convex Problems for Image Segmentation or Completion

Mohamed El Guide, Anas El Hachimi, Khalide Jbilou, Lothar Reichel

PDF

Open Access

TL;DR

This paper introduces a new convex-non-convex formulation for image segmentation and completion that combines low-rank and smoothness regularizations, solved efficiently with ADMM and validated through numerical experiments.

Contribution

It presents a novel formulation integrating low-rank and smoothness regularizations for image tasks, with convergence analysis and empirical validation.

Findings

01

Effective in image segmentation and completion tasks.

02

Convergence of the ADMM algorithm is established.

03

Numerical experiments demonstrate superior performance.

Abstract

This work proposes a novel convex-non-convex formulation of the image segmentation and the image completion problems. The proposed approach is based on the minimization of a functional involving two distinct regularization terms: one promotes low-rank structure in the solution, while the other one enforces smoothness. To solve the resulting optimization problem, we employ the alternating direction method of multipliers (ADMM). A detailed convergence analysis of the algorithm is provided, and the performance of the methods is demonstrated through a series of numerical experiments.

Tables4

Table 1. Table 1 : PSNR-values of the restored images, CPU-times, and number of iterations by the CNC, ATCG-TV, LR-CNC methods for various values of SR.

Data	Method	CNC			ATCG-TV			LR-CNC
	SR	PSNR	CPU-time	Iter	PSNR	CPU-time	Iter	PSNR	CPU-time	Iter
MRI	0.1	21.22	20.22	730	18.44	8.20	968	21.36	4.54	174
	0.2	23.14	5.68	355	21.96	7.13	782	23.37	2.69	98
	0.3	25.01	4.00	262	24.37	8.56	889	25.32	1.65	60
Cameraman	0.1	21.34	11.53	757	19.31	6.58	770	21.52	5.09	184
	0.2	23.11	5.52	361	22.11	7.31	785	23.27	2.64	91
	0.3	24.33	3.35	220	24.08	6.90	755	24.70	1.66	58

Table 2. Table 2 : PSNR-values, CPU-time, and the number of iterations required by the CNC, ATCG-TV, and LR-CNC algorithms for several SR-values.

Data	Method	CNC			ATCG-TV			LR-CNC
	SR	PSNR	CPU-time	Iter	PSNR	CPU-time	Iter	PSNR	CPU-time	Iter
Airplane	0.1	22.26	56.64	714	18.64	19.24	736	22.47	4.14	78
	0.2	24.39	13.48	355	23.04	20.90	736	24.53	2.11	38
	0.3	25.86	9.56	212	25.21	22.42	753	26.13	2.05	37
Barbara	0.1	23.11	24.30	722	20.73	20.78	849	23.24	3.63	71
	0.2	25.19	13.14	346	24.42	24.59	814	25.37	2.39	41
	0.3	26.76	8.47	217	26.63	22.32	769	26.89	2.00	36

Table 3. Table 3 : PSNR- and SSIM-values for images determined by the CNC, ATCG-TV, Chan & Vese, and LR-CNC algorithms for Gaussian noise with several noise levels L L .

method		CNC		ATCG-TV		Chan $&$ Vese [18]		LR-CNC
	$L$	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
MRI	0.01	24.94	0.42	24.37	0.49	24.17	0.64	25.12	0.49
	0.1	18.73	0.33	18.11	0.37	18.71	0.37	20.50	0.38
	0.2	13.78	0.29	13.80	0.31	13.71	0.32	15.43	0.35
	0.3	10.62	0.27	10.63	0.26	10.54	0.27	12.03	0.31
Millennium	0.01	22.74	0.71	22.40	0.69	18.94	0.37	22.40	0.69
	0.1	18.14	0.62	16.20	0.30	16.66	0.30	20.30	0.65
	0.2	13.34	0.49	12.21	0.26	13.09	0.24	18.29	0.59
	0.3	10.10	0.40	9.15	0.22	10.11	0.20	13.60	0.51

Table 4. Table 4 : PSNR- and SSIM-values of segmented images by CNC, Chan & Vese, ATCG-TV, and LR-CNC algorithms.

method		CNC		ATCG-TV		Chan $&$ Vese [18]		LR-CNC
	$L$	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM	PSNR	SSIM
Samson	0.01	26.87	0.51	29.04	0.79	29.61	0.70	28.02	0.75
	0.1	18.81	0.30	19.54	0.60	19.66	0.52	21.38	0.63
	0.2	13.48	0.18	13.91	0.47	13.93	0.41	18.22	0.53
	0.3	10.16	0.13	10.45	0.39	10.46	0.34	15.66	0.51
Jasper	0.01	24.88	0.51	25.28	0.60	26.16	0.62	27.01	0.68
	0.1	18.40	0.39	18.96	0.50	19.30	0.49	22.54	0.57
	0.2	13.44	0.32	13.74	0.42	13.87	0.40	20.54	0.48
	0.3	10.17	0.26	10.35	0.36	10.44	0.35	18.62	0.43

Equations325

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U)},

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U)},

R (U) = i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ (\nabla U)_{ij} ∥_{F}, a, T),

R (U) = i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ (\nabla U)_{ij} ∥_{F}, a, T),

\phi\left(t;T,a\right)=\left\{\begin{array}[]{ll}\phi_{1}\left(t;T,a\right)=\dfrac{a(T_{2}-T)t^{2}}{2T},&t\in[0,T),\\[5.69054pt] \phi_{2}\left(t;T,a\right)=-\dfrac{a}{2}t^{2}+aT_{2}t-\dfrac{aTT_{2}}{2},&t\in[T,T_{2}),\\[8.53581pt] \phi_{3}\left(t;T,a\right)=\dfrac{aT_{2}\left(T_{2}-T\right)}{2},&t\in[T_{2},\infty).\end{array}\right.

\phi\left(t;T,a\right)=\left\{\begin{array}[]{ll}\phi_{1}\left(t;T,a\right)=\dfrac{a(T_{2}-T)t^{2}}{2T},&t\in[0,T),\\[5.69054pt] \phi_{2}\left(t;T,a\right)=-\dfrac{a}{2}t^{2}+aT_{2}t-\dfrac{aTT_{2}}{2},&t\in[T,T_{2}),\\[8.53581pt] \phi_{3}\left(t;T,a\right)=\dfrac{aT_{2}\left(T_{2}-T\right)}{2},&t\in[T_{2},\infty).\end{array}\right.

(\nabla U)_{ij} = [(D_{1} U)_{ij}, (D_{2} U)_{ij}]^{T},

(\nabla U)_{ij} = [(D_{1} U)_{ij}, (D_{2} U)_{ij}]^{T},

D_{1} U = U C_{1}, D_{2} U = C_{2} U,

D_{1} U = U C_{1}, D_{2} U = C_{2} U,

C_{1} = - 1 1 ⋮ 00 0 - 1 ⋱ \dots \dots 00 ⋱ 10 \dots \dots ⋱ - 1 1 10 ⋮ 0 - 1 \in R^{n_{2} \times n_{2}}, C_{2} = - 1 0 ⋮ 01 1 - 1 ⋱ \dots \dots 01 ⋱ 00 \dots \dots ⋱ - 1 0 00 ⋮ 1 - 1 \in R^{n_{1} \times n_{1}} .

C_{1} = - 1 1 ⋮ 00 0 - 1 ⋱ \dots \dots 00 ⋱ 10 \dots \dots ⋱ - 1 1 10 ⋮ 0 - 1 \in R^{n_{2} \times n_{2}}, C_{2} = - 1 0 ⋮ 01 1 - 1 ⋱ \dots \dots 01 ⋱ 00 \dots \dots ⋱ - 1 0 00 ⋮ 1 - 1 \in R^{n_{1} \times n_{1}} .

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U)}, such that rank (U) \leq r,

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U)}, such that rank (U) \leq r,

∥ U ∥_{*} = i = 1 \sum m i n {n_{1}, n_{2}} σ_{i} .

∥ U ∥_{*} = i = 1 \sum m i n {n_{1}, n_{2}} σ_{i} .

\partial ∥ X ∥_{*} = {U V^{T} + W : W \in R^{n_{1} \times n_{2}}, U^{T} W = 0, W^{T} V = 0, ∥ W ∥_{2} \leq 1} .

\partial ∥ X ∥_{*} = {U V^{T} + W : W \in R^{n_{1} \times n_{2}}, U^{T} W = 0, W^{T} V = 0, ∥ W ∥_{2} \leq 1} .

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U) + ∥ U ∥_{*}} .

U \in R^{n_{1} \times n_{2}} min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + R (U) + ∥ U ∥_{*}} .

U, Z, M min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ M_{ij} ∥_{2}; T, a) + ∥ Z ∥_{*}},

U, Z, M min {\frac{λ}{2} ∥ U - B ∥_{F}^{2} + i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ M_{ij} ∥_{2}; T, a) + ∥ Z ∥_{*}},

L_{β_{1}, β_{2}} (U, Z, M, Q, O) = \frac{λ}{2} ∥ U - B ∥_{F}^{2}

L_{β_{1}, β_{2}} (U, Z, M, Q, O) = \frac{λ}{2} ∥ U - B ∥_{F}^{2}

⟨ U, V ⟩ = trace (U^{T} V) = i = 1 \sum n_{1} j = 1 \sum n_{2} u_{ij} v_{ij} .

⟨ U, V ⟩ = trace (U^{T} V) = i = 1 \sum n_{1} j = 1 \sum n_{2} u_{ij} v_{ij} .

U^{k + 1} =

U^{k + 1} =

Z^{k + 1} =

M^{k + 1} =

Q^{k + 1} =

O^{k + 1} =

U^{k + 1} = ar g U min \frac{λ}{2} ∥ U - B ∥_{F}^{2} + ⟨ D U - M^{k}, Q^{k} ⟩ + \frac{β _{1}}{2} D U - M^{k}_{F}^{2} + ⟨ Z^{k} - U, O^{k} ⟩ + \frac{β _{2}}{2} Z^{k} - U_{F}^{2},

U^{k + 1} = ar g U min \frac{λ}{2} ∥ U - B ∥_{F}^{2} + ⟨ D U - M^{k}, Q^{k} ⟩ + \frac{β _{1}}{2} D U - M^{k}_{F}^{2} + ⟨ Z^{k} - U, O^{k} ⟩ + \frac{β _{2}}{2} Z^{k} - U_{F}^{2},

U^{k + 1} = ar g U min \frac{λ}{2} ∥ U - B ∥_{F}^{2} + \frac{β _{1}}{2} D U - M^{k} + \frac{Q ^{k}}{β _{1}}_{F}^{2} + \frac{β _{2}}{2} Z^{k} - U + \frac{O ^{k}}{β _{2}}_{F}^{2} .

U^{k + 1} = ar g U min \frac{λ}{2} ∥ U - B ∥_{F}^{2} + \frac{β _{1}}{2} D U - M^{k} + \frac{Q ^{k}}{β _{1}}_{F}^{2} + \frac{β _{2}}{2} Z^{k} - U + \frac{O ^{k}}{β _{2}}_{F}^{2} .

λ U + β_{1} D^{T} D U = λ B + β_{1} D^{T} M^{k} - D^{T} Q^{k} + β_{2} Z^{k} - O^{k},

λ U + β_{1} D^{T} D U = λ B + β_{1} D^{T} M^{k} - D^{T} Q^{k} + β_{2} Z^{k} - O^{k},

R^{k} = λ B + β_{1} D^{T} M^{k} - D^{T} Q^{k} + β_{2} Z^{k} - O^{k} .

R^{k} = λ B + β_{1} D^{T} M^{k} - D^{T} Q^{k} + β_{2} Z^{k} - O^{k} .

λ U + β_{1} D_{1}^{T} D_{1} U + β_{1} D_{2}^{T} D_{2} U = R^{k},

λ U + β_{1} D_{1}^{T} D_{1} U + β_{1} D_{2}^{T} D_{2} U = R^{k},

λ U + β_{1} U C_{1} C_{1}^{T} + β_{1} U C_{2} C_{2}^{T} = R^{k} .

λ U + β_{1} U C_{1} C_{1}^{T} + β_{1} U C_{2} C_{2}^{T} = R^{k} .

C_{1} = F_{1}^{H} Λ_{1} F_{1}, C_{2} = F_{2}^{H} Λ_{2} F_{2},

C_{1} = F_{1}^{H} Λ_{1} F_{1}, C_{2} = F_{2}^{H} Λ_{2} F_{2},

(F_{1}^{H} \otimes F_{2}^{H}) (λ I \otimes I + β_{1} Λ_{1} \otimes I + β_{1} I \otimes Λ_{2}) (F_{1} \otimes F_{2}) vec (U) = vec (R^{k}) .

(F_{1}^{H} \otimes F_{2}^{H}) (λ I \otimes I + β_{1} Λ_{1} \otimes I + β_{1} I \otimes Λ_{2}) (F_{1} \otimes F_{2}) vec (U) = vec (R^{k}) .

vec (U) = (F_{1}^{H} \otimes F_{2}^{H}) (λ I \otimes I + β_{1} Λ_{1} \otimes I + β_{1} I \otimes Λ_{2})^{- 1} (F_{1} \otimes F_{2}) vec (R^{k}) .

vec (U) = (F_{1}^{H} \otimes F_{2}^{H}) (λ I \otimes I + β_{1} Λ_{1} \otimes I + β_{1} I \otimes Λ_{2})^{- 1} (F_{1} \otimes F_{2}) vec (R^{k}) .

Z^{k + 1} = ar g Z min {∥ Z ∥_{*} + ⟨ Z - U^{k + 1}, O^{k} ⟩ + \frac{β _{2}}{2} Z - U^{k + 1}_{F}^{2}},

Z^{k + 1} = ar g Z min {∥ Z ∥_{*} + ⟨ Z - U^{k + 1}, O^{k} ⟩ + \frac{β _{2}}{2} Z - U^{k + 1}_{F}^{2}},

Z^{k + 1} = ar g Z min {∥ Z ∥_{*} + \frac{β _{2}}{2} Z - U^{k + 1} + \frac{O ^{k}}{β _{2}}_{F}^{2}} .

Z^{k + 1} = ar g Z min {∥ Z ∥_{*} + \frac{β _{2}}{2} Z - U^{k + 1} + \frac{O ^{k}}{β _{2}}_{F}^{2}} .

U^{k + 1} - \frac{O ^{k}}{β _{1}} .

U^{k + 1} - \frac{O ^{k}}{β _{1}} .

Z^{k + 1} = U S_{\frac{1}{β _{2}}} V^{T},

Z^{k + 1} = U S_{\frac{1}{β _{2}}} V^{T},

M^{k + 1} = ar g M min {i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ M_{ij} ∥_{2}; T, a) + Z^{k + 1}_{*} + ⟨ D U^{k + 1} - M, Q^{k} ⟩ + \frac{β _{1}}{2} D U^{k + 1} - M_{F}^{2}} .

M^{k + 1} = ar g M min {i = 1 \sum n_{1} j = 1 \sum n_{2} ϕ (∥ M_{ij} ∥_{2}; T, a) + Z^{k + 1}_{*} + ⟨ D U^{k + 1} - M, Q^{k} ⟩ + \frac{β _{1}}{2} D U^{k + 1} - M_{F}^{2}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Medical Image Segmentation Techniques · Numerical methods in inverse problems

Full text

Low-Rank Regularized Convex-Non-Convex Problems for Image Segmentation or

Completion

M. El Guide Mohammed VI Polytechnic University, Rabat, Morocco. [email protected]

A. El Hachimi33footnotemark: 3 The UM6P Vanguard center, Mohammed VI Polytechnic University, Rabat, Morocco. [email protected]

K. Jbilou Université du Littoral, Côte d’Opale, batiment H. Poincarré, 50 rue F. Buisson, F-62280 Calais Cedex, France. [email protected]

L. Reichel Department of Mathematical Sciences, Kent State University, Kent 44242, Ohio USA. [email protected]

Abstract

This work proposes a novel convex-non-convex formulation of the image segmentation and the image completion problems. The proposed approach is based on the minimization of a functional involving two distinct regularization terms: one promotes low-rank structure in the solution, while the other one enforces smoothness. To solve the resulting optimization problem, we employ the alternating direction method of multipliers (ADMM). A detailed convergence analysis of the algorithm is provided, and the performance of the methods is demonstrated through a series of numerical experiments.

keywords:

Convex-non-convex problem, image completion, image segmentation, low-rank approximation, matrix nuclear normregularization.

1 Introduction

Image segmentation and completion are fundamental computational tasks in computer vision and data analysis with many applications in medical and hyperspectral imaging, as well as in remote sensing. These tasks arise when reconstructing or segmenting images with missing or occluded pixels. Existing approaches, including variational methods [38] and deep learning techniques [33] have been applied to solve these tasks under specific circumstances. However, these approaches have limitations, such as sensitivity of the computed results to noise in the given data and high computational cost.

Recovery of meaningful information from corrupted or incomplete data is a long-standing problem in image processing. For example, in medical imaging, modalities such as Magnetic Resonance Imaging (MRI) and Computerized Tomography (CT) frequently yield data that suffer from noise contamination or incomplete acquisition due to hardware limitations or patient constraints [21, 28]. Similarly, hyperspectral imaging, known for its ability to capture detailed spectral information, often suffers from missing data values because of sensor malfunction [19, 40]. These difficulties make image recovery and segmentation indispensable for accurate data analysis, diagnosis, and decision-making. Traditional approaches to deal with these issues include the use of i) regularization, e.g., Total Variation (TV) regularization [2, 4, 5], ii) low-rank matrix or tensor factorization [3, 19], and iii) variational formulations such as the Mumford-Shah model [35]. The latter has inspired many variations, including the two-stage Mumford-Shah model by Cai et al. [15] and the Convex-Non-Convex (CNC) model proposed by Chan et al. [17]. Despite their effectiveness, these approaches face limitations, such as high computational complexity, sensitivity to parameter settings, and difficulties in preserving fine details in high-dimensional data.

Image segmentation involves partitioning an image into meaningful regions based on certain properties of the image such as intensity, texture, or spectral information. Classical techniques, including active contour methods [18] and region-growing algorithms [27], have been studied extensively. Variational approaches, such as the Mumford-Shah model [35] and its convex variants [15], have advanced the field by providing robust mathematical formulations. More recently, the hybrid convex-non-convex framework has gained prominence due to its ability to balance computational feasibility with modeling flexibility; see Chan et al. [17]. We remark that traditional convex formulations, such as those based on the nuclear norm, are robust but can be overly restrictive and may give suboptimal solutions. Conversely, purely non-convex approaches, while offering greater flexibility, often lack stability and scalability; see, e.g., Lu et al. [34].

To overcome these limitations, this paper introduces a hybrid convex-non-convex optimization framework for image segmentation and completion. By combining the robustness of convex optimization and the flexibility of non-convex regularization, the proposed methods strike a balance between accuracy and computational efficiency. The methods use regularization that promotes matrix solutions of low rank. This has emerged as a powerful technique for image recovery as well as for image segmentation. Low-rank matrix solutions can be represented with reduced degrees of freedom; see Bell et al. [3]. We seek to determine low-rank matrix solutions by regularization with the nuclear matrix norm, which is a convex surrogate of the matrix rank function [16]. However, while the use of the nuclear norm ensures mathematical tractability, the solution of minimization problems that involve the nuclear matrix norm can be very demanding computationally for large-scale problems. To address this issue, the Alternating Direction Method of Multipliers (ADMM) is frequently employed; see, e.g., [12, 29, 32, 36]. The ability of ADMM to decompose complex problems into simple subproblems has made it a popular method for solving a variety of optimization problems in image processing. Computed examples presented in Section 4 show the proposed methods to be competitive with available methods both in terms of accuracy and computing time.

We formulate image completion and image segmentation within a unified framework. Let $B\in\mathbb{R}^{n_{1}\times n_{2}}$ represent an observed image. For image completion, the matrix $B$ has zero or blank entries at locations for which no data is available, and for image segmentation the matrix $B$ represents a fully observed noisy image. The desired solution $U\in\mathbb{R}^{n_{1}\times n_{2}}$ of many image processing problems is a piecewise smooth image. Therefore, many solution methods use regularization that promotes piecewise smoothness of the computed solution. For instance, this can be achieved with Total Variation (TV) regularization; see [4, 6, 7]. We obtain a minimization problem of the form

[TABLE]

where $R:\mathbb{R}^{n_{1}\times n_{2}}\to\mathbb{R}$ denotes a total variation regularization functional, $\|\cdot\|_{F}$ stands for the Frobenius norm, and $\lambda>0$ is a regularization parameter that balances the influence of the terms in (1) on the solution. Chan et al. [17] proposed the use of the total variation functional

[TABLE]

where

[TABLE]

The parameters $T>0$ and $a>0$ have to be chosen by a user: $a$ is used to tune the degree of non-convexity of the regularization functional as discussed below, while $T$ is chosen to emphasize interesting features of the image. In image segmentation, $T$ is used to indicate which entries of the image should not be considered boundary points of the segmented regions of the images. The chosen form behaves like a quadratic smoothing term for small gradients but transitions to a flat response for large gradients, thereby preserving edges, as proposed by Chan et al. [17]. We comment on the choices of $a$ , $T$ , and $T_{2}$ below; see also Chan et al. [17] for a discussion on how to choose these parameters.

Following Chan et al. [17], we define a gradient-type operator $\nabla U$ by

[TABLE]

where the superscript T stands for transposition,

[TABLE]

and $C_{1}$ and $C_{2}$ are the circulant matrices

[TABLE]

The function $\phi$ in (2) satisfies:

$\phi$ is continuously differentiable for $t\in\mathbb{R}_{+}$ . 2. 2.

$\phi$ is twice continuously differentiable for $t\in\mathbb{R}_{+}\backslash\{T,T_{2}\}$ . 3. 3.

$\phi$ is convex and monotonically increasing for $t\in[0,T)$ . 4. 4.

$\phi$ concave and monotonically non-decreasing for $t\in[T,T_{2})$ . 5. 5.

$\phi$ is constant for $t\in[T_{2},+\infty)$ . 6. 6.

$\inf_{t\in\mathbb{R}_{+}\backslash\{T,T_{2}\}}\phi^{\prime\prime}=-a$ .

We would like to determine an approximate solution of (1) of low rank. Recovery of data so that the computed solution or correction of an available approximate solution are of low rank has received considerable attention; see, e.g., [19, 21, 31, 38, 40]. Reasons for this include that the imposition of a low-rank constraint may be meaningful in the model considered, may yield a computed solution of higher quality, or may be easier to interpret. We therefore modify the minimization problem (1) to obtain

[TABLE]

where $r$ is user-specified positive integer.

However, the solution of the minimization problem (3) is NP-hard. It therefore is impractical to solve (3), except when all matrices involved are very small. To circumvent this difficulty, we replace the rank function by its convexification, which is the matrix nuclear norm; see [3, 13, 16] for discussions on this replacement. We solve the minimization so obtained by ADMM.

In summary, the contributions of this work are: (i) a new unified model for image completion and segmentation that promotes both low-rank structure and piecewise smoothness, (ii) an efficient ADMM-based algorithm with proven convergence properties, and (iii) a comprehensive experimental evaluation for a variety of images (grayscale, color, hyperspectral) that demonstrate improved accuracy and speed compared to existing methods.

This paper is organized as follows. Section 2 discusses the solution of the optimization problem (3) with the rank function replaced by the nuclear norm, and Section 3 is concerned with the convergence properties of the proposed solution method. Numerical examples that illustrate the performance of the solution methods are reported in Section 4. Concluding remarks can be found in Section 5.

It is a pleasure to dedicate this paper to Åke Björck and Lars Eldén, who made profound contributions to Numerical Linear Algebra and pioneered the development of numerical methods for the solution linear discrete ill-posed problems; see, e.g., [8, 9, 10, 11, 23, 24, 25, 26].

2 The solution method

This section describes the proposed solution method. We convexify the rank-regularized problem (3) by using the nuclear norm surrogate, and derive an efficient algorithm based on ADMM to solve the resulting problem.

Let the matrix $U\in\mathbb{R}^{n_{1}\times n_{2}}$ have the singular values $\sigma_{1}\geq\sigma_{2}\geq\ldots\geq\sigma_{\min\{n_{1},n_{2}\}}\geq 0$ . The nuclear norm of $U$ is defined as

[TABLE]

The matrix nuclear norm is continuous, convex, and coercive. Its subdifferential is given by

[TABLE]

Here and throughout this paper $\|\cdot\|_{2}$ denotes the spectral matrix norm or the Euclidean vector norm.

Replacing the rank constraint in the optimization problem (3) by the matrix nuclear norm gives the minimization problem

[TABLE]

We introduce some auxiliary variables to obtain an equivalent minimization problem that we will solve:

[TABLE]

such that $DU=M$ and $Z=U$ . Here, $M$ captures the discrete gradient of $U$ (so that the regularizer $R(U)$ can be written in terms of $M$ ), and $Z$ is introduced to facilitate the nuclear norm term. These substitutions allow splitting the terms of the objective function in the ADMM framework and enable more efficient optimization. In detail, we define the differential operator $D=[D_{1}^{T},D_{2}^{T}]^{T}$ , and express the auxiliary variable $M=[M_{ij}]$ , where each matrix entry $M_{ij}=[(D_{1}U)_{i,j},(D_{2}U)_{i,j}]^{T}$ captures the local gradient information. We propose to solve the resulting optimization problem using the ADMM; see, e.g., [12, 29, 32, 36] for discussions of this method. The augmented Lagrangian is given by

[TABLE]

where $\beta_{1}>0$ and $\beta_{2}>0$ are Lagrangian parameters and $Q\in\mathbb{R}^{2n_{1}\times n_{2}}$ and $O\in\mathbb{R}^{n_{1}\times n_{2}}$ are Lagrangian multipliers. The inner product of $U=[u_{ij}]\in\mathbb{R}^{n_{1}\times n_{2}}$ and $V=[v_{ij}]\in\mathbb{R}^{n_{1}\times n_{2}}$ is defined as

[TABLE]

In particular, $\|U\|_{F}=\langle U,U\rangle^{1/2}$ .

At the $(k+1)$ st iteration of ADMM, we solve the following subproblems

[TABLE]

The remainder of this section discusses the solution of these subproblems.

2.1 Solution of subproblem (7)

We obtain from (7) that

[TABLE]

which is equivalent to

[TABLE]

The solution satisfies

[TABLE]

where $D^{T}$ is the adjoint of the gradient operator, i.e., the divergence operator. Let

[TABLE]

We then can write equation (12) as

[TABLE]

which is equivalent to

[TABLE]

Since the matrices $C_{1}$ and $C_{2}$ are circulants, they are diagonalizable by Discrete Fourier Transform (DFT) matrices, i.e.,

[TABLE]

where $F_{1}$ and $F_{2}$ denote DFT matrices of sizes $n_{2}\times n_{2}$ and $n_{1}\times n_{1}$ , respectively, and the superscript H stands for transposition and complex conjugation. The diagonal entries of the diagonal matrices $\Lambda_{1}$ and $\Lambda_{2}$ are the eigenvalues of $C_{1}$ and $C_{2}$ , respectively. Let $\otimes$ denote the Kronecker product and let the operator vec stack all entries of a matrix column by column to give a vector. We obtain

[TABLE]

The parameters $\lambda$ and $\beta_{1}$ are positive and the matrices $\Lambda_{1}$ and $\Lambda_{2}$ are invertible. Therefore, the matrix $\left(\lambda I\otimes I+\beta_{1}\Lambda_{1}\otimes I+\beta_{1}I\otimes\Lambda_{2}\right)$ is invertible. It follows that $U$ can be expressed as

[TABLE]

2.2 Solving subproblem (8)

The matrix $Z^{k+1}$ satisfies

[TABLE]

which can be written as

[TABLE]

The right-hand side defines a proximal operator of the matrix nuclear norm. It can be shown that since $\beta_{2}>0$ , the iterates $Z^{k+1}$ converge to a unique solution $Z^{\infty}$ as $k\to\infty$ ; see [14]. The iterate $Z^{k+1}$ can be expressed with the aid of Singular Value Thresholding (SVT) of the matrix

[TABLE]

Let $USV^{T}$ denotes the singular value decomposition of the matrix (14). Thus, $U$ and $V$ are orthogonal matrices and $S$ is a diagonal matrix, whose nontrivial entries are the singular values. Then

[TABLE]

where, elementwise, $S_{\frac{1}{\beta_{2}}}=\max\left(S-\dfrac{1}{\beta_{2}},0\right)$ .

2.3 Solving subproblem (9)

Let $M=[M_{ij}]\in\mathbb{R}^{n_{1}\times n_{2}}$ . Then this subproblem can be written as

[TABLE]

This expression can can be simplified to

[TABLE]

Following the discussion by Chan et al. [17], we find that the minimizer is given by

[TABLE]

where

[TABLE]

and

[TABLE]

with

[TABLE]

Chan et al. [17] show that the cost function for the minimization problem (16) is strongly convex (convex) if and only if $\beta_{1}>a$ ( $\beta_{1}\geq a$ ).

We are in a position to discuss the solution of the minimization problem (4). The following algorithm outlines the solution process.

ADMM is guaranteed to converge to a global minimum of the convex problem (5); in our non-convex formulation, we show in Section 3 that every limit point of the sequence generated by Algorithm 1 is a stationary point of (5).

A numerically practical way to compute the solution $U$ is to first apply the MATLAB function psf2otf to the matrix

[TABLE]

This MATLAB function is used to compute the fast Fourier transform of the matrix $L$ (considered as a Point-Spread Function (PSF) array) and creates the Optical Transfer Function (OTF) array. Using MATLAB notation, $U$ can be computed as

[TABLE]

Generally, the matrix $M$ is easy to compute. The main computational work required by Algorithm 1 is the computation of the matrix $Z$ as this requires the evaluation of a singular value decomposition. The matrices in the computed examples of Section 4 are small enough to make this feasible. For very large matrices, we can compute a partial singular value decomposition that is made up of all singular triplets with singular values larger than $1/\beta_{2}$ . These singular triples can be computed, e.g., with the MATLAB function svds, which implements the method described in [1].

3 Convergence analysis

This section investigates the convergence of Algorithm 1. We establish that the sequence of the generated iterates has limit points and any accumulation point is a solution (or stationary point) of the minimization problem. Let

[TABLE]

where

[TABLE]

The functionals $F$ , $L$ , and $R$ are referred to as the fidelity, low-rank, and regularization functionals, respectively. Consider the saddle point problem of determining a quintuple $\left(U^{*},Z^{*},W^{*},Q^{*},O^{*}\right)$ such that

[TABLE]

Chan et al. [17, Lemma 3.1] show the following result.

Lemma 1.

The functional defined by

[TABLE]

is strictly convex if and only if the function $h(\cdot;\lambda,T,a):\mathbb{R}\to\mathbb{R}$ given by

[TABLE]

is strictly convex.

Lemma 2.

The functional $J(\cdot;\lambda,T,a):\mathbb{R}^{n_{1}\times n_{2}}\to\mathbb{R}$ is strictly convex if and only if the function $J_{1}:\mathbb{R}^{n_{1}\times n_{2}}\to\mathbb{R}$ is strictly convex.

Proof.

The functional $J(\cdot;\lambda,T,a)$ can be written as

[TABLE]

The nuclear norm function is convex. Therefore, it does not affect the strict convexity. This shows the lemma. ∎

Corollary 3.

The functional $J$ is strictly convex if and only if $h$ is strictly convex.

The next theorem furnishes conditions that secure strict convexity of $J(\cdot;\lambda,T,a)$ .

Theorem 4.

A sufficient condition for the functional $J(\cdot;\lambda,T,a)$ to be strictly convex is that the pair $\left(\lambda,a\right)\in\mathbb{R}^{*}_{+}\times\mathbb{R}^{*}_{+}$ satisfies

[TABLE]

Proof.

The theorem can be shown similarly as [17, Theorem 3.5] by using Corollary 3. ∎

A function $G:\mathbb{R}^{n}\to\mathbb{R}$ is said to be $\mu$ -strongly convex if there is a constant $\mu>0$ such that $G(x)-\frac{\mu}{2}\|x\|_{2}^{2}$ is convex.

Proposition 5.

The functional $J(\cdot;\lambda,T,a)$ is proper, continuous, bounded from below by zero, coercive, and $\mu$ -strongly convex, where

[TABLE]

Proof.

Chan et al. [17, Proposition 3.7] show that the functional $J_{1}(U;\lambda,T,a)$ is proper, continuous, bounded from below by zero, coercive, and $\mu$ -strongly convex, with $\mu$ given by (22). The nuclear norm function satisfies

[TABLE]

The representations (20) and (23) show that the functional $J$ is continuous, bounded below by zero, and strongly $\mu$ -convex. ∎

Lemma 6.

Let the pair of parameters $\left(\lambda,a\right)$ satisfy (21). Then the functional $J$ , the regularization term $R$ , the quadratic fidelity term $F$ , and the functional $L$ in (18) are locally Lipschitz continuous functions.

Proof.

Chan et al. [17] show that the functions $R$ and $F$ are locally Lipschitz continuous. The lemma follows from the observation that the function $L$ also is locally Lipschitz continuous. ∎

Proposition 7.

For any pair of parameters $\left(\lambda,a\right)$ that satisfies (21), the functional $J:\mathbb{R}^{n_{1}\times n_{2}}\to\mathbb{R}$ has a unique (global) minimizer $U^{*}$ that satisfies

[TABLE]

where [math] denotes the null matrix of size $n_{1}\times n_{2}$ and $\partial_{u}[J](U^{*})\subset\mathbb{R}^{n_{1}\times n_{2}}$ stands for the subdifferential of $J$ . Moreover,

[TABLE]

where $\bar{\partial}_{u}[R](DU^{*})$ denotes the Clarke generalized gradient [20] of the (non-convex and non-smooth) regularization function $R$ , and $\partial_{U}[\left\|U\right\|_{*}]\left(U^{*}\right)$ is the subdifferential of the nuclear norm function at $U^{*}$ .

Proof.

The functional $J$ is strongly convex when (21) holds. Therefore, $J$ has a unique minimizer $U^{*}$ and (24) follows from the generalized Fermat’s rule [37].

The concept of a generalized gradient extends the concept of a subdifferential for non-smooth convex functions to non-smooth non-convex functions that are locally Lipschitz. According to Lemma 6, the functionals $J$ , $\phi$ , $L$ , and $F$ are locally Lipschitz. Therefore, the generalized gradient is defined.

For non-smooth and convex functions, the Clarke generalized gradient equals the subdifferential, i.e.,

[TABLE]

Hence,

[TABLE]

which gives

[TABLE]

because in the case of a continuously differentiable function, the generalized gradient reduces to the gradient. ∎

Proposition 8.

Let $F:\mathbb{R}^{n_{1}\times n_{2}}\to\mathbb{R}$ be the fidelity function, and let $B\in\mathbb{R}^{n_{1}\times n_{2}}$ , $Q\in\mathbb{R}^{2n_{1}\times n_{2}}$ , $O\in\mathbb{R}^{n_{1}\times n_{2}}$ , $\lambda\in\mathbb{R}^{*}_{+}$ , $\gamma_{1}\in\mathbb{R}$ , and $\gamma_{2}\in\mathbb{R}$ . Then the function

[TABLE]

is convex if

[TABLE]

This inequality also is necessary if the inequality is to hold for all $n_{1},n_{2}\geq 1$ .

Proof.

In order for the function $F$ to be convex, its Hessian $H$ has to be positive semidefinite. Let $n=\min\{n_{1},n_{2}\}$ . The Hessian is given by

[TABLE]

where $I_{n}$ denotes the identity matrix of order $n$ . The matrix $H$ can be written as

[TABLE]

where $C_{1},C_{2}\in\mathbb{R}^{n\times n}$ . It can be shown by a simple calculation that $C_{1}C_{1}^{T}=C_{2}^{T}C_{2}$ . Let $A=C_{1}C_{1}^{T}=C_{2}^{T}C_{2}$ . Then the Hessian can be expressed as

[TABLE]

Since the matrix $A$ is symmetric and positive semidefinite, it has the spectral factorization

[TABLE]

with eigenvalues $\lambda_{i}(A)\geq 0$ and orthogonal eigenvector matrix $V$ . In order for

[TABLE]

to be positive semidefinite, $\lambda$ , $\gamma_{1}$ , and $\gamma_{2}$ must satisfy

[TABLE]

Since

[TABLE]

we obtain by using Gershgorin discs that $\lambda_{i}(A)\in[0,4]$ for all $i$ . It follows that convexity is secured if (25) holds. The largest eigenvalue converges to $4$ as $n$ increases. Therefore (25) is necessary if we would like to determine $\gamma_{1}$ independently of $n_{1}$ and $n_{2}$ . ∎

Proposition 9.

For any Lagrangian multipliers $Q$ and $O$ , we have that the augmented Lagrangian functional $L_{\beta_{1},\beta_{2}}(U,Z,M,Q,O)$ is proper, continuous, and coercive jointly in the primal variables $(U,Z,M)$ . Moreover, $L_{\beta_{1},\beta_{2}}(U,Z,M,Q,O)$ is jointly convex in $(U,Z,M)$ if the penalty parameters $\beta_{1}$ and $\beta_{2}$ satisfy

[TABLE]

where

[TABLE]

The relations (26) are meaningful if $\gamma_{1}>\gamma_{3}$ and $\gamma_{2}>\gamma_{4}$ .

Proof.

It is clear that the function $L_{\beta_{1},\beta_{2}}$ is proper and continuous. To show that $L_{\beta_{1},\beta_{2}}$ is coercive jointly in $(U,Z,M)$ , we note that

[TABLE]

Since $R(M)$ is bounded and the terms $\dfrac{1}{2\beta_{2}}\left\|O\right\|_{F}^{2}$ and $\dfrac{1}{2\beta_{1}}\left\|Q\right\|_{F}^{2}$ are independent of $(U,Z,M)$ , it follows that $R(M)$ and these terms do not affect the coercivity. The expression

[TABLE]

is coercive with respect to $U$ , the expression

[TABLE]

is coercive with respect to $Z$ , and the expression

[TABLE]

is coercive with respect to $W$ . Therefore, $L_{\beta_{1},\beta_{2}}(U,Z,M,Q,O)$ is jointly coercive with respect to $(U,Z,M)$ .

To show convexity, we express $L$ as

[TABLE]

Define the functions

[TABLE]

and note that $L_{5}$ can be written as

[TABLE]

The functional $U\to L_{1}(U)$ is convex if

[TABLE]

and $L_{2}$ is convex if the inequality

[TABLE]

holds. Since the nuclear norm $\left\|Z\right\|_{*}$ is convex, the functional $L_{3}$ is convex when $\gamma_{4}\geq 0$ . In view of that the functional $L_{4}$ is affine with respect to $U$ , $M$ , and $Z$ , it does not affect the convexity of $L$ . Finally, the functional $L_{5}$ is jointly convex if it can be reduced to the form

[TABLE]

Thus, we require the coefficients of $\left\|DU\right\|_{F}^{2}$ , $\left\|M\right\|_{F}^{2}$ , $\left\|U\right\|_{F}^{2}$ , and $\left\|Z\right\|_{F}^{2}$ be positive, that the product of the square roots of $c_{1}$ and $c_{2}$ equal the coefficient of $-\langle DU,M\rangle$ , and that the product of the square roots of $c_{3}$ and $c_{4}$ equal the coefficient of $-\langle U,Z\rangle$ . This implies that

[TABLE]

Thus, $\beta_{1}$ and $\beta_{2}$ should satisfy

[TABLE]

and the quadruple $(\gamma_{1},\gamma_{2},\gamma_{3},\gamma_{4})$ should satisfy (27). Moreover, the set $\mathbb{T}$ is non-empty. ∎

Proposition 9 shows that we can determine parameters $\beta_{1}$ and $\beta_{2}$ such that the functional $L_{\beta_{1},\beta_{2}}$ is jointly convex in $(U,Z,M)$ . Since

[TABLE]

it follows from (26) and (27) that there are parameters $\rho_{1},\rho_{2}\geq 1$ such that

[TABLE]

Assume that $\rho_{1},\rho_{2}>1$ . Then condition (28) yields

[TABLE]

and

[TABLE]

Thus,

[TABLE]

Therefore, $\beta_{1}$ and $\beta_{2}$ satisfy the inequalities

[TABLE]

Lemma 10.

Assume that $f=g+h$ , where $g$ and $h$ are lower semi-continuous convex functions from $\mathbb{R}^{n_{1}\times n_{2}}$ to $\mathbb{R}$ , and $h$ is Gateau differentiable with derivative $h^{\prime}$ . If $p^{*}\in\mathbb{R}^{n_{1}\times n_{2}}$ , then the following conditions are equivalent:

$p^{*}\in\arg\inf_{p\in\mathbb{R}^{n_{1}\times n_{2}}}f(p)$ , 2. 2.

$g(p)-g(p^{*})+\langle h^{\prime}(p^{*}),p-p^{*}\rangle\geq 0$ , $\forall p\in\mathbb{R}^{n_{1}\times n_{2}}$ .

Moreover, when the function $g$ has a separable structure of the type

[TABLE]

where $p_{1},\,p_{2},\,p_{3}$ are independent variables, conditions $1.$ and $2.$ are equivalent to

[TABLE]

Proof.

The proposition can be shown similarly as in [17]. ∎

Theorem 11.

For any pair of parameters $(\lambda,a)$ that satisfies condition (21), any penalty parameters $\beta_{1}$ and $\beta_{2}$ that satisfy (26), and any Lagrange multipliers $Q$ and $O$ , the augmented Lagrangian functional $L$ satisfies

[TABLE]

The saddle point problem (19) has at least one solution, and all solutions are of the form $(U^{*},U^{*},DU^{*},Q^{*},O^{*})$ , where $U^{*}$ is the unique global minimizer of the functional $J$ .

Proof.

It follows from the proof of Proposition 9 that there is a quadruple $(\gamma_{1},\gamma_{2},\gamma_{3},\gamma_{4})$ such that the functional

[TABLE]

is proper, continuous, coercive, and convex jointly in the variables $(U,Z,M)$ . We can express $L$ as

[TABLE]

where

[TABLE]

with

[TABLE]

The functions $g$ and $h$ are semi-continuous and convex. Moreover, $h$ is Gatteau differentiable. Therefore, Lemma 10 yields the desired result. ∎

Theorem 12.

Let the parameters $\left(\lambda,a\right)$ satisfy (21) and let the Lagrangian parameters $(\beta_{1},\beta_{2})$ satisfy (26). Then the saddle point problem (19) admits at least one solution, and all the solutions have the form $(U^{*},U^{*},DU^{*},Q^{*},O^{*})$ , where $U^{*}$ denotes the unique minimizer of (3). Furthermore, $DU^{*}=M^{*}$ and $U^{*}=Z^{*}$ .

Proof.

The proof is similar to the proof presented in [17]. ∎

Let $(U^{*},Z^{*},M^{*},Q^{*},O^{*})$ solve the saddle point problem (19). Then

[TABLE]

where $L_{\beta_{1},\beta_{2}|U}$ , $L_{\beta_{1},\beta_{2}|Z}$ , and $L_{\beta_{1},\beta_{2}|M}$ are restrictions of the functional $L_{\beta_{1},\beta_{2}}$ to $U$ , $Z$ , and $M$ , respectively. These functionals can be written as

[TABLE]

In order to apply Lemma 10 to the functionals $L_{\beta_{1},\beta_{2}|U}$ , $L_{\beta_{1},\beta_{2}|Z}$ , and $L_{\beta_{1},\beta_{2}|M}$ , we have to verify that the first and second parts of each functional are semi-continuous and convex, and that the second part is Gateau differentiable. For this to hold, the parameters $\beta_{1}^{\prime},\beta_{1}^{\prime\prime},\beta_{2}^{\prime},\beta_{2}^{\prime\prime}$ have to satisfy the conditions

[TABLE]

We obtain

[TABLE]

In the following, let $\left(U^{*},Z^{*},DU^{*},Q^{*},O^{*}\right)$ be a solution of the saddle point problem (19). Introduce the variables

[TABLE]

where $\left(U^{k},Z^{k},DU^{k},Q^{k},O^{k}\right)$ , $k=1,2,\ldots~$ , is the sequence generated by Algorithm 1. The following propositions are important for showing convergence of this sequence to the solution of the saddle point of (19).

Proposition 13.

For any parameter pairs $(\lambda,a)$ that satisfies (21), and parameter pairs $\left(\beta_{1},\beta_{2}\right)$ such that

[TABLE]

we have

[TABLE]

for certain values of $\beta_{3}$ and $\beta_{4}$ , and for specific values of $c_{1},c_{2},c_{3},c_{4}$ to be determined.

See Appendix A for a proof.

Proposition 14.

Let $\left(\beta_{1},\beta_{2}\right)$ satisfy (35) and let $\left(\beta_{3},\beta_{4}\right)$ and $\left(\beta_{1}^{\prime\prime},\beta_{2}^{\prime\prime}\right)$ satisfy (48) and (47), respectively. Assume that the coefficients $c_{1},c_{2},c_{3},c_{4}$ satisfy (49). Then

[TABLE]

A proof is provided in Appendix B.

The following theorem gives a condition on $\beta_{1}$ and $\beta_{2}$ so that the sequence generated by Algorithm (1) converges to the solution of the saddle point problem (19).

Theorem 15.

*Assume that $(U^{*},Z^{*},DU^{*},Q^{*},O^{*})$ is a solution of a saddle point problem and let the parameter pair $(\lambda,a)$ satisfy (21). Then for any parameters such that (35) holds, the sequence

$\{(U^{k},Z^{k},M^{k},Q^{k},O^{k})\}_{k=1}^{\infty}$ generated by Algorithm 1 satisfies*

[TABLE]

See Appendix C for a proof.

Theorem 15 implies that the iterates $(U^{k},Z^{k},M^{k},Q^{k},O^{k})$ generated by Algorithm 1 converge to a stationary solution of (5) when $k$ increases. In other words, the proposed method is guaranteed to converge under the stated conditions. The following section illustrates the effectiveness of this algorithm when applied to various image completion and segmentation tasks.

4 Numerical examples

We compare the LR-CNC algorithm (Algorithm 1) to several available methods for image completion and segmentation, including the Convex-Non-Convex (CNC) segmentation method by Chan et al. [17], the ATCG-TV image restoration algorithm proposed by Benchettou et al. [4] (a total variation-based image reconstruction method using an alternating conditional gradient scheme), and the classical Chan-Vese segmentation method [18]. We also apply a standard K-means clustering [30] to determine segmentations of reconstructed images.

The iterations with Algorithm 1 are terminated as soon as two consecutive approximations $U^{k}$ are sufficiently close. Specifically, we terminate the iterations when

[TABLE]

where tol is a user-chosen tolerance. In all the experiments, we set $\text{tol}=10^{-4}$ . We report the Peak Signal-to-Noise Ratio (PSNR)

[TABLE]

where $U$ and $U_{reco}$ denote arrays that represent the original and the recovered data, respectively, and $\max U(:)^{2}$ is the maximum squared entry (pixel-value) of the array $U$ .

When segmenting images, we also compute the Structural Similarity Index Measure (SSIM) to determine the closeness of the recovered and the original image. The definition of this index is somewhat involved, see [39], and we do not provide it here. We just recall that a larger SSIM-value corresponds to a more accurate reconstruction and the maximum value is 1. All computations were carried out on a laptop computer equipped with a 2.3 GHz Intel Core i5 processor and 8 GB of memory using MATLAB 2023.

For the numerical experiments, we need to fix several parameters: $a$ , $\lambda$ , $T$ , $\beta_{1}$ , and $\beta_{2}$ . Based on the equations (21), we let $\lambda=\tau_{1}9a$ , and (35) shows that

[TABLE]

with $\rho_{1}>1$ and $\rho_{2}>3$ . Thus, we only need to set the parameters $a$ , $\tau_{1}$ , $\tau_{2}$ , $\tau_{3}$ , $\rho_{1}$ , $\rho_{2}$ , and $T$ .

4.1 Image completion

This subsection compares the performance of Algorithm 1 to that of the recently proposed algorithms ATCG-TV by Benchettou et al. [4] and CNC by Chan et al. [17]. This subsection is divided into two parts: the first part is concerned with the gray scale images, while the second part considers color images.

In this subsection we use the sampling rate, which is defined as

[TABLE]

where $n_{1}n_{2}$ is the number of the pixels of the data, and $p$ is the number of missed entries. For all the experiments in this subsection, we set $a=0.1$ , $T=10^{-6}$ , $\rho_{1}=2.5$ , $\rho_{2}=3.001$ , $\tau_{1}=\rho_{1}$ , $\tau_{2}=\rho_{1}$ , and $\tau_{3}=1.0001$ .

4.1.1 Gray scale images

We illustrate the performance of Algorithm 1 when applied to the well-known MRI and cameraman images. Figure 1 shows results of completion for the MRI image determined by Algorithm 1, the CNC algorithm [17], and the ATCG-TV algorithm [4] for $90\%$ missing data (SR $=0.1$ ). Table 1 reports PSNR-values, CPU-times (in seconds), and the number of iterations (Iter) required by each algorithm to satisfy the stopping criteria. The stopping criteria for the ATCG-TV and CNC algorithms were the default choices provided in [4] and [17], respectively.

Figure 1 and Table 1 show that for small SR-values, the LR-CNC algorithm outperforms the ATCG-TV algorithm. However, for larger SR-values, the difference in performance between the two algorithms is less significant. The LR-CNC algorithm demonstrates a clear advantage in terms of CPU time and the number of iterations required in comparison with both the CNC and ATCG-TV algorithms. Figure 2 displays the graphs of the logarithm of the mean square error versus the number of iterations for the LR-CNC, CNC, and ATCG-TV algorithms, as well as the values of

[TABLE]

and the evolution of the PSNR-values as a function of the number of iterations when these algorithms are applied to the MRI image for SR $=0.1$ .

Figure 2 shows the relative differences (37) obtained with the LR-CNC algorithm to decrease faster than the corresponding differences for the CNC and ATCG-TV algorithms. Moreover, the PSNR-values of the restorations determined by the LR-CNC algorithm increase quickly and smoothly with the iteration number. In fact, they increase much faster and in a smoother manner with the iteration number than for the CNC and ATCG-TV algorithms.

4.1.2 Color images

We apply the LR-CNC, CNC, and ATCG-TV algorithms to the restoration of color images. In our experiments, we use the well-known Airplane and Barbara images, both of which are represented by arrays of size $256\times 256\times 3$ . Thus, the red, green, and blue channels are each represented by a tensor slice. These images are reshaped into matrices of size $256\times(256*3)$ by using the MATLAB command $\texttt{reshape}\left(\cdot,[256,256*3]\right)$ .

Figure 3 displays results for the color images Airplane and Barbara with $90\%$ missing data when reconstructed by the CNC, ATCG-TV, and LR-CNC algorithms.

Table 2 displays the PSNR-values, CPU-time, and the number of iterations required by the CNC, ATCG-TV, and LR-CNC algorithms to satisfy the stopping criterion. The computations are carried out for the Airplane and Barbara images for several SR-values. Figure 4 depicts the relative differences (38) and the PSNR-values as functions of the iteration number for the CNC, ATCG-TV, and LR-CNC algorithms.

The above results show the LR-CNC algorithm to consistently produce images with higher PSNR/SSIM values than the other algorithms in our comparison, especially in challenging scenarios (very low sampling ratio or high noise levels). Qualitatively, the LR-CNC algorithm gives reconstructions that retain details and segments images into regions more distinctly than the other algorithms; see Figures 3 and 4. An advantage of the LR-CNC algorithm is its fast convergence: the algorithm achieves accurate results in far fewer iterations (2) than the other algorithms due to the efficiency of low-rank regularization and our ADMM scheme.

4.2 Image segmentation

This subsection applies the LR-CNC algorithm (Algorithm 1) to image segmentation. We compare the performance of this algorithm to the CNC and ATCG-TV algorithms, as well as to an algorithm proposed by Chan and Vese [18]. In all experiments, we apply these algorithms to images that have been contaminated by Gaussian noise and then use the K-means algorithm [30] to segment the resulting image, with $K$ equal to the number of regions in the ground truth. Given a noise-free image ’Im’, we generate a noise-contaminated image $B$ with the MATLAB command

[TABLE]

where the parameter $L$ specifies the mean of the Gaussian noise; $0.01$ is the variance of the noise. The variance is the default value. We will refer to the value of the parameter $L$ as the Gaussian noise level. For all segmentation experiments, we let $a\in\{0.1,0.2\}$ . If $L$ is relatively large, then we set $a=0.2$ , otherwise we set $a=0.1$ . Moreover, we let $T=10^{-6}$ , $\rho_{1}=2.5$ , $\rho_{2}=10.001$ , $\tau_{1}=\rho_{1}$ , $\tau_{2}=\rho_{1}$ , and $\tau_{3}=1.0001$ . Code for the algorithm by Chan & Vese [18] is available at 111https://fr.mathworks.com/matlabcentral/fileexchange/34548-active-contour-without-edge.

Figure 5 displays results obtained with the CNC, ATCG-TV, Chan & Vese, and LR-CNC algorithms for Gaussian noise level $L=0.3$ . These tests are applied to the MRI image with the cluster number set to $K=3$ , and to a Millennium simulation 222https://wwwmpa.mpa-garching.mpg.de/galform/virgo/millennium/ with cluster number $K=2$ .

Table 3 shows PSNR and SSIM values for segmented images determined by the CNC, ATCG-TV, Chan & Vese, and LR-CNC algorithms for several Gaussian noise levels $L$ when applied to the MRI and Millennium images. Figures 6 and 7 display the relative differences (38) and PSNR-values of images determined at each iteration of the CNC, ATCG-TV, and LR-CNC algorithms as functions of the iteration number when these algorithms are applied to the MRI and Millennium images. The Gaussian noise level is $L=0.3$ and we seek to determine $K=3$ and $K=2$ clusters in the MRI and Millenium images, respectively.

Figure 5 shows the segmented images produced by the LR-CNC algorithm to retain details with high precision for both images, also when the amount of noise is large. Furthermore, the PSNR and SSIM values reported in Table 3 indicate that the LR-CNC algorithm consistently exhibits greater robustness than the other algorithm, with particularly strong performance for larger noise levels. Figures 6 and 7 illustrate the smooth and rapid convergence of the iterates determined by the LR-CNC algorithm.

Our last example is concerned with segmentation of hyperspectral images. Hyperspectral images contain more detailed information than RGB images. However, material separation based on hyperspectral images remains a challenging task. It involves isolating materials from one another. Various approaches have been developed to carry out material separation, including non-negative matrix factorization [22].

We consider two hyperspectral images: Samson and Jasper Ridge. Both images are captured by the Airborne Visible/Infrared Imaging Spectrometer AVIRIS. Initially, these images contained 224 spectral bands. After removing the water absorption bands, the Samson image has 156 bands and the Jasper Ridge image has 198 bands. The Samson dataset is a tensor of size $95\times 95\times 156$ , and the Jasper Ridge dataset is a tensor of size $100\times 100\times 198$ . The Samson dataset contains three materials: water, soil, and trees, whereas the Jasper Ridge dataset includes four materials: water, soil, trees, and roads. Figures 8 and 9 show the different materials of each dataset. By using the non-negative tensor factorization described in [22], it can be shown that the abundance of each material is presented by the color between red and yellow.

Figure 10 displays the results of the CNC, Chan & Vese, ATCG-TV, and LR-CNC algorithms applied to the hyperspectral images Samson and Jasper Ridge. Both images are corrupted by Gaussian noise of level $0.3$ . For the Samson image, we represent water in black, trees in gray with an intensity of $0.5$ , and soil in white. For the Jasper Ridge image, water is represented in black, trees in gray with an intensity of $0.3$ , soil in gray with an intensity of $0.7$ , and roads are shown in white. Given the presence of three materials in the Samson test, we use $K=3$ , while for the Jasper Ridge image, we use $K=4$ to represent the four materials.

Figure 9 shows that the LR-CNC method identifies each material more distinctly than the other algorithms. Table 4 displays PSNR- and SSIM-values for the recovered data for the Samson and Jasper Ridge datasets for various Gaussian noise levels. Figures 10 and 11 depict relative differences (38) and PSNR-values at each iteration of the algorithms for the Gaussian noise level $0.3$ for the Samson and Jasper Ridge data sets, respectively.

Table 4 illustrates that, at a low Gaussian noise level of $0.01$ , the performance of all algorithms is comparable, with only slight differences in achieved PSNR- and SSIM-values. However, for noise levels $0.1$ and larger, the LR-CNC algorithm yields segmentations with significantly larger PSNR- and SSIM-values. Furthermore, the graphs of Figures 10 and 11 demonstrate the convergence of the LR-CNC algorithm to be more regular and faster.

5 Conclusion

In this work, we proposed a novel approach to consider convex-non-convex optimization problems with focus on promoting low-rank solutions. By combining low-rank regularization with a non-convex penalty, the method achieves reconstructions that are both accurate and computationally efficient. This is supported by measured quality metrics. The optimization problems are solved with the Alternating Direction Method of Multipliers (ADMM), and we provide an analysis that establishes the convergence properties of the LR-CNC algorithm. Extensive numerical experiments illustrate the effectiveness of this algorithm. The LR-CNC framework balances the robustness of convex models and the flexibility of non-convex formulations. This makes the LR-CNC algorithm a promising tool for image segmentation and completion tasks.

The authors declare that there is no conflict of interest.

Appendix A Proof of Proposition 13

. From Theorem 12, we have $U^{*}=Z^{*}$ and $DU^{*}=M^{*}$ . Therefore,

[TABLE]

Using (6) we obtain the equations

[TABLE]

Adding these equations, we get

[TABLE]

At the $k$ th step with $k\geq 1$ , Algorithm 1 yields relations that are analogous to (32), (33), and (34), namely,

[TABLE]

where $\beta_{1}^{\prime},\,\beta_{2}^{\prime},\,\beta_{1}^{\prime\prime}$ , and $\beta_{2}^{\prime\prime}$ satisfy the relations (31).

Substituting $U=U^{k+1}$ into (32) and $U=U^{*}$ into (A), and adding the equations so obtained, we get

[TABLE]

Similarly, we substitute $Z=Z^{k+1}$ into (33), $Z=Z^{*}$ into (A), $M=M^{k+1}$ into (33), and $M=M^{*}$ into (A), and obtain

[TABLE]

Adding (42) to (43) gives

[TABLE]

Moreover, addition of (42), (44), and (A) yields

[TABLE]

where $\beta_{3},\beta_{4}>0$ .

We would like to show that the third and the fourth terms of the above inequality are equal to, respectively,

[TABLE]

for some nonnegative coefficients $c_{1},c_{2},c_{3},c_{4}$ . This requires that the coefficients of $\left\|D\widetilde{U}^{k+1}\right\|_{F}^{2}$ , $\left\|\widetilde{M}^{k+1}\right\|_{F}^{2}$ , $\left\|\widetilde{U}^{k+1}\right\|_{F}^{2}$ , and $\left\|\widetilde{Z}^{k+1}\right\|_{F}^{2}$ are nonnegative, i.e., that

[TABLE]

By also considering the conditions (31) on $\beta_{1}^{\prime},\,\beta_{2}^{\prime},\,\beta_{1}^{\prime\prime}$ , and $\beta_{2}^{\prime\prime}$ , we obtain

[TABLE]

It follows from the first and the second inequalities that

[TABLE]

We also have the relations

[TABLE]

which give

[TABLE]

We obtain

[TABLE]

Since $\beta_{1}-\beta_{3}>2\beta_{1}^{\prime\prime}$ and $\beta_{2}-\beta_{4}>2\beta_{2}^{\prime\prime}$ , it follows that we can have a solution only if $\beta_{1}^{\prime}>\beta_{1}^{\prime\prime}$ and $\beta_{2}^{\prime}>\beta_{2}^{\prime\prime}$ . Imposing the conditions

[TABLE]

we get

[TABLE]

We now investigate if $\beta_{3}$ and $\beta_{4}$ admit some solutions. Considering the above two equations and the inequalities (29), we have

[TABLE]

Since $\beta_{1}$ and $\beta_{2}$ are positive, $\beta_{1}$ accepts solutions for every $\beta_{3}>0$ and $\beta_{2}$ accepts solutions only if

[TABLE]

Letting

[TABLE]

we find that the third and the fourth terms of (A) can be written as

[TABLE]

with

[TABLE]

It follows that $c_{1}>c_{2}$ and $c_{3}>c_{4}$ . We will use these inequalities below.

Inequality (A) can be written as

[TABLE]

which is equivalent to

[TABLE]

This implies

[TABLE]

which gives

[TABLE]

This concludes the proof.

Appendix B Proof of Proposition 14

.

According Proposition 13, we have

[TABLE]

We now seek to determine lower bounds for $-\beta_{1}\langle D\widetilde{U}^{k+1},\widetilde{M}^{k}-\widetilde{M}^{k+1}\rangle$ and $-\beta_{2}\langle\widetilde{U}^{k+1},\widetilde{Z}^{k}-\widetilde{Z}^{k+1}\rangle$ . We have

[TABLE]

and

[TABLE]

Moreover,

[TABLE]

From the construction of $M^{k}$ , we get

[TABLE]

and from the construction of $Z^{k}$ , we obtain

[TABLE]

Replacing $M$ with $M^{k+1}$ in (B), $M$ with $M^{k}$ in (A), summing the two inequalities, and proceeding similarly when replacing $Z$ with $Z^{k+1}$ in (B) and $Z$ with $Z^{k}$ in (A), yields

[TABLE]

and

[TABLE]

Since

[TABLE]

it follows that

[TABLE]

Therefore,

[TABLE]

and

[TABLE]

Consequently,

[TABLE]

This completes the proof.

Appendix C Proof of Theorem 15

It follows from Proposition 14 that

[TABLE]

with $\left(\beta_{1}^{\prime\prime},\beta_{2}^{\prime\prime}\right)$ , $\left(\beta_{3},\beta_{4}\right)$ , and $(c_{1},c_{2},c_{3},c_{4})$ satisfying. respectively, (47), (48), and (49). Thus, we have

[TABLE]

Introduce the sequence

[TABLE]

This sequence is bounded and decreasing, and therefore it converges. This implies that

[TABLE]

converges to [math]. This implies that the sequences

[TABLE]

are bounded and, therefore, the sequences

[TABLE]

also are bounded. Moreover, the sequences

[TABLE]

are bounded. We conclude that

[TABLE]

As we have discussed, $c_{1},c_{2},c_{3},c_{4}>0$ , with $c_{1}\neq c_{2}$ and $c_{3}\neq c_{4}$ . Furthermore, $c_{1}>c_{2}$ and $c_{3}>c_{4}$ . We can see that

[TABLE]

and

[TABLE]

Moreover, we have

[TABLE]

Consequently, from the inequalities in (60), (61), and (62), and by using (56) and (58), as well as from (57) and (59), we obtain

[TABLE]

This concludes the proof.

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Baglama, L. Reichel, Augmented implicitly restarted Lanczos bidiagonalization methods, SIAM Journal on Scientific Computing, 27, 19–42 (2005).
2[2] A. Beck, M. Teboulle, Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems, IEEE Transactions on Image Processing, 18(11) 2419–2434 (2009).
3[3] R. Bell, Y. Koren, C. Volinsky, Matrix factorization techniques for recommender systems, Computer, 42(8), 30-37 (2009).
4[4] O. Benchettou, A. H. Bentbib, A. Bouhamidi, K. Kreit, Constrained tensorial total variation problem based on an alternating conditional gradient algorithm, Journal of Computational and Applied Mathematics, 451, Art. 116018 (2024).
5[5] A. H. Bentbib, M. El Guide, K. Jbilou, E. Onunwor, L. Reichel, Solution methods for linear discrete ill-posed problems for color image restoration, BIT Numerical Mathematics 58, 555–576 (2018).
6[6] A. H. Bentbib, A. El Hachimi, K. Jbilou, A. Ratnani, Fast multidimensional completion and principal component analysis methods via the cosine product, Calcolo, 59(3), Art. 26 (2022).
7[7] A. H. Bentbib, A. El Hachimi, K. Jbilou, A. Ratnani, A tensor regularized nuclear norm method for image and video completion, Journal of Optimization Theory and Applications, 192(2), 401–425 (2022).
8[8] Å. Björck, Solving linear least squares problems by Gram-Schmidt orthogonalization, BIT Numerical Mathematics, 7, 1–210 (1967).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Low-Rank Regularized Convex-Non-Convex Problems for Image Segmentation or

Abstract

keywords:

1 Introduction

2 The solution method

2.1 Solution of subproblem (7)

2.2 Solving subproblem (8)

2.3 Solving subproblem (9)

3 Convergence analysis

Lemma 1**.**

Lemma 2**.**

Proof.

Corollary 3**.**

Theorem 4**.**

Proof.

Proposition 5**.**

Proof.

Lemma 6**.**

Proof.

Proposition 7**.**

Proof.

Proposition 8**.**

Proof.

Proposition 9**.**

Proof.

Lemma 10**.**

Proof.

Theorem 11**.**

Proof.

Theorem 12**.**

Proof.

Proposition 13**.**

Proposition 14**.**

Theorem 15**.**

4 Numerical examples

4.1 Image completion

4.1.1 Gray scale images

4.1.2 Color images

4.2 Image segmentation

5 Conclusion

Appendix A Proof of Proposition 13

Appendix B Proof of Proposition 14

Appendix C Proof of Theorem 15

Lemma 1.

Lemma 2.

Corollary 3.

Theorem 4.

Proposition 5.

Lemma 6.

Proposition 7.

Proposition 8.

Proposition 9.

Lemma 10.

Theorem 11.

Theorem 12.

Proposition 13.

Proposition 14.

Theorem 15.