Rank Approximation of a Tensor with Applications in Color Image and Video Processing
Ramin Goudarzi Karim, Carmeliza Navasca, Da Yan

TL;DR
This paper introduces a block coordinate descent algorithm that estimates tensor rank and provides its canonical polyadic decomposition, with applications demonstrated on color images and videos.
Contribution
The paper presents a novel sparse optimization-based algorithm for tensor rank estimation and decomposition, applicable to image and video processing.
Findings
Effective tensor rank estimation on color images and videos
Successful application of the algorithm to real-world visual data
Demonstrated improvement over existing methods in tensor approximation
Abstract
We propose a block coordinate descent type algorithm for estimating the rank of a given tensor. In addition, the algorithm provides the canonical polyadic decomposition of a tensor. In order to estimate the tensor rank we use sparse optimization method using norm. The algorithm is implemented on single moving object videos and color images for approximating the rank.
| Size of Tensor | |||
| Actual Rank | 5 | 8 | 10 |
| Upper bound | 10 | 15 | 20 |
| Estimated Rank | 5 | 8 | 12 |
| Residual error | 2.85e-1 | 1.34e-1 | 1.20e-1 |
| Relative error | 5.17e-2 | 1.05e-2 | 5.00e-3 |
| Time | 2.23 | 3.86 | 6.39 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Sparse and Compressive Sensing Techniques · Advanced Neuroimaging Techniques and Applications
Rank Approximation of a Tensor with Applications in Color Image and Video Processing
Ramin Goudarzi111Department of Mathematics, University of Alabama at Birmingham, Birmingham, AL 35294, USA, [email protected]. Carmeliza Navasca222Department of Mathematics, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL, 35294, USA, [email protected] Da Yan333Department of Computer Science, University of Alabama at Birmingham, Birmingham, AL 35294, USA, [email protected].
Abstract
We propose a block coordinate descent type algorithm for estimating the rank of a given tensor. In addition, the algorithm provides the canonical polyadic decomposition of a tensor. In order to estimate the tensor rank we use sparse optimization method using norm. The algorithm is implemented on single moving object videos and color images for approximating the rank.
1 Introduction
In 1927, Hitchcock [17, 18] proposed the idea of the polyadic form of a tensor, i.e., expressing a tensor, multilinear array, as the sum of a finite number of rank-one tensors. This decomposition is called the canonical polyadic (CP) decompositon; it is known as CANDECOMP or PARAFAC. It has been extensively applied to many problems in various engineering [30, 32, 1, 13] and science [38, 22]. Specifically, tensor methods have been applied in many multidimensional datasets in signal processing applications [7, 9, 11], color image processing [43, 19] and video processing [33, 4]. Most of these applications rely on decomposing a tensor data into its low rank form to be able to perform efficient computing and to reduce memory requirements. In computer vision, detection of moving objects in video processing relies on foreground and background separation, i.e. the separation of the moving objects called foreground from the static information called background, requires low rank representation of video tensor. In color image processing, the rgb channels in color image representation requires extensions of the matrix models of gray-scale images to low rank tensor methods. There are several numerical techniques [8, 10, 23, 27, 30] for approximating a low rank tensor into its CP decomposition, but they do not give an approximation of the minimum rank. In fact, most low rank tensor algorithms require an a priori tensor rank to find the tensor decomposition. Several theoretical results [24, 25] on tensor rank can help, but they are limited to low-multidimensional and low order tensors.
In this work, the focus is on finding an estimation of the tensor rank and its rank-one tensor decomposition (CP) of a given tensor. There are also algorithms [7, 5] which give tensor rank, but they are specific to symmetric tensor decomposition over the complex field using algebraic geometry tools. Our proposed algorithm addresses two difficult problems for the CP decomposition: (a) one is that finding the rank of tensors is a NP-hard problem [16] and (b) the other is that tensors can be ill-posed [12] and failed to have their best low-rank approximations.
The problem of finding the rank of a tensor can be formulated as a constrained optimization problem.
[TABLE]
where represents the total number of non-zero elements of . The rank optimization problem is NP hard and so to make it more tractable, the following formulation [42] is used:
[TABLE]
where is the regularization parameter and the objective function is a composition of smooth and non-smooth functions. Our formulation includes a Tikhonov type regularization:
[TABLE]
The added Tikhonov regularization has the effect of forcing the factor matrices to have the equal norm. Moreover, this formulation and its numerical methods described later give an overall improvement in the accuracy and thus, memory requirements of the tensor model found in [42].
1.1 Organization
Our paper is organized as follows. In Section 2, we provide some notations and terminologies used throughout this paper. In Section 3, we formulate an -regularization optimization to the low-rank approximation of tensors. In Section 4, we describe a numerical method to solve the -regularization optimization by using a proximal alternating minimization technique for the rank and an alternating least-squares for the decomposition. In Section 5, we provide an analysis of convergence of the numerical methods. The numerical experiments in Section 6 consist of simulated low rank tensor, color images and videos. Finally, our conclusion and future work are given in Section 7.
2 Preliminaries
We denote the scalars in with lower-case letters and the vectors with lower-case letters . The matrices are written as upper-case letters and the symbols for tensors are calligraphic letters . The subscripts represent the following scalars: , , and the -th column of a matrix is . The matrix sequence is denoted . An Nth order tensor is a multidimensional array with entries for where . In particular, a third order tensor is a multidimensional array with entries for , and .
Here we present some standard definitions and relations in tensor analysis. The Kronecker product of two vectors and is denoted by :
[TABLE]
The Khatri-Rao (column-wise Kronecker) product (see[37]) of two matrices and is defined as
[TABLE]
The outer product of three vectors , , is a third order tensor with the entries defined as follows:
[TABLE]
Definition 2.1** (vec)**
Given a matrix , the function where is a vector of size obtained from column-stacking the column vectors of ; i.e.
[TABLE]
where .
The vectorization of a third order tensor is the process of transforming the tensor into a column vector, the map is defined as
[TABLE]
where . Using the definitions above, we get
[TABLE]
Definition 2.2** (Mode- matricization)**
Matricization is the process of reordering the elements of an th order tensor into a matrix. The mode- matricization of a tensor is denoted by and arranges the mode- columns to be the columns of the resulting matrix. The mode- column, , is a vector obtained by fixing every index with the exception of the th index.
If we use a map to express such matricization process for any th order tensor , that is, the tensor element maps to matrix element , then there is a formula to calculate :
[TABLE]
For example, the tensor unfolding or matricization of a third order tensor is the process or rearranging the elements of into a matrix. The mode- () matricization is denoted by and the elements of it can be expressed by the following relations:
[TABLE]
[TABLE]
[TABLE]
2.1 CP decomposition and the Alternating Least-Squares Method
In 1927, Hitchcock [17][18] proposed the idea of the polyadic form of a tensor, i.e., expressing a tensor as the sum of a finite number of rank-one tensors. Today, this decomposition is called the canonical polyadic (CP); it is known as CANDECOMP or PARAFAC. It has been extensively applied to many problems in various engineering [30, 32, 1, 13] and science [38, 22]. The well-known iterative method for implementing the sum of rank one terms is the Alternating Least-Squares (ALS) technique. Independently, the ALS was introduced by Carrol and Chang [6] and Harshman [15] in 1970. Among those numerical algorithms, the ALS method is the most popular one since it is robust. However, the ALS has some drawbacks. For example, the convergence of ALS can be extremely slow.
The CP decomposition of a given third order tensor factorizes it to a sum of rank one tensors.
[TABLE]
For simplicity we use the notation to represent the sum on the right hand side of the equation above, where , and are called factor matrices.
[TABLE]
The CP decomposition problem can be formulated as an optimization problem. Given the goal is to find vectors , such that the distance between the tensor and the sum of the outer products of is minimized. The Frobenius norm (sum of squares of the entries) is mainly used to measure the distance.
[TABLE]
Using the Khatri-Rao product, the objective function in (2.2) can be stated in the following four equivalent forms:
[TABLE]
[TABLE]
[TABLE]
and
[TABLE]
All the functions in (2.3), (2.4), (2.5) and (2.6) are linear least squares problems with respect to matrices A, B, C and vector . To find approximations to A,B,C, and , these four optimization problems (2.3)-(2.6) are implemented iteratively and the minimizers are updated between each optimization problems (via Gauss-Seidel sweep) with a stopping criteria. This technique is called the Alternating Least Squares (ALS) Method. The ALS method is popular since it is robust and easily implementable. However, the ALS has some drawbacks. For example, the convergence of ALS can be extremely slow. Another drawback is the requirement of a tensor rank before a CP decomposition is approximated. The next sections deal with tensor rank approximation.
3 Rank Approximation of a Tensor
The problem of finding the rank of a tensor can be formulated as a constrained optimization problem.
[TABLE]
where represents the total number of non-zero elements of . Since the problem is NP hard (ref), we replace by the norm of . The norm is defined as the sum of absolute value of the elements of . So the rank approximation problem can be written as
[TABLE]
In order to obtain a CP decomposition of the given tensor as well as the rank approximation, we formulate the rank approximation problem as follow:
[TABLE]
where is the regularization parameter. The objective function of the problem (3.1) is non-convex and non-smooth. However, it is a composition of a smooth and non-smooth functions.
Moreover, it is known that CP decomposition of a tensor is unique up to scaling anf permutation of factor matrices. Note that
[TABLE]
for a nonzero scalar . In order to overcome the scaling indeterminacy, we add a Tikhonov type regularization term to our objective function [20]. Let f and g be the following:
[TABLE]
and
[TABLE]
which represent the fitting term and the regularization term in (3.1), then the rank approximation problem can be formulated as
[TABLE]
The added Tikhonov regularization has the effect of forcing the factor matrices to have the equal norm. i.e.
[TABLE]
[29], Now let represent the objective function in (3.4) collectively, then
[TABLE]
where
[TABLE]
Let , when are fix, we represent by and by .
4 Approximation of Tensor Decomposition with Tensor Rank
In this section we propose a block coordinate descent type algorithm for solving the problem (3.4). We consider four blocks of variables with respect to and . In particular, at each inner iteration, we solve the following minimization problems
[TABLE]
[TABLE]
[TABLE]
and
[TABLE]
where represents the proximal linearization [3] of with respect to , namely
[TABLE]
Note that each of the minimization problems in (4.1)-(4.4) is strictly convex, therefore are uniquely determined at each iteration. In fact, the subproblems in (4.1)-(4.3) are standard liner least squares problems with an additional Tikhonov regularization term. One can see by vectorization of the objective functions, for instance, the residual term in (2.3) can be written as follows
[TABLE]
Since the objective functions in (4.1)-(4.3) are strictly convex, the first order optimality condition is sufficient for a point to be minimum. In other words, the exact solutions of (4.1)-(4.3) can be given be the following normal equations
[TABLE]
[TABLE]
and
[TABLE]
where , and .
To update in (4.4), we discuss the proximal operator first.
Definition 4.1
(proximal operator) Let be a lower semicontinuous convex function, then the proximal operator of with parameter is defined as follow
[TABLE]
Using the proximal operator notation, the equation (4.4) is equivalent to
[TABLE]
This is easy to verify because
[TABLE]
Remark 4.2
The proximal operator in (4.5) is well-defined because the function is continuous and convex. Using the vec operator, we have
[TABLE]
where is the matrix with columns . Therefore we can rewrite the objective function as
[TABLE]
It is easy to calculate the gradient of (4.9) with respect to :
[TABLE]
This implies the Lipschitz continuity of the gradient of with respect to . The Lipschitz constant is so we must have
[TABLE]
5 Analysis of Convergence
in this section, we study the global convergence of the proposed algorithm under mild assumptions. The Kurdyka-Lojasiewicz [21], [26] property plays a key role in our analysis. We begin this section by stating the descent lemma.
Lemma 5.1
(Descent Lemma) Let be continuously differentiable function, and is Lipschitz continuous with constand L, then for any we have
[TABLE]
Next lemma provides the theoretical estimate for the decrease in the objective function after a single update .
Lemma 5.2
Suppose that is obtained by the equation (4) and , where ’s are defined in (4.11), then there is a constant such that
[TABLE]
Proof. Recall that
[TABLE]
and is obtained by the equation
[TABLE]
therefore we must have
[TABLE]
Since is Lipschitz continuous with constant , by the descent lemma we have
[TABLE]
with (5.2), the above inequality implies
[TABLE]
setting
[TABLE]
proves the lemma.
Remark 5.3
Suppose that ’s are bounded from above by the constant in the previous lemma, then for fixed step-size where
[TABLE]
we have
[TABLE]
*for each . *
Definition 5.4
[14]** A differentiable function is called strongly convex if there is a constant such that
[TABLE]
for any .
Lemma 5.5
Suppose that is obtained by equation (4.1), then we have
[TABLE]
Proof. Note that the objective functions in (4.1) is strongly convex with parameter and by the first-order optimality condition we must have
[TABLE]
now the strong convexity of yields
[TABLE]
which implies
[TABLE]
This proves the lemma.
Remark 5.6
Similar results hold for the blocks and , if they are updated by equations (4.2) and (4.3). In particular, we have that
The next theorem guarantees that the value of decreases monotonically at each iteration. This shows that the sequence generated by scheme (4.1), (4.2), (4.3) and (4.4) is monotonically decreasing in value,
Theorem 5.7
(Sufficient decrease property) Let represent the objective function in (ref) and , then we have
[TABLE]
for some positive constant . In addition we have
[TABLE]
Proof. By lemmas 5.2 and 5.5 we have
[TABLE]
setting gives the first result. This shows that the sequence generated by our algorithm is decreasing. The monotonicity of with the fact that is bounded from below, implies as , next let be a positive integer, then
[TABLE]
letting proves the last statement.
Remark 5.8
The sequence generated by the scheme (4.2)-(4.4) is bounded. The reason comes from the fact that unboundedness of occurs when at least one of the blocks or gets unbounded. This never happens due to the regularization terms in the objective function and the fact that is non-increasing.
Theorem 5.9
Let be the sequence generated by our algorithm, then there exists a positive constant such that for any there is a vector such that
[TABLE]
Proof. Let be a positive integer. By equations (4.1), (4.2), (4.3) and the first order optimality condition we have
[TABLE]
[TABLE]
and
[TABLE]
define
[TABLE]
then . similarly we can define vectors , . Next, by equation (4.4), we have that
[TABLE]
hence by the optimality condition, there exists such that
[TABLE]
define
[TABLE]
so . From these facts we have that
[TABLE]
We now estimate the norm of . First note that by 5.8, is bounded and the objective function (without the regularization term) is twice continuously differentiable, therefore as a consequence of mean value theorem, must be Lipschitz continuous. Hence there must exist a constant such that
[TABLE]
similarly, constants and exist such that
[TABLE]
and
[TABLE]
setting , gives us the result.
Let be a continuous function. The function is said to have Kurdyka-Lojasiewicz (KL) property at point if there exists such that
[TABLE]
is bounded around [44]. A very rich class of functions satisfying the KL property is the semi-algebraic functions. These are functions where their graphs can be expressed as an algebraic set, that is
[TABLE]
where ’s and ’s are polynomial functions and the graph of is defined by
[TABLE]
Note that the univariate function is semialgebraic because
[TABLE]
The class of semi algebraic functions are closed under addition and composition [2]. Hence The objective function in (3.5) is semialgebraic therefore it satisfies KL property.
Theorem 5.10
Suppose that is the sequence generated by our algorithm, then converges to the critical point of .
6 Numerical Experiment and Results
In this section we test our algorithm on tensors with different rank and dimensions. We randomly generate tensors with specified ranks and compare the performance of our algorithm with other available algorithms such as LRAT [42]. Next, we apply our algorithm on single moving object videos in order to extract the background and target object.
6.1 Tensor Rank Approximation
In this subsection we test the performance of our algorithm on randomly generated cubic tensors with various dimensions and various rank. The upper bound for the rank of tensors are set to be equal to . The results are shown in TABLE I.
6.2 Comparison between LRAT and our algorithm
In this subsection, we compare the performance of our proposed algorithm to LRAT [42]. We generate a random cubic tensor where its rank is equal to five. The comparison is based on the residual function as well as the sparsity of vector . The upper bound for the rank of the tested tensor is set to be equal to ten for both algorithms.
6.3 Application in background extraction of single moving object videos
In this subsection we apply our algorithm to extract the background of videos. See Figure 1. The video example [4, 33] is a with rank tensor. The relative residual error of is .
7 Conclusion
We presented the iterative algorithm for approximating tensor rank and CP decomposition based on a sparse optimization problem. Specifically, we apply a Tikhonov regularization method for finding the decomposition and a proximal algorithm for the tensor rank. We have also provided convergence analysis and numerical experiments on color images and videos. Overall, this new tensor sparse model and its computational method dramatically improve the accuracy and memory requirements.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. Acar, C. A. Bingol, H. Bingol, R. Bro, and B. Yener, Multiway analysis of epilepsy tensors , Bioinformatics, 23 (13), pp. i 10-i 18, 2007.
- 2[2] H. Attouch, J. Bolte, B.F. Svaiter, Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. Ser. A 137 (2013), 91–129.
- 3[3] J. Bolte,S. Sabach, M. Teboulle, Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program,vol. 146, pp. 459-494, (2014).
- 4[4] T. Bouwmans, A. Sobral, S. Javed and S.-K. Jung, and E.-H. Zahzah, Decomposition into Low-rank plus Additive Matrices for Background/Foreground Separation: A Review for a Comparative Evaluation with a Large-Scale Dataset, Computer Science Review, 23 (2017), Page: 1-71
- 5[5] J. Brachat, P. Comon, B. Mourrain and E. Tsigaridas Symmetric Tensor Decomposition , Linear Algebra and Applications 433, 11-12 (2010), pp. 1851-1872.
- 6[6] J. Carrol and J. Chang. Analysis of Individual Differences in Multidimensional Scaling via an N 𝑁 N -way Generalization of “Eckart-Young” Decomposition. Psychometrika, 9, 267-283, 1970.
- 7[7] P. Comon, G. Golub, L-H. Lim and B. Mourrain. Symmetric tensors and symmetric tensor rank. SIAM Journal on Matrix Analysis and Applications, 30 (3), 1254-1279, 2008.
- 8[8] P. Comon , Tensor decompositions , in Mathematics in Signal Processing V, J. G. Mc Whirter and I. K. Proudler, eds., Clarendon Press, Oxford, UK, 2002, pp. 1-24.
