A Low-rank Tensor Dictionary Learning Method for Multi-spectral Images Denoising
Xiao Gong, Wei Chen

TL;DR
This paper introduces a Low-rank Tensor Dictionary Learning method for denoising multi-spectral images, leveraging shared spatial and spectral dictionaries and nearly low-rank approximations to improve noise removal in real-world data.
Contribution
The paper proposes a novel LTDL approach that models nearly low-rank structures and learns shared spatial and spectral dictionaries for enhanced MSI denoising.
Findings
Effective denoising on synthetic data
Superior performance on real MSIs
Outperforms state-of-the-art methods
Abstract
As a 3-order tensor, a multi-spectral image (MSI) has dozens of spectral bands, which can deliver more information for real scenes. However, real MSIs are often corrupted by noises in the sensing process, which will further deteriorate the performance of higher-level classification and recognition tasks. In this paper, we propose a Low-rank Tensor Dictionary Learning (LTDL) method for MSI denoising. Firstly, we extract blocks from the MSI and cluster them into groups. Then instead of using the exactly low-rank model, we consider a nearly low-rank approximation, which is closer to the latent low-rank structure of the clean groups of real MSIs. In addition, we propose to learn an spatial dictionary and an spectral dictionary, which contain the spatial features and spectral features respectively of the whole MSI and are shared among different groups. Hence the LTDL method utilizes both the…
| Method | Model | Data |
|---|---|---|
| NL-means [5] | Filter | Matrix |
| K-SVD [1] | Sparsity | Matrix |
| BM3D [8] | Filter | Matrix |
| CSR [11] | Sparsity | Matrix |
| WNNM [14] | Low Rank | Matrix |
| LRTA [21] | Low Rank | Tensor |
| NLM3D [7] | Filter | Tensor |
| PARAFAC [17] | Low Rank | Tensor |
| BM4D [18] | Filter | Tensor |
| Tdl [19] | Sparsity and Low Rank | Tensor |
| TenSR [20] | Sparsity | Tensor |
| KBReg [23] | Sparsity and Low Rank | Tensor |
| the proposed LTDL | Sparsity and Low Rank | Tensor |
| Method | ||||||||
| PSNR | SSIM | SAM | ERGAS | PSNR | SSIM | SAM | ERGAS | |
| Noisy image | 20.000.00 | 0.140.07 | 0.940.24 | 552.89159.26 | 13.980.00 | 0.050.03 | 1.130.20 | 1105.74318.50 |
| K-SVD | 30.071.30 | 0.610.04 | 0.530.22 | 170.0238.26 | 27.301.44 | 0.460.05 | 0.610.23 | 233.8751.99 |
| BM3D | 36.922.88 | 0.920.03 | 0.210.09 | 78.4323.40 | 33.392.93 | 0.860.06 | 0.280.11 | 117.9736.64 |
| NLM3D | 36.472.93 | 0.930.04 | 0.240.10 | 85.6421.54 | 33.442.83 | 0.870.05 | 0.330.15 | 119.5330.15 |
| LRTA | 36.582.78 | 0.890.05 | 0.220.11 | 82.0926.02 | 33.082.68 | 0.820.08 | 0.280.12 | 122.3637.41 |
| PARAFAC | 33.164.17 | 0.840.11 | 0.280.14 | 127.3165.73 | 31.103.00 | 0.730.08 | 0.420.20 | 155.9261.05 |
| BM4D | 39.492.29 | 0.940.02 | 0.220.12 | 57.8112.63 | 35.532.10 | 0.870.03 | 0.340.17 | 91.3719.21 |
| Tdl | 39.222.37 | 0.940.02 | 0.170.10 | 59.6714.28 | 35.122.05 | 0.870.03 | 0.270.15 | 95.3322.70 |
| KBRreg | 40.632.15 | 0.950.03 | 0.240.22 | 51.8013.47 | 37.572.51 | 0.920.05 | 0.250.24 | 72.9820.00 |
| Ours | 41.502.82 | 0.970.01 | 0.090.03 | 46.1812.84 | 38.13 2.73 | 0.950.02 | 0.13 0.07 | 67.65 18.40 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Image and Signal Denoising Methods · Blind Source Separation Techniques
A Low-rank Tensor Dictionary Learning Method for Multi-spectral Images Denoising
Xiao Gong and Wei Chen
Beijing Jiaotong University
Code is available at https://www.dropbox.com/s/80ltjrxr2v9maeg/LTDL.zip?dl=0
Abstract
As a 3-order tensor, a multi-spectral image (MSI) has dozens of spectral bands, which can deliver more information for real scenes. However, real MSIs are often corrupted by noises in the sensing process, which will further deteriorate the performance of higher-level classification and recognition tasks. In this paper, we propose a Low-rank Tensor Dictionary Learning (LTDL) method for MSI denoising. Firstly, we extract blocks from the MSI and cluster them into groups. Then instead of using the exactly low-rank model, we consider a nearly low-rank approximation, which is closer to the latent low-rank structure of the clean groups of real MSIs. In addition, we propose to learn an spatial dictionary and an spectral dictionary, which contain the spatial features and spectral features respectively of the whole MSI and are shared among different groups. Hence the LTDL method utilizes both the latent low-rank prior of each group and the correlation of different groups via the shared dictionaries. Experiments on synthetic data validate the effectiveness of dictionary learning by the LTDL. Experiments on real MSIs demonstrate the superior denoising performance of the proposed method in comparison to state-of-the-art methods.
1 Introduction
A multi-spectral image (MSI) has dozens of spectral bands, where the wavelengths may range from infrared to ultra-violet. Compared with a RGB image which only has three spectral bands, an MSI provides more information which reveals features of the object hidden in the spectral domain. However, in many cases, MSIs suffer from corruptions or noises in the sensing process [2]. As a low level image processing technique, MSI denoising is key to many high-level computer vision tasks, such as segmentation and classification whose performance highly relies on the quality of the data.
As a model driven approach, dictionary learning methods have been used to find the basic atoms which comprise various signals of a training dataset. By using a learned dictionary of some signal ensembles, noises can be effectively removed via solving a sparse signal recovery problem for each patch of an image [13, 11]. For MSI denoising, applying the traditional dictionary learning methods, e.g., K-SVD [1], for each band leads to the poor performance, as it fails to exploit spectral information in MSIs [19].
Tensor dictionary learning, which keeps the multidimensional structure of tensors, has attracted growing interests of researchers to process images in the past years. Based on CANDECOMP/PARAFAC (CP) decomposition, Duan et al. extend the K-SVD method for tensors, where a higher order tensor dictionary is learned and each atom of the dictionary is a rank-one tensor [12]. By using the Tucker model of tensors, Zubair and Wang propose to learn multiple orthogonal dictionaries along different modes of tensors, where the core tensor have sparse non-zeros elements [28]. In [20], Qi et al. divide an MSI into small 3-order tensor blocks, and learn overcomplete dictionaries for each mode of the blocks via a two-phase block-coordinate-relaxation approach that includes sparse coding and dictionary updating. However, they fails to further employ all the structural information embedded in images.
In [8, 18], non-local similar small patches in space are clustered into groups and processed together, which improves the image denoising performance. In addition, in view of the fact that blocks extracted from MSIs have both the spatial correlation and the spectral correlation, the low-rank model is employed in different dictionary learning methods for MSI denoising. For example, Peng et al. apply dictionary learning for each group (i.e., a tensor) separately and enforce a low-rank Tucker approximation [19]. A simultaneously sparse and low-rank structure is considered for each group of an MSI in [23]. In both two methods, each group of an MSI is processed separately and a learned dictionary only captures information of the group, which fails to exploit inter-group correlations. Learning a shared dictionary among all groups of an MSI would gain from the multitask learning concept and be more effective to capture the atoms that comprises the signal.
In this paper, we propose a Low-rank Tensor Dictionary Learning (LTDL) method for MSI denoising, which differs with existing methods in two aspects:
- •
Dictionaries of the spatial domain and the spectral domain are trained by all tensor groups in the proposed method, so the features in space and spectrum can be learned from the whole MSI, while the denoising methods in [19, 23] use different dictionaries for distinct groups.
- •
Instead of enforcing the denoised groups to be exactly low-rank, we consider a more flexible model that each group is decomposed into a low-rank component and a non-low-rank component, as clean MSI groups are nearly low-rank in practice.
A flowchart of the proposed MSI denoising method is shown in Figure 1. Full bands blocks of an MSI are extracted by window and then similar blocks are clustered into groups. Then we obtain 3-order tensor groups by unfolding each blocks at spectral domain. The three modes of a group corresponding to the spatial domain, the spectral domain and blocks. Shared overcomplete dictionaries of all groups in both the spatial mode and the spectral mode are learned via the proposed LTDL method, where a nearly low-rank structure is enforced for the tensor approximation. An effective algorithm based on the alternating direction method of multipliers (ADMM) [4] is used to solve the optimization problem of the proposed LTDL method for MSI denoising. Experimental results demonstrate that the new method outperforms state-of-the-art methods in MSI denoising.
The rest of the paper is organized as follows: Section 2 reviews existing methods for image denoising based on the filter model, the sparsity model and the low-rank model. In section 3, we present the proposed LTDL method for MSI denoising. Experimental results on synthetic data and MSIs are provided in section 4. At last, conclusions are given in section 5.
2 Related Work
To decompose images with noises, one require the priors of the different signals. The “No Free Lunch” theory in machine learning suggests that all algorithms perform the same for the randomised data and we can achieve good performance only when the data has some structure and some appropriate model is used. In the past decade, various model driven methods have been proposed for image denoising, and some of them are summarised in Table 1.
The first category of image denoising methods [5, 8, 7, 18] uses different filters, e.g., the mean-value filter and the Wiener filter, to exploit the local correlation between adjacent pixels and/or the non-local correlation between similar small patches/blocks of an image. Another category of widely used methods considers the property that natural images or clustered groups of their patches usually exhibit low-rank structures. Different low-rank approximation methods have been proposed for image denoising such as nuclear norm regularization [11], Tucker low-rank decomposition [21] and CP low-rank decomposition [17]. Based on the sparsity model, dictionary learning methods assume that image patches/blocks are linear compositions of very few atoms selected from a dictionary. Dictionary learning and sparse representation in [1, 11], is firstly applied for 2D image denoising, and then extended to higher-order image denoising such as TenSR [20]. In recent years, researchers consider to exploit both the sparsity model and low rank model to better utilize the prior of the groups extracted from an MSI [19, 23]. Unfortunately, in these recent methods, dictionaries (or called factors) are learned separately for each tensor group, which deviates from the principle of dictionary learning and significantly increases the total number of dictionary atoms. Our work differs with the existing methods for MSI denoising. Instead of using the exact low-rank model, we consider a nearly low-rank structure for each tensor group of an MSI, and learning dictionaries that are shared among all groups.
3 Low-rank tensor dictionary learning for MSI denoising
3.1 Notation
The following notations are used throughout this paper. The order of a tensor is the number of modes. Elements of an -order tensor are denoted by , where refers to the th mode index. A mode- vector of an -order tensor is obtained from by varying index in the th mode while keeping the indices of other modes fixed. The unfolding matrix of a tensor at the th mode is denoted as , where the columns are all mode- vectors of . The tensor can be obtained by folding the matrix at the th mode. The -mode product of the tensor and a matrix is a tensor , whose elements are computed by . The inner product of two tensors is the sum of the products of all entries, i.e., The norm and the Frobenius norm of a tensor are defined as and , respectively. The nuclear norm of a matrix is denoted by , which is the sum of all singular values of . The symbol of denotes the Kronecker product of matrices.
3.2 The observation of the nearly low-rank tensor structure in real MSI groups
For an MSI with spatial size and spectral bands, we extract overlapping full-band blocks by using a sampling window that traverses the whole MSI with step lengthes and in the two spacial coordinates. Each block is unfolded in the spectral mode to be a matrix , which has a spatial mode and a spectral mode. To exploit the non-local self-similarity of images, blocks of the MSI can be clustered into groups, where each group forms a tensor (called a tensor group). The th tensor group consists of similar blocks of the MSI and . Tensor groups are assumed with a low-rank structure in a variety of works [19, 24, 23], owning to the correlations across the spatial domain, the spectral domain and similar blocks.
To enforce the low-rank structure of the recovered signal from a noisy tensor group of an MSI, there are at least two approaches. One approach is using the low-rank regularization. For example, by employing the convex tensor nuclear norm [16], the tensor group denoising problem is casted into a convex optimization problem, which is given as
[TABLE]
where () are positive scalars that controls the low-rank penalties for different modes. The optimization problem in (1) can be efficiently solved by the singular value thresholding [6]. Another approach aims to find the rank- approximation of the noisy tensor group in an optimal least squares sense with , and in [19], which is given by
[TABLE]
where is a smaller core tensor, and () are factors. The denoised tensor group is . The optimization problem in (2) is nonconvex and different algorithms are proposed in literature such as higher-order singular value decomposition (HOSVD) [9] and higher-order orthogonal iteration (HOOI) [10].
However, we observe that usually tensor groups of clean real MSIs are not exactly low-rank. As shown in Fig. 2, the curve (in black) for the sorted singular values of a tensor group of the clean “pompoms” MSI in the 3rd mode has a very long tail, i.e., many singular values are close to [math] rather than exactly [math]. As shown in 2, these small singular values contain the texture information of the MSI. Therefore, enforcing an exactly low-rank structure of a tensor group would lead to the lost of important information of the MSI. To avoid this drawback, we consider a nearly low-rank structure for each tensor group and pose the following optimization problem for denoising:
[TABLE]
where is the th recovered tensor group, and is the weight to balance fidelity and the low-rank structure. Note that the small singular values of results from , and are penalized via the Frobenius norm. The optimization problem in (3) can be solved by updating the variables in and alternately. As shown in Fig. 2, with appropriate choose of the weight , the curve for the sorted singular values of the proposed method is close to the curve of the clean MSI tensor group, while the methods corresponding to (1) and (2) lose information embedded in the small singular values.
3.3 The proposed LTDL method for MSI denoising
In this subsection, we introduce the proposed MSI denoising method that learns shared tensor overcomplete dictionaries and considers the nearly low-rank structure. Two dictionaries, i.e., and , are learned from all tensor groups of an MSI, where corresponds to the spatial domain and corresponds to the spectral domain. We define and as the redundancy ratios (i.e., the ratio of the number of columns to the number of rows, and ) corresponding to the spatial dictionary and the spectral dictionary, respectively. The proposed LTDL method for MSI denoising is formulated as follows:
[TABLE]
where is the representation of th tensor group, and are the weights corresponding to regularization for sparsity and low-rank, respectively. For the last term of the object in (4), denotes the rank- approximation of the reconstructed tensor group .
3.4 Algorithm development
The proposed optimization problem in (4) is nonconvex. Here, we employ the alternating direction method of multipilers(ADMM) [4], which leads to solve several subproblems.
We first introduce auxiliary tensors () and the optimization problem in (4) is equivalent to
[TABLE]
The augmented Lagrangian function for the above problem can be given as:
[TABLE]
where are the Lagrange multipiers, is a positive scalar, and columns of the dictionaries and are constrained by a unit power.
Now we introduce the strategy for solving (5) based on the ADMM. To minimize the augmented Lagrange function (6), we fix the dictionaries and , and all multipliers . Then (6) can be split into separate optimization problems with objects as:
[TABLE]
, and can be updated alternatively by solving the optimization problem with all the other variables fixed. In specific, to update , one needs to solve the tensor low multilinear rank approximation problem given by
[TABLE]
which can be handled by HOOI [10] and the solution is denoted as
[TABLE]
To update , with irrelevant terms removed, the problem of becomes:
[TABLE]
which has a closed-form solution
[TABLE]
where and is an identity matrix. The tensor can be obtained by folding at the rd mode. Updating requires to solve the following optimization problem:
[TABLE]
which leads to the solution
[TABLE]
where denotes the soft-thresholding operator and . The elements of is , where denotes the sign function. Note that variables for different tensor groups () can be updated in parallel.
Next we consider to update the dictionaries and , which are shared by all groups. By letting , the optimization problem turns to be
[TABLE]
where and are obtained by stacking all groups of and at the rd mode, respectively. Define . Then can be updated by
[TABLE]
where and . The optimization problem in (15) is a quadratically constrained quadratic programming problem and can be solved using a Lagrange dual [15]. Thus, the spatial dictionary can be updated by
[TABLE]
where and () are dual variables whose values are obtained by solving the dual problem. Similarly, letting , can be obtained by solving the following problem:
[TABLE]
Then the spectral dictionary is updated by
[TABLE]
where and () are optimized dual variables of (17).
For the last step, we update the Lagrange multipliers in turn by
[TABLE]
The proposed LTDL method for MSI denoising is summarized in Algorithm ‣ 1.
In the following theorem, we provide a weak convergence condition for the proposed algorithm, where we assume the dictionaries are fixed. By alternatively updating the dictionaries and the other variables, it can be guaranteed that the value of the cost function in (4) would not increase. In practice, to speed up the algorithm, one can update the dictionaries in each iteration without waiting for the other variables achieving convergence. Although we cannot provide formal convergence guarantees in this case, we do not have convergence problems in our experiments.
When and are fixed, for th iteration, the sequences , and () satisfy:
[TABLE]
where is initial value. Proofs are provided in the supplementary document.
4 Experimental results
In this section, we evaluate the effectiveness of the proposed LTDL using both synthetic data and real MSIs.
4.1 Dictionary Learning Performance with Synthetic Data
We first evaluate the dictionary learning performance of the proposed method on synthetic data. Both the spatial dictionary and the spectral dictionary are generated randomly with normalized columns. For each tensor group, its tensor representation is generated randomly with the sparsity level 6, i.e., no more than 6 dictionary atoms are used to constitute the tensor group. The nonzero components in can be rewritten as a matrix, which is generated as the production of two random matrices and . Therefore, all generated tensor groups have ranks no higher than . We generate 200 different tensor groups in this way, and then add Gaussian noises with standard deviation . The learned dictionaries are compared against all atoms of the generated dictionaries and we find the most close pair via , where is a generated dictionary atom and is a recovered dictionary atom111We consider the equivalent dictionary in the experiments.. The learning is seemed as success if the distance is less than . The convergence behaviour of the proposed algorithm, i.e., success recovery ratio of the ground truth dictionary versus the number of iterations is shown in Figure 3, where we provide the results of 20 trials for each noise level. With the decrease of the noise variance, the performance of success recovery ratio is improved.
4.2 Denoising Performance for Real MSIs
MSI datasets: Two real MSI datasets, i.e., the Columbia dataset [25]222http://www.cs.columbia.edu/CAVE/databases/multispectral/ and the urban area HYDICE MSI333https://erdc-library.erdc.dren.mil/xmlui/handle/11681/2925, are used to evaluate the denoising performance of the proposed algorithm. The Columbia dataset includes scenes that are separated into 5 sections. Each MSI has the size of in space and includes full spectral resolution reflectance data from to at steps, which leads to 31 bands. These MSIs are adopted as clean data, and we generate additive noises to evaluate the denoising performance of the proposed algorithm. For the natural urban area HYDICE MSI, we select the bands from to , which are severely damaged (i.e., the noise is not generated).
Experimental settings: In the proposed LTDL method, the spatial window size is set as and the step size is set as . Groups of an MSI are clustered by using the k-means++ [3], and each cluster forms a tensor group. We set and as the parameters in the ADMM of our algorithm. The redundancy ratio of dictionaries are set as . We add Gaussian noise with the mean [math] and the standard deviation on the whole MSIs in the Columbia dataset. We set the sparsity weight and the low-rank weight . The rank parameters in (9) are estimated by using the tensor rank estimation method proposed in [26]. For unknown noises, we suggest that the is smaller than the size of th tensor group.
The proposed algorithm is compared with both denoising methods for 2D images including K-SVD [1] and BM3D [8] and denoising methods for 3D images including NLM3D [7], LRTA [21], PARAFAC [17], BM4D [18], Tdl [19], and KBRreg [23]. For denoising methods for 2D images, each MSI is processed band by band as multiple 2D images.
MSI denoising with generated noises: We generate Gaussian noises for MSIs in the Columbia dataset, and employ four different performance indicators including peak signal-to-noise ratio (PSNR), structural similarity (SSIM), spectral angle mapper (SAM) [27] and dimensionless global relative error of synthesis (ERGAS) [22], which are widely used to evaluate recovery quality for MSIs. Recovered MSIs with higher PSNR and SSIM or lower SAM and ERGAS are considered with better quality. PSNR and SSIM are classical spatial-based quality indices, while ERGAS and SAM are spectral-based quality indices. We report the averaged denoising performance and the standard deviation of all the compared methods in Table 2. It can be observed that the proposed method outperforms all the competing methods under all of the four different quality indices.
To visualize the denoising performance, we display the denoised 590nm band image of the “flowers” MSI in Figure 4. By zooming in the petal part of the image, it can be observed that the proposed LTDL retains the gynoecium details of the flower, while most other methods blur the texture.
Real MSI denoising: In reality, an MSI may suffer from different kind of noises including not only the Gaussian noise but also the non-Gaussian noise, such as the stripe noise. Now we investigate the effectiveness of the proposed LTDL for recovering a real corrupted MSI. The denoising results are compared in Figure 5, where the 197th band of the HYDICE MSI has light noise and the 207th band has severe noise. The results of the LRTA, the PARAFAC and the Tdl are much poor than the others, which are not provided here owing to the space limitation. The BM4D is more appropriate for MSIs than the BM3D, and thus we also omit the results of BM3D to save the space. We zoom in a part of the restored image which involves stripe noise. It can be observed that although most methods achieve similar performance in the 197th band image that has light noise, the proposed LTDL is able to restore the 207th band which suffers from more complex and severe noise.
The learned dictionaries of the proposed method are shown in Fig. 6. Atoms in the spatial dictionary represent the spatial features of the HYDICE MSI. To enhance visualization, we reorganize each column into a patch of the size . Atoms of the learned spectral dictionary correspond to various spectral features of different bands.
5 Conclusion
This paper presents an effective tensor dictionary learning method for restore high dimensional MSIs. The proposed LTDL method exploits the nearly low-rank structure in a group of similar blocks in the natural MSI and also exploits shared dictionaries among different groups, which makes the proposed method distinct to existing methods. Experimental results show the superior performance of the proposed method for denoising MSIs with both simulated corruptions and real corruptions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Aharon, M. Elad, A. Bruckstein, et al. K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Processing , 54(11):4311, 2006.
- 2[2] B. Aiazzi, L. Alparone, A. Barducci, S. Baronti, and I. Pippi. Information-theoretic assessment of sampled hyperspectral imagers. IEEE Trans. Geoscience and Remote Sensing , 39(7):1447–1458, 2001.
- 3[3] D. Arthur and S. Vassilvitskii. K-means++: the advantages of careful seeding. pages 1027–1035. Society for Industrial and Applied Mathematics, 2007.
- 4[4] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine learning , 3(1):1–122, 2011.
- 5[5] A. Buades, B. Coll, and J.-M. Morel. A non-local algorithm for image denoising. In CVPR , volume 2, pages 60–65. IEEE, 2005.
- 6[6] J.-F. Cai, E. J. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization , 20(4):1956–1982, 2010.
- 7[7] P. Coupé, P. Yger, S. Prima, P. Hellier, C. Kervrann, and C. Barillot. An optimized blockwise nonlocal means denoising filter for 3-d magnetic resonance images. IEEE Trans. Medical Imaging , 27(4):425–441, 2008.
- 8[8] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Processing , 16(8):2080–2095, 2007.
