A Dual Sparse Decomposition Method for Image Denoising
Hong Sun, Chen-guang Liu, Cheng-wei Sang

TL;DR
This paper introduces a dual sparse decomposition technique for image denoising, especially effective under strong noise conditions, by decomposing dictionaries based on atom occurrence frequency, leading to superior denoising performance.
Contribution
It presents a novel dual sparse decomposition approach that enhances image denoising by utilizing a new criterion for sub-dictionary selection based on atom frequency.
Findings
Outperforms state-of-the-art denoising methods in PSNR and SSIM.
Improves subjective visual quality of denoised images.
Effective under strong noise conditions.
Abstract
This article addresses the image denoising problem in the situations of strong noise. We propose a dual sparse decomposition method. This method makes a sub-dictionary decomposition on the over-complete dictionary in the sparse decomposition. The sub-dictionary decomposition makes use of a novel criterion based on the occurrence frequency of atoms of the over-complete dictionary over the data set. The experimental results demonstrate that the dual-sparse-decomposition method surpasses state-of-art denoising performance in terms of both peak-signal-to-noise ratio and structural-similarity-index-metric, and also at subjective visual quality.
| Input: Image data (Equ.(1) |
| Grouping: For patch , form group according Equs. (14) or (15) |
| Dual sparse decomposition: For each group do |
| - Sparse decomposition: by solving Equ.(2) |
| - Subspace decomposition: by Equs.(5, (8)-(11) |
| - Linear reconstruction on : by Equs. (12)-(13) |
| Aggregate: to form denoised image |
| Output: Denoised image . |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Sparse and Compressive Sensing Techniques · Advanced Image Processing Techniques
A Dual Sparse Decomposition Method
for Image Denoising
Hong Sun
1School of Electronic Information
Wuhan University
430072 Wuhan, China
2Dept. Signal and Image Processing
Telecom ParisTech
46 rue Barrault, 75013 Paris, France
Email: [email protected]
Chen-guang Liu
1School of Electronic Information
Wuhan University
430072 Wuhan, China
2Dept. Signal and Image Processing
Telecom ParisTech
46 rue Barrault, 75013 Paris, France
Email: [email protected]
Cheng-wei Sang
School of Electronic Information
Wuhan University
430072 Wuhan, China
Email: [email protected]
Abstract
This article addresses the image denoising problem in the situations of strong noise. We propose a dual sparse decomposition method. This method makes a sub-dictionary decomposition on the over-complete dictionary in the sparse decomposition. The sub-dictionary decomposition makes use of a novel criterion based on the occurrence frequency of atoms of the over-complete dictionary over the data set. The experimental results demonstrate that the dual-sparse-decomposition method surpasses state-of-art denoising performance in terms of both peak-signal-to-noise ratio and structural-similarity-index-metric, and also at subjective visual quality.
I Introduction
Two main issues are involved in the denoising problem. One is the filtering technique by signal analysis to identify the information underlying the noisy data. The other is grouping technique by clustering technique to provide homogeneous signals for filtering.
Almost all filtering techniques assume that the involved signal should be homogeneous. Therefore, a grouping procedure is generally required before filtering. Many edge detection and image segmentation techniques [1] are used in image denoising. Recently, a nonlocal self-similarity method [2] provides a potential breakthrough for data grouping, which is adopted in this paper.
The filtering technique is developed in the past 50 years or so from many diverse points of view, statistical estimation method, such as Viener filter, adaptive filter, etc. [3]; transform-domain method, such as Principal Components Analysis [4], wavelet shrinkage [5], etc., and so on. The underlying assumption of these filtering methods is that information in the noisy data has a property of energy concentration in a small linear subspace of the overall space of possible data vectors, whereas additive noise is typically distributed through the larger space isotropically.
However, in many practical cases, some components with low energy might actually be important because they carry information relative to the signal details. On the contrary, when dealing with noise with non-Gaussian statistics, it may happen that some noise components may have higher energies. Consequently, a major difficulty of filtering is to separate the information details from noise. A way to deal with this problem is cooperative filtering technique, such as Turbo iterative filter [6].
In recent years, sparse coding has attracted significant interest in the field of signal denoising [7] upon an over-complete dictionary. A sparse representation is a signal decomposition on a very small set of components (called atoms) which are adapted to the observational data. The sparse-decomposition based denoising is much better at the trade-off between the preservation of details and the suppression of noise. However, the sparse decomposition is adapted to noisy data so that separating details from noise still is at issue.
In this paper, we propose a dual sparse decomposition method for filtering. The first decomposition is to make an over-complete dictionary to reject some noises which really distributed through the larger space isotropically but to preserve the information details as much as possible. The second decomposition is to identify principal atoms to form a sub-dictionary which preserve well the weak information details and simultaneously suppress strong noises.
This article is organized as follows: Section 2 analyzes some limitations of the classical sparse decomposition for denoising. Section 3 presents the principle of the proposed dual sparse decomposition. Section 4 shows some experimental results and comparisons with state-of-art image denoising methods. Finally, we draw the conclusion in Section 5.
II Sparse Decomposition for Denoising
We start with a brief description of the classical sparse decomposition and analyze their limitations for denoising.
The sparse decomposition of observations based on a dictionary . When , the dictionary is said over-complete. is a basis vector, also called an atom of the dictionary. They are not necessarily independent. With observational data set:
[TABLE]
a dictionary and the coefficients can be the solution of the following equation [8]:
[TABLE]
where denotes -norm and denotes -norm. In equation (2), is the sparse code of the observation . The allowed error tolerance can be chosen according to the standard deviation of the noise. The sparse decomposition can be written in matrix form as:
[TABLE]
where the matrix of size is composed of sparse column vectors :
[TABLE]
An estimate of the underlying signal embedded in the observed data set would be:
[TABLE]
The over-complete dictionary in sparse decomposition can effectively capture the information patterns and reject white Gaussian noise patterns. However, we note that the learning algorithm for dictionary by equation (2) would fall into a dilemma of preserving weak derails and suppressing noise. On one hand, in order to suppress noise, the allowed error tolerance in equation (2) should be small enough. As a result, certain weak details would be lost. On the other hand, in order to capture weak details, cannot be too small. Otherwise some atoms would be so noisy that degrade the denoising performance. Fig. 1 shows an example to show this situation. Taking a noisy image degraded by white noise with standard deviation (Fig. 1a), we make two different dictionaries and (Fig. 1b) by solving equation (2) with and respectively. We got two different retrieved images and (Fig. 1b) respectively by equation (4). Intuitively, the noise is well suppressed in but some information details are lost. On the contrary, more details are reserved in but it is rather noisy.
Considering the above limitation of sparse-decomposition-based denoising, our idea of dual sparse decomposition is to make a two-stop sparse decomposition: The first step is to make an over-complete dictionary by learning from the observational data with a lower allowed error tolerance according to equation (2). Thereby, the obtained dictionary can capture more information details although it contains some noisy atoms. The second step is to make a sub-dictionary decomposition on to reject some atoms too noisy.
III Sparse Subspace Decomposition
To get the sub-dictionary, we introduce a novel criterion to the sparse subspace decomposition of a learned dictionary and a corresponding index of significance of the atoms.
III-A Occurrence Frequency of Atom
Atoms in the sparse decomposition are prototypes of signal segments. This property allows us to take the atoms as a signal pattern. Thereupon, some important features of the signal pattern could be considered as a criterion to identify significant atoms. We note a common knowledge about the regularity of signal: A signal pattern must occur in meaningful signals with higher frequency even with a lower energy, such as the geometrical regularity of image structures like edges and textures. On the contrary, a noise pattern would hardly be reproduced in observed data even with a higher energy. Therefore, we propose to take the frequency of atoms appeared in the data set as the criterion to identify principal atoms [9]. In fact, the frequency of atoms is a good description of the signal texture [10].
We intend to find out a measurement of the frequency of atom from the sparse codes. Coefficient matrix in the sparse representation (equation (3)) is composed by sparse column vectors . Let us consider the row vectors of coefficient matrix :
[TABLE]
where
[TABLE]
Note that the row vector is not necessarily sparse.
Thus, the coefficient matrix can be written by row vectors as:
[TABLE]
Then equation (4) can be expressed by reordered dictionary and its coefficient as:
[TABLE]
Denoting the zero pseudo-norm of , we find that is just the number of occurrences of atom over the data set . We can define the frequency of the atom as :
[TABLE]
III-B Subspace decomposition on Over-complete Dictionary
Taking vectors from equation (5), we calculate their -norms and rank them in descending order:
[TABLE]
Corresponding to the order of , the reordered dictionary is written as:
[TABLE]
Equation (6) becomes as:
[TABLE]
Then, the first atoms of can span a principal subspace and the remaining atoms span a noise subspace as:
[TABLE]
In practical application, is the threshold of to separate the principal sub-dictionary from the noise sub-dictionary. We set the maximum point of the histogram of to as:
[TABLE]
An estimate of the underlying signal embedded in the observed data set can be obtained on the principal sub-dictionary simply by linear combination:
[TABLE]
Note that .
We show an example of the proposed dual sparse decomposition in Fig. 1(c). The learned over-complete dictionary is decomposed into a principal sub-dictionary and a noise sub-dictionary under the atom’s frequency criterion. The retrieved image by the dual sparse decomposition method has a super performance at preserving fine details and at suppressing strong noise. We note that the residual image on the noise sub-dictionary contains some information but very noisy. This is because the atoms of the over-complete dictionary are not independent. The information in the residue image is also in existence in .
III-C Application to Filtering
A major difficulty of filtering is to suppress noise Gaussian or non-Gaussian and to preserve information details simultaneously. We use the peak signal-to-noise ratio (PSNR) to assess the noise removal performance:
[TABLE]
and the structural similarity index metric (SSIM) between denoised image and the pure one to evaluate the preserving details performance:
[TABLE]
where is the average of , is the variance of , is the covariance of and , and and are small variables to stabilize the division with weak denominator.
From the example shown in Fig. 1, the retrieved image actually by the K-SVD filter [11] with the classical sparse decomposition has a high performance with and but some information details are obviously lost. On the contrary, the retrieved image is noisier with and but more information details are reserved. Making a dictionary decomposition on noisier but with more details, the retrieved image based on the principal sub-dictionary has a higher performance with and .
Fig. 2 shows an image filtering result based on the proposed dual sparse decomposition and a comparison with K-SVD algorithm. From the results, the dual sparse decomposition method outperforms K-SVD method by about in PSNR and by about in SSIM. In terms of subjective visual quality, we can see that the corner of mouth and the nasolabial fold with weak intensities are much better recovered by the dual sparse decomposition method.
Fig. 3 shows the despeckling results of simulated one-look SAR scenario with a fragment of Barbara image. From the result by a probabilistic patch based (PPB) filter [12] which can cope with non-Gaussian noise, we can see that PPB can well remove speckle noise. However, it also removes low-intensity details. The dual sparse decomposition method shows advantages at preserving fine details and at suppressing strong noise.
IV Application to Denoising
In practical applications, our images are generally with spatial complicated scene. On the other hand, the used filtering techniques are generally suitable to homogeneous images. For image denoising based on the sparse decomposition, the hypotheses of signal sparsity and component reproducibility mean also the condition of homogeneity. In order to make the involved signal homogeneous, we select homogeneous pixels before filtering by a self-similarity measure [2] . In applications of image denoising, can be specified as Euclidean distance between the reference patch and a given patch as:
[TABLE]
The smaller is, the more similar between and is. This self-similarity matches well the property of highly repetitive structures of images.
In applications of image despeckling, becomes the probabilistic patch-based similarity proposed by [12] as:
[TABLE]
where and the equivalent number of looks.
For a given reference patch , we make grouping stacks with its -most similar patches to form a group of data . In our experiments, we take . Then we apply a filtering algorithm to each of the data groups . Our denoising algorithm is presented in Table I:
To compare with the state-of-art denoising algorithm, we take the BM3D method [13], one of the best method nowadays for image denoising. In the BM3D method, a block-matching grouping is also used before filtering. In the experiments, the used dictionaries s are of size are of size ( atoms), designed to handle image patches of size pixels.
Fig. 4 shows the results of denoising an image with a strong additive zero-mean white Gaussian noise and their performances of the dual-sparse-decomposition method and the BM3D method. Fig. 5 shows the results of despeckling a simulated one-look SAR image with non-Gaussian noise and their performances of the dual-sparse-decomposition method and the SAR-BM3D method [14]. The experimental results demonstrate some advantage of the dual-sparse-decomposition method at preserving fine details and at suppressing speckle noise, also with a better subjective visual quality over the BM3D method.
V Conclusion
This work present a new signal analysis method by a proposed dual sparse decomposition, leading to state-of-the-art performance for image denoising. The proposed method introduces a sub-dictionary decomposition on an over-complete dictionary learned under a lower allowed-error-tolerance. The principal sub-dictionary is identified under a novel criterion based on the occurrence frequency of atoms. The experimental results have demonstrated that the proposed dual-sparse-decomposition-based denoising method has some advantages both at preserving information details and at suppressing strong noise, as well as provides retrieved image with better subjective visual quality.
It is perfectly possible to straightforward extension the proposed dual-sparse-decomposition to application of feature extraction, inverse problems, or machine learning.
Acknowledgment
This work was supported by the National Natural Science Foundation of China (Grant No. 60872131).
The idea of the dual Sparse decomposition arises through a lot of deep discussions with Professor Henri Matre at Telecom-ParisTech. The mathematical expressions in this paper are corrected by Professor Didier Le Ruyet at CNAM-Paris.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. Ma i ^ ^ 𝑖 \hat{i} tre, Le Traitement des Images . Paris, FRANCE: Lavoisier, 2003.
- 2[2] A. Buades, B. Coll, and J. Morel, “A review of image denoising algorithms, with a new one,” Multiscale Model. Simul. , vol. 4, no. 2, pp. 490–530, 2005.
- 3[3] D. G. Manolakis, V. Ingle, and S. Kogon, Statistical and Adaptive Signal Processing . New York: Mc Graw Hill, 2000.
- 4[4] T. Moon and W. Stirling, Mathematical Methods and Algorithms for Signal Processing . New Jersey: Prentice-Hall, 2000.
- 5[5] D. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard, “Wavelet shrinkage: asymptopia?” Journal of the Royal Statistical Society ser. B , vol. 57, pp. 301–337, 1995.
- 6[6] H. Sun, H. Ma i ^ ^ 𝑖 \hat{i} tre, and B. Guan, “Turbo image restoration,” in Proc. IEEE International Symposium on Signal Processing and Its Applications (ISSPA 2003) , Paris, France, 2003, pp. 417–420.
- 7[7] A. Hyvarinen, P. Hoyer, and E. Oja, “Sparse code shrinkage for image denoising,” in Proc. IEEE International Joint Conference on Neural Networks Proceedings , 1998, pp. 859–864.
- 8[8] M. Aharon, M. Elad, and A. Bruckstein, “K-svd: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Processing , vol. 54, no. 11, pp. 4311–4322, 2006.
