Statistical learning of rational wavelet transform for natural images
Naushad Ansari, Anubha Gupta

TL;DR
This paper introduces a statistical learning method for rational wavelet transforms tailored for natural images, demonstrating improved performance in compressed sensing reconstruction over standard wavelet transforms.
Contribution
It proposes a novel Rational Wavelet Transform Learning in Statistical sense (RWLS) method using a lifting framework with a closed form solution.
Findings
RWLS outperforms standard dyadic wavelet transforms in image reconstruction
The method is effective for compressed sensing applications
Closed form solution simplifies the learning process
Abstract
Motivated with the concept of transform learning and the utility of rational wavelet transform in audio and speech processing, this paper proposes Rational Wavelet Transform Learning in Statistical sense (RWLS) for natural images. The proposed RWLS design is carried out via lifting framework and is shown to have a closed form solution. The efficacy of the learned transform is demonstrated in the application of compressed sensing (CS) based reconstruction. The learned RWLS is observed to perform better than the existing standard dyadic wavelet transforms.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 1
Figure 2
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 28
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 1
Figure 2
Figure 36
Figure 37
Figure 38
Figure 39
Figure 40Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Statistical Learning of Rational Wavelet Transform for Natural Images
Abstract
Motivated with the concept of transform learning and the utility of rational wavelet transform in audio and speech processing, this paper proposes Rational Wavelet Transform Learning in Statistical sense (RWLS) for natural images. The proposed RWLS design is carried out via lifting framework and is shown to have a closed form solution. The efficacy of the learned transform is demonstrated in the application of compressed sensing (CS) based reconstruction. The learned RWLS is observed to perform better than the existing standard dyadic wavelet transforms.
**Index Terms— ** Rational Wavelet, Statistically Matched Wavelet, Natural Images, Lifting Framework
1 Introduction
Transform learning (TL) is an active research area where the sparsifying transform along with the transform domain signal are learned using some constraints for a class of signals. Currently, TL is being used in several applications including image/video denoising and MRI reconstruction [1, 2, 3]. While TL is being used actively, non-convexity of the problem having no closed form solution makes it difficult to solve. Hence, greedy algorithms are used to solve TL problem.
Among existing transforms, discrete wavelet transform (DWT) is widely used in applications because of its ability of efficient signal representation [4]. Non-uniqueness of wavelet basis motivates one to learn wavelet transform in applications. Wavelet transform learning can be viewed as a specific case of transform learning. Since the integer translates of the associated wavelet filters form the basis in -space, wavelet transform learning corresponds to learning of wavelet filters.
Generally, dyadic wavelet transform is used that decomposes input signal spectrum into two uniform frequency bands via two-channel filterbank. On the other hand, rational wavelet transform (RWT) provides non-uniform frequency band representation of signal spectrum that is seen to be useful in some applications [5, 6]. RWT has also been used in pattern recognition [7] and feature extraction [8]. Although methods for RWT designs have been presented in the literature [9, 10, 11, 12], designed wavelets are independent of the signal of interest. Recently, a method has been proposed in [13] to learn rational wavelet deterministically from a given signal. Since [13] requires full signal, it cannot be used in inverse problems such as CS-based reconstruction where one does not have access to the full original signal.
This paper proposes rational wavelet learning for natural images. It has been shown that natural images can be modeled as fractional Brownian motion (fBm) processes in[14]. fBm processes are Gaussian non-stationary random processes with stationary increments that form a class of statistically self-similar processes [15] and have been used widely in image processing [16, 17].
The above discussion of transform learning, flexibility of rational wavelet transform, and modeling of natural images via fBm processes motivates us to learn rational wavelet transform for natural images in statistical sense. Specifically, statistics of a set of natural images are used to propose method for learning separable rational wavelet transform for this class of images. Lifting framework for rational wavelet introduced in [13] is utilized in the proposed work and is called the RWLS method. The proposed formulation leads to convex problem that can be solved by least squares making the RWLS method computationally efficient.
Following are the salient contributions of this work:
Statistical learning of rational separable wavelet transform for natural images is proposed. 2. 2.
Lifting framework, that is Digital Signal Processing (DSP) hardware friendly, is used in the proposed method making the learned transform easily implementable on hardware. 3. 3.
The proposed formulation leads to convex problem unlike conventional TL and can be solved easily. 4. 4.
The proposed RWLS is applied in compressed sensing based reconstruction and is observed to perform better than the existing dyadic wavelet transforms.
2 Brief Background
2.1 Lifting in Dyadic Wavelet
Lifting methodology supports customized wavelet design [18], [19]. This design is modular, guarantees perfect reconstruction, and allows non-linear filters to be part of the wavelet structures. A general lifting scheme consists of three steps: Split, Predict, and Update (Refer to Fig. 1). In the split step, given input signal is divided into even and odd indexed samples. The corresponding filterbank structure is called as the Lazy wavelet system [19] and is converted to the conventional wavelet system using successive predict and update stage filters as shown in Fig. 2 with analysis filters labeled as , and the synthesis filters as , .
In the Predict Lifting step, odd samples are predicted from the neighboring even samples using the predictor or vice-versa. This step modifies the analysis highpass filter and the synthesis lowpass filter as:
[TABLE]
[TABLE]
The Update Lifting step modifies the analysis lowpass filter and the synthesis highpass filter. The update step filter is denoted with the symbol and the related equations are given as:
[TABLE]
[TABLE]
2.2 Rational Wavelet
Let us consider Fig. 3(a) with 2-channel rational wavelet filterbank that can be converted into an equivalent uniformly decimated M-band structure (Fig. 3(b)). Filters and of Fig. 3(b) can be written as an equivalent filter of Fig. 3(a) using the following equation:
[TABLE]
Similarly, synthesis filters and of Fig. 3(b) can be written as an equivalent filter of Fig. 3(a) using the following equation:
[TABLE]
while other filters remain same, i.e., and .
2.3 Fractional Brownian Motion
Fractional Brownian motion is a Gaussian, zero mean, self similar, non-stationary random process with stationary increments [20]. The auto-covariance of the corresponding discrete time process is given by:
[TABLE]
where , and is the self-similarity index, also called as Hurst exponent. The statistical properties of fBm processes are completely characterized by the single parameter that can be estimated using the maximum likelihood estimation method presented in [21].
3 Proposed RWLS Learning Method
This section presents the proposed RWLS learning method on rational wavelet statistically matched to natural images. Learning of separable two-dimensional (2D) rational wavelet is presented that requires learning 1-D RWLS separately matched to the row space and the column space of natural images. The proposed strategy is identical on either the row or the column space. For the sake of readers’ ease, let us first consider design for the column space.
3.1 Proposed Learning for the Column Space
Consider the initial architecture of uniformly decimated 3-band Lazy wavelet with filters (in Fig. 3(b)):
[TABLE]
This Lazy wavelet is subsequently transformed to equivalent rational wavelet via (5) and (6). On feeding the vectorized column form of collection of natural images, labeled as , through this rational Lazy wavelet filterbank, following approximate and detail subband coefficients are obtained:
[TABLE]
[TABLE]
Next, the lowpass and highpass filters of the Lazy rational wavelet structure are lifted via predict and update stage polynomial learned as explained in the following subsections.
3.1.1 Predict stage
We require to predict one branch of samples with the help of the other branch in the predict stage. In rational wavelet structure, this requires the concept of rate converter as proposed in [13] because the output sample rate of two branches is unequal (refer to Fig. 4). Considering the predict polynomial filter as
[TABLE]
we obtain
[TABLE]
Thus, the choice of in (11) allows to be exactly predicted from the neighboring samples. These updated detail coefficients can also be viewed as the error in predicting the lower branch samples. Hence, (3.1.1) is re-written as
[TABLE]
is learned by minimizing the mean squared prediction error (mse) given by:
[TABLE]
where denotes the expectation operator.
To minimize mse, mse vector is we differentiated with respect to and is equated to zero as:
[TABLE]
Assuming that the input signal , corresponding to the column space of natural images, belongs to an fBm process, and are computed using (7) and (3.1.1) is solved for . On simplifying the structure of Fig. 4, the updated equivalent analysis highpass filter, using the learned predict filter , can be written as:
[TABLE]
where . For the update of the corresponding synthesis lowpass filter, the rational wavelet structure is converted to the equivalent -band structure and the polyphase matrix of the analysis side is computed using , and . On applying the condition of perfect reconstruction [22] in (17), polyphase matrix of the synthesis side is computed.
[TABLE]
where , , and is identity matrix. From and (6), updated filter of the rational wavelet is computed. This completes the predict stage.
3.1.2 Update Stage
Next, the update stage filter shown in Fig.5 is learned. Again, rate converter, shown in Fig.5 as proposed in [13], is required. The reconstructed signal at the upper branch is shown as . Since the natural images are generally rich in low frequency content, should be as close as possible to the input signal . This allows us to learn the update stage filter by minimizing the energy difference of the two signals as below:
[TABLE]
where . Signal can be written in terms of update stage filter s that allows us to solve (18). Once is learned, analysis lowpass filter is updated as:
[TABLE]
Synthesis highpass filter is updated using the method similar to the one used to update the synthesis lowpass filter. This completes the proposed learning. Since the lifting framework is modular, more predict and update stages can be appended to get longer length filterbanks. This is to note that for learning the RWT for the column space of natural images, we vectorized an ensemble of natural images column-wise and stacked them below each other to build a 1-D signal. Next, we estimate the Hurst exponent H of this column vector and learn the RWT as presented above.
3.2 Proposed Design for the Row Space
Corresponding to the row-space design, we vectorize all images row-wise and stack them to build a 1-D signal. Next, we estimate the Hurst exponent H of this row vector and learn the RWT using the method presented in the previous sub-section.
4 Application
The proposed RWLS method is applied on natural images as separable wavelets. The performance of the learned RWLS is compared with standard bi-orthogonal 5/3 and 9/7 wavelets in the application of compressive sensing based reconstruction of natural images of dimension . An ensemble of ten natural images shown in Fig. 6 is considered for learning the statistically matched rational wavelet structure for the row space and the column space of images. The value of Hurst exponent is observed to be between 0.5 to 1.0 for all the ten images considered. Fig. 7 shows the frequency response of the analysis side lowpass and highpass filters matched to the column space of natural images.
Bernoulli measurement matrix with entries taken as is considered in CS. Since it is computationally expensive to apply CS on big images, we use the concept of block CS [23], where block-size of is considered. Recently, multilevel wavelet decomposition has been proposed over L-shaped pyramid (L-Pyramid) (Fig. 8(b)) in [24] and is observed to perform better in CS application compared to the existing multilevel regular pyramid (R-Pyramid) wavelet decomposition (Fig. 8(a)). We decompose our input images to 3-level using this new L-Pyramid wavelet decomposition in our experiments. Table-I presents reconstruction results in terms of PSNR (peak signal to noise ratio) for sampling ratios varying from to , where sampling ratio is the percentage of total samples measured.
From Table-1, we note that the performance of the proposed RWLS is superior (comparable at 90% for Img1 and Img4) to standard wavelets on natural images. Although image ‘Img11’ was not used in the ensemble of images used to learn the RWT, the performance of the learned RWLS over this image is also superior indicating that the proposed learning indeed provides statistically-matched rational system for the class of natural images.
5 Conclusion
Statistical learning for rational wavelet transform (RWLS) method for natural images is presented in this work. The natural images are modeled as fBm processes and their statistical properties are used to learn separable rational wavelet transform. Lifting framework for the rational wavelet is used in the proposed work that provides closed form solution for learning making the method computationally efficient. The learned rational wavelet transform is tested in the application of CS based reconstruction of natural images and is observed to perform better compared to the existing standard bi-orthogonal wavelet transforms.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Ravishankar, B. Wen, and Y. Bresler, “Online sparsifying transform learning—part i: Algorithms,” IEEE Journal of Selected Topics in Signal Processing , vol. 9, no. 4, pp. 625–636, 2015.
- 2[2] B. Wen, S. Ravishankar, and Y. Bresler, “Video denoising by online 3d sparsifying transform learning,” in Image Processing (ICIP), 2015 IEEE International Conference on . IEEE, 2015, pp. 118–122.
- 3[3] S. Ravishankar and Y. Bresler, “Efficient blind compressed sensing using sparsifying transforms with convergence guarantees and application to magnetic resonance imaging,” SIAM Journal on Imaging Sciences , vol. 8, no. 4, pp. 2519–2557, 2015.
- 4[4] S. Mallat, A wavelet tour of signal processing . Academic press, 1999.
- 5[5] T. Blu, “An iterated rational filter bank for audio coding,” in Time-Frequency and Time-Scale Analysis, 1996., Proceedings of the IEEE-SP International Symposium on . IEEE, 1996, pp. 81–84.
- 6[6] ——, “Iterated filter banks with rational rate changes connection with discrete wavelet transforms,” Signal Processing, IEEE Transactions on , vol. 41, no. 12, pp. 3232–3244, 1993.
- 7[7] O. Chertov, V. Malchykov, and D. Pavlov, “Non-dyadic wavelets for detection of some click-fraud attacks,” in Signals and Electronic Systems (ICSES), 2010 International Conference on . IEEE, 2010, pp. 401–404.
- 8[8] T.-T. Le, M. Ziebarth, T. Greiner, and M. Heizmann, “Optimized size-adaptive feature extraction based on content-matched rational wavelet filters,” in Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European . IEEE, 2014, pp. 1672–1676.
