S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks

Jae-Seok Choi; Yongwoo Kim; Munchurl Kim

arXiv:1906.05480·cs.CV·May 20, 2020

S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks

Jae-Seok Choi, Yongwoo Kim, Munchurl Kim

PDF

TL;DR

This paper introduces the S3 loss, a novel spectral-spatial structure loss function that improves the quality of pan-sharpened satellite images by reducing artifacts caused by misalignments.

Contribution

The paper proposes the S3 loss function, enhancing CNN-based pan-sharpening by effectively handling misalignments and improving visual quality of the output images.

Findings

01

Significant reduction of artifacts in pan-sharpened images

02

Improved visual quality across various CNN architectures

03

Effective handling of sensor misalignments

Abstract

Recently, many deep-learning-based pan-sharpening methods have been proposed for generating high-quality pan-sharpened (PS) satellite images. These methods focused on various types of convolutional neural network (CNN) structures, which were trained by simply minimizing a spectral loss between network outputs and the corresponding high-resolution multi-spectral (MS) target images. However, due to different sensor characteristics and acquisition times, high-resolution panchromatic (PAN) and low-resolution MS image pairs tend to have large pixel misalignments, especially for moving objects in the images. Conventional CNNs trained with only the spectral loss with these satellite image datasets often produce PS images of low visual quality including double-edge artifacts along strong edges and ghosting artifacts on moving objects. In this letter, we propose a novel loss function, called a…

Tables1

Table 1. TABLE I: Average quality metric scores for 100 result images at the original scale on the WorldView-3 test dataset.

Method / Metric	Avg. ERGAS₁	Avg. SCC₁	Avg. SCC₀	Avg. n-ERGAS₁
Bicubic	0.6818 $\pm$ 0.0401	0.8577 $\pm$ 0.0089	0.4826 $\pm$ 0.0126	0.6818 $\pm$ 0.0401
Provided PS	3.6845 $\pm$ 0.2036	0.9647 $\pm$ 0.0014	0.9679 $\pm$ 0.0012	3.3506 $\pm$ 0.1849
PanNet [16]	0.4360 $\pm$ 0.0254	0.8468 $\pm$ 0.0094	0.7367 $\pm$ 0.0092	0.4360 $\pm$ 0.0254
PanNet-S3 (Ours)	3.0647 $\pm$ 0.1830	0.9530 $\pm$ 0.0015	0.9568 $\pm$ 0.0012	2.6465 $\pm$ 0.1533
BDPN [20]	1.3695 $\pm$ 0.0779	0.8836 $\pm$ 0.0069	0.9051 $\pm$ 0.0039	1.3691 $\pm$ 0.0779
BDPN-S3 (Ours)	3.3380 $\pm$ 0.2043	0.9563 $\pm$ 0.0015	0.9580 $\pm$ 0.0012	2.8221 $\pm$ 0.1681
DSen2 [19]	0.4278 $\pm$ 0.0258	0.8508 $\pm$ 0.0088	0.6485 $\pm$ 0.0142	0.4278 $\pm$ 0.0258
DSen2-S3 (Ours)	3.1800 $\pm$ 0.1929	0.9536 $\pm$ 0.0015	0.9539 $\pm$ 0.0013	2.6942 $\pm$ 0.1575

Equations18

G_{1} = g (M_{2}, P_{1}, θ),

G_{1} = g (M_{2}, P_{1}, θ),

θ^{*} = θ argmin \sum ∥ g (M_{2}, P_{1}, θ) - M_{1} ∥_{2}^{2} .

θ^{*} = θ argmin \sum ∥ g (M_{2}, P_{1}, θ) - M_{1} ∥_{2}^{2} .

co v (\hat{M}_{1}, P_{1})

co v (\hat{M}_{1}, P_{1})

s t d (\hat{M}_{1})

s t d (P_{1})

cor r (\hat{M}_{1}, P_{1})

S

L_{c} = \sum ∥(G_{1} - M_{1}) ⊙ S ∥_{1}^{1} .

L_{c} = \sum ∥(G_{1} - M_{1}) ⊙ S ∥_{1}^{1} .

L_{a} = \sum ∥(g r a d (\hat{G}_{1}) - g r a d (P_{1})) ⊙ (2 - S) ∥_{1}^{1},

L_{a} = \sum ∥(g r a d (\hat{G}_{1}) - g r a d (P_{1})) ⊙ (2 - S) ∥_{1}^{1},

g r a d (X) = \frac{X - m ( X )}{s t d ( X )} .

g r a d (X) = \frac{X - m ( X )}{s t d ( X )} .

L_{S 3} = L_{c} + w_{a} L_{a},

L_{S 3} = L_{c} + w_{a} L_{a},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks

Jae-Seok Choi, Yongwoo Kim, and Munchurl Kim This paper was submitted for review on Apr. 9, 2019 (Corresponding author: Munchurl Kim). This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2017R1A2A2A05001476). We thank Korea Aerospace Research Institute for providing the KOMPSAT-3A satellite dataset for our experiments.J.-S. Choi and M. Kim are with the School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, South Korea (e-mail: [email protected]; [email protected]). Y. Kim is with the Artificial Intelligence Research Division, Korea Aerospace Research Institute, Daejeon 34133, South Korea (e-mail: [email protected]).

Abstract

Recently, many deep-learning-based pan-sharpening methods have been proposed for generating high-quality pan-sharpened (PS) satellite images. These methods focused on various types of convolutional neural network (CNN) structures, which were trained by simply minimizing a spectral loss between network outputs and the corresponding high-resolution multi-spectral (MS) target images. However, due to different sensor characteristics and acquisition times, high-resolution panchromatic (PAN) and low-resolution MS image pairs tend to have large pixel misalignments, especially for moving objects in the images. Conventional CNNs trained with only the spectral loss with these satellite image datasets often produce PS images of low visual quality including double-edge artifacts along strong edges and ghosting artifacts on moving objects. In this letter, we propose a novel loss function, called a spectral-spatial structure (S3) loss, based on the correlation maps between MS targets and PAN inputs. Our proposed S3 loss can be very effectively utilized for pan-sharpening with various types of CNN structures, resulting in significant visual improvements on PS images with suppressed artifacts.

Index Terms:

Convolutional neural network (CNN), deep learning, pan sharpening, pan colorization, satellite imagery, spectral spatial structure, super resolution (SR).

I Introduction

Due to their sensor resolution constraints and bandwidth limitation, satellites often acquire multi-resolution multi-spectral images of the same target areas. In general, satellite images include pairs of a low-resolution (LR) multi-spectral image (MS) of longer ground sample distance (GSD), and a high-resolution (HR) panchromatic (PAN) image of shorter GSD. By extracting high-quality spatial structures from a PAN image and multi-spectral information from an MS image, one can generate a pan-sharpened (PS) image which has the same GSD as that of the PAN image but with the spectral information of the MS image. This is known as pan-sharpening or pan-colorization.

I-A Related Works

Traditional pan-sharpening methods [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] include component substitution [1, 2, 3, 4, 5], multiresolution analysis [6, 7] and machine-learning [8, 9, 10]. Comparisons for component substitution and multiresolution analysis based approaches were presented thoroughly in [11]. Component substitution based methods often incorporated the Brovey transform (BT) [1], the intensity-hue-saturation [2], principal component analysis (PCA) [3], or matting models [4] for pan-sharpening. In multiresolution analysis based methods, the spatial structures of PAN images are decomposed using wavelet [6] or undecimated wavelet [7] decomposition techniques, and are fused with up-sampled MS images to produce PS images. These methods have relatively low computation complexity but tend to produce PS images with mismatched spectral information. Machine-learning based methods [8, 9, 10] learn pan-sharpening models by optimizing a loss function of inputs and targets with some regularization terms.

With the advent of deep-learning, recent pan-sharpening methods [15, 16, 17, 18, 19, 20] started to incorporate various types of convolutional neural network (CNN) structures and are showing a large margin of quality improvements over traditional pan-sharpening methods. Most of these CNN-based pan-sharpening methods utilized network structures that were proven to be effective in classification [21, 22] and super-resolution (SR) [23, 24, 25] tasks. As the goal for pan-sharpening is to increase the resolution of MS inputs, many conventional CNN-based pan-sharpening methods employed network structures from the previous CNN-based SR methods [23, 24, 25]. Pan-sharpening CNN (PNN) [15] is known as the first method to employ CNN into pan-sharpening. The PNN used a shallow 3-layered network adopted from SRCNN [23] which is the first CNN-based SR method. The PNN was trained and tested on the Ikonos, GeoEye-1 and WorldView-2 satellite image datasets. Inspired by the success of ResNet [21] in classification, PanNet [16] incorporated the ResNet architecture with a smaller number of filter parameters to perform pan-sharpening. Lanaras et al. [19] employed the state-of-the-art SR network, EDSR [25], and proposed a moderately deep network version (DSen2) and a very deep network version (VDSen2) for pan-sharpening. Recently, a bidirectional pyramid network (BDPN) [20] has been proposed, using deep and shallow networks for PAN and MS inputs separately.

I-B Our Contributions

Since the state-of-the-art CNN-based pan-sharpening methods, PanNet [16], DSen2 [19] and BDPN [20] were trained using a simple spectral loss function for minimizing reconstruction error between generated images and MS target images, their PS result images often suffer from visually unpleasant artifacts along building edges and on moving cars in their resulting PS images of shorter GSD such as the WorldView-3 dataset. This is because, as GSD becomes smaller, pixel misalignments between PAN and MS inputs tend to get larger due to inevitable acquisition time difference and mosaicked sensor arrays. In such scenarios, the spectral loss between network outputs and MS target images are insufficient for training, thus resulting in the PS images of low visual quality.

In this letter, we propose a novel loss term, called a spectral-spatial structure (S3) loss, which can be effectively utilized for training of pan-sharpening CNNs to learn spectral information of MS targets while preserving the spatial structure of PAN inputs. Our S3 loss consists of two loss functions: a spectral loss between network outputs and MS targets, and a spatial loss between network outputs and PAN inputs. Here, both spectral and spatial losses are computed based on the correlation maps between MS targets and PAN inputs. The spectral loss is selectively applied for the areas where averaged MS targets and PAN inputs are highly correlated. The spatial loss only considers gradient maps of generated images (network output) and PAN inputs. In doing so, our network using the S3 loss can generate PS images where double-edge artifacts and ghosting artifacts on moving cars are significantly reduced. Finally, we show that our S3 loss can effectively work with various pan-sharpening CNNs. Fig. 1 shows a CNN-based pan-sharpening architecture with our proposed S3 loss.

II Proposed Method

II-A Formulations

Most of satellite imagery datasets include PAN images of higher resolution (smaller GSD) $\mathbf{P}_{0}$ , and the corresponding MS images of lower resolution (larger GSD) $\mathbf{M}_{1}$ . Here, the subscripts of $\mathbf{P}_{0}$ and $\mathbf{M}_{1}$ denote a level of resolution where a smaller number is for a higher resolution. We have two scenarios in terms of scales (resolutions): (i) Our final goal in pan-sharpening is to utilize both $\mathbf{P}_{0}$ and $\mathbf{M}_{1}$ inputs to generate a high-quality PS image $\mathbf{G}_{0}$ , which has the same resolution as $\mathbf{P}_{0}$ , while preserving spectral information of $\mathbf{M}_{1}$ . This case corresponds to the original scale scenario in [16, 19]; (ii) Now we consider a pan-sharpening model that requires training using input and target pairs. For target images, we use $\mathbf{M}_{1}$ . For input images, we use $\mathbf{M}_{2}$ and $\mathbf{P}_{1}$ , which are down-scaled versions of $\mathbf{M}_{1}$ and $\mathbf{P}_{0}$ respectively, using a degradation model [19]. The pan-sharpening CNN takes $\mathbf{M}_{2}$ and $\mathbf{P}_{1}$ as inputs, and generates $\mathbf{G}_{1}$ . This case corresponds to the lower scale scenario in [16, 19]. In conclusion, training and testing the pan-sharpening networks are performed under the lower and original scale scenarios, respectively. In this regard, the conventional pan-sharpening networks were trained by simply minimizing a spectral loss between network outputs $\mathbf{G}_{1}$ and MS targets $\mathbf{M}_{1}$ under the lower scale scenario.

II-B Proposed S3 Loss

We now define our spectral-spatial structure (S3) loss, which can be used for training any pan-sharpening CNN to yield high-quality PS images $\mathbf{G}_{1}$ , and ultimately $\mathbf{G}_{0}$ . First, we define our feedforward pan-sharpening operation as

[TABLE]

where $g$ is a pan-sharpening CNN with filter parameters $\theta$ . The conventional methods [16, 19] use the L2 loss as

[TABLE]

However, solely using this loss function for training often leads to artifacts in resultant images $\mathbf{G}_{1}$ , due to inherent misalignments between $\mathbf{M}_{1}$ and $\mathbf{P}_{1}$ . To overcome this limitation, we propose S3 loss consisting of two loss functions: a spectral loss between $\mathbf{G}_{1}$ and $\mathbf{M}_{1}$ ; and a spatial loss between $\mathbf{G}_{1}$ and $\mathbf{P}_{1}$ . First, the spectral loss more penalizes the spectral distortion on the areas where grayed $\mathbf{M}_{1}$ (denoted as $\hat{\mathbf{M}}_{1}$ ) and $\mathbf{P}_{1}$ are highly correlated. The correlation map $\mathbf{S}$ can be formulated as

[TABLE]

where $m$ is a mean filter, $\odot$ denotes an element-wise multiplication, $\gamma$ is a control parameter, and $e$ is a very small value, i.e. $10^{-10}$ . We use a 31 $\times$ 31 box filter for $m$ . We empirically set $\gamma$ to 4. Using $\mathbf{S}$ , our spectral loss $L_{c}$ is then defined as

[TABLE]

Here, we try to minimize spectral loss between $\mathbf{G}_{1}$ and $\mathbf{M}_{1}$ only for pixel areas where $\hat{\mathbf{M}}_{1}$ and $\mathbf{P}_{1}$ have large positive and negative correlations. Note that $\mathbf{S}$ is not trainable.

For our spatial loss $L_{a}$ , we try to minimize the difference between the gradient map of grayed $\mathbf{G}_{1}$ (denoted as $\hat{\mathbf{G}}_{1}$ ) and that of $\mathbf{P}_{1}$ , which is formulated as

[TABLE]

where $grad$ for $\mathbf{X}$ is a function defined as

[TABLE]

We incorporated $(2-\mathbf{S})$ into $L_{a}$ in (9), so that $L_{a}$ focuses more on those areas where $L_{c}$ is less focused. Finally, combining $L_{c}$ and $L_{a}$ , we have our final S3 loss $L_{S3}$ as

[TABLE]

where $w_{a}$ is a weighting value. We empirically set $w_{a}$ to 1.

In order to show the effectiveness of our S3 loss, we incorporated our S3 loss into the state-of-the-art pan-sharpening networks, PanNet [16], BDPN [20] and DSen2 [19], which are named as PanNet-S3, DSen2-S3 and DSen2-S3 in our experiments. The DSen2 network has 14 convolutional layers with 128 channels, having about 1.8M filter parameters. PanNet has 10 layers with 76K parameters, while BDPN has 46 layers with 1.4M parameters. As for our PanNet-S3 and BDPN-S3, full data of MS-PAN inputs were concatenated and used as the network input.

III Experiment Results and Discussions

III-A Experiment Settings

III-A1 Datasets

All the networks including ours and baselines were trained and tested on the WorldView-3 satellite image dataset, whose PAN images are of about 0.3 $m$ GSD and MS images are of about 1.2 $m$ GSD. PS images of 0.3 $m$ GSD are also provided in the dataset, but they are used only for a visual comparison purpose with our results. Note that the WorldView-3 satellite image dataset has the shortest GSD (highest-resolution) among aforementioned datasets. We selected and used the WorldView-3 satellite image dataset from SpaceNet Challenge dataset [26]. The RGB channels of the MS images were used for all experiments. Total 13K MS-PAN image pairs were used for training networks, where cropping and various data augmentations were conducted on the fly during the training. The MS-PAN training subimages were created by applying a down-scaling method in [19]. The cropped MS subimages used for training are 32 $\times$ 32-sized, while PAN subimages are of 128 $\times$ 128 size. Before being fed into the networks, the training image pairs were normalized to have a range between 0 and 1. Training was done in the lower scale scenario.

III-A2 Training

We trained all the networks using the decoupled ADAMW optimization [27] with an initial learning rate of $10^{-4}$ , initial weight decay of $10^{-7}$ , and the other hyper-parameters as defaults. The mini-batch size was set to 2. We employed a uniform weight initialization technique in [28]. All the networks including our proposed networks were implemented using TensorFlow [29], and were trained and tested on Nvidia Titan Xp GPU. The networks were trained for total $10^{6}$ iterations, where the learning rate and weight decay were lowered by a factor of 10 after $5\times 10^{5}$ iterations. In our PanNet-S3, initial learning rate and weight decay were set to $5\times 10^{-4}$ and $10^{-8}$ , respectively. In our BDPN-S3, we used $10^{-8}$ for the hyper-parameter $e$ in our S3 loss, and $w_{a}$ in the S3 loss was empirically set to 2.

III-B Comparisons and Discussions

We now compare our proposed methods using the S3 loss, with the conventional pan-sharpening methods including bicubic, PS images provided from the WorldView-3 dataset, PanNet [16], BDPN [20] and DSen2 [19]. We implemented PanNet, BDPN and DSen2 according to their technical descriptions, and trained them on the WorldView-3 dataset. At testing, for MS input images with a size of 160 $\times$ 160, average computation time for our DSen2-S3 on GPU is about 2 sec per image.

As in [11, 16], we use two popular metrics: Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS) [30] for measuring spectral distortion, and spatial correlation coefficient (SCC) [31] for measuring spatial distortion. Lower is better for ERGAS, whereas higher is better for SCC. Note that in the original scale scenario, there are no ground truth PS images for comparison. Therefore, in this letter, given a PS output at the original scale, SCC0 between the network output and PAN input, ERGAS1 between a down-scaled network output and its MS input, and SCC1 between the down-scaled network output and down-scaled PAN input were computed.

Table I shows average quality metric scores (with standard errors), ERGAS1, SCC1 and SCC0 for PS results at the PAN resolution (0.3 $m$ GSD). Here, 100 MS-PAN pairs from the WorldView-3 satellite test dataset were selected for testing in the original scale scenario. As shown in Table I, the PS results by PanNet, BDPN and DSen2 have lower ERGAS values, showing lower spectral distortion, but lower SCC values, indicating higher spatial distortion. On the other hand, our methods with the S3 loss generated the PS images with much higher SCC values, but with slightly higher spectral distortion. Note that, since ERGAS simply computes the score values of spectral distortion between MS and PAN test image pairs that are often misaligned with unknown magnitudes and directions, it may not be effective in measuring the distortions for misaligned MS-PAN pairs.

Here, we additionally propose a more effective spectral distortion metric, called n-ERGAS, which is a simple variant of ERGAS inspired by an evaluation method used in the NTIRE 2018 Super-Resolution Challenge [32] for misaligned input-target pairs. In this challenge, input images were randomly translated from the corresponding target images, and a new metric was used for evaluation. As for our n-ERGAS, once we obtain a PS result image, $\pm$ 6-pixel translations are applied to obtain 144 translated PS images. Next, multiple ERGAS scores are computed using down-scaled versions of these translated images and an MS input, and the most favorable (smallest) ERGAS score is selected as the final ERGAS score for evaluation. As methods using our proposed S3 loss can reconstruct spectral information of misaligned MS on spatially correlated areas with PAN, their n-ERGAS scores should be lower (thus better) than the corresponding ERGAS scores. The n-ERGAS1 scores for the various methods are presented in Table I. As shown, all our methods (PanNet-S3, BDPN-S3 and DSen2-S3) have lower n-ERGAS scores compared to the corresponding ERGAS scores, while almost no difference is observed between n-ERGAS and ERGAS scores for the baselines (PanNet [16], BDPN [20] and DSen2 [19]). This indicates that our S3 loss indeed tries to minimize the spectral distortion more on spatially correlated areas with PAN, demonstrating the effectiveness of using our S3 loss for misaligned MS-PAN images.

We now visually compare several pan-sharpening methods including ours. Fig. 2 shows PS images for various methods on WV3. First, PS images provided from the dataset show high spectral distortion, with blue glow around cars. Since trained using a simple loss between network outputs and MS targets, PanNet, BDPN and DSen2 tend to perform poorly on misaligned MS-PAN test inputs, creating unpleasant artifacts around strong edges and moving objects in the PS images. On the other hand, our method using the proposed S3 loss can reconstruct PS images with highly sharpened edges, rooftops, roads and cars with much less artifacts, visually outperforming the conventional methods. However, some spectral artifacts are slightly visible around cars, indicating that there is still room for improvement. Nevertheless, the results using conventional PanNet, BDPN and DSen2 methods still suffer from ghosting and double-edge artifacts, degrading the overall visual quality. This confirms that our proposed S3 loss can be used for various networks to generate PS images with higher visual quality and less artifacts, compared to their baselines.

Moreover, we conducted experiments using two additional satellite datasets: the WorldView-2 (WV2) dataset and the KOMPSAT-3A (K3A) dataset. The WV2 dataset is of 11 bits per pixel, and includes PAN images of 0.5 m GSD and MS images of 2.0 m GSD. The K3A dataset is of 14 bits per pixel, and includes PAN images of 0.7 m GSD and MS images of 2.8 m GSD. Fig. 3 and 4 show pan-sharpening results at the original scale using various methods on the WV2 dataset and the K3A dataset, respectively. As shown, similar to the experiment results using the WV3 dataset, PS images using our DSen2-S3 method trained with WV2 have a slightly higher spectral distortion compared to MS inputs (higher ERGAS), but their spatial details are much similar to those of PAN inputs (higher SCC). This implies that our S3 loss is effective and robust for different types of satellite datasets.

We now present experiment results to show the effectiveness of using the correlation map in our S3 loss. Here, we set $\mathbf{S}=\mathbf{1}$ , so that the correlation map was not used in training. Fig. 5 shows pan-sharpening results at the original scale on the WorldView-3 test dataset for our DSen2-S3 without using the correlation maps $\mathbf{S}$ . As shown, simply adding the spatial loss regarding PAN inputs would not be able to overcome artifacts, and much more spectral distortions are visible around moving cars if we do not incorporate the correlation map into our S3 loss. Therefore, we can confirm that the correlation map plays an important role in our S3 loss.

IV Conclusion

We proposed a novel spectral-spatial structure (S3) loss that can be effectively applied for CNN-based pan-sharpening methods. Our S3 loss is featured with a combined measuring capability of spectral, spatial and structural distortions, so that the CNN-based pan-sharpening networks can be effectively trained to generate highly detailed PS images with less artifacts, compared to the conventional losses simply based on the difference between network outputs and MS targets.

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. R. Gillespie, A. B. Kahle, and R. E. Walker, “Color enhancement of highly correlated images. i. decorrelation and HSI contrast stretches,” Remote Sensing of Environment , vol. 20, no. 3, pp. 209–235, Dec. 1986.
2[2] W. J. Carper, T. M. Lillesand, and P. W. Kiefer, “The use of intensity-hue-saturation transformations for merging spot panchromatic and multispectral image data,” Photogrammetric Engineering and Remote Sensing , vol. 56, Jan. 1990.
3[3] V. P. Shah, N. H. Younan, and R. L. King, “An efficient pan-sharpening method via a combined adaptive PCA approach and contourlets,” IEEE Transactions on Geoscience and Remote Sensing , vol. 46, no. 5, pp. 1323–1335, May 2008.
4[4] X. Kang, S. Li, and J. A. Benediktsson, “Pansharpening with matting model,” IEEE Transactions on Geoscience and Remote Sensing , vol. 52, no. 8, pp. 5088–5099, Aug. 2014.
5[5] Q. Xu, B. Li, Y. Zhang, and L. Ding, “High-fidelity component substitution pansharpening by the fitting of substitution data,” IEEE Transactions on Geoscience and Remote Sensing , vol. 52, no. 11, pp. 7380–7392, Nov. 2014.
6[6] S. G. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence , no. 7, pp. 674–693, 1989.
7[7] J.-L. Starck, J. Fadili, and F. Murtagh, “The undecimated wavelet decomposition and its reconstruction,” IEEE Transactions on Image Processing , vol. 16, no. 2, pp. 297–309, 2007.
8[8] Z. Pan, J. Yu, H. Huang, S. Hu, A. Zhang, H. Ma, and W. Sun, “Super-resolution based on compressive sensing and structural self-similarity for remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing , vol. 51, no. 9, pp. 4864–4876, Sep. 2013.