TL;DR
This paper introduces a novel solar image quality metric called perception evaluation, which leverages multi-fractal texture features extracted by deep neural networks and measures image quality based on the cosine distance of Gram matrices.
Contribution
The paper proposes a new reduced-reference image quality metric for solar images using multi-fractal texture features and deep neural networks, which is robust across different scenes.
Findings
Perception evaluation accurately estimates solar image quality.
The metric performs well on both simulated and real images.
It provides a robust assessment when using high-resolution reference images.
Abstract
Next-generation ground-based solar observations require good image quality metrics for post-facto processing techniques. Based on the assumption that texture features in solar images are multi-fractal which can be extracted by a trained deep neural network as feature maps, a new reduced-reference objective image quality metric, the perception evaluation is proposed. The perception evaluation is defined as cosine distance of Gram matrix between feature maps extracted from high resolution reference image and that from blurred images. We evaluate performance of the perception evaluation with simulated and real observation images. The results show that with a high resolution image as reference, the perception evaluation can give robust estimate of image quality for solar images of different scenes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Perception Evaluation – A new solar image quality metric based on the multi-fractal property of texture features
Solar Physics
Yi Huang
Peng Jia
Dongmei Cai
Bojun Cai
College of Physics and Optoelectronics, Taiyuan University of Technology, Taiyuan, 030024, China
Department of Physics, Durham University, South Road, Durham DH1 3LE, UK
Key Laboratory of Advanced Transducers and Intelligent Control Systems, Ministry of Education and Shanxi Province, Taiyuan University of Technology, Taiyuan, 030024, China
keywords:
Atmospheric Seeing, Instrumental Effects, Instrumentation and Data Management
\setlastpage\inarticletrue{opening}
1 Introduction
\ilabel
S-Introduction The resolution of ground-based telescopes is limited by many different factors such as: the (quasi-) static aberrations or the dynamic aberrations caused by atmospheric turbulence. The atmospheric turbulence induced aberration, termed as “seeing“, prevents large aperture ground-based solar telescopes from achieving their theoretical angular resolution. For solar telescopes without adaptive optics (AO) systems (Thompson, 2000), to alleviate the atmospheric turbulence induced image degradation and achieve higher angular resolution, post-facto image reconstruction techniques are widely used (van Noort, Rouppe van der Voort, and Löfdahl, 2005; Mikurda and von der Lühe, 2006; Scharmer et al., 2010). A proper objective image quality metric (IQM) is required for these post-facto image reconstruction techniques, because IQM is used either as criterion for frame selection based methods or as cost function for deconvolution algorithms. In recent decades, several IQMs have been proposed and they can be classified into: full-reference (FR), no-reference (NR), and reduced-reference (RR) metrics.
FR IQMs require high resolution images as reference. The mean squared error (MSE) is the simplest FR IQM, which computes the average of the squared difference between the distorted and reference image. The structural similarity (SSIM) proposed by Wang et al. (2004) is widely used and it can give similar result as that given by the human visual system. Root-mean-square contrast (RMS-contrast) is the most commonly used IQM for image reconstruction (Denker et al., 2005, 2007; Danilovic et al., 2008). Because the granulation is uniform and isotropic, the FR IQMs has been successfully used for granulation images (Scharmer, 1989; Denker et al., 2005; Danilovic et al., 2008). Unfortunately, FR IQMs have several drawbacks, such as its performance strongly depends on the wavelength (Albregtsen and Hansen, 1977) and its sensitivity is related to the structural contents of the image (Deng et al., 2015).
The Median Filter-Gradient Similarity (MFGS) proposed by Deng et al. (2015) is a NR IQM, which does not need a reference image and is suitable to evaluate the quality of solar observation images directly. However, Popowicz et al. (2017) and Denker et al. (2018) show that MFGS is not completely independent of the structural contents or the spatial sampling rate of an image. This property would limit the performance of image reconstruction methods in different regions of the sun.
In this paper, we propose a new RR IQM – the Perception Evaluation (PE). The PE only requires one high resolution image as reference and can evaluate the quality of blurred images with this reference image. The PE is based on the assumption that texture features in solar images are multi-fractals and they should be similar for solar images obtained in the same wavelength. In this paper, the multi-scale distribution of the multi-fractals is extracted by a trained deep neural network (DNN) (Yu, Schmid, and Victor, 2015; Motoyoshi et al., 2007). The difference of multi-fractal property between high resolution images and blurred images is then used to evaluate the image quality. We will introduce the PE in Section 2. In Section 3, we will evaluate the performance of PE with simulated blurred images and real observed images and in Section 4, we will give our conclusions and discuss possible applications in the future.
2 Perception Evaluation
\ilabel
S-general
2.1 Principle of the perception evaluation
\ilabel
S-text The texture feature is a description of the spatial arrangement of the gray scale in an image. The texture feature is usually used to describe the regularity or coarseness of an image (Guo, Zhao, and Pietikainen, 2012). Human beings can easily distinguish between images with different texture features, such as the rainforest or the desert in a black-and-white aerial photograph. For solar images, the texture features are almost everywhere. In different wavelengths, solar images are different and these images are composed of different texture features as shown in Figure \irefF-GHimg.
Texture features in solar images may be self-similar and these images are usually called fractal (Jia, Cai, and Wang, 2014), such as the granulation. In other wavelengths, they are not self-similar in the whole spatial scale, which means they can not be described by a spectrum with the same exponent. However, these images can be described by a continuous spectrum with different exponents in different scales. This property is usually called multi-fractal property (Ne and Parga, 2000; Peng et al., 2017). If we assume multi-fractal properties of texture features on the solar images do not change between images observed in the same wavelength, with one high resolution image as reference, we can easily discriminate images with different blur level. The difference between texture features of high resolution images and that of blurred images is a good tracer for image quality. Can we model that difference to evaluate image quality?
Direct modelling the difference of multi-fractal properties is hard, because texture features are complex and they are different for solar images of different wavelength. Many different image quality metrics based on texture features are proposed, such as: the image grey scale statistics (N-th order joint histograms) introduced by Julesz (1962), models based on other statistical measurements (Heeger and Bergen, 1995; Portilla and Simoncelli, 2000). Because DNN is complex enough to directly learn texture features, parametric texture feature models based on DNN features are widely used (Gatys, Ecker, and Bethge, 2015b; Liu, Gousseau, and Xia, 2016). In this paper, we will use a trained Convolutional neural network (CNN) to extract the multi-fractal properties of texture features. We will discuss our algorithm below.
2.2 Algorithm of the perception evaluation
\ilabel
S-labels CNN is a kind of DNN, which includes many convolutional layers. A convolutional layer has channels. Different channel means the input signal will be convolved with a different trainable convolutional kernel. The output of each convolutional layers are called feature maps. After a convolutional layer, an image with pixels will become a 3D feature maps with size of , where is the channel number. Visualization of feature maps shows that these feature maps describe images in a multi-scale way (Zeiler and Fergus, 2014), which makes it adequate to model multi-fractal properties of texture features on solar images.
The VGG, which is a CNN with many small convolution kernels and several convolutional layers and is proposed by the Visual Geometry Group of the University of Oxford (Pfister et al., 2014), is used in this paper. Considering texture features of solar images are complex and may have different multi-fractal properties, we use VGG16 to model multi-fractal properties of texture features. The VGG16 is a VGG with 12 convolutional layers and 4 pooling layers as shown in Figure \irefF-simple1. In the first several convolutional layers, feature maps from VGG16 have rich details. The deeper the convolutional layer is, the feature maps are more abstract and contain larger scale texture features (Gatys, Ecker, and Bethge, 2015a). According to our experience, we select feature maps: Feature 1, Feature 2, Feature 3 and Feature 4 from the trained VGG16 as candidate feature maps in this paper, as shown in Figure \irefF-simple1.
To evaluate the representative ability of these feature maps, we generate several short exposure point spread functions (PSF) with different D/r0 through Monte-Carlo simulation (Jia et al., 2015b; Basden et al., 2018). Then we convolve high resolution solar images with these PSFs to generate blurred images as shown in Figure \irefF-simple2. We extract multi-fractal properties from these blurred images by VGG16 and use Feature 1, Feature 2, Feature 3 and Feature 4 to reconstruct these images as shown in Figure \irefF-simple3. These reconstructed images show that feature maps from different layers can reflect multi-fractal properties in different scales. Because image quality metric should be only relevant to blur level, we need to transform feature maps to a quantity that are not relevant to the image size or the structural content.
The Gram matrix calculates the correlation between two variables without subtracting their mean values. The Gram matrix can reflect difference between two variables and is normally used for kernel generation for classical machine learning tasks (Hofmann, Schölkopf, and Smola, 2007). In recent years, the Gram matrix of feature maps is used in image style transfer to reflect the style difference between two images (Johnson, Alahi, and Feifei, 2016) as shown in equation \irefeq:equation3.
[TABLE]
Where and are the 2-dimensional feature maps in a particular layer. To evaluate image quality, the Gram matrix has the following advantages (Gatys, Ecker, and Bethge, 2015a):
(1) The size of the Gram matrix depends only on the number of feature maps instead of the image size.
(2) The Gram matrix is only related to texture features of an image, not its structural content.
Thanks to the above advantages, we will use the Gram matrix to represent texture features’ multi-fractal properties. In real applications, the Gram matrix of the reference image and that of blurred images will be obtained separately by the VGG16. Then we will calculate the cosine distance (Ustyuzhaninov et al., 2018) between these two matrices to evaluate the image quality,
[TABLE]
where and are Gram matrix of the blurred image and that of the reference image, is the PE. According to our experience, the Gram matrix of Feature 4 is best in representing the multi-fractal properties of texture features, because it may contain the largest amount of information which has better expressive ability compared to other feature maps. In VGG16, the size of Feature 4 is , where and are size of the input signal. In the following sections, we will only use the Gram matrix of Feature 4 to calculate the PE.
3 Performance Evaluation
\ilabel
S-features
3.1 Sample Data
\ilabel
S-equations There are two data sets used in this paper: G-band observation data from the SST (Scharmer et al., 2002) (430.5 nm with pixel scale of 0.041 arcsec and exposure time of 4ms) and H-alpha observation data from the NVST (Liu et al., 2014) (655.32 nm with pixel scale of 0.136 arcsec and exposure time of 20ms.). The SST data are reconstructed with phase diversity and corrected to the theoretical telescope and detector MTF. The NVST data are reconstructed by speckle reconstruction (Li et al., 2015). All these data are near the diffraction limit and used as reference images in this paper. At the same time, we generate many short exposure PSFs through Monte Carlo simulation (Basden et al., 2018). The parameters in Monte Carlo simulation is set according to Tabel 1 and we will use an accurate atmospheric turbulence phase screen generation method as discussed in Jia et al. (2015a); Jia et al. (2015b). These simulated short exposure PSFs will be convolved with reference images to generate simulated blurred images.
3.2 Performance of the Perception Evaluation
\ilabel
S-equations According to Popowicz et al. (2017), MFGS proposed by Deng et al. (2015), is robust in real applications and considered as a candidate solar image quality metric. In real applications, Denker et al. (2018) has proposed a modified implementation of MFGS to evaluate image sequences obtained with the High-resolution Fast Imager (HiFI) at the 1.5-meter GREGOR solar telescope (von der Lühe et al., 2001; Volkmer et al., 2010; Denker et al., 2012) and has revealed the field and structure-dependency of MFGS. In this paper we select MFGS for comparison. According to our requriements, we directly add the horizontal and vertical gradients of an image as the MFGS to achieve higher effectiveness.
Firstly, we use G-band SST observation data as shown in Figure \irefF-simple4 to test the PE. Areas X1, X2 with size of arcsec are used as reference images. We extract 100 images with size of pixels ( arcsec) from the W area ( pixels) in Figure \irefF-simple4 by step of 50 pixels (around 2 arcsec and these images have overlapping regions). Then we convolve these images with simulated short exposure PSFs (D/r0 from 1 to 20) to generate simulated blurred images. In the right panel of Figure \irefF-simple4 are simulated blurred images with different degradation level. These simulated short exposure images are evaluated with PE and MFGS respectively. The results are shown in Figure \irefF-simple5. We can find that PE is more sensitive to different level of blur than MFGS, because the error bar is much smaller for PE. Besides, we can also find that different reference images will not change the trend of PE, which indicates us that the PE is robust to the reference images.
Secondly we use the H-alpha data from NVST to test the PE. As shown in Figure \irefF6-img_Halpha, we extract two reference images X1 and X2 with size of arcsec from H-alpha data as reference images. Then we extract small images with size of pixels ( arcsec) from the W area ( pixels) in Figure \irefF6-img_Halpha by step of 50 pixels (6.8 arcsec) and convolve these images with simulated PSF to generate simulated blurred images. The PE and the MFGS are used to evaluate the quality of these images and the results are shown in Figure \irefF-simple6. We can find that PE still maintains discriminative power for different degrees of image degradation and it is more sensitive than the MFGS.
Thirdly, we use PE to evaluate the image quality of real observation data. These real observational images are extracted from the NVST H-alpha observation data on the same day. There are 150 frames of the real observation data and they have size of pixels. We use high resolution images X1 and X2 shown in Figure \irefF6-img_Halpha as reference images. One frame of observation images and its PE values in different sections are shown in Figure \irefF-simple7. We can find that the PE can reflect spatial variation of image quality and the variation trend is almost the same for PE with different reference images, which shows that the PE is robust to reference images.
Besides, we also evaluate the PE with 150 continuous frames of images. The results are shown in video 1 (see attachment) and Figure \irefF-simple8. We can find that the PE can reflect temporal variation of atmospheric turbulence. With different reference images, the absolute value of the PE is different. The difference of the absolute value of the PE is caused by different amount of texture features in different reference images. In real applications, we will use one high resolution image as reference, which means only the relative variation is important. From Figure \irefF-simple8, we find that the variation trend of PE is the same when the reference image is different, which indicates the effectiveness of our method.
We further explore the stability of PE with the same figure and different rotation angles. We calculate the PE of a speckle reconstructed H-alpha image ( pixels) from the NVST with 8 different rotation angles. The reference image is the first image in Figure \irefF7-rotate and as shown in this figure, the difference of PE values between different images are very small, regardless of the rotation angle.
3.3 Limitation of the Perception Evaluation
\ilabel
S-Lim We use simulated blurred images with different size and different blur properties (different coherent length) to further test the robustness of the PE and the MFGS. We extract 100 images from W region of Figure \irefF-simple4 and Figure \irefF6-img_Halpha and convolve these images with PSFs of different coherent length to generate simulated blurred images. For the PE, we use the same image as reference image for different wavelength (X1 region in Figure \irefF-simple4 and Figure \irefF6-img_Halpha). We evaluate the PE and MFGS of these images and the results are shown in Figure \irefF-simple9 and Figure \irefF-simple10. We can find that the MFGS and the PE are both sensitive to sampling and image scale. While the PE is sensitive to the image size and it is the limitation of the PE, because the texture features’ multi-fractal property is a statistical property and we need a lot of texture features to keep the PE robust. Higher resolution and more pixels in the science camera of future solar telescopes will reduce the limitation of the PE in real applications. Otherwise particular attention should be paid when using PE to evaluate image quality. According to our experiences, images with at least pixels are adequate to be evaluated by the PE.
4 Conclusion
\ilabel
S-Conclusion Based on the assumption that texture features in the solar image are multi-fractal, we propose a new RR IQMs – PE in this paper. The PE only needs one high resolution image to evaluate the image quality of blurred images. We test the performance of PE with simulated blurred images and real observation data and find that the PE is robust to image content and rotation angle. However, we also find that the PE is sensitive to the image size. In real applications, we recommend to use PE to evaluate the quality of images which should have at least pixels.
Because the PE is robust and only related to texture features of the solar image, we can use it to evaluate the quality of solar images of any wavelength, if we have high resolution image as reference. It will be benefit to frame selection based image restoration methods, because better frames can be selected. The PE can also be directly used as cost function to increase the performance of deconvolution algorithm. Furthermore, the PE can even be used to evaluate the image quality of any astronomy images with texture features, such as nebulae, super nova remnants and galaxies, which would boost up the development of post-facto methods in the astronomical community.
Acknowledgments
This work is supported by National Natural Science Foundation of China (NSFC) (11503018) and the Joint Research Fund in Astronomy (U1631133) under cooperative agreement between the NSFC and Chinese Academy of Sciences (CAS), Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (2016033). Peng Jia is supported by the China Scholarship Council to study at the University of Durham. The data used in this paper were obtained with the New Vacuum Solar Telescope in Fuxian Solar Observatory of Yunnan Astronomical Observatory, CAS and the Swedish 1-m Solar Telescope. The Swedish 1-m Solar Telescope is operated on the island of La Palma by the Institute for Solar Physics of the Royal Swedish Academy of Sciences in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofísica de Canarias.
The authors would like to thank the reviewers for her/his kindly suggestions, which have greatly improved this paper. Peng Jia would like to thank Dr. Yongyuan Xiang, Professor Hui Liu, Professor Kaifan Ji and Professor Zhong Liu from Yunnan Observatory, Professor Hui Deng from Guangzhou University, Dr. Qinming Zhang from Purple Mountain Observatory, Dr. Yang Guo from Nanjing University who provide very helpful suggestions to this paper. The code used in this paper is written in Python programming language (Python Software Foundation) with the package pytroch, astropy and sklearn. The complete code can be downloaded from aojp.lamost.org or https://github.com/yellowyi9527/Perception-Evaluation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Albregtsen and Hansen (1977) Albregtsen, F., Hansen, T.L.: 1977, The wavelength dependence of granulation (0.38 c 2.4 m). Solar Physics 54 (1), 31.
- 2Basden et al. (2018) Basden, A., Bharmal, N.A., Jenkins, D., Morris, T., Osborn, J., Peng, J., Staykov, L.: 2018, The durham adaptive optics simulation platform (dasp): Current status. Software X 7 , 63.
- 3Danilovic et al. (2008) Danilovic, S., Gandorfer, A., Lagg, A., Schussler, M., Solanki, S.K., Vogler, A., Katsukawa, Y., Tsuneta, S.: 2008, The intensity contrast of solar granulation: comparing hinode sp results with mhd simulations. Astronomy and Astrophysics 484 (3).
- 4Danilovic et al. (2008) Danilovic, S., Gandorfer, A., Lagg, A., Schüssler, M., Solanki, S.K., Vögler, A., Katsukawa, Y., Tsuneta, S.: 2008, The intensity contrast of solar granulation: comparing Hinode SP results with MHD simulations. A&A 484 , L 17. DOI . ADS . · doi ↗
- 5Deng et al. (2015) Deng, H., Zhang, D., Wang, T., Ji, K., Wang, F., Liu, Z., Xiang, Y., Jin, Z., Cao, W.: 2015, Objective image-quality assessment for high-resolution photospheric images by median filter-gradient similarity. Solar Physics 290 (5), 1479.
- 6Denker et al. (2005) Denker, C.J., Mascarinas, D., Xu, Y., Cao, W., Yang, G., Wang, H., Goode, P.R., Rimmele, T.R.: 2005, High-spatial resolution imaging combining high-order adaptive optics, frame selection, and speckle masking reconstruction. Solar Physics 227 (2), 217.
- 7Denker et al. (2007) Denker, C.J., Deng, N., Rimmele, T.R., Tritschler, A., Verdoni, A.P.: 2007, Field-dependent adaptive optics correction derived with the spectral ratio technique. Solar Physics 241 (2), 411.
- 8Denker et al. (2005) Denker, C., Mascarinas, D., Xu, Y., Cao, W., Yang, G., Wang, H., Goode, P.R., Rimmele, T.: 2005, High-Spatial-Resolution Imaging Combining High-Order Adaptive Optics, Frame Selection, and Speckle Masking Reconstruction. Sol. Phys. 227 , 217. DOI . ADS . · doi ↗
