TL;DR
The paper introduces STAR, a novel Retinex-based model that utilizes structure and texture maps derived from exponentiated local derivatives to improve image decomposition, enhancement, and correction.
Contribution
It proposes a new structure and texture aware Retinex model using exponential filters and an alternating optimization algorithm for better image decomposition.
Findings
Outperforms previous methods in quantitative metrics.
Improves low-light image enhancement quality.
Enhances color correction accuracy.
Abstract
Retinex theory is developed mainly to decompose an image into the illumination and reflectance components by analyzing local image derivatives. In this theory, larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. In this paper, we utilize exponentiated local derivatives (with an exponent {\gamma}) of an observed image to generate its structure map and texture map. The structure map is produced by been amplified with {\gamma} > 1, while the texture map is generated by been shrank with {\gamma} < 1. To this end, we design exponential filters for the local derivatives, and present their capability on extracting accurate structure and texture maps, influenced by the choices of exponents {\gamma}. The extracted structure and texture maps are employed to regularize the illumination and reflectance components in…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 28
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 34
Figure 35
Figure 36
Figure 37
Figure 38
Figure 39
Figure 40| Algorithm 2: Alternative Updating Scheme |
| Input: observed image , parameters , updating number , |
| maximum iteration number defined in Algorithm 1; |
| Initialization: estimated by Algorithm 1; |
| for () do |
| 1. Update ; |
| 2. Update ; |
| 3. Solve the STAR model (10) and obtain and |
| by Algorithm 1; |
| if (Converged) |
| 4. Stop; |
| end if |
| end for |
| Output: Final illuminance and reflectance . |
| Dataset | 35 Images | 200 Images | ||
| Metric | NIQE | VIF | NIQE | VIF |
| Input | 3.74 | 1.00 | 3.45 | 1.00 |
| HE [11] | 3.24 | 1.34 | 3.28 | 1.19 |
| MSRCR [27] | 2.98 | 1.84 | 3.21 | 1.11 |
| CVC [10] | 3.03 | 2.04 | 3.01 | 1.63 |
| NPE [51] | 3.10 | 2.48 | 3.12 | 1.62 |
| LDR [33] | 3.12 | 2.36 | 2.96 | 1.66 |
| SIRE [17] | 3.06 | 2.09 | 2.98 | 1.57 |
| MF [18] | 3.19 | 2.23 | 3.26 | 1.71 |
| WVM [19] | 2.98 | 2.22 | 2.99 | 1.68 |
| LIME [24] | 3.24 | 2.76 | 3.32 | 1.84 |
| JieP [9] | 3.06 | 2.67 | 3.18 | 1.82 |
| BIMEF [64] | 3.14 | 2.54 | 3.02 | 1.79 |
| RRM [34] | 3.08 | 2.69 | 2.97 | 1.86 |
| STAR w/o | 3.18 | 2.64 | 3.22 | 1.77 |
| STAR w/o | 3.09 | 2.78 | 3.01 | 1.82 |
| STAR | 2.93 | 2.96 | 2.86 | 1.92 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
STAR: A Structure and Texture Aware Retinex Model
Jun Xu1, Yingkun Hou3, Dongwei Ren2, Li Liu4, Fan Zhu4, Mengyang Yu4, Haoqian Wang5,6, and Ling Shao4
internatio Corresponding Author: Haoqian Wang (Email: [email protected]). 1College of Computer Science, Nankai University, Tianjin, China
2 College of Intelligence and Computing, Tianjin University, Tianjin, China
3 School of Information Science and Technology, Taishan University, Taian, China
4 Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
5Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
6Shenzhen Institute of Future Media Technology, Shenzhen, China
Jun Xu, Yingkun Hou, Dongwei Ren, Li Liu, Fan Zhu, Mengyang Yu, Haoqian Wang, Ling Shao
This work is partially funded by the Major Project for New Generation of AI under Grant 2018AAA01004, in part by the National Natural Science Foundation of China (No. 61831014, 61929104) and the Shenzhen Science and Technology Project under Grant (JCYJ20170817161916238, JCYJ20180508152042002, GGFW2017040714161462). The Corresponding author is Prof. Haoqian Wang (Email: [email protected]). Jun Xu is with TKLNDST, College of Computer Science, Nankai University, Tianjin, China. Yingkun Hou is with School of Information Science and Technology, Taishan University, Taian, China. Dongwei Ren is with College of Intelligence and Computing, Tianjin University, Tianjin, China. Li Liu, Fan Zhu, and Mengyang Yu are with Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE. Haoqian Wang is with the Tsinghua Shenzhen International Graduate School, and also with Shenzhen Institute of Future Media Technology, Shenzhen 518055, China. Ling Shao is with the Inception Institute of Artificial Intelligence, Abu Dhabi, UAE, and also with the Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE.
Abstract
Retinex theory is developed mainly to decompose an image into the illumination and reflectance components by analyzing local image derivatives. In this theory, larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. In this paper, we utilize exponentiated local derivatives (with an exponent ) of an observed image to generate its structure map and texture map. The structure map is produced by been amplified with , while the texture map is generated by been shrank with . To this end, we design exponential filters for the local derivatives, and present their capability on extracting accurate structure and texture maps, influenced by the choices of exponents . The extracted structure and texture maps are employed to regularize the illumination and reflectance components in Retinex decomposition. A novel Structure and Texture Aware Retinex (STAR) model is further proposed for illumination and reflectance decomposition of a single image. We solve the STAR model by an alternating optimization algorithm. Each sub-problem is transformed into a vectorized least squares regression, with closed-form solutions. Comprehensive experiments on commonly tested datasets demonstrate that, the proposed STAR model produce better quantitative and qualitative performance than previous competing methods, on illumination and reflectance decomposition, low-light image enhancement, and color correction. The code is publicly available at https://github.com/csjunxu/STAR.
I Introduction
The Retinex theory developed by Land and McCann [32, 30] models the color perception of human vision on natural scenes. It can be viewed as a fundamental theory for intrinsic image decomposition problem [3], which aims at decomposing an image into illumination and reflectance (or shading) components. A simplified Retinex model involves decomposing an observed image into an illumination component and a reflectance component via , where denotes the element-wise multiplication. The illumination expresses the color of the light striking the surfaces of objects in the scene , while the reflectance reflects the painted color of the surfaces of objects in [68]. Retinex theory has been applied in many image processing tasks, such as low-light image enhancement [68, 51, 24] and color correction [19, 9] (please refer to Figure 1 for an example).
The Retinex theory introduces a useful property of derivatives [32, 30, 68]: larger derivatives are often attributed to the changes in reflectance, while smaller derivatives are likely from the smooth illumination. With this property, the Retinex decomposition can be performed by classifying the image gradients into the reflectance component and the illumination one [29]. However, binary classification of image gradients is unreliable since reflectance and illumination changes will coincide in an intermediate region [68]. Later, several methods are proposed to classify the edges or edge junctions, instead of gradients, according to some trained classifiers [4, 47]. However, it is quite challenging to train classifiers considering all possible ranges of illumination and reflectance configurations. Besides, though these methods explicitly utilize the property of derivatives, they perform Retinex decomposition by analyzing the gradients of a scene [12] in a local manner, while ignoring the global consistency of the structure in that scene. To alleviate this problem, several methods [68, 51, 19] perform global decomposition with the consideration of different regularization. However, these methods ignore the property of derivatives and cannot separate well illumination and reflectance components.
In this paper, we propose to utilize exponentiated local derivatives to better exploit the property of derivatives in a global manner. The exponentiated derivatives are determined by an introduced exponents on local derivatives, and generalize the local derivatives to extract global structure and texture maps. Given an observed scene (e.g., Figure 1 (a)), its derivatives are exponentiated by to generate a structure map (Figure 1 (d) up) when being amplified with and a texture map (Figure 1 (d) down) when being shrank with . The extracted structure and texture maps are employed to regularize the illumination (Figure 1 (b)) and reflectance (Figure 1 (c)) components in Retinex decomposition, respectively. With meaningful structure and texture maps, we propose a Structure and Texture Aware Retinex (STAR) model to accurately estimate the illumination and reflectance components. We solve our STAR model by an alternating optimization algorithm [60, 49]. Each sub-problem is transformed into a vectorized least squares regression with closed-form solutions [57]. Comprehensive experiments on commonly tested datasets demonstrate that, the proposed STAR model obtains better performance than previous competing methods, on illumination and reflectance decomposition, low-light image enhancement, and color correction.
In summary, the contributions of this work are three-fold:
- •
We propose to utilize exponentially local derivatives to better extract meaningful structure and texture maps.
- •
We develop a novel Structure and Texture Aware Retinex (STAR) model to accurately estimate the illumination and reflectance components, and exploit the property of derivatives in a global manner.
- •
Experimental on commonly tested datasets demonstrate that our STAR obtains better performance than previous competing methods on Retinex decomposition, low-light image enhancement, and color correction.
The remaining paper is organized as follows. In §II, we review the related work. In §III, we introduce a structure and texture aware weighting scheme for Retinex image decomposition. Then we develop a structure and texture aware Retinex model in §IV. §V describes the detailed experiments on Retinex decomposition of illumination and reflectance components. §VI presents our STAR model on two other image processing applications: low-light image enhancement and color correction. Finally, we conclude this paper in §VII.
II Related Work
II-A Retinex Model
The Retinex model has been extensively studied in literature [31, 27, 9], which can be roughly divided into classical ones [7, 20, 41] and variational ones [51, 24, 9]. Besides, the Retinex decomposition methods can be applied into low-light image enhancement [64, 50, 37] and color correction [21, 26, 14].
Classical Retinex methods include path based methods [31, 7, 16, 20], Partial Differential Equation (PDE) based methods [41], and center/surround methods [27]. Early path based methods [31, 7] are developed based on the assumption that, the reflectance component can be computed by the product of ratios along some random paths. These methods demand careful parameter tuning and incur high computational costs. To improve the efficiency, later path-based methods of [16, 20] employ recursive matrix computation techniques to replace previous random path computation. However, their performance is largely influenced by the number of recursive iterations, and unstable for real applications. PDE based methods [41] utilize the property that the Retinex solutions satisfy a discrete Poisson equation, which yields an efficient implementation of reflectance estimation using only two Fast Fourier Transformations (FFTs). However, the structure of illumination component will be degraded, since gradients derived by a divergence-free vector field often loss piece-wise smoothness. The center/surround methods include the famous single-scale Retinex (SSR) [28] and multi-scale Retinex with color restoration (MSRCR) [27]. These methods simply assume the illumination component to be smooth, and the reflectance component to be non-smooth. However, due to lack of a reasonable structure-preserving restriction, MSRCR tends to produce halo artifacts around edges.
Variational methods [36, 19, 34] have been proposed for Retinex based illumination and reflectance decomposition. In [29], the smooth assumption is introduced into a variational model to estimate the illumination component. But this method is slow and ignores to regularize the reflectance. Later, an variational model is proposed in [39] to focus on estimating the reflectance component. But this method ignores to regularize the illumination component. The logarithmic transformation is also employed in [43] as a pre-processing step to suppress the variation of gradient magnitude in bright regions, but the reflectance component estimated with logarithmic regularization tends to be over-smoothed. To consider both illumination and reflectance regularizations, a total variation (TV) model based method is proposed in [42]. But similar to [43], the reflectance is over-smoothed due to the side-effect of the logarithmic transformation. Recently, Fu et al. [17] developed a probabilistic method for simultaneous illumination and reflectance estimation (SIRE) in the linear space instead of logarithmic one. This method preserves well the details and avoid to over-smooth the reflectance component, when compared to previous methods performed in the logarithmic space. To alleviate the detail loss problem of the reflectance component in the logarithmic space, Fu et al. [19] proposed a weighted variational model (WVM) to enhance the variation of gradient magnitude in bright regions. However, the illumination component may instead be damaged by the unconstrained isotropic smoothness assumption. By considering the properties of 3D objects, Cai et al. [9] proposed a Joint intrinsic-extrinsic Prior (JieP) model for Retinex decomposition. However, this model is prone to over-smoothing both the illumination and reflectance of a scene. In [34], Li et al. proposed a Robust Retinex Method (RRM) by considering an additional noise map [56, 55]. But this method is effective especially for low-light images accompanied by intensive noise.
II-B Intrinsic Image Decomposition
The Retinex model is in similar spirit with the intrinsic image decomposition model [35, 1, 2], which decomposes an observed image into Lambertian shading and reflectance (ignoring the specularity). The major goal of intrinsic image decomposition is to recover the shading and relectance terms from an observed scene, while the specularity term can be ignored without performance degradation [23]. However, the reflectance recovered in this problem usually loses the visual content of the scenes [24], and hence can hardly be used for simultaneous illumination and reflectance estimation. Therefore, intrinsic image decomposition does not satisfy the purpose of Retinex decomposition for low-light image enhancement, in which the objective is to preserve the visual contents of dark regions as well as keep its visual realism [24]. For more difference between Retinex decomposition and intrinsic image decomposition, please refer to [24].
III Structure and Texture Awareness
In this section, we first present the simplified Retinex model, and then introduce structure and texture awareness for illumination and reflectance regularization.
III-A Simplified Retinex Model
The Retinex model [30] is a color perception simulation of the human vision system. Its physical goal is to decompose an observed image into its illumination and reflectance components, i.e.,
[TABLE]
where means the illumination component of the scene representing the brightness of objects, denotes the surface reflection component of the scene representing its physical characteristics, and means element-wise multiplication. The illumination component and reflectance one can be recovered by alternatively estimating them via
[TABLE]
where means element-wise division. In fact, we employ and to avoid zero denominators, where .
To solve this inverse problem (2), previous Retinex methods usually employ an objective function that estimates illumination and reflectance components by
[TABLE]
where and are two different regularization functions for illumination and reflectance , respectively. One implementation choice of and is the total variation (TV) [45], which is widely used in previous methods [42, 19].
III-B Structure and Texture Estimator
The Retinex model (1) decomposes an observed scene into its illumination and reflectance components. This problem is highly ill-posed, and proper priors of illumination and reflectance should be considered to regularize the solution space. Qualitatively speaking, the illumination should be piece-wisely smooth, capturing the structure of the objects in the scene, while the reflectance should present the physical characteristics of the observed scene, capturing its texture information. Here, texture refers to the small patterns in object surface, which are similar in local statistics [54].
Previous structure-texture decomposition methods often enforce the TV regularizers to preserve edges [42, 36, 62]. These TV regularizers simply enforce gradient similarity of the scene and extract the structure of the objects. There are two ways for structure-texture decomposition. One is to directly derive structure using structure-preserving techniques, such as edge-aware filters [65] and optimization based methods [9]. The other way is to extract structure from the estimated texture weights [62]. However, these techniques [62, 65, 9] are vulnerable to textures and produce ringing effect near edges. Moreover, the method [62] cannot extract scenes structures with similar appearances to those of the underlying textures.
To better understand the power of these techniques for structure-texture extraction, we study two typical filters. The first is the TV filter [45], which computes the absolute gradients of an input image as a guidance map:
[TABLE]
The second is the mean local variance (MLV) [9], which can also be utilized for structure map estimation:
[TABLE]
where is the local patch [13] around each pixel of , denotes the number of elements in , and its size is set as in all our experiments.
To support that the TV and MLV filters can capture the structure of the scene, we visualize the effect of the two filters performed on extracting the structure/texture from an observed image. Here, the input RGB image (Figure 2 (a), up) is first transformed into the Hue-Saturation-Value (HSV) domain. Since the Value (V) channel (Figure 2 (a), down) reflects the illumination and reflectance information, we process this channel for the input image. It can be seen from Figure 2 (c) that, the TV and MLV filters can basically reflect the main structure of the input image. This point can be further validated by comparing the similarity of the two filtered image (Figure 2 (c)) with the edges extracted for the input image (Figure 2 (a)). To this end, we resort to a recently published edge detection method [38] to extract the main structure of the input image. By comparing the TV filtered image (Figure 3 (b)), MLV filtered image (Figure 3 (d)), and the edge extracted image (Figure 3 (c)), we observe that the TV and MLV filtered images already reflect the structure of the input image.
III-C Proposed Structure and Texture Awareness
Existing TV and MLV filters described in Eqns. (4) and (5) cannot be directly utilized in our problem, since they are prone to capture structural information. As described in Retinex theory [32, 30], larger derivatives are attributed to the changes in reflectance, while smaller derivatives are emerged in the smooth illumination. Therefore, by exponential growth or decay, these local derivatives will reflect more clearly the corresponding content structure or detailed textures, as has been illustrated in Figure 2. To this end, we introduce an exponential version of local derivatives for flexible structure and texture estimation. Specifically, we add an exponent term to the TV and MLV filtering operations. By this way, we can make the two filters more flexible for separate structure and texture extraction. Specifically, we propose the exponentiated TV (ETV) filter as
[TABLE]
and the exponentiated MLV (EMLV) filter as
[TABLE]
where denotes the number of elements in and is the exponent determining the sensitivity to the gradients of . Note that we evaluate the two exponentiated filters Eqns. (6) and (7) by visualizing their effects on a test image (i.e., Figure 2 (a), top). This RGB image is first transformed into the Hue-Saturation-Value (HSV) domain, and the decomposition is performed in the Value (V) channel. In Figure 2 (b)-(e), we plot the filtered images for the V channel of the input image. It is noteworthy that, with , the ETV and EMLV filters roughly reveal the textures of the test image, while with , the ETV and EMLV filters tend to extract the structural edges.
Motivated by this observation, we introduce a structure and texture aware weighting scheme for illumination and reflectance decomposition. Specifically, we set , the ETV based weighting matrix as
[TABLE]
and the EMLV based weighting matrix as:
[TABLE]
where and are two exponential parameters to adjust the structure and texture awareness for illumination and reflectance decomposition. As will be demonstrated in §V, the values of and influence the performance of the Retinex decomposition. Due to considering local variance information, the EMLV filter (Eqn. (9)) can reveal details and preserve structures better than the ETV filter (Figure 2). This point will also be validated in §V.
IV Structure and Texture Aware Retinex Model
IV-A Proposed Model
In this section, we propose a Structure and Texture Aware Retinex (STAR) model to simultaneously estimate the illumination and the reflectance of an observed image . To make our STAR model as simple as possible, we adopt the TV -norm to regularize the illumination and reflectance components. The proposed STAR model is formulated as
[TABLE]
where and are the two matrices defined in (9), indicating the structure map of the illumination and the texture map of the reflectance, respectively. The structure should be small enough to preserve the edges of objects in the scene, while large enough to suppress the details (as the inverse of Figure 2 (d,e)). On the other hand, the texture map should be small enough to reveal the details (as the inverse of Figure 2 (b,c)).
IV-B Optimization Algorithm
Since the objective function (10) is separable w.r.t. the two variables and , it can be solved via an alternative optimization algorithm. The two separated sub-problems are convex and alternatively solved. We initialize the matrix variables . Denote and as the illumination and reflectance components at the -th () iteration, respectively, and is the maximum iteration number. By optimizing one variable at a time while fixing the other, we can alternatively update the two variables as follows:
a) Update while fixing . With in the -th iteration, the optimization problem with respect to becomes:
[TABLE]
To solve the problem (11), we reformulate it into a vectorized format. To this end, with the vectorization operator , we denote vectors , , , , which are of length . Denote by the Toeplitz matrix from the discrete gradient operator with forward difference, then we have . Denote by , the matrices with lying on the main diagonals, respectively. Then, the problem (11) is transformed into a standard least squares regression problem:
[TABLE]
By differentiating problem (12) with respect to , and setting the derivative to , we have the following solution
[TABLE]
We then reformulate the obtained into matrix format via the inverse vectorization .
b) Update while fixing . After acquiring from the solution (11), the optimization problem (10) with respect to is similar to that of :
[TABLE]
Similarly, we reformulate the problem (14) into a vectorized format. Additionally, we denote \bm{r}$$=$$\text{vec}(\bm{R}), \bm{t}_{0}$$=$$\text{vec}(\bm{T}_{0}). which are of length . We also have . Denote by , the matrices with lying on the main diagonals, respectively. Then, the problem (14) is also transformed into a standard least squares problem:
[TABLE]
By differentiating the problem (15) with respect to , and setting the derivative to , we have the following solution
[TABLE]
We then reformulate the obtained into the matrix format via inverse vectorization .
The above alternative algorithm are repeated until the convergence condition is satisfied or the number of iterations exceeds a preset threshold. The convergence condition of the alternative optimization algorithm is: or is satisfied, or the maximum iteration number is achieved. We set and in our experiments. Our STAR model (10) can be efficiently solved since there are only two variables in problem (10) and each sub-problem has closed-form solution.
Convergence Analysis. The convergence of Algorithm 1 can be guaranteed since the overall objective function (10) is convex with a global optimal solution. In Figure 4, we plot the {average convergence curves of the errors of or on the 35 low-light images collected from [17, 19, 24, 9]. One can see that either of them is reduced to less than in 10 iterations.
IV-C Updating Structure and Texture Awareness
Until now, we have obtained the decomposition of . To achieve better estimation on illumination and reflectance, we update the structure and texture aware maps and , and then solve the renewed problem (10). The alternative updating of (, ) and (, ) are repeated for iterations. We set to balance the speed-accuracy trade-off of the proposed STAR model in our experiments. We summarize the updating procedures in Algorithm 2.
Complexity Analysis. Now we discuss the complexity analysis of the proposed Algorithms 1 and 2. Assume that the input image is of size . In Algorithm 1, the costs for updating and are both due to the diagonalization operations, where is the number of iterations in Algorithm 1. In Algorithm 2, the costs for updating and are also , where is the number of updating in Algorithm 2. As such, the overall complexity of our STAR for Retinex decomposition is .
V Experiments
In this section, we evaluate the qualitative and quantitative performance of the proposed Structure and Texture Aware Retinex (STAR) model on Retinex decomposition (§V-B). In §V-C, we also perform an ablation study on illumination and reflectance decomposition to gain deeper insights into the proposed STAR Retinex model. All these experiments are run on a Huawei Matebook X Pro laptop with an Intel Core i5 8265U CPU and 8GB memory.
V-A Implementation Details
The input RGB-color image is first transformed into the Hue-Saturation-Value (HSV) space. Since the Value (V) channel reflects the illumination and reflectance information, we only process this channel, and transform the processed image from the HSV space to RGB-color space, similar to [19, 9]. In our experiments, we empirically set the parameters as . We also compare with a Baseline of our STAR, in which we set in (10). Due to considering local variance information, the EMLV filter (Eqn. 9) can reveal details and preserve structures better than the ETV filter (Figure 2). We will perform ablation study on these points in §V-C.
V-B Retinex Decomposition
The Retinex decomposition includes illumination and reflectance estimation. Accurate illumination estimation should not distort the structure, while being spatially smooth. Meanwhile, accurate reflectance should reveal the details of the observed scene. The ground truths for the illumination and reflectance components are difficult to generate, and hence quantitative evaluation of existing Retinex decomposition methods is very difficult until now.
To evaluate the effectiveness of the proposed STAR model, we perform qualitative comparisons on both illumination and reflectance estimation with the Baseline, the conventional Multi-scale Retinex (MSR) [27], and several state-of-the-art Retinex models, including Simultaneous Illumination and Reflectance Estimation (SIRE) [17], Weighted Variation Model (WVM) [19], Joint intrinsic-extrinsic Prior (JieP) model [9], Robust Retinex Method (RRM) [34], and Retinex Decomposition based Generative Adversarial Network (RDGAN) [50]. Similar to these methods, we perform Retinex decomposition on the V channel of the HSV space, and transform the decomposed components back to the RGB space. Some visual results on two common test images in the 35 low-light images collected from [17, 19, 24, 9] are shown in Figures 5 and 6. It can be seen that, in the proposed STAR model, the structure awareness scheme enforces piece-wise smoothness, while the texture awareness scheme preserves details across the image. As can be seen in Figures 5 and 6 (h), (i), (e)-(g), the proposed STAR method preserves better the structure of the three black regions on the white car, and reveals more details of the texture on the wall, than the Baseline and the other methods such as WVM [19], JieP [9], and RRM [34]. More comparisons on Retinex decomposition are provided in the Supplementary File.
Comparison on speed. In Table I, we also compare the computational time of different Retinex image decomposition methods on a RGB image. We observe that our STAR is faster than WVM [19] and RRM [34], but slower than JieP [9] and SIRE [17]. Though not the fastest method, our STAR achieves better decomposition performance than the other methods such as JieP [9] and SIRE [17].
V-C Validation of the Proposed STAR Model
Here, we conduct a detailed examination of our STAR for Retinex decomposition. We assess 1) the choice of the weighting scheme (ETV or EMLV) on our STAR; 2) the importance of structure and texture awareness to our STAR; 3) the influence of the parameters on our STAR; 4) how to determine the parameters and in our STAR? 5) the necessity of updating structure and texture to our STAR.
1. The influence of the weighting scheme (ETV or EMLV) on our STAR. To study the the weighting scheme (ETV or EMLV) on our STAR, we employ the ETV filter (8) and set and in (10) and update them as Algorithm 2 describes, and thus have another STAR model: STAR-ETV. The default STAR model can be termed as STAR-EMLV. From Figure 7, one can see that, the STAR-ETV model tends to provide little structure in illumination, while losing texture information in reflectance. By employing EMLV filter as the weighting matrix, the proposed STAR (STAR-EMLV) method maintains the structure and texture better than the STAR-ETV model.
2. Is structure and texture awareness important? To answer this question, we set or in (10) and update them as Algorithm 2 describes, and thus have two baselines: STAR w/o Structure and STAR w/o Texture. Note that if we set or in (10) as comfortable identity matrix, the performance of the corresponding STAR model is very bad. From Figure 8, one can see that, STAR w/o Structure tends to provide little structural information in illumination, while STAR w/o Texture influence little in illumination and reflectance. By considering both, the proposed STAR decompose the structure/texture components accurately.
3. How do the parameters and influence STAR? The are key parameters for the structure and texture awareness of STAR. In Figure 9, one can see that STAR with ((d) and (h)) produces reasonable results, STAR with ((c) and (g)) can barely distinguish the illumination and reflectance, while STAR with ((b) and (f)) confuses illumination and reflectance to a great extent. Since we regularize more on (), and in (f) are not exactly the same as and in (b), respectively.
4. How to determine the parameters and ? The scaling and relative size of these two parameters determine the trade-off of the regularization intensity between the structure and texture components. To determine their reasonable values, we run Retinex decomposition experiments on the “kodim07” image (described as “a shuttered window partially masked by flowering bush”) by our STAR with . To avoid extensive parameter tuning, We did not test our STAR with other values in . The original image can be available at http://r0k.us/graphics/kodak/kodim07.html. The decomposed illumination and reflectance components by our STAR with different and values are shown in Figure 11, respectively. Due to limited space, here we only show the components by our STAR with . Other comparisons by our STAR with and on other images are provided in the Supplementary File. We observe that our STAR with achieves better illumination performance than the other cases. By fixing the , we observe that the reflectance reflects more details with smaller . Similar results can be found on other images. Therefore, we set and for our STAR.
5. Is updating necessary? We also study the effect of the updating iteration number on STAR. To do so, we simply set in STAR and evaluate its Retinex decomposition performance. From Figure 10, one can see that the illumination becomes more structural while reflectance presents more details with more iterations.
VI Other Applications
In this section, we apply the proposed STAR model on two other image processing applications: low-light image enhancement (§VI-A) and color correction (§VI-B).
VI-A Low-light Image Enhancement
Capturing images in low-light environments suffers from unavoidable problems, such as low visibility [37] and heavy noise degradation [61, 60, 59]. Low-light image enhancement aims to alleviate this problem by improving the visibility and contrast of the observed images. To preserve the color information, the Retinex model based low-light image enhancement is often performed in the Value (V) channel of the Hue-Saturation-Value (HSV) domain.
Comparison methods and datasets. We compare the proposed STAR model with previous competing low-light image enhancement methods, including HE [11], MSRCR [27], Contextual and Variational Contrast (CVC) [10], Naturalness Preserved Enhancement (NPE) [51], Layered Difference Representation (LDR) [33], SIRE [17], Multi-scale Fusion (MF) [18], WVM [19], Low-light IMage Enhancement (LIME) [24], and JieP [9]. We evaluate these methods on 35 low-light images collected from [17, 19, 24, 9], and on the 200 low-light images provided in [51].
Objective metrics. We qualitatively and quantitatively evaluate these methods on the subjective and objective quality metrics of enhanced images, respectively. The compared methods are evaluated on two commonly used metrics, one no-reference image quality assessment (IQA) metric Natural Image Quality Evaluator (NIQE) [40], and one full-reference IAQ metric Visual Information Fidelity (VIF) [46]. A lower NIQE value indicates better image quality, while a higher VIF value indicates better visual quality. The reason we employ VIF is that it is widely considered [19, 9, 24] to capture visual quality better than the Peak Signal-to-Noise Ratio (PSNR) and the Structural Similarity Index (SSIM) [53], which cannot be used in this task since no “ground truth” images are available.
Results are listed in Table II. One can see that the proposed STAR achieves lower NIQE and higher VIF results than the other competing methods. This indicates that the images enhanced by our STAR present better visual quality than those of other methods. Besides, without the structure or texture weighting scheme, the proposed STAR model produces inferior performance on these two objective metrics. This demonstrates the effectiveness of the proposed structure and texture aware components for low-light image enhancement. In Figure 12, we compare the visual quality of state-of-the-art methods [17, 19, 24, 9, 34]. As can be seen, on several representative images, our STAR achieves visually clear content while enhancing the illumination naturally, in agreement with our objective results. Besides, from the -th column of Figure 12, one can observe that the proposed STAR achieves comparable performance with the competing methods [19, 24, 9] on noise suppression [58, 25, 44].
VI-B Color Correction
In Retinex theory [32, 30], if the estimation is performed in each channel of the RGB-color space, the estimated reflectance contains the original color information of the observed scene. Therefore, the Retinex model can be applied to color correction tasks. To demonstrate the estimation accuracy of the illumination and reflectance components, we evaluate the color correction performance of the proposed STAR model and the competing methods [17, 19, 9].
We first compare the performance of the proposed STAR with several leading Retinex methods: SIRE [17], WVM [19], JieP [9], and LSRS [21]. The original images and color corrected images are downloaded from the Color Constancy Website. In Figure 13, we provide some visual results of color correction using different methods. One can see that, all these methods achieve satisfactory qualitative performance (from -nd to -th rows, -st and -rd columns of Figure 13), when compared with the original images (-st row, -nd and -th columns of Figure 13) and ground truth images (-st row, -st and -rd columns of Figure 13). To verify the accuracy of color correction using these methods, we employ the S-CIELAB color metric [66] to measure the color errors on spatial processing. The S-CIELAB errors between the ground truth and corrected images of different methods are shown from the -nd to -th rows, -nd and -th columns of Figure 13, respectively. As can be seen, the spatial locations of the errors, i.e., the green areas, of the STAR corrected images are much smaller than other methods. This indicates that the results of STAR are closer to the ground truth images (-st row, -st and -rd columns of Figure 13) than other methods.
Furthermore, we perform a quantitative comparison of the proposed STAR with several leading color constancy methods [5, 63, 9] on the Color-Checker dataset [22]. This dataset contains totally 568 images of indoor and outdoor scenes taken with two high quality cameras (Canon 5D and Canon 1D). Each image contains a MacBeth color-checker for accuracy reference. The average illumination across each channel is computed in the RGB-color space separately, as the estimated illumination for that channel. The results in terms of Mean Angular Error (MAE, lower is better) between the corrected image and the ground truth image are listed in Table III. As can be seen, the proposed STAR method achieves lower MAE results than the competing methods on the color constancy problem.
VII Conclusion
In this paper, we proposed a Structure and Texture Aware Retinex (STAR) model for illumination and reflectance decomposition. We first introduced an Exponentialized Mean Local Variance (EMLV) filter to extract the structure and texture maps from the observed image. The extracted maps were employed to regularize the illumination and reflectance components. In addition, we proposed to alternatively update the structure/texture maps, and estimate the illumination/reflectance for better Retinex decomposition performance. The proposed STAR model is efficiently solved by a standard alternative optimization algorithm. Comprehensive experiments on Retinex decomposition, low-light image enhancement, and color correction demonstrated that the proposed STAR model achieves better quantitative and qualitative performance than representative Retinex decomposition methods.
Current Retinex decomposition community also has its problem. To the best of our knowledge, there is no reasonable ground truths of decomposed illumination and reflectance in Retinex decomposition. Most of the previous Retinex methods [17, 34, 50] compared the visual quality of decomposed illumination and reflectance components by subjective evaluations. In our opinion, the reasons are possibly two-fold: 1) it is hard to synthesis the illumination and reflectance components simultaneously to produce a meaningful natural image (just like the “chicken or the egg” problem [6, 67]); 2) designing meaningful quantitative metrics (besides of PSNR and SSIM [53]) for the ground truth components (if have) are also difficult [52], since both the illumination and reflectance components should be considered to evaluate the performance of Retinex methods. To perform quantitative comparisons, these methods performed experiments on low-light image enhancement and color correction, which largely depends on the quality of decomposed components, to demonstrate the advantages of the developed Retinex decomposition methods. How to directly benchmark existing Retinex methods on synthetic datasets with ground truths is really a challenging problem, and we will explore this direction as our future work.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. T. Barron. Convolutional color constancy. In IEEE International Conference on Computer Vision (ICCV) , pages 379–387, 2015.
- 2[2] J. T. Barron and Y. Tsai. Fast fourier color constancy. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 886–894, 2017.
- 3[3] H. Barrow, J. Tenenbaum, A. Hanson, and E. Riseman. Recovering intrinsic scene characteristics. Computer Vision System , 2:3–26, 1978.
- 4[4] M. Bell and E. T. Freeman. Learning local evidence for shading and reflectance. In IEEE International Conference on Computer Vision (ICCV) , volume 1, pages 670–677. IEEE, 2001.
- 5[5] S. Bianco, C. Cusano, and R. Schettini. Color constancy using CN Ns. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops , pages 81–89, 2015.
- 6[6] A. Borji, M.-M. Cheng, Q. Hou, H. Jiang, and J. Li. Salient object detection: A survey. Computational Visual Media , 5(2):117–150, Jun 2019.
- 7[7] D. H. Brainard and B. A. Wandell. Analysis of the Retinex theory of color vision. Journal of the Optical Society of America A , 3(10):1651–1661, 1986.
- 8[8] G. Buchsbaum. A spatial processor model for object colour perception. Journal of the Franklin Institute , 310(1):1–26, 1980.
