Fast Fourier Color Constancy and Grayness Index for ISPA Illumination   Estimation Challenge

Yanlin Qian; Ke Chen; Huanglin Yu

arXiv:1908.02076·cs.CV·September 18, 2019

Fast Fourier Color Constancy and Grayness Index for ISPA Illumination Estimation Challenge

Yanlin Qian, Ke Chen, Huanglin Yu

PDF

Open Access

TL;DR

This paper presents two methods for illumination estimation in images, one based on Fourier transform and the other on Grayness Index, achieving competitive rankings in a challenge.

Contribution

It introduces two novel approaches for illumination estimation, demonstrating their effectiveness in a competitive challenge setting.

Findings

01

Fourier-transform-based method ranked 3rd

02

Grayness Index-based method ranked 6th

03

Both methods show promising results in illumination estimation

Abstract

We briefly introduce two submissions to the Illumination Estimation Challenge, in the Int'l Workshop on Color Vision, affiliated to the 11th Int'l Symposium on Image and Signal Processing and Analysis. The Fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.

Figures9

Click any figure to enlarge with its caption.

Tables3

Table 1. TABLE I: Illumination Estimation Leaderboard. The notation ⇑ ( n ) ⇑ absent 𝑛 \Uparrow(n) refers to the method being ranked n 𝑛 n th. The numbers on Cube+ training data are obtained using 3-fold cross-validation for FFCC and are not required for the submission. We list them for better comprehensive comparison.

	Undisclosed Testing Data
Method	Median	Mean	Trimean
$⇑ (1)$ Color Cerberus et.al.[11]	1.51	2.65	1.64
$⇑ (2)$ FFCC Model J (barron)	1.59	2.49	1.73
$⇑ (3)$ FFCC Model P (our)	1.64	2.93	1.77
$⇑ (6)$ GI (our)	2.10	6.87	2.50

Table 2. TABLE II: Results on Gehler-Shi Dataset

	3-fold cross validation
Method	Median	Mean	Trimean
FFCC Model P	0.96	1.78	1.14
GI	1.87	3.07	2.16

Table 3. TABLE III: Results on NUS 8-camera Dataset

	3-fold cross validation
Method	Median	Mean	Trimean
FFCC Model P	1.31	1.99	1.43
GI	1.97	2.91	2.13

Equations8

I^{c} (p) = W^{c} (p) \circ L^{c} (p)

I^{c} (p) = W^{c} (p) \circ L^{c} (p)

u (p) = l o g (I^{g} (p) / I^{r} (p)), v (p) = l o g (I^{g} (p) / I^{r} (b)) .

u (p) = l o g (I^{g} (p) / I^{r} (p)), v (p) = l o g (I^{g} (p) / I^{r} (b)) .

δ lo g I^{c} (p)

δ lo g I^{c} (p)

δ lo g I^{c} (p)

δ lo g I^{c} (p)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsColor Science and Applications · Image Enhancement Techniques · Remote-Sensing Image Classification

Full text

Fast Fourier Color Constancy and Grayness Index for ISPA Illumination Estimation Challenge

Yanlin Qian

Tampere University

[email protected]

Ke Chen, Huanglin Yu

South China University of Technology

Abstract

We briefly introduce two submissions to the Illumination Estimation Challenge, in the Int’l Workshop on Color Vision, affiliated to the 11th Int’l Symposium on Image and Signal Processing and Analysis. The fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.

Index Terms:

color constancy, illumination, FFCC, gray pixel

I Introduction

Color constancy refers to the property of camera capturing the intrinsic color of objects regardless of the casting scene illumination. Researchers of color constancy are generally doing two things: First, measuring the illumination estimation as accurate as possible; Second, color correction to realize auto white balance. This work is all about the Step 1 as it is the most demanding part.

Many advanced deep learning methods (e.g. [6, 13, 7, 4]) have been proposed, and they achieved nearly-saturating results over two main-stream color constancy datasets – Gehler-Shi dataset [12] and NUS 8-camera dataset [5]. Along with the illumination estimation challenge, a new benchmark, Cube+ Dataset [1] is advertised and applied with more hardness. It is meaningful for us to test some existing methods and see a reboot in experimental results.

Different to Gehler-Shi and NUS 8-camera dataset, the Cube+ contains more images (up to 1707) with diverse ground truth annotations. Please refer to [1] for more official description of this dataset. One noticeable point of the Cube+ is that, only a single camera (Canon EOS 550D) is used for collecting the training and testing data. This can be seemed as an advantage and a disadvantage, simultaneously. On one hand, we now have chance to test any methods independent of varying photometric properties of several sensors. On the other hand, we are blind to the varying camera sensibility. This makes Cube+ “easier” for CNN-based methods.

Given that CNN is a functional tool since its invention and many other teams will use it for the challenge (we guess), we test some different approaches. The first submission (ranked 3rd in the leaderboard) is based on Barron’s Fast Fourier Color Constancy (FFCC, [3]) since its source is publicly available and it is practical in products (e.g. Google Photo App). For how FFCC works and its insights, please refer to [3].

The other submission is ranked 6th which is out of top three, but we still want to introduce it. It is based on Grayness Index (GI) [8], which is an advanced version of gray pixel methods [14, 9]. A comparison of gray pixel methods is given by [10]. It is learning-free, thus it does not rely on training on the given 1707 images and the corresponding groundtruth. We choose it as it is simple (few lines of Matlab, no learning) and we are curious to see its limitness in a more challenging dataset like Cube+. Please check [8] if you want to find how gray (achromatic) pixels can be identified.

II Methodology

In this section we describe the principles of the two approaches. Assuming Lambertian model, narrow sensor response and uniform global illumination , the RGB value of a pixel can be expressed as:

[TABLE]

which shows the color value of the channel $c$ at the location $p$ in image $I$ is the product of the surface albedo $W^{c}(p)$ and the illumination color $L^{c}(p)$ . In Barron’s [2], the RGB value of a pixel $I(p)$ is transformed into the log-chroma measures:

[TABLE]

By framing the task of color constancy in Equation 2, the global illumination $L$ can be treated as an additive constraint in log-chroma space. Now that we have a 2D spatial localization task, where some window-wise classifiers can be trained using image-groundtruth pairs and then gives maximum activation when it detects illumination on the UV histogram. Barron proposed convolutional filter on UV histogram in [2], and further extended it in FFCC [3] by performing element-wise multiplication in the Fourier space, which allows “warpped” input images and faster inference speed.

Grayness Index [8] addresses color constancy from a different angle. Apply log and laplacian-of-gaussian filter $\delta$ on Equation. 1, we get:

[TABLE]

Assuming the illumination is constant over small local neighborhood (the same color and direction), Equation 3 simplifies to:

[TABLE]

which is the core of gray pixel. $\delta\log I^{r}(p)=\delta\log I^{g}(p)=\delta\log I^{b}(p)$ indicates a perfect gray pixel at at the location $p$ . Based on Equation 4, Grayness Index detects nearly gray pixels accurately, which reflects illumination.

Please refer to the original papers [3, 8] for more theoretical details and the implementation.

III How we do differently

FFCC-based method FFCC contains multiple variants (model A to Q, depending on how many channels are used, thumb or full-size input, photo exif or deep feature). As the challenge page provides exif information, we adopt model P of FFCC, which is the best branch proved by results on Gehler-Shi Dataset. For deep feature required by Model P, we extract the neural activation from the layer “fc7” of 16-layer VGG network, which is pretrained on Place365 [15]. Now that we have exif and deep feature for each image of Cube+ dataset, we tuned the hyper parameters of FFCC and trained our model, same as [3].

On the last submission day we realized we made a mistake. The released testing data do not contain any exif, which we believe our trained model will definitely fail on this testing data. Under limited time, we computed the mean exif matrix of all training images and used it for each testing image. The trained model P can work, but not to a satisfying degree. We submitted the results given by this model P.

GI-based method We did not change anything – we downloaded the code from the github page111https://github.com/yanlinqian/Grayness-Index and used it on testing images.

IV Results

Table I gives a simple leaderboard. Our FFCC-based Model Q suffers from the lack of exif for testing data, ranked after the FFCC Model J, which does not rely on exif. This validates that using a biased exif leads to a worse case. The 3-fold cross-validation of FFCC Model P on Cube+ training data shows again the importance of exif information.

For GI-based method, it obtains a competitive median angular error, with no learning. It inherits the simplicity of the classical gray world method, but also the same drawback – when the scene is dominated by a specific color or the image is a local patch, its result is seriously biased. This is shown by its mean error, as several “outlier” images can bring very high angular errors, increasing the mean error rapidly. This is consistent with the observations in [8]. Figure 1 illustrates several hard images in the testing set for GI. On Cube+ training set which contains much less “outlier” images, GI obtains more accurate and robust results (Table I, bottom).

We also show the result of two methods tested on Gehler-Shi Dataset (Table II) and NUS 8-camera Dataset (Table III). Comparing the three tables, we find that the Cube+ dataset is the most challenging one. It is partially due to a large portion of local-patch images (Figure 1) in the Cube+ dataset.

V Conclusion

In this paper we test two non-deep-learning methods on the new challenging Cube+ dataset. The FFCC method needs training and can exploit the exif information (if provided), while the Grayness Index is a simple statistical methods but sensitive to extremely hard images. Both methods find their position in the leaderboard, reasonably. Our future plan is to make Grayness Index include more dichromatic cues to deal with very local images, while still keeping it simple.

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. Banić and S. Lončarić. Unsupervised learning for color constancy. ar Xiv preprint ar Xiv:1712.00436 , 2017.
2[2] J. T. Barron. Convolutional color constancy. In ICCV , 2015.
3[3] J. T. Barron and Y.-T. Tsai. Fast fourier color constancy. In CVPR , 2017.
4[4] S. Bianco and C. Cusano. Quasi-unsupervised color constancy. In CVPR , 2019.
5[5] D. Cheng, D. K. Prasad, and M. S. Brown. Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. JOSA A , 31(5):1049–1058, May 2014.
6[6] Y. Hu, B. Wang, and S. Lin. Fully convolutional color constancy with confidence-weighted pooling. In CVPR , 2017.
7[7] Y. Qian, K. Chen, J. Kämäräinen, J. Nikkanen, and J. Matas. Recurrent color constancy. In ICCV , 2017.
8[8] Y. Qian, J. Nikkanen, J. Kämäräinen, and J. Matas. On finding gray pixels. In CVPR , 2019.