Enhanced Local Binary Patterns for Automatic Face Recognition

Pavel Kr\'al; Ladislav Lenc; Anton\'in Vrba

arXiv:1702.03349·cs.CV·June 18, 2018

Enhanced Local Binary Patterns for Automatic Face Recognition

Pavel Kr\'al, Ladislav Lenc, Anton\'in Vrba

PDF

TL;DR

This paper introduces an improved local binary pattern descriptor for face recognition that is more robust to noise, illumination, and resolution variations, outperforming existing methods on benchmark datasets.

Contribution

A novel local binary pattern descriptor considering multiple pixels and neighborhoods, enhancing robustness and accuracy in face recognition tasks.

Findings

01

Outperforms state-of-the-art methods on UFI and FERET datasets.

02

Handles single training sample and low-resolution images effectively.

03

Demonstrates robustness to noise, illumination, and variances.

Abstract

This paper presents a novel automatic face recognition approach based on local binary patterns. This descriptor considers a local neighbourhood of a pixel to compute the feature vector values. This method is not very robust to handle image noise, variances and different illumination conditions. We address these issues by proposing a novel descriptor which considers more pixels and different neighbourhoods to compute the feature vector values. The proposed method is evaluated on two benchmark corpora, namely UFI and FERET face datasets. We experimentally show that our approach outperforms state-of-the-art methods and is efficient particularly in the real conditions where the above mentioned issues are obvious. We further show that the proposed method handles well one training sample issue and is also robust to the image resolution.

Tables1

Table 1. TABLE I: Final results of the proposed approach on the UFI and FERET databases against several state-of-the-art methods

	Recognition rate [%]
Approach	UFI	FERET
SRC (Wagner et al. [15])	-	95.20
LBP (Ahonen et al. [1])	55.04	93.89
LBP_8,2	59.83	97.99
uniform LBP_8,2	53.39	97.66
LDP (Zhang et al. [9])	50.25	97.4
FS-LBP (Lenc et al. [14])	63.31	98.91
E-LBP_4,9,5 (proposed)	65.28	98.5

Equations2

N_{i}=\left\{\begin{array}[]{ll}0&\mbox{if }{g_{i}<g_{c}}\\ 1&\mbox{if }{g_{i}\geq g_{c}}\end{array}\right.

N_{i}=\left\{\begin{array}[]{ll}0&\mbox{if }{g_{i}<g_{c}}\\ 1&\mbox{if }{g_{i}\geq g_{c}}\end{array}\right.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Enhanced Local Binary Patterns for Automatic Face Recognition

Pavel Král1,2, Antonín Vrba1, Ladislav Lenc1,2

1Dept. of Computer Science & Engineering 2New Technologies for the Information Society

Faculty of Applied Sciences Faculty of Applied Sciences

University of West Bohemia University of West Bohemia

Plzeň, Czech Republic Plzeň, Czech Republic

*

Abstract

This paper presents a novel automatic face recognition approach based on local binary patterns. This descriptor considers a local neighbourhood of a pixel to compute the feature vector values. This method is not very robust to handle image noise, variances and different illumination conditions. We address these issues by proposing a novel descriptor which considers more pixels and different neighbourhoods to compute the feature vector values. The proposed method is evaluated on two benchmark corpora, namely UFI and FERET face datasets. We experimentally show that our approach outperforms state-of-the-art methods and is efficient particularly in the real conditions where the above mentioned issues are obvious. We further show that the proposed method handles well one training sample issue and is also robust to the image resolution.

Index Terms:

E-LBP, Enhanced Local Binary Patterns, Face Recognition, Local Binary Patterns, LBP

I Introduction

Automatic face recognition (AFR) consists in person identification from digital images using a computer. This field has been intensively studied during the past a few decades and its importance is constantly growing particularly due to the nowadays security issues.

It has been proved that local binary patterns (LBP) are an efficient image descriptor for several tasks in computer vision field including automatic face recognition [1]. It considers a very small local neighbourhood of a pixel to compute the feature vector values. The individual values are then computed using the differences between intensity values of the central and surrounding pixels.

In this paper, we propose a novel image descriptor called Enhanced local binary patterns (E-LPB). This method improves the original LBP operator by considering larger central area and larger neighbourhood to compute the feature vector values. These properties keep more information about the image structure and can compensate some noise, image variance issues and the differences between train / test images. This method of computation of the LBP operator considering more points has, to the best of our knowledge, never been done before and it is thus the main contribution of this paper.

The proposed method is evaluated on two standard corpora, UFI [2] and FERET [3] face datasets. UFI dataset is chosen to show the results in real conditions where the images are noisy, vary in the pose and are illuminated differently. FERET corpus is used in order to show the results of one training sample issue. In this case, we have only one image for training. Therefore it is not possible to improve the results by training step as presented for instance in [4] and we focus thus rather on the descriptor itself.

II Related Work

Methods based on local binary patterns generally use LBP histograms computed in rectangular regions [1]. The concatenated histograms create face representation vectors which are then compared using a distance metric. Uniform local binary patterns are an interesting LBP extension [5] which reduces the histogram size to 59. Ojala et al. [6] further use a circular neighbourhood created by a number of points $P$ placed on a circle with a diameter $R$ . This LBP variant is denoted as LBPP,R.

Li et al. [7] propose dynamic threshold local binary pattern (DTLBP). They use the mean value of the neighbouring pixels and also the maximum contrast between the neighbouring points to compute the feature vector. Another LBP extension are local ternary patterns (LTP) [8] which uses three states to capture the differences between the central pixel and the neighbouring ones.

Local derivative patterns (LDP) are proposed in [9]. The difference from the original LBP is that it uses the features of higher order. Davarzani et al. [10] propose a weighted and adaptive LBP-based texture descriptor. This approach successfully handles some issues of the previously proposed LBP-based approaches such as invariance to scaling, rotation, viewpoint variations and non-rigid deformations.

Elongated binary patterns [11] are another variant of the LBP using an elliptical instead of circular neighbourhood. The main advantage of this modification is that it retains better structural information in the images. Jin et al. [12] propose improved local binary patterns (ILBP). This method compares the intensities of neighbourhood pixels against the local mean pixel intensity (instead of the intensity of the central pixel).

Another interesting LBP adaptation proposed by Li et al. is extended local binary patterns [13]. This method introduces two different and complementary feature types (pixel intensities and differences).

The previously described methods were oriented to the modification of the LBP operator itself, however creation of the feature vector and recognition procedure remain usually similar. Both tasks are significantly improved by Lenc and Kral [14] by automatic identification of the important facial points using Gabor wavelets and k-means clustering algorithm. Lei et al. [4] further propose a learning step to improve the results of the LBP operator when more gallery images available.

III Enhanced Local Binary Patterns for Face Recognition

III-A Local Binary Patterns

The original LBP [6] operator uses a 3 $\times$ 3 square neighbourhood centred at a given pixel. The algorithm assigns either 0 or 1 value to the 8 neighbouring pixels by Equation 1.

[TABLE]

where $N_{i}$ is the binary value assigned to the neighbouring pixel $i\in\{1,..,8\}$ , $g_{i}$ denotes the gray-level value of the neighbouring pixel $i$ and $g_{c}$ is the gray-level value of the central pixel. The resulting values are then concatenated into an 8 bit number. Its decimal representation is used to create the feature vector.

III-B Enhanced Local Binary Patterns (E-LBP)

We extend the original LBP operator by computing the feature values from point-sets instead of the isolated points. We also consider different sizes of the neighbourhood of the central area. This concept can handle several LBP issues:

•

LBP has small spatial support, therefore it cannot properly detect large-scale textural structures;

•

It loses local textural information, since only the signs of differences of neighbouring pixels are used;

•

It is sensitive to noise, because the slightest fluctuation above or below the value of the central pixel is treated as equivalent to a major contrast between the central pixel and its surroundings.

The proposed algorithm is depicted in Figure 1 and simultaneously described next. Let ${G_{N_{i}}}$ be a set of neighbouring pixel intensities with its central pixel $C_{N_{i}}$ (the closest left/top pixel is used as the central one in the case of the neighbourhoods of the even size). Let $G_{C}$ be a set of central pixel intensities with its central pixel $C_{C}$ and let $r$ be a distance between the central pixels $C_{N_{i}}$ and $C_{C}$ . We calculate the representative values for these sets as average values of the pixel intensities belonging to these sets: $g^{\prime}_{i}=mean(G_{N_{i}})$ , $i\in\{1,..,8\}$ and $g^{\prime}_{C}=mean(G_{C})$ .

The feature vector is then created in a similar way as in the case of the original LBP operator using $g^{\prime}_{i}$ and $g^{\prime}_{C}$ values instead of $g_{i}$ and $g_{c}$ , respectively (see Section III-A).

Note that it is possible to consider several point-set topologies of different sizes to capture different texture information, however in this paper we use only the square shapes of the sizes $2\times 2$ , i.e. $4$ points and $3\times 3$ points, i.e. $9$ points.

The proposed operator is further denoted as E-LBPx,y,r, where $x\in\{4,9\}$ represents the neighbouring pixel-set topology, $y\in\{4,9\}$ is the central pixel-set topology and $r$ is the distance between the central pixels $C_{N}$ and $C_{C}$ , which is hereafter called E-LBP range.

The source codes of this algorithm are freely available for research purposes at http://home.zcu.cz/~pkral/sw/.

III-C Face Modelling and Recognition

We compute LBP values in all points of the face image. The image is then divided into a set of square cells lying on a regular grid. Feature vectors are computed for each cell as a histogram of the E-LBP values. Every cell is then represented by one feature vector of the size 256. As many other LBP based face recognition methods, we concatenate the feature histograms into one feature vector to create the face model. We use a histogram intersection distance with 1-NN classifier for the face recognition.

IV Evaluation

IV-A Experimental Set-up and Corpora

We used OpenCV111http://opencv.org/ toolkit for implementation of our models to realize the following experiments. The face databases used for evaluation of our approach are briefly described next.

IV-A1 UFI Dataset

Unconstrained facial images (UFI) dataset [2] contains face images of 605 persons extracted from real photographs and is mainly dedicated for face recognition in real conditions. In the following experiments, we use Cropped images partition. Figure 2 (left) shows two images of one individual from this partition with recognition results of our method.

IV-A2 FERET Dataset

FERET dataset [3] contains 14,051 images of 1,199 individuals. We use fa set for training while fb set for testing of the proposed method which represents 1195 of different individuals to recognize. Note that only one image per person/set is available therefore we address the one training sample problem. For the following experiments, the faces are cropped according to the eye positions and resized to $130\times 150$ pixels. Figure 2 (right) shows two example images of one person from the FERET database with recognition results obtained by the proposed approach.

IV-B Optimal Cell Size of the Proposed Approach

The cell size (see Section III-C) is one important parameter of the whole approach. This value should be set correctly to obtain a good recognition accuracy. However, it does not influence the E-LBP operator itself and it should depend mainly on the image resolution. We thus set this value experimentally using original LBP operator.

The results of this experiment are depicted in Figure 3 for UFI and FERET corpora. This figure shows that the recognition accuracies are increasing and from the value of 10 they remain almost constant for both corpora. Therefore, we chose this value for the following experiments.

IV-C Optimal Range of the Proposed Operator

E-LBP range (see Section III-B) is another important parameter of the proposed method. It defines the distance between the individual point-sets to compute the feature vector values and it also influences significantly the recognition results. Therefore, we determine its optimal value for both corpora in the second experiment (see Figure 4 for UFI and Figure 5 for FERET dataset). We can thus summarize:

•

The optimal E-LBP range is 5 for both corpora;

•

The best topology is E-LBP4,9 for both corpora;

•

The results of E-LBP4,4 are almost similar to E-LBP4,9;

•

Proposed E-LBP operator significantly outperforms the baseline LBP in these two cases on both corpora;

•

The behaviour of this operator on both corpora is consistent (similar progress).

We conclude that the proposed E-LBP operator is very robust and we also assume that it should perform well on other corpora using these settings.

IV-D Image Resolution Evaluation

Another important requirement is the robustness to the different image resolution. It is beneficial to keep the high recognition score also when image resolution is changing. Therefore, we report in Figure 6 the dependence of the recognition accuracy on the image resolution. The image resolution varies from 96 $\times$ 96 to 256 $\times$ 256 and we use both corpora for this experiment.

This figure shows that the proposed E-LBP approach is robust against the image resolution on both corpora. The recognition accuracy is higher than the baseline LBP8,2 operator, except for the $96\times 96$ case. We can thus conclude that the proposed operator is not suitable for images in very small resolution.

IV-E Final Results

Table I compares the performance of the proposed method against several other state-of-the-art algorithms. It demonstrates that the proposed approach is efficient particularly in the real conditions (i.e. UFI dataset), where it outperforms the standard LBP by 10% and the previous best method by 2% in absolute value. This method also achieves competitive recognition rate on FERET dataset (one training sample issue).

V Conclusions

This paper introduced a novel face recognition approach based on LBP. We proposed an original image descriptor which considers more pixels and different neighbourhoods to compute the feature vector. We evaluated this method on the standard UFI and FERET face datasets. The source codes are freely available for research purposes at url_hidden_for_review.

We experimentally showed that our approach outperforms a number of other state-of-the-art methods (LBP8,2 included) and its capabilities are particularly evident in the real conditions when images can be noisy, vary in the pose and are illuminated differently. We also demonstrated that the proposed approach is robust to the image resolution. This was demonstrated on the UFI dataset, where we obtained recognition accuracy 65.28%, which represents the increase by 2% over the other best method.

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local binary patterns,” in Computer vision-eccv 2004 . Springer, 2004, pp. 469–481.
2[2] L. Lenc and P. Král, “Unconstrained Facial Images: Database for face recognition under real-world conditions,” in 14th Mexican International Conference on Artificial Intelligence (MICAI 2015) . Cuernavaca, Mexico: Springer, 25-31 October 2015 2015.
3[3] P. J. Phillips, H. Wechsler, J. Huang, and P. Rauss, “The FERET database and evaluation procedure for face recognition algorithms,” Image and Vision Computing , vol. 16, no. 5, pp. 295–306, 1998.
4[4] Z. Lei, M. Pietikäinen, and S. Z. Li, “Learning discriminant face descriptor,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 36, no. 2, pp. 289–302, 2014.
5[5] T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of texture measures with classification based on featured distributions,” Pattern recognition , vol. 29, no. 1, pp. 51–59, 1996.
6[6] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on pattern analysis and machine intelligence , vol. 24, no. 7, pp. 971–987, 2002.
7[7] W. Li, P. Fu, and L. Zhou, “Face recognition method based on dynamic threshold local binary pattern,” in Proceedings of the 4th International Conference on Internet Multimedia Computing and Service . ACM, 2012, pp. 20–24.
8[8] X. Tan and B. Triggs, “Enhanced local texture feature sets for face recognition under difficult lighting conditions,” Image Processing, IEEE Transactions on , vol. 19, no. 6, pp. 1635–1650, 2010.