TL;DR
This paper introduces a novel nasal region-based 3D face recognition method that uses keypoint detection, surface normal features, and genetic algorithms to achieve high accuracy across multiple datasets, outperforming many existing methods.
Contribution
The paper presents a new five-step nasal region-based recognition algorithm that is robust to expressions and does not require complex alignment or training.
Findings
Achieves high recognition accuracy on FRGC, Bosphorus, and BU-3DFE datasets.
Outperforms many state-of-the-art 3D face recognition algorithms.
Does not rely on sophisticated alignment or denoising steps.
Abstract
The potential of the nasal region for expression robust 3D face recognition is thoroughly investigated by a novel five-step algorithm. First, the nose tip location is coarsely detected and the face is segmented, aligned and the nasal region cropped. Then, a very accurate and consistent nasal landmarking algorithm detects seven keypoints on the nasal region. In the third step, a feature extraction algorithm based on the surface normals of Gabor-wavelet filtered depth maps is utilised and, then, a set of spherical patches and curves are localised over the nasal region to provide the feature descriptors. The last step applies a genetic algorithm-based feature selector to detect the most stable patches and curves over different facial expressions. The algorithm provides the highest reported nasal region-based recognition ranks on the FRGC, Bosphorus and BU-3DFE datasets. The results are…
| Dataset | Nasal root | |||||
| (L1) | Left eye corner | |||||
| (L2) | Left alar groove | |||||
| (L3) | Subnasale | |||||
| (L5) | Right alar groove | |||||
| (L6) | Right eye corner | |||||
| (L7) | ||||||
| Bosphorus | 1.06 0.58 | 1.76 1.03 | 1.06 0.62 | 1.11 0.38 | 1.19 0.60 | 2.12 1.14 |
| FRGC | 2.04 1.09 | 2.95 1.61 | 1.29 0.82 | 1.86 0.85 | 1.22 0.62 | 2.91 1.53 |
| Algorithm | Threshold (mm) | L3 | L6 | L2 | L7 | Nose tip L4 |
|---|---|---|---|---|---|---|
| Proposed | ||||||
| method | 10 | |||||
| 12 | ||||||
| 15 | ||||||
| 20 | 99.55% | |||||
| 99.62% | ||||||
| 99.66% | ||||||
| 99.69% | 99.35% | |||||
| 99.62% | ||||||
| 99.66% | ||||||
| 99.66% | 96.69% | |||||
| 97.73% | ||||||
| 99.04% | ||||||
| 99.59% | 94.59% | |||||
| 96.56% | ||||||
| 98.38% | ||||||
| 99.62% | 97.52% | |||||
| 99.04% | ||||||
| 99.66% | ||||||
| 99.79% | ||||||
| Creusot et al. [37] | 10 | |||||
| 12 | ||||||
| 15 | ||||||
| 20 | 97.96% | |||||
| 99.18% | ||||||
| 99.82% | ||||||
| 99.90% | 98.43% | |||||
| 99.71% | ||||||
| 99.86% | ||||||
| 99.90% | 98.82% | |||||
| 99.65% | ||||||
| 99.93% | ||||||
| 99.93% | 98.50% | |||||
| 99.43% | ||||||
| 99.75% | ||||||
| 99.86% | 95.47% | |||||
| 98.15% | ||||||
| 98.97% | ||||||
| 99.33% |
| Feature | Number of gallery samples per subject (No. gallery samples/ No. probe samples) | ||||||
|---|---|---|---|---|---|---|---|
| descriptors | 1 | ||||||
| (482/4330) | 2 | ||||||
| (880/3848) | 3 | ||||||
| (1206/3408) | 4 | ||||||
| (1432/3006) | 5 | ||||||
| (1610/2648) | 6 | ||||||
| (1752/2326) | 7 | ||||||
| (1757/2034) | |||||||
| Spherical patches | 96.2% | 98.9% | 99.4% | 99.6% | 99.6% | 99.7% | 99.75% |
| Curves | 91.6% | 96.8% | 98.1% | 98.8% | 98.9% | 99.3% | 99.4% |
| Algorithm | Modality | Rank-one | ||||
| FRGC v2.0 | EER ROC III | 0.1% FAR | ||||
| ROC III | Neutral vs. | |||||
| Neutral | Neutral vs. | |||||
| Non-neutral | ||||||
| Spherical patches | ||||||
| Curves | 3D Nose | 97.9% | ||||
| 94.1% | 2.4% | |||||
| 4.9% | 93.5% | |||||
| 80.0% | 98.45% (R1RR) | |||||
| 95.8% (R1RR) | 98.5% (R1RR) | |||||
| 97.5% (R1RR) | ||||||
| Smeets et al. [34] | 3D Face | 89.6% | 3.8% | 77.2% | - | - |
| Osaimi et al. [20] | 3D Face | 96.5% | - | 94.05% | 98.35% (0.1% FAR) | 97.8% (0.1% FAR) |
| Spreeuwers et al. [21] | 3D Face | |||||
| 3D Nose | 99.0% | |||||
| 94.5% | - | 94.6% | ||||
| 83.7% | - | - | ||||
| Drira et al. (2013) [24] | 3D Face | 97.0% | - | 97.1% | 99.2% (R1RR) | 96.8% (R1RR) |
| Alyüz et al. (2010) [6] | 3D Face | |||||
| 3D Nose | 97.5% | |||||
| 91.81 | 1.91% | |||||
| - | 85.6% | |||||
| - | 98.39% (R1RR) | |||||
| - | 96.40% (R1RR) | |||||
| - | ||||||
| Wang et al. (2010) [23] | 3D Face | 98.39% | - | 98.04% | 99.2% (0.1% FAR) | 97.7% (0.1% FAR) |
| Wang et al. (2008) [5] | 3D Nose | 95% (44mm) | ||||
| 78% (24mm) | - | - | - | - | ||
| Drira et al. (2009) [7] | 3D Face/Nose | |||||
| (125 gallery) | ||||||
| (125 probe) | 88% (Face) | |||||
| 77.5% (Nose) | - | - | - | - | ||
| Chang et al. [1] | 3D Nose | - | Neutral 12% | |||
| Non-neutral 23% | - | 97.1% (R1RR) | 86.1% (R1RR) | |||
| Emambakhsh et al. [4] | 3D Nose | 89.61% | Neutral 8% | |||
| Non-neutral 18% | - | 90.87% (R1RR) | 81.61% (R1RR) | |||
| Li et al. (2014) [15] | 3D Face | 96.3% | - | - | 98.0% (R1RR) | 94.2% (R1RR) |
| Queirolo et al. [22] | 3D Face | 99.6% | - | 96.6% | 99.5% (R1RR) | 94.8% (R1RR) |
| Berretti et al. (2013) [17] | 3D Face | 95.6% | - | - | 97.3% (R1RR) | 92.8% (R1RR) |
| Berretti et al. (2010) [16] | 3D Face | 94.15% | - | - | 97.3% (R1RR) | 91.0% (R1RR) |
| Mohammadzade et al. (2013) [14] | 3D Face | - | - | 99.2% | - | - |
| Mian et al. (2008) [33] | 3D Face | 93.5% | - | Neutral 99.9% | ||
| Non-neutral 92.7% | 99% | 86.7% | ||||
| Mian et al. (2007) [9] | 2D+3D Face | |||||
| 2D+3D Nose | 95.91% | |||||
| 92.2% | - | 99.3% | ||||
| 92.5% | 99.2% | |||||
| 94.9% | 95.37% | |||||
| 80.0% |
| Algorithm | Modality and size | R1RR |
|---|---|---|
| Spherical patches | ||
| Curves | 3D Nose (105/2797) | 95.35% |
| 86.1% | ||
| Li et al. (2014) [15] | 3D Face (105/2797) | 95.4% |
| Dibeklioğlu [8] | 3D Nose (47/1527) | |
| (47/423) [rotated] | 89.2% | |
| 62.6% | ||
| Li et al. (2011) [35] | 3D Face (105/4561) | 94.1% |
| Alyüz et al. (2008) [39] | 3D Face (34/441) | |
| (47/1508) | 95.9% | |
| 95.3% |
| Algorithm | Facial expression (No. of probe samples) | ||||||
|---|---|---|---|---|---|---|---|
| Happy (106) | Surprise (71) | Fear (70) | Sadness (66) | Anger (71) | Disgust (69) | Neutral (194) | |
| Spherical patches | |||||||
| Curves | 98.08% | ||||||
| 85.85% | 100% | ||||||
| 92.96% | 98.55% | ||||||
| 87.14% | 96.92% | ||||||
| 92.31% | 94.12% | ||||||
| 84.06% | 88.24% | ||||||
| 69.12% | 98.96% | ||||||
| 96.88% | |||||||
| Li et al. [35] | 95.28% | 98.59% | 92.86% | 95.45% | 88.73% | 76.81% | 100% |
| Algorithm | Facial expression (No. of probe samples) | |||||
|---|---|---|---|---|---|---|
| Happy (400) | Surprise (400) | Fear (400) | Sadness (400) | Anger (400) | Disgust (400) | |
| Spherical patches | ||||||
| Curves | 88.5% | |||||
| 81.8% | 91.0% | |||||
| 87.8% | 89.8% | |||||
| 85.3% | 92.3% | |||||
| 87.6% | 90.1% | |||||
| 83.2% | 81.8% | |||||
| 69.8% | ||||||
| Hajati et al. [36] | 86.0% | 84.0% | 82.0% | 85.0% | 93.0% | 79.0% |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Nasal Patches and Curves for Expression-robust 3D Face Recognition
Mehryar Emambakhsh and Adrian Evans M. Emambakhsh was with the Department of Electronic and Electrical Engineering, University of Bath, Bath, UK. E-mail: [email protected]. Evans is with the Department of Electronic and Electrical Engineering, University of Bath, Bath, UK. E-mail: [email protected]
Abstract
The potential of the nasal region for expression robust 3D face recognition is thoroughly investigated by a novel five-step algorithm. First, the nose tip location is coarsely detected and the face is segmented, aligned and the nasal region cropped. Then, a very accurate and consistent nasal landmarking algorithm detects seven keypoints on the nasal region. In the third step, a feature extraction algorithm based on the surface normals of Gabor-wavelet filtered depth maps is utilised and, then, a set of spherical patches and curves are localised over the nasal region to provide the feature descriptors. The last step applies a genetic algorithm-based feature selector to detect the most stable patches and curves over different facial expressions. The algorithm provides the highest reported nasal region-based recognition ranks on the FRGC, Bosphorus and BU-3DFE datasets. The results are comparable with, and in many cases better than, many state-of-the-art 3D face recognition algorithms, which use the whole facial domain. The proposed method does not rely on sophisticated alignment or denoising steps, is very robust when only one sample per subject is used in the gallery, and does not require a training step for the landmarking algorithm.
https://github.com/mehryaragha/NoseBiometrics
Index Terms:
Face recognition, Facial landmarking, Nose region, Feature selection, Gabor wavelets, Surface normals
I Introduction
While much previous research on expression invariant 3D face recognition has focused on modelling expressions and detecting expression insensitive facial parts, there have been relatively few studies evaluating the potential of the nasal region for addressing this issue. Despite this, the nose has a number of salient features that make it suitable for expression robust recognition. It can be easily detected, due to its discriminant curvature and convexity [1], is difficult to hide without attracting suspicion [2, 3], is relatively stable over various facial expressions ([1, 4, 5, 6, 7, 8, 9]) and is rarely affected by unintentional occlusions caused by hair and scarves. Although it has been reported that the 2D image of the nose has too few discriminant features to be used as a reliable region for human identification [10], its 3D surface has much undiscovered potential. This paper further investigates the 3D nasal region for human identity authentication and verification purposes and presents a novel algorithm that provides very high discriminant strength, comparable with recent 3D face recognition algorithms, which use the whole facial domain.
The proposed approach is based on a very consistent and accurate landmarking algorithm, which overcomes the issue of robust segmentation of the nasal region. The algorithm first finds an approximate location of the nose tip and then finely tunes its location, while accurately determining the position of the nasal root and detecting the symmetry plane of the face. Next, the locations of three sets of landmarks are found: subnasale, eye corners and nasal alar groove. These landmarks are utilised on feature maps created by applying multi-resolution Gabor wavelets to the surface normals of the depth map. Two types of feature descriptors are used: spherical patches and nasal curves. Feature selection is then performed using a heuristic genetic algorithm (GA) and, finally, the expression-robust feature descriptors are applied to the well-known and widely used 3D Face Recognition Grand Challenge (FRGC) [11], Bosphorus [12] and Binghamton University 3D Facial Expression (BU-3DFE) [13] datasets.
Results show the algorithm’s high potential to recognise nasal regions, and hence faces, over different expressions, with very few gallery samples per subject. The highest rank-one recognition rates (R1RR) achieved are: 1) a R1RR of 97.9% and equal error rate (EER) of 2.4% for FRGC v2.0 and receiver operator characteristic (ROC) III experiments, respectively; 2) a R1RR of 98.45% and 98.5% for FRGC’s neutral vs. neutral and neutral vs. non-neutral samples, respectively; 3) a R1RR of 96.2% when one gallery sample per subject is used for the FRGC dataset (482 gallery samples (subjects) vs. 4330 probe samples); 4) a R1RR of 95.35% for the Bosphorus dataset when 2797 scans of 105 subjects are used as probes and the set of 105 neutral scans (one per subject) is used as the galley.
The remainder of the paper is organized as follows. After the literature review provided in section II, the alignment and nasal region cropping steps, followed by the nasal region landmarking, are detailed in section III. The feature extraction algorithm is described in section IV and section V explains the feature descriptors used. The feature selection algorithm is detailed in section VI and experimental results, including a thorough comparison with previous work, is provided in section VII. Finally, conclusions are given in section VIII.
I-A Scientific contribution and comparison with previous work
The major contribution of this paper is a novel surface normal-based recognition algorithm that provides a thorough evaluation of the recognition potential of the 3D nasal region. The results achieved are not only better than previous 3D nose recognition algorithms but also higher than many recognition algorithms that employ the whole face. The algorithm employs a novel, training-free, highly consistent and accurate landmarking algorithm for the nasal region and a robust feature space, based on the response of Gabor wavelets to surface normal vectors, is also introduced. To localise the expression robust regions on the nose a heuristic GA feature selection is applied to two different geometrical feature descriptors. Because of the smoothing effects of the Gabor wavelets, there is no need for sophisticated denoising algorithms. Indeed, only simple median filtering is required for the surface normals, even with noisy datasets such as the FRGC Spring 2003 folder. An additional advantage of the proposed approach is that a fast Principal Component Analysis (PCA)-based self-dependent method can be employed for facial pose correction. This eliminates the need for sophisticated pose correction algorithms or reference faces for fine tuning the alignment.
The proposed approach significantly extends our previous work [4] in which the nasal landmarking and recognition was performed on the depth map. This paper increases the number of landmarks and their detection accuracy and presents new feature extraction and selection algorithms. The work is inspired by recent algorithms on utilising facial normal vectors in 3D [14] and regional normal vectors [15]. To compare the new algorithms with previous approaches which used similar methodologies, the application of normals, computed over the nasal surface, is used for identification as well as the verification scenario. By using multi-resolution Gabor wavelets the ability of the algorithm to handle more noisy samples is enhanced, providing higher R1RR than the approach of Li et al. [15], which excluded the noisy FRGC Spring 2003 samples. This work also extends the application of facial curves, introduced as feature descriptors by Berretti et al. ([16] and [17]), to nasal spherical patches, producing a R1RR increase of , and showing a higher class separability for the spherical patches than for curves for 3D face recognition.
II Recent literature review
Robustness against the deformations caused by facial expressions has been a popular research topic in 3D face recognition. The face is a non-rigid object and therefore 3D matching techniques for rigid objects, such as the iterative closest point (ICP) algorithm [18], can become trapped in local minima and fail to provide accurate matching scores.
An empirical approach to deal with the variations caused by expressions is to capture a range of facial expressions for each subject and store them in the gallery [19]. Then, the facial biometric features of each test subject can be compared with all the stored expressions and a decision made on the identity of the subject. This method has numerous disadvantages: capturing a range of facial expressions for each subject is not always straightforward and requires a high storage capacity per subject. In addition, facial expressions will not necessarily remain constant and may differ between the test and gallery captures [19].
One approach to overcome this problem is to use computer graphics algorithms to artificially create different expressions for each facial capture. In [20], expressions are learned using PCA eigenvectors and then used to re-generate the expressions on the probe samples. Although this approach does not require multiple samples per subject in the gallery, it is still vulnerable to the number of training samples used to model the facial expressions. Also, a universal definition of facial expression for all subjects still remains to be found [19] and the need to classify the expression types prior to face recognition increases the computational complexity.
Another approach is to employ region-based methods, in which the least variant parts of the face over different expressions are detected using facial segmentation [1, 21, 6, 22] or extracted using their expression invariant capabilities [14, 23, 5, 15]. Spreeuwers proposes a multiple regional approach based on a PCA-Linear Discriminant Analysis (LDA) feature extraction method [21]. In regional recognition, scores from a combination of different masks on the nose, cheek, forehead, chin and mouth are fused to finalise the decision making.
Alyüz et al. use a regional registration algorithm in conjunction with LDA classifiers, giving an expression robust 3D face recognition approach [6]. They also demonstrate that the nasal region has a high discriminatory power. A focus on integrating multiple regions is provided by Queirolo et al. [22] in which four regions (the upper face image, the whole face and two nasal regions) are segmented and stored for the gallery sessions before matching is performed using a novel matching criterion, called the surface interpenetration measure, and simulated annealing.
Using facial curves is another popular approach to 3D face recognition that can be categorised as a subset of regional algorithms. Drira et al. use the intersections of planes with the facial surface to define a set of radial curves which pass through the nose tip, and then perform a quality assessment in order to handle missing data and occlusions [24]. Another curve-based algorithm is proposed by Berretti et al. [17]. First, keypoints are detected on the facial surface and then the least variant curves on the face are selected using a statistical model and matched with those in the gallery. As an extension to curves, isogeodesic stripes centralised on the nose tip are used in an expression invariant 3D face recognition method that employs a novel descriptor, termed the 3D weighted walkthroughs, to quantify the differences between corresponding stripes [16]. In another curve-based approach, Drira et al. find geodesic curves on the nasal region for a subset of the FRGC dataset [7].
To overcome the sensitivity of holistic face recognition algorithms to expression variations, Mian et al. propose a landmark-based method, in conjunction with a localised feature descriptor that incorporates the 2D texture and 3D point clouds. In an alternative approach, Wang et al. apply shape difference boosting to the Bosphorus dataset to learn the expressions and identify those facial regions which remain constant over different expressions [23]. Instead of using depth or the point coordinates for 3D registration, Mohammadzade et al. use the surface normals of the points in conjunction with a Fisher’s discriminant paradigm [14]. This approach selects the normals which maximise the concentration of within-class scatter while simultaneously maximising the between-class distribution. Recently, Li et al. proposed local normals histograms, captured from multiple rectangular regions on the face, to set up an expression-robust feature space and use a novel sparse classifier to perform the matching [15].
Despite the robustness of these algorithms against facial expressions, they often rely on accurate and consistent facial segmentation, which is not a straightforward task in 3D. To address this issue, some researches have focused on the nasal region, which shows high consistency over different expressions. For example, in one of the first investigations on 3D nose recognition, Chang et al. initially segment the face into different non-overlapping regions, using the curvature information [1]. Then, three overlapping nasal regions are detected and stored in the gallery. The same regions are segmented in the probe images and matched using the ICP algorithm. Wang et al. propose the use of local shape difference boosting for 3D face recognition and also apply the boosting algorithm to different nasal regions [5]. The regions are cropped using the intersection of spheres of radius , centred on the nose tip, with the face surface. When the value of was increased, the recognition ranks reached a maximum and then plateaued. A combination of the nasal region, forehead and eyes are used for a 2D/3D face recognition by Mian et al. [9]. A modified ICP algorithm is used for matching, in conjunction with a pattern rejector based on spherical face representation (SFR) and shift-invariant feature transform (SIFT) [25], producing high recognition ranks on the FRGC dataset, in particular for the neutral probes. Dibeklioğlu et al. used the Dijkstra algorithm to segment the nose and evaluated the performance using a subset of the Bosphorus dataset [8].
III Preprocessing and nasal region landmarking
The algorithm explained in [4] is used to crop the face. Next, median filtering with a mm2 mask size is applied twice on the cropped face. The image is then resampled to a uniform grid with mm/pixel horizontal and vertical resolutions using Delaunay triangulation and aligned using the iterative PCA algorithm [9]. The aligned face is then intersected with three cylinders to crop the nasal region, according to [4]. The depth map of the cropped nasal region is again median filtered with a mm2 mask to further smooth its surface and decrease the spike noise effects. The block diagram in Fig. 1-a shows how the landmarks in Fig. 1-b are detected.
III-A Local minima detector, nose tip re-localisation, nasal root and subnasale detection
First, an initial position of the nasal root () is detected by [4]. Then, the location of the nose tip (), found in section III, is more finely tuned. Various planes, passing through with normals are intersected with the nose surface, where is the angle of the plane with the -axis, and and are the unit vectors along the and axes, respectively. This process results in several curves on the nasal region, shown in Fig. 3-a.
The proposed landmarking algorithm relies on a minima detector, which finds a set of minima on rotated versions of the curves and then maps them to the original curve. The rotation is required because some of the original curves are strictly decreasing functions that do not have an actual minimum. Assuming is a matrix representing points of a curve, instead of directly differentiating to find the minima on the curves, as proposed by Segundo et al. [26], the curves are first rotated in the -axis (roll direction) by an angle around a given point on the curve. This operation is given by,
[TABLE]
where is the rotated version of and the function finds the location of the smallest local minima on and then remaps them to the original curve using the rotation angle . The output is an matrix, containing the locations of the local minima. computes the first order differentiation (first order difference in discrete space) of , which is then given to the signum function to detect its sign changes. This finds the locations of all the local minima in , which are then sorted based on their value in ascending order and the lowest are selected and rotated back to the original curve using . Fig. 2 shows an example of this procedure for (the global extremum).
The value allocated to should be small enough in order to preserve the single-valued functionality of , i.e. each projection of any point on to the horizontal axis should correspond to only one point on the vertical axis. Based on the type of landmark to be extracted, the value of is chosen using trial and error. The minima detector operator of (1) is applied to each curve in Fig. 3-a as follows,
[TABLE]
in which, represents the curve, which is rotated by around . represents the location of the global minimum for each curve, which itself, constitutes a curve whose global maximum gives , the initial location of the nasal root. Figure 3 shows the set of curves (in blue), their minima (in red) and the maximum of the minima (green).
The nasal root and tip locations ( and ) may be slightly inaccurate due to the depth variations caused by the noise and facial expressions. In order to improve the accuracy of their locations, for the points situated on a 5 5 mm2 area (shown in Fig. 3-b) around the nasal root and saddle, the following angular deviation is calculated,
[TABLE]
in which, and are the projections of the two pairs of points and (, and ), which are selected from the overall and points on the region of interest (RoI) from the nasal root and tip regions, respectively, see Fig. 3-b. is used to rotate the nose region in the roll direction and around . Then the image is divided into the left and right halves. Assuming the rotated nasal region is translated so that the nose tip is at the origin, for the -axis indices within the strip shown in Fig. 3-c (computed using ), the objective function is calculated by
[TABLE]
in which and are the depth maps of the flipped and cropped left and right sides of and , respectively. The two points and that minimise have the most similar values of and and their projections onto the axis ( and ) correspond to the values of the accurate nose tip and root locations such that,
[TABLE]
This is an example of a ”min-max” optimisation, which finds the best worst case for the optimum [27]. A plane passing through and is then intersected with the nose surface, with normal vector , see Fig. 4-b. The locations of the maximum and minimum of the resulting curve are the positions of and . This procedure is illustrated in Fig. 4. and give the final locations of the nose tip and nasal root, respectively.
The points on the same curve, which are located below the nasal tip (shown in Fig. 4-c) are then rotated around by an angle . The location of the lowest minimum of the resulting curve () provides the subnasale after applying (1), i.e. . Finally, is used to update the nose region and correct the pose by applying a roll directional rotation around .
III-B Nose alar groove and eye corners localisation
The location of the nose tip () is moved to the origin and an RoI defined to detect the nasal alar grooves (Fig. 5-a) by,
[TABLE]
where , and are scalar constants determining the length and directivity of the lobes in the RoI. These are chosen to be able to crop the nasal alar region, while avoiding redundant parts (in subsequent experiments, = 30 mm, and ). and are the distance from the nose tip and angular rotation from the horizontal axis passing the nose tip location, respectively. Similarly, the RoI used to detect the eye corners ( and ) is depicted in Fig. 5-c and is found using,
[TABLE]
in which mm, and . The polar coordinate system is characterised by and , which are the distance from the nasal root () and the angular rotation from the horizontal axis passing , respectively.
Planes parallel with the -plane are then intersected with each row of the RoIs. For the and intersections over the nasal alar groove and eye corner RoIs, curves and are found, respectively, and (1) is used to find three minima for the nasal alar groove,
[TABLE]
and the eye corners,
[TABLE]
where and are matrices with three rows, in which each row has the location of the lowest minimum found from the and rows of the RoIs, for the right and left sides of and , respectively. Then, for each row, and are compared and the pairs with the most similar Euclidean distances to the nose tip selected. Using a similar approach, the distance of and to for each row is computed and those pairs with the most similar distances kept. Figures 5-b and -c illustrate these processes.
The points found as candidates for the nasal alar groove and eye corners might contain some outliers. This is because of the imaging noise and deformations on the face due to the facial expressions. To remove the outliers, an iterative approach is used. First, the 3D Euclidean distances between the points on each consecutive row are computed. Then the standard deviation () of the resulting vector is used to reject the points whose is higher than a given threshold (in mm). This process continues until the number of inliers remains unchanged. Compared to the outlier removal method of [4], which uses -means clustering as a criterion to localise the outliers, this approach is deterministic and, unlike -means, is not vulnerable to empty clusters. The outlier removal algorithm results in the green points labelled as the inliers in Fig. 5-b and -d. The left and right pairs, which have the closest value of to that of the nose tip are selected as and . Also, the points amongst the inliers in Fig. 5-d, with the smallest depth values, are detected and the pair with the most similar distance to are selected as the eye corners ( and ). The eye corners and nasal alar groove landmarks are the red points in Fig. 5-b and -d, respectively.
IV Feature extraction
The proposed feature space is based on surface normals. For an aligned depth map of the nasal region, represented by its point clouds as the normals are , where ( and represent the Hadamard product operator and matrix of ones, respectively). In order to reduce the sensitivity of the normal vectors to noise and enable the extraction of multi-resolution directional region-based information from the nasal region, instead of calculating the normal vectors directly from the nose surface, they are derived from the Gabor wavelet [28] filtered depth map. The algorithm proposed by Manjunath et al. is used to minimise the wavelets overlap and redundancy in the filtered images [29].
The discrete Fourier transform of the resampled Gabor wavelet for the scale and orientation level ( and ) is computed and its zero frequency component is set to zero. The Hadamard product of the resulting and the Fourier transform of is then calculated and the absolute value of its inverse Fourier transform is computed for each scale and orientation, i.e. . The maximum of all the corresponding elements of the filtered images is computed over all orientations for each scale : . In other words, , where computes the maximum of the corresponding elements along orientations . Finally, the normal vectors of the resulting per scale maximal map is calculated using the aligned nose coordinate maps and ,
[TABLE]
where is a block matrix containing the normal vectors for the scale level.
V Localised feature descriptors using spherical patches and curves
The feature descriptors are used to define a part of the nasal region, containing a set of normal vectors from the Gabor wavelets filters. Histograms of the resulting feature vectors for the , and maps are concatenated to create the feature space. This procedure is illustrated in Fig. 6 for and . The feature descriptors are used to reduce the dimensionality of the feature space, decrease the redundancy and enable the use of probabilistic feature selection to lower the sensitivity to facial expressions while maintaining the most discriminative parts.
The basic landmarks previously identified, see Fig. 1-b, are used to create the new keypoints shown in Fig. 7-a. These new landmarks are easily obtained by dividing the horizontal and vertical lines that connect the landmarks. A sphere centralised on each point is then intersected with the nasal surface and its inner parts are cropped. Then, the histogram of the normals of Gabor-wavelet filtered depth images are computed, based on the procedure explained in section IV. The intersection process is depicted in Fig. 7-b. A set of spheres of identical radii (in this case 7 mm) are intersected with the nose surface. These spherical feature descriptors provide the capability to evaluate the potential of overlapping spherical regions on the nasal surface, when used as feature vectors.
Alternatively, using different pairs of landmarks, a set of orthogonal planes to the nasal region can be found. Intersecting the planes with the nose surface results in a set of curves on the nasal region. For example, the normal vector of a plane passing through two nasal landmarks and , and orthogonal to the plane can be defined by , where is the unit vector along the -axis. When and are selected from the set of landmarks shown in Fig. 7-c, they can be used to create the set of curves shown in Fig. 7-d, which provide the feature descriptors. For each curve, the concatenated histograms of the , and components of the normal vectors from the Gabor wavelet filters outputs are computed, giving the feature vector.
VI Feature selection using GA
The feature selection step selects those subsets of feature vectors extracted from the curves and spherical patches that are more robust against facial expressions. For a given feature descriptor and different Gabor wavelets scales , the feature vector is computed by,
[TABLE]
where , and are the features of the scale, for the , and surface normal components, respectively. For feature descriptors, each feature set of the normal maps is represented by the concatenation of different histograms, of length from the feature descriptors, giving
[TABLE]
In (12), , and are the normalised histograms computed using the feature descriptor () for the scale () on the normal map , which is computed using (10).
Here the aim is to find a binary vector to be used as a switch to select the most robust and remove the vulnerable feature descriptors to facial expressions. Using a binary vector , the vector , whose length is equal to the length of , or can be computed for the scale by,
[TABLE]
The elements of () are set to zero or one, depending on the value of the element of . Finally, is concatenated over all scales to create a binary vector , whose length is equal to the feature space dimensionality,
[TABLE]
The value of each element of can be altered using the nucleus binary vector . A curve or patch is selected or omitted based on the value of elements. If , then the curve or patch is selected, otherwise it is omitted. By grouping the neutral samples for the gallery and the non-neutral samples for the test phase, and varying , the most expression robust curves and patches can be selected. As shown in (13) and (14) when , all curves and patches are selected or removed, simultaneously, for all scales. The resulting low dimensional samples are matched with those in the gallery using the Mahalanobis cosine distance,
[TABLE]
The Kernel Fisher’s analysis (KFA) algorithm with polynomial kernel is applied to the feature space to project the features to a lower dimensional space using a supervised approach. If and are the number of gallery and probe samples, respectively, and is the dimension of the projected subspace (in all subsequent experiments, ), and will be matrices of dimensions and , respectively. is the covariance matrix computed over , and is a distance matrix containing the matching errors. To maximise the probability of assigning the test samples to their corresponding classes (subjects), when compared with the gallery samples , can be varied and its optimum found by,
[TABLE]
in which is the average probability that the label corresponding to the smallest matching error, found by (15), is the same as the label of the probe sample. In other words, is the rank one recognition rate which is maximised as is changed. The excellent capability of GA in high dimensional binary parameter space [22, 30] make it well suited for this non-convex optimisation problem. The GA used in this work is a modified Non-dominated Sorting Genetic Algorithm-II (NSGA-II) [31] which, in comparison with NSGA [32], is an elitism-based approach, relies on an improved sorting algorithm, has lower computational complexity and does not require sharing parameter assignment [31]. The modified NSGA-II incorporates elitism over the individuals that increase the diversity of the population in addition to those with better fitness output. The parameter assignments for the GA are explained in section VII-C.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] K. Chang, K. Bowyer, and P. Flynn, “Multiple nose region matching for 3D face recognition under varying facial expression,” IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 28, no. 10, pp. 1695–1700, 2006.
- 2[2] M. Emambakhsh and A. Evans, “Self-dependent 3D face rotational alignment using the nose region,” in 4th IET International Conference on Imaging for Crime Detection and Prevention (ICDP) , pp. 1–6, 2011.
- 3[3] A. Moorhouse, A. Evans, G. Atkinson, J. Sun, and M. Smith, “The nose on your face may not be so plain: Using the nose as a biometric,” in 3rd IET International Conference on Crime Detection and Prevention (ICDP) , pp. 1–6, 2009.
- 4[4] M. Emambakhsh, A. Evans, and M. Smith, “Using nasal curves matching for expression robust 3D nose recognition,” in 6th IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) , pp. 1–6, 2013.
- 5[5] Y. Wang, X. Tang, J. Liu, G. Pan, and R. Xiao, “3D face recognition by local shape difference boosting,” in European conference on Computer Vision (ECCV , vol. 5302, pp. 603–616, 2008.
- 6[6] N. Alyüz, B. Gökberk, and L. Akarun, “Regional registration for expression resistant 3-D face recognition,” IEEE Transactions on Information Forensics and Security , vol. 5, no. 3, pp. 425–440, 2010.
- 7[7] H. Drira, , B. Amor, M. Daoudi, and A. Srivastava, “Nasal region contribution in 3D face biometrics using shape analysis framework,” in 3rd International Conference on Advances in Biometrics , pp. 357–366, 2009.
- 8[8] H. Dibeklioğlu, B. Gökberk, and L. Akarun, “Nasal region-based 3D face recognition under pose and expression variations,” in 3rd International Conference on Advances in Biometrics , pp. 309–318, 2009.
