TL;DR
This paper introduces a depth-guided attention mechanism for RGB-D face recognition that leverages depth features to improve focus on person-specific facial regions, enhancing accuracy across challenging conditions.
Contribution
The paper proposes a novel depth-guided attention mechanism that directs neural networks to focus on informative facial regions using depth features, improving face recognition performance.
Findings
Achieved state-of-the-art accuracy on four challenging face datasets.
Improved recognition accuracy by up to 5% over existing methods.
Demonstrated generalization with thermal images replacing depth data.
Abstract
Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A novel depth-guided attention mechanism is proposed for deep multi-modal face recognition using low-cost RGB-D sensors. Our novel attention mechanism directs the deep network "where to look" for visual features in the RGB image by focusing the attention of the network using depth features extracted by a Convolution Neural Network (CNN). The depth features help the network focus on regions of the face in the RGB image that contains more prominent person-specific information. Our attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
