Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning
Md Mahedi Hasan, Shoaib Meraj Sami, and Nasser Nasrabadi

TL;DR
This paper introduces a novel text-guided face recognition framework that leverages natural language facial descriptions and multi-granularity cross-modal contrastive learning to improve recognition accuracy, especially in low-quality surveillance images.
Contribution
It proposes a face-caption alignment module with contrastive losses and a face-caption fusion module for enhanced multimodal feature learning, addressing semantic gaps and textual ambiguities.
Findings
Significant performance improvements on low-quality images.
Outperforms existing face recognition models and benchmarks.
Effective integration of facial attributes via natural language enhances recognition.
Abstract
State-of-the-art face recognition (FR) models often experience a significant performance drop when dealing with facial images in surveillance scenarios where images are in low quality and often corrupted with noise. Leveraging facial characteristics, such as freckles, scars, gender, and ethnicity, becomes highly beneficial in improving FR performance in such scenarios. In this paper, we introduce text-guided face recognition (TGFR) to analyze the impact of integrating facial attributes in the form of natural language descriptions. We hypothesize that adding semantic information into the loop can significantly improve the image understanding capability of an FR algorithm compared to other soft biometrics. However, learning a discriminative joint embedding within the multimodal space poses a considerable challenge due to the semantic gap in the unaligned image-text representations, along…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Text-Guided Face Recognition Using Multi-Granularity Cross-Modal Contrastive Learning· youtube
Taxonomy
TopicsFace recognition and analysis · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
