When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition

Nzakiese Mbongo; Kailash A. Hambarde; Hugo Proen\c{c}a

arXiv:2512.06426·cs.CV·December 9, 2025

When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition

Nzakiese Mbongo, Kailash A. Hambarde, Hugo Proen\c{c}a

PDF

Open Access

TL;DR

This paper introduces a dual-path transformer framework leveraging CLIP for robust gender recognition in long-range imagery, combining visual cues and attribute prompts to improve accuracy under challenging conditions.

Contribution

It presents a novel multi-attribute, CLIP-based dual-path model and a new large-scale dataset for long-range gender recognition, outperforming existing methods.

Findings

01

Surpasses state-of-the-art in long-range gender recognition

02

Robust to distance, angle, and occlusion variations

03

Provides interpretable attribute localization

Abstract

Accurate gender recognition from extreme long-range imagery remains a challenging problem due to limited spatial resolution, viewpoint variability, and loss of facial cues. For such purpose, we present a dual-path transformer framework that leverages CLIP to jointly model visual and attribute-driven cues for gender recognition at a distance. The framework integrates two complementary streams: (1) a direct visual path that refines a pre-trained CLIP image encoder through selective fine-tuning of its upper layers, and (2) an attribute-mediated path that infers gender from a set of soft-biometric prompts (e.g., hairstyle, clothing, accessories) aligned in the CLIP text-image space. Spatial channel attention modules further enhance discriminative localization under occlusion and low resolution. To support large-scale evaluation, we construct U-DetAGReID, a unified long-range gender dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Domain Adaptation and Few-Shot Learning · Biometric Identification and Security