NearID: Identity Representation Learning via Near-identity Distractors
Aleksandar Cvejic, Rameen Abdal, Abdelrahman Eldesokey, Bernard Ghanem, Peter Wonka

TL;DR
NearID introduces a new framework and dataset to improve identity representation learning by eliminating background context, leading to more reliable identity discrimination in vision tasks.
Contribution
The paper presents the NearID dataset and a contrastive learning approach that significantly enhances identity discrimination by removing background cues.
Findings
Pre-trained encoders perform poorly on NearID evaluation (SSR as low as 30.7%).
The proposed method improves SSR to 99.2%, showing strong identity discrimination.
Part-level discrimination improves by 28.0%, aligning better with human judgments.
Abstract
When evaluating identity-focused tasks such as personalized generation and image editing, existing vision encoders entangle object identity with background context, leading to unreliable representations and metrics. We introduce the first principled framework to address this vulnerability using Near-identity (NearID) distractors, where semantically similar but distinct instances are placed on the exact same background as a reference image, eliminating contextual shortcuts and isolating identity as the sole discriminative signal. Based on this principle, we present the NearID dataset (19K identities, 316K matched-context distractors) together with a strict margin-based evaluation protocol. Under this setting, pre-trained encoders perform poorly, achieving Sample Success Rates (SSR), a strict margin-based identity discrimination metric, as low as 30.7% and often ranking distractors above…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- Aleksandar/NearIDdataset· 117 dl117 dl
- Aleksandar/NearID-Fluxdataset· 52 dl52 dl
- Aleksandar/NearID-Flux_1024dataset· 311 dl311 dl
- Aleksandar/NearID-FluxCdataset· 43 dl43 dl
- Aleksandar/NearID-FluxC_1024dataset· 45 dl45 dl
- Aleksandar/NearID-PowerPaintdataset· 39 dl39 dl
- Aleksandar/NearID-Qwendataset· 244 dl244 dl
- Aleksandar/NearID-Qwen_1328dataset· 157 dl157 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
