From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal

Daniel George; Charles Yeh; Daniel Lee; Yifei Zhang

arXiv:2604.05296·cs.CV·April 8, 2026

From Measurement to Mitigation: Quantifying and Reducing Identity Leakage in Image Representation Encoders with Linear Subspace Removal

Daniel George, Charles Yeh, Daniel Lee, Yifei Zhang

PDF

TL;DR

This paper evaluates identity leakage in visual embeddings and introduces a linear subspace removal method, ISP, to enhance privacy while maintaining utility in image retrieval tasks.

Contribution

It provides the first attacker-calibrated privacy benchmark for non-FR encoders and proposes a novel linear subspace removal technique for identity sanitization.

Findings

01

CLIP shows higher identity leakage than DINOv2/v3 and SSCD.

02

ISP reduces linear access to near-chance levels while preserving utility.

03

The approach transfers effectively across different datasets with minor utility loss.

Abstract

Frozen visual embeddings (e.g., CLIP, DINOv2/v3, SSCD) power retrieval and integrity systems, yet their use on face-containing data is constrained by unmeasured identity leakage and a lack of deployable mitigations. We take an attacker-aware view and contribute: (i) a benchmark of visual embeddings that reports open-set verification at low false-accept rates, a calibrated diffusion-based template inversion check, and face-context attribution with equal-area perturbations; and (ii) propose a one-shot linear projector that removes an estimated identity subspace while preserving the complementary space needed for utility, which for brevity we denote as the identity sanitization projection ISP. Across CelebA-20 and VGGFace2, we show that these encoders are robust under open-set linear probes, with CLIP exhibiting relatively higher leakage than DINOv2/v3 and SSCD, robust to template…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.