Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation
Zehua Cheng, Wei Dai, and Jiahao Sun

TL;DR
This paper introduces Identity-Decoupled MRAG, a framework that anonymizes human faces in visual evidence for multi-modal retrieval systems, balancing privacy with the preservation of essential visual cues.
Contribution
It presents a novel disentangled generative approach with a mutual-information regularizer and a multi-oracle privacy enforcement mechanism.
Findings
Effective anonymization of faces while retaining visual attributes.
Reduces identity similarity below impostor threshold across multiple recognition models.
Enables low-latency deployment with a latent diffusion generator.
Abstract
Multi-modal retrieval-augmented generation (MRAG) systems retrieve visual evidence from large image corpora to ground the responses of large multi-modal models, yet the retrieved images frequently contain human faces whose identities constitute sensitive personal information. Existing anonymization techniques that destroy the non-identity visual cues that downstream reasoning depends on or fail to provide principled privacy guarantees. We propose Identity-Decoupled MRAG, a framework that interposes a generative anonymization module between retrieval and generation. Our approach consists of three components: (i)a disentangled variational encoder that factorizes each face into an identity code and a spatially-structured attribute code, regularized by a mutual-information penalty and a gradient-based independence term; (ii)a manifold-aware rejection sampler that replaces the identity code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
