Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation

Zehua Cheng; Wei Dai; and Jiahao Sun

arXiv:2604.23584·cs.CV·April 28, 2026

Identity-Decoupled Anonymization for Visual Evidence in Multi-modal Retrieval-Augmented Generation

Zehua Cheng, Wei Dai, and Jiahao Sun

PDF

TL;DR

This paper introduces Identity-Decoupled MRAG, a framework that anonymizes human faces in visual evidence for multi-modal retrieval systems, balancing privacy with the preservation of essential visual cues.

Contribution

It presents a novel disentangled generative approach with a mutual-information regularizer and a multi-oracle privacy enforcement mechanism.

Findings

01

Effective anonymization of faces while retaining visual attributes.

02

Reduces identity similarity below impostor threshold across multiple recognition models.

03

Enables low-latency deployment with a latent diffusion generator.

Abstract

Multi-modal retrieval-augmented generation (MRAG) systems retrieve visual evidence from large image corpora to ground the responses of large multi-modal models, yet the retrieved images frequently contain human faces whose identities constitute sensitive personal information. Existing anonymization techniques that destroy the non-identity visual cues that downstream reasoning depends on or fail to provide principled privacy guarantees. We propose Identity-Decoupled MRAG, a framework that interposes a generative anonymization module between retrieval and generation. Our approach consists of three components: (i)a disentangled variational encoder that factorizes each face into an identity code and a spatially-structured attribute code, regularized by a mutual-information penalty and a gradient-based independence term; (ii)a manifold-aware rejection sampler that replaces the identity code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.