When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing
Siyuan Xu, Yibing Liu, Peilin Chen, Yung-Hui Li, Shiqi Wang, Sam Kwong

TL;DR
This paper addresses the challenge of restoring privacy in multimodal large language models by introducing a new dataset and a guided generation approach that balances privacy recovery with model utility.
Contribution
It introduces the SPPE dataset for evaluating privacy recovery and proposes a unified guided generation method for reconstructing private content in MLLMs.
Findings
Effective privacy recovery demonstrated on SPPE and InstructPix2Pix datasets.
The approach generalizes well across diverse visual content.
Achieves a balance between privacy protection and model usability.
Abstract
Privacy leakage in Multimodal Large Language Models (MLLMs) has long been an intractable problem. Existing studies, though effectively obscure private information in MLLMs, often overlook the evaluation of the authenticity and recovery quality of user privacy. To this end, this work uniquely focuses on the critical challenge of how to restore surrogate-driven protected data in diverse MLLM scenarios. We first bridge this research gap by contributing the SPPE (Surrogate Privacy Protected Editable) dataset, which includes a wide range of privacy categories and user instructions to simulate real MLLM applications. This dataset offers protected surrogates alongside their various MLLM-edited versions, thus enabling the direct assessment of privacy recovery quality. By formulating privacy recovery as a guided generation task conditioned on complementary multimodal signals, we further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Authorship Attribution and Profiling
