Unsafe2Safe: Controllable Image Anonymization for Downstream Utility
Mih Dinh, SouYoung Jin

TL;DR
Unsafe2Safe is an automated pipeline that detects privacy-sensitive images and rewrites their private regions using diffusion editing, balancing privacy protection with data utility.
Contribution
It introduces a two-stage method combining vision-language models and diffusion editors for controllable image anonymization with a comprehensive evaluation suite.
Findings
Reduces face similarity, text similarity, and demographic predictability significantly.
Maintains downstream task accuracy comparable to raw data training.
Improves privacy protection and semantic fidelity through fine-tuning diffusion editors.
Abstract
Large-scale image datasets frequently contain identifiable or sensitive content, raising privacy risks when training models that may memorize and leak such information. We present Unsafe2Safe, a fully automated pipeline that detects privacy-prone images and rewrites only their sensitive regions using multimodally guided diffusion editing. Unsafe2Safe operates in two stages. Stage 1 uses a vision-language model to (i) inspect images for privacy risks, (ii) generate paired private and public captions that respectively include and omit sensitive attributes, and (iii) prompt a large language model to produce structured, identity-neutral edit instructions conditioned on the public caption. Stage 2 employs instruction-driven diffusion editors to apply these dual textual prompts, producing privacy-safe images that preserve global structure and task-relevant semantics while neutralizing private…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
