BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection
Yihan Wang, Yiwei Lu, Xiao-Shan Gao, Gautam Kamath, Yaoliang Yu

TL;DR
This paper reveals that black-box data protection methods can be significantly compromised through a novel attack that uses a small set of unprotected data to learn a mapping, effectively removing protections from unseen data.
Contribution
We introduce BridgePure, a diffusion-based attack model that exploits protection leakage to break black-box data protections with minimal data and query access.
Findings
BridgePure effectively removes protections from unseen data within the same distribution.
The attack demonstrates superior purification performance on classification and style mimicry tasks.
Black-box protections are vulnerable to small-data, query-based attacks, exposing critical security risks.
Abstract
Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools (e.g., APIs) for users to upload personal data and receive protected counterparts. In this work, we show that such black-box protections can be substantially compromised if a small set of unprotected in-distribution data is available. Specifically, we propose a novel threat model of protection leakage, where an adversary can (1) easily acquire (unprotected, protected) pairs by querying the black-box protections with a small unprotected dataset; and (2) train a diffusion bridge model to build a mapping between unprotected and protected data. This mapping, termed BridgePure, can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Digital and Cyber Forensics
MethodsSparse Evolutionary Training · Diffusion
