Neural Dehydration: Effective Erasure of Black-box Watermarks from DNNs with Limited Data
Yifan Lu, Wenxuan Li, Mi Zhang, Xudong Pan, Min Yang

TL;DR
This paper introduces Neural Dehydration, a novel black-box watermark removal method for DNNs that effectively erases multiple watermarks with minimal data, preserving model utility and overcoming limitations of existing attacks.
Contribution
The paper presents Neural Dehydration, a data-efficient, watermark-agnostic attack capable of removing all mainstream black-box watermarks from DNNs, even with limited or no data.
Findings
Achieves over 90% model utility retention after watermark removal.
Successfully erases all ten tested black-box watermarks across multiple datasets.
Operates effectively with less than 2% of training data or no data at all.
Abstract
To protect the intellectual property of well-trained deep neural networks (DNNs), black-box watermarks, which are embedded into the prediction behavior of DNN models on a set of specially-crafted samples and extracted from suspect models using only API access, have gained increasing popularity in both academy and industry. Watermark robustness is usually implemented against attackers who steal the protected model and obfuscate its parameters for watermark removal. However, current robustness evaluations are primarily performed under moderate attacks or unrealistic settings. Existing removal attacks could only crack a small subset of the mainstream black-box watermarks, and fall short in four key aspects: incomplete removal, reliance on prior knowledge of the watermark, performance degradation, and high dependency on data. In this paper, we propose a watermark-agnostic removal attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
