Verifying Integrity of Deep Ensemble Models by Lossless Black-box Watermarking with Sensitive Samples
Lina Lin, Hanzhou Wu

TL;DR
This paper introduces a lossless black-box watermarking method using sensitive samples to verify the integrity of deep ensemble models without modifying them, effective even when some sub-models are attacked.
Contribution
A novel black-box watermarking approach for deep ensemble models that does not alter the original models and can verify integrity under attack scenarios.
Findings
Reliable verification of DEM integrity even if one sub-model is attacked
Method does not modify the original DEM, ensuring lossless verification
Effective in real-world attack scenarios
Abstract
With the widespread use of deep neural networks (DNNs) in many areas, more and more studies focus on protecting DNN models from intellectual property (IP) infringement. Many existing methods apply digital watermarking to protect the DNN models. The majority of them either embed a watermark directly into the internal network structure/parameters or insert a zero-bit watermark by fine-tuning a model to be protected with a set of so-called trigger samples. Though these methods work very well, they were designed for individual DNN models, which cannot be directly applied to deep ensemble models (DEMs) that combine multiple DNN models to make the final decision. It motivates us to propose a novel black-box watermarking method in this paper for DEMs, which can be used for verifying the integrity of DEMs. In the proposed method, a certain number of sensitive samples are carefully selected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
