Detect and remove watermark in deep neural networks via generative   adversarial networks

Haoqi Wang; Mingfu Xue; Shichang Sun; Yushu Zhang; Jian Wang; Weiqiang; Liu

arXiv:2106.08104·cs.MM·July 5, 2022

Detect and remove watermark in deep neural networks via generative adversarial networks

Haoqi Wang, Mingfu Xue, Shichang Sun, Yushu Zhang, Jian Wang, Weiqiang, Liu

PDF

TL;DR

This paper presents a GAN-based method to detect and effectively remove watermarks from deep neural networks, significantly reducing watermark presence with minimal impact on model accuracy.

Contribution

It introduces a novel GAN-based attack that can reverse and remove backdoor watermarks from DNNs, highlighting vulnerabilities in current watermarking techniques.

Findings

01

Removes about 98% of watermarks in DNNs

02

Minimal impact on model accuracy (less than 3% drop)

03

Effective on MNIST and CIFAR10 datasets

Abstract

Deep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires a lot of computing resources and training data. It is difficult for most individual users to obtain such computing resources and training data. Model copyright infringement is an emerging problem in recent years. For instance, pre-trained models may be stolen or abuse by illegal users without the authorization of the model owner. Recently, many works on protecting the intellectual property of DNN models have been proposed. In these works, embedding watermarks into DNN based on backdoor is one of the widely used methods. However, when the DNN model is stolen, the backdoor-based watermark may face the risk of being detected and removed by an adversary. In this paper, we propose a scheme to detect and remove watermark in deep neural networks via generative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.