BlockDoor: Blocking Backdoor Based Watermarks in Deep Neural Networks

Yi Hao Puah; Anh Tu Ngo; Nandish Chattopadhyay; Anupam Chattopadhyay

arXiv:2412.12194·cs.CR·January 7, 2025

BlockDoor: Blocking Backdoor Based Watermarks in Deep Neural Networks

Yi Hao Puah, Anh Tu Ngo, Nandish Chattopadhyay, Anupam Chattopadhyay

PDF

Open Access

TL;DR

BlockDoor is a comprehensive framework designed to block all types of backdoor triggers used for watermarking neural networks, effectively reducing watermark validation accuracy while preserving model functionality.

Contribution

This work introduces BlockDoor, a novel package of techniques that detects and modifies trigger samples to prevent backdoor watermark verification in neural networks.

Findings

01

Reduces watermark validation accuracy by up to 98%

02

Maintains less than 1% drop in accuracy on clean samples

03

Effective across multiple datasets and neural architectures

Abstract

Adoption of machine learning models across industries have turned Neural Networks (DNNs) into a prized Intellectual Property (IP), which needs to be protected from being stolen or being used without authorization. This topic gave rise to multiple watermarking schemes, through which, one can establish the ownership of a model. Watermarking using backdooring is the most well established method available in the literature, with specific works demonstrating the difficulty in removing the watermarks, embedded as backdoors within the weights of the network. However, in our work, we have identified a critical flaw in the design of the watermark verification with backdoors, pertaining to the behaviour of the samples of the Trigger Set, which acts as the secret key. In this paper, we present BlockDoor, which is a comprehensive package of techniques that is used as a wrapper to block all three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning

MethodsSparse Evolutionary Training