Towards Traitor Tracing in Black-and-White-Box DNN Watermarking with Tardos-based Codes
Elena Rodriguez-Lois, Fernando Perez-Gonzalez

TL;DR
This paper introduces a novel black-and-white-box watermarking method for DNNs that employs Tardos codes to enable collusion-resistant traitor tracing, allowing source identification before model access.
Contribution
It presents the first traitor tracing approach for black-box DNN watermarking using Tardos codes, enhancing security against collusion and leaks.
Findings
Successfully identifies traitors even after attacks
Demonstrates collusion resistance in black-box scenarios
Discusses limitations and open problems in traitor tracing
Abstract
The growing popularity of Deep Neural Networks, which often require computationally expensive training and access to a vast amount of data, calls for accurate authorship verification methods to deter unlawful dissemination of the models and identify the source of the leak. In DNN watermarking the owner may have access to the full network (white-box) or only be able to extract information from its output to queries (black-box), but a watermarked model may include both approaches in order to gather sufficient evidence to then gain access to the network. Although there has been limited research in white-box watermarking that considers traitor tracing, this problem is yet to be explored in the black-box scenario. In this paper, we propose a black-and-white-box watermarking method for DNN classifiers that opens the door to collusion-resistant traitor tracing in black-box, exploiting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting
