Robust Black-box Watermarking for Deep NeuralNetwork using Inverse Document Frequency
Mohammad Mehdi Yadollahi, Farzaneh Shoeleh, Sajjad Dadkhah, Ali A., Ghorbani

TL;DR
This paper introduces a novel black-box watermarking framework for deep neural networks in the textual domain, utilizing TF-IDF for secure watermark generation during training, ensuring model ownership verification without performance loss.
Contribution
It presents a new watermarking method for DNNs in NLP that embeds watermarks during training using TF-IDF, enhancing security and robustness against attacks.
Findings
Watermarked models retain original accuracy.
The method effectively verifies ownership of surrogate models.
Watermarks are robust against pruning and brute-force attacks.
Abstract
Deep learning techniques are one of the most significant elements of any Artificial Intelligence (AI) services. Recently, these Machine Learning (ML) methods, such as Deep Neural Networks (DNNs), presented exceptional achievement in implementing human-level capabilities for various predicaments, such as Natural Processing Language (NLP), voice recognition, and image processing, etc. Training these models are expensive in terms of computational power and the existence of enough labelled data. Thus, ML-based models such as DNNs establish genuine business value and intellectual property (IP) for their owners. Therefore the trained models need to be protected from any adversary attacks such as illegal redistribution, reproducing, and derivation. Watermarking can be considered as an effective technique for securing a DNN model. However, so far, most of the watermarking algorithm focuses on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning
MethodsPruning
