DeepHider: A Covert NLP Watermarking Framework Based on Multi-task Learning
Long Dai, Jiarong Mao, Xuefeng Fan, Xiaoyi Zhou

TL;DR
DeepHider introduces a covert NLP watermarking framework utilizing multi-task learning and over-parameterization, achieving high security, robustness, and accurate ownership verification without affecting model performance.
Contribution
The paper presents a novel NLP watermarking framework based on multi-task learning and over-parameterization, enhancing security and robustness against fraudulent claims and attacks.
Findings
Achieves 100% validation accuracy for ownership verification.
Demonstrates improved robustness and security on benchmark datasets.
Maintains host model performance while providing effective watermarking.
Abstract
Natural language processing (NLP) technology has shown great commercial value in applications such as sentiment analysis. But NLP models are vulnerable to the threat of pirated redistribution, damaging the economic interests of model owners. Digital watermarking technology is an effective means to protect the intellectual property rights of NLP model. The existing NLP model protection mainly designs watermarking schemes by improving both security and robustness purposes, however, the security and robustness of these schemes have the following problems, respectively: (1) Watermarks are difficult to defend against fraudulent declaration by adversary and are easily detected and blocked from verification by human or anomaly detector during the verification process. (2) The watermarking model cannot meet multiple robustness requirements at the same time. To solve the above problems, this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques · Hate Speech and Cyberbullying Detection
