Protecting Your NLG Models with Semantic and Robust Watermarks
Tao Xiang, Chunlong Xie, Shangwei Guo, Jiwei Li, Tianwei Zhang

TL;DR
This paper introduces a novel semantic and robust watermarking method for NLG models that uses unharmful phrase pairs to protect intellectual property without disrupting model performance.
Contribution
It presents a new watermarking scheme utilizing semantic phrase pairs and systematic augmentation to enhance robustness and undetectability in NLG models.
Findings
Watermarks are effective in protecting NLG models.
The scheme demonstrates high robustness against detection and removal.
Watermarks do not interfere with the original attention mechanisms.
Abstract
Natural language generation (NLG) applications have gained great popularity due to the powerful deep learning techniques and large training corpus. The deployed NLG models may be stolen or used without authorization, while watermarking has become a useful tool to protect Intellectual Property (IP) of deep models. However, existing watermarking technologies using backdoors are easily detected or harmful for NLG applications. In this paper, we propose a semantic and robust watermarking scheme for NLG models that utilize unharmful phrase pairs as watermarks for IP protection. The watermarks give NLG models personal preference for some special phrase combinations. Specifically, we generate watermarks by following a semantic combination pattern and systematically augment the watermark corpus to enhance the robustness. Then, we embed these watermarks into an NLG model without misleading its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Digital Rights Management and Security · Advanced Malware Detection Techniques
