Task-Agnostic Language Model Watermarking via High Entropy Passthrough   Layers

Vaden Masrani; Mohammad Akbari; David Ming Xuan Yue; Ahmad Rezaei,; Yong Zhang

arXiv:2412.12563·cs.CL·December 18, 2024

Task-Agnostic Language Model Watermarking via High Entropy Passthrough Layers

Vaden Masrani, Mohammad Akbari, David Ming Xuan Yue, Ahmad Rezaei,, Yong Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces a task-agnostic watermarking technique for large language models using high-entropy passthrough layers, enabling model ownership verification without impairing performance and resisting various attacks.

Contribution

The proposed method is fully task-agnostic, easy to implement, and robust against fine-tuning, pruning, and layer removal attacks, with minimal additional training time.

Findings

01

Achieves near-perfect watermark extraction accuracy

02

Maintains original model performance

03

Resistant to fine-tuning, pruning, and layer removal attacks

Abstract

In the era of costly pre-training of large language models, ensuring the intellectual property rights of model owners, and insuring that said models are responsibly deployed, is becoming increasingly important. To this end, we propose model watermarking via passthrough layers, which are added to existing pre-trained networks and trained using a self-supervised loss such that the model produces high-entropy output when prompted with a unique private key, and acts normally otherwise. Unlike existing model watermarking methods, our method is fully task-agnostic, and can be applied to both classification and sequence-to-sequence tasks without requiring advanced access to downstream fine-tuning datasets. We evaluate the proposed passthrough layers on a wide range of downstream tasks, and show experimentally our watermarking method achieves a near-perfect watermark extraction accuracy and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Task-Agnostic Language Model Watermarking via High Entropy Passthrough Layers· underline

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Chaos-based Image/Signal Encryption · Speech Recognition and Synthesis