EmbTracker: Traceable Black-box Watermarking for Federated Language Models

Haodong Zhao; Jinming Hu; Yijie Bai; Tian Dong; Wei Du; Zhuosheng Zhang; Yanjiao Chen; Haojin Zhu; Gongshen Liu

arXiv:2603.12089·cs.CR·March 13, 2026

EmbTracker: Traceable Black-box Watermarking for Federated Language Models

Haodong Zhao, Jinming Hu, Yijie Bai, Tian Dong, Wei Du, Zhuosheng Zhang, Yanjiao Chen, Haojin Zhu, Gongshen Liu

PDF

Open Access

TL;DR

EmbTracker is a novel server-side black-box watermarking framework for federated language models that enables individual traceability and robust ownership verification without requiring client cooperation.

Contribution

It introduces EmbTracker, the first black-box watermarking scheme for FedLMs that achieves client-level traceability and high robustness against removal attacks.

Findings

01

Near 100% verification accuracy in experiments

02

High resilience against fine-tuning, pruning, and quantization

03

Minimal impact on primary task performance (within 1-2%)

Abstract

Federated Language Model (FedLM) allows a collaborative learning without sharing raw data, yet it introduces a critical vulnerability, as every untrustworthy client may leak the received functional model instance. Current watermarking schemes for FedLM often require white-box access and client-side cooperation, providing only group-level proof of ownership rather than individual traceability. We propose EmbTracker, a server-side, traceable black-box watermarking framework specifically designed for FedLMs. EmbTracker achieves black-box verifiability by embedding a backdoor-based watermark detectable through simple API queries. Client-level traceability is realized by injecting unique identity-specific watermarks into the model distributed to each client. In this way, a leaked model can be attributed to a specific culprit, ensuring robustness even against non-cooperative participants.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Hate Speech and Cyberbullying Detection