Collaborative Threshold Watermarking
Tameem Bakr, Anish Ambreth, Nils Lukas

TL;DR
This paper proposes a collaborative $(t,K)$-threshold watermarking scheme for federated learning that enables groups of at least $t$ clients to verify a model's provenance without revealing the watermark key, scalable to many clients.
Contribution
It introduces a novel $(t,K)$-threshold watermarking protocol that allows collaborative embedding and verification in federated learning, ensuring security and scalability.
Findings
Effective watermark detection at $K=128$ clients
Minimal accuracy loss due to watermarking
Robust against adaptive fine-tuning attacks
Abstract
In federated learning (FL), clients jointly train a model without sharing raw data. Because each participant invests data and compute, clients need mechanisms to later prove the provenance of a jointly trained model. Model watermarking embeds a hidden signal in the weights, but naive approaches either do not scale with many clients as per-client watermarks dilute as grows, or give any individual client the ability to verify and potentially remove the watermark. We introduce -threshold watermarking: clients collaboratively embed a shared watermark during training, while only coalitions of at least clients can reconstruct the watermark key and verify a suspect model. We secret-share the watermark key so that coalitions of fewer than clients cannot reconstruct it, and verification can be performed without revealing in the clear. We instantiate our…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
The writing is stylistically good -- in general the paper is communicated clearly. The proposed primitive of a threshold verifiable watermark is novel, and abstractly interesting.
- **trusted server setting makes motivation for threshold security much weaker** -- if there is a trusted server which knows $\tau$, I'm not sure I understand why distributed threshold verification by clients is needed. If the server is trusted, then surely they can be trusted to compute the verification on the behalf of the clients as well. Are there any applications where this would not be the case? - **'trustless server' setting requires a trusted server** -- I don't think the 'trustless se
S1. Novel Problem Formulation: The authors propose a new type of watermarking method suitable for federated learning, where the verification of the model is carried out collectively by a subset of clients and requires $\geq t$ participants for verification. S2. Cryptographic Grounding: The scheme is built on standard cryptographic primitives (commitment, secret sharing, and secure aggregation).
W1. Lack of Clarity in Threat Model: The threat model is vague (who is the adversary, what is their goal, who is the trusted dealer, do you have a trusted or trustless server?) and conflicts with the scenarios. (Please check the first two questions for a detailed comments) W2. Underexplored Practical Deployment Considerations: The scheme assumes an initial trusted dealer for share distribution. How this dealer is instantiated in real federated settings is left unaddressed. W3. Limited Empiric
- The paper introduces an interesting and timely idea of (t, K)-threshold watermarking for federated learning, addressing collaborative model ownership in untrusted multi-party settings. - The formulation builds on established cryptographic primitives such as commitments and Shamir’s secret sharing, providing a reasonable conceptual foundation. - The problem motivation is clear, and the overall goal of combining cryptographic guarantees with watermarking for provenance verification is relevant
- The presentation quality is very poor. Nearly all of the figures in both the main paper and the appendix have very small fonts and unreadable legends. The plots must be redrawn with larger, clearer labels and axes so they can be interpreted without zooming in 300%. Moreover, the overall appearance of figures are not consistent, which does not meet ICLR’s standards. I strongly recommend that the authors fix this issue. - The paper also uses the term “decrease by X%” when referring to accuracy
1. The authors provide a reasonable analysis of the limitations of existing methods: current approaches either assume a single trusted entity or fail to scale to large multi-client scenarios. Introducing "(t, K)-threshold watermarking" is a theoretical innovation that fills a gap in federated learning watermark mechanisms. 2. The use of cryptographic primitives (commitment schemes, secret sharing) ensures unforgeability and collusion resistance. 3. The experiments are comprehensive, and abla
1. A main concern is the threat model: in which scenarios is it necessary to prove model ownership collaboratively by multiple parties under a white-box assumption? Furthermore, although the paper mentions a "trustless setting," potential attack scenarios (e.g., malicious clients tampering with gradients, server denial-of-service) are insufficiently discussed and are limited to an "honest-but-curious" model. 2. While the method is reasonable, the embedding perturbation direction relies on stat
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Advanced Neural Network Applications
