Is Multilingual LLM Watermarking Truly Multilingual? Scaling Robustness to 100+ Languages via Back-Translation
Asim Mohamed, Martin Gubri

TL;DR
This paper demonstrates that current multilingual LLM watermarking methods lack robustness in medium- and low-resource languages and introduces STEAM, a Bayesian optimization-based detection method that improves cross-lingual robustness and scalability.
Contribution
The paper reveals limitations of existing multilingual watermarking in low-resource languages and proposes STEAM, a novel, adaptable detection method that enhances robustness across diverse languages.
Findings
STEAM improves watermark detection robustness by +0.23 AUC.
STEAM increases true positive rate by +37%.
Method is compatible with various watermarking techniques and tokenizers.
Abstract
Multilingual watermarking aims to make large language model (LLM) outputs traceable across languages, yet current methods still fall short. Despite claims of cross-lingual robustness, they are evaluated only on high-resource languages. We show that existing multilingual watermarking methods are not truly multilingual: they fail to remain robust under translation attacks in medium- and low-resource languages. We trace this failure to semantic clustering, which fails when the tokenizer vocabulary contains too few full-word tokens for a given language. To address this, we introduce STEAM, a detection method that uses Bayesian optimisation to search among 133 candidate languages for the back-translation that best recovers the watermark strength. It is compatible with any watermarking method, robust across different tokenizers and languages, non-invasive, and easily extendable to new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Spam and Phishing Detection · Advanced Malware Detection Techniques
