Watermarking Game-Playing Agents in Perfect-Information Extensive-Form Games
Juho Kim, Fei Fang, Tuomas Sandholm

TL;DR
This paper explores watermarking strategies for game-playing agents in perfect-information extensive-form games, adapting techniques from language models to detect unauthorized use with minimal impact on performance.
Contribution
It introduces a novel watermarking method for game-playing strategies, demonstrating its effectiveness and low impact on game quality in chess engines.
Findings
Watermark detection is effective with few games.
Watermarking causes negligible performance degradation.
The method adapts LLM watermarking to game strategies.
Abstract
Watermarking techniques for large language models (LLMs), which encode hidden information in the output so its source can be verified, have gained significant attention in recent days, thanks to their potential capability to detect accidental or deliberate misuse. Similar challenges involving model misuse also exist in the context of game-playing, such as when detecting the unauthorized use of AI tools in gaming platforms (e.g., cheating in online chess). In this paper, we initiate the study of how game-playing strategies can be watermarked. We show how the KGW watermark for LLMs can be adapted to watermark game-playing agents in perfect-information extensive-form games. The watermark can then be detected using a statistical test. We show that the degradation in the quality of the watermarked strategy profile, quantified by the expected utility, can be bounded, but there is a tradeoff…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
