On Protecting Agentic Systems' Intellectual Property via Watermarking
Liwen Wang, Zongjie Li, Yuchong Xie, Shuai Wang, Dongdong She, Wei Wang, Juergen Rahmel

TL;DR
This paper introduces AGENTWM, a novel watermarking framework for agentic systems like LLMs, which embeds verifiable signals into action sequences to protect intellectual property against imitation attacks.
Contribution
AGENTWM is the first watermarking method tailored for agentic models, leveraging semantic equivalence of actions to embed robust, undetectable watermarks without impairing system performance.
Findings
High detection accuracy across complex domains
Negligible impact on agent performance
Effective against adaptive adversaries
Abstract
The evolution of Large Language Models (LLMs) into agentic systems that perform autonomous reasoning and tool use has created significant intellectual property (IP) value. We demonstrate that these systems are highly vulnerable to imitation attacks, where adversaries steal proprietary capabilities by training imitation models on victim outputs. Crucially, existing LLM watermarking techniques fail in this domain because real-world agentic systems often operate as grey boxes, concealing the internal reasoning traces required for verification. This paper presents AGENTWM, the first watermarking framework designed specifically for agentic models. AGENTWM exploits the semantic equivalence of action sequences, injecting watermarks by subtly biasing the distribution of functionally identical tool execution paths. This mechanism allows AGENTWM to embed verifiable signals directly into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Advanced Malware Detection Techniques
