MCP-ITP: An Automated Framework for Implicit Tool Poisoning in MCP
Ruiqi Li, Zhiqiang Wang, Yunhao Yao, and Xiang-Yang Li

TL;DR
This paper introduces MCP-ITP, an automated framework that generates implicit tool poisoning attacks in MCP, significantly increasing attack success rates while evading detection, revealing vulnerabilities in LLM agent-tool interactions.
Contribution
MCP-ITP is the first automated, adaptive system for implicit tool poisoning in MCP, formulating poisoned tool creation as a black-box optimization problem with feedback from LLMs.
Findings
Achieves up to 84.2% attack success rate.
Suppresses detection rate to as low as 0.3%.
Outperforms manual poisoning baselines.
Abstract
To standardize interactions between LLM-based agents and their environments, the Model Context Protocol (MCP) was proposed and has since been widely adopted. However, integrating external tools expands the attack surface, exposing agents to tool poisoning attacks. In such attacks, malicious instructions embedded in tool metadata are injected into the agent context during MCP registration phase, thereby manipulating agent behavior. Prior work primarily focuses on explicit tool poisoning or relied on manually crafted poisoned tools. In contrast, we focus on a particularly stealthy variant: implicit tool poisoning, where the poisoned tool itself remains uninvoked. Instead, the instructions embedded in the tool metadata induce the agent to invoke a legitimate but high-privilege tool to perform malicious operations. We propose MCP-ITP, the first automated and adaptive framework for implicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Security and Verification in Computing
