MalTool: Malicious Tool Attacks on LLM Agents

Yuepeng Hu; Yuqi Jia; Mengyuan Li; Dawn Song; Neil Gong

arXiv:2602.12194·cs.CR·May 12, 2026

MalTool: Malicious Tool Attacks on LLM Agents

Yuepeng Hu, Yuqi Jia, Mengyuan Li, Dawn Song, Neil Gong

PDF

TL;DR

This paper introduces MalTool, a framework for synthesizing malicious tools with embedded harmful behaviors for LLM agents, revealing significant detection challenges and risks.

Contribution

It presents the first systematic study of malicious tool code behaviors, a taxonomy, and a framework for generating malicious tools to evaluate detection methods.

Findings

01

MalTool successfully generates diverse malicious tools with specified behaviors.

02

Existing detection methods are largely ineffective against the synthesized malicious tools.

03

The study highlights urgent need for new defenses against malicious tools in LLM-agent systems.

Abstract

In a malicious tool attack, an attacker uploads a malicious tool to a distribution platform; once a user inadvertently installs the tool and the LLM agent selects it during task execution, the tool can compromise the user's security and privacy. Prior work focuses on manipulating tool names and descriptions to increase the likelihood of installation by users and selection by LLM agents. However, a successful attack also requires embedding malicious behaviors in the tool's code implementation, which remains largely unexplored. In this work, we bridge this gap by presenting the first systematic study of malicious tool code implementations. We first propose a taxonomy of malicious tool behaviors based on the confidentiality-integrity-availability triad, tailored to LLM-agent settings. To investigate the severity of the risks posed by attackers exploiting coding LLMs to automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.