Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools

Kanghua Mo; Li Hu; Yucheng Long; Zhihao Li

arXiv:2508.02110·cs.AI·January 8, 2026

Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools

Kanghua Mo, Li Hu, Yucheng Long, Zhihao Li

PDF

Open Access 1 Video

TL;DR

This paper introduces the Attractive Metadata Attack (AMA), a novel black-box method that manipulates tool metadata to stealthily influence large language model (LLM) agents' tool selection, exposing systemic vulnerabilities in current AI agent architectures.

Contribution

The paper presents AMA, a new attack framework that exploits tool metadata to manipulate LLM agents, demonstrating high success rates and robustness against existing defenses.

Findings

01

High attack success rates (81%-95%) across scenarios

02

Effective even against prompt-level defenses and detection methods

03

Reveals systemic vulnerabilities in current LLM agent architectures

Abstract

Large language model (LLM) agents have demonstrated remarkable capabilities in complex reasoning and decision-making by leveraging external tools. However, this tool-centric paradigm introduces a previously underexplored attack surface, where adversaries can manipulate tool metadata -- such as names, descriptions, and parameter schemas -- to influence agent behavior. We identify this as a new and stealthy threat surface that allows malicious tools to be preferentially selected by LLM agents, without requiring prompt injection or access to model internals. To demonstrate and exploit this vulnerability, we propose the Attractive Metadata Attack (AMA), a black-box in-context learning framework that generates highly attractive but syntactically and semantically valid tool metadata through iterative optimization. The proposed attack integrates seamlessly into standard tool ecosystems and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Security and Verification in Computing