From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection
Haowei Wang, Rupeng Zhang, Junjie Wang, Mingyang Li, Yuekai Huang, Dandan Wang, Qing Wang

TL;DR
This paper reveals security vulnerabilities in LLM tool-calling systems, demonstrating how adversarial injection can lead to privacy breaches, DoS attacks, and manipulation, highlighting the need for improved defenses.
Contribution
We introduce ToolCommander, a novel adversarial framework that exploits vulnerabilities in LLM tool-calling mechanisms through a two-stage attack strategy.
Findings
Achieves 91.67% success in privacy theft
100% success in denial-of-service attacks
Effective manipulation of tool-calling in various scenarios
Abstract
Tool-calling has changed Large Language Model (LLM) applications by integrating external tools, significantly enhancing their functionality across diverse tasks. However, this integration also introduces new security vulnerabilities, particularly in the tool scheduling mechanisms of LLM, which have not been extensively studied. To fill this gap, we present ToolCommander, a novel framework designed to exploit vulnerabilities in LLM tool-calling systems through adversarial tool injection. Our framework employs a well-designed two-stage attack strategy. Firstly, it injects malicious tools to collect user queries, then dynamically updates the injected tools based on the stolen information to enhance subsequent attacks. These stages enable ToolCommander to execute privacy theft, launch denial-of-service attacks, and even manipulate business competition by triggering unscheduled tool-calling.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Digital and Cyber Forensics
