Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Ziyou Jiang, Mingyang Li, Guowei Yang, Junjie Wang, Yuekai Huang,, Zhiyuan Chang, Qing Wang

TL;DR
This paper introduces AutoCMD, a dynamic command generation method that mimics familiar patterns to improve information theft attacks on LLM tool-learning systems, revealing vulnerabilities and proposing defenses.
Contribution
AutoCMD is a novel approach that learns to generate targeted malicious commands by mimicking upstream tool information, enhancing attack success rates in dynamic environments.
Findings
AutoCMD achieves +13.2% increase in attack success rate.
AutoCMD generalizes to new tool-learning systems.
Four defense methods effectively mitigate the attack.
Abstract
Information theft attacks pose a significant risk to Large Language Model (LLM) tool-learning systems. Adversaries can inject malicious commands through compromised tools, manipulating LLMs to send sensitive information to these tools, which leads to potential privacy breaches. However, existing attack approaches are black-box oriented and rely on static commands that cannot adapt flexibly to the changes in user queries and the invocation chain of tools. It makes malicious commands more likely to be detected by LLM and leads to attack failure. In this paper, we propose AutoCMD, a dynamic attack comment generation approach for information theft attacks in LLM tool-learning systems. Inspired by the concept of mimicking the familiar, AutoCMD is capable of inferring the information utilized by upstream tools in the toolchain through learning on open-source systems and reinforcement with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques
