Towards Action Hijacking of Large Language Model-based Agent
Yuyang Zhang, Kangjie Chen, Jiaxin Gao, Ronghao Cui, Run Wang, Lina Wang, Tianwei Zhang

TL;DR
This paper presents AI^2, a novel attack method that manipulates LLM-based applications' action plans by leveraging application knowledge to craft misleading prompts, effectively bypassing safety filters and causing harmful actions.
Contribution
The paper introduces AI^2, a new attack technique that uses application knowledge to generate semantically harmless yet misleading prompts, enhancing attack success against LLM-based systems.
Findings
Achieves an average attack success rate of 84.30%.
Bypasses 92.7% of common safety filters.
Remains effective against dedicated defenses with 59.45% success.
Abstract
Recently, applications powered by Large Language Models (LLMs) have made significant strides in tackling complex tasks. By harnessing the advanced reasoning capabilities and extensive knowledge embedded in LLMs, these applications can generate detailed action plans that are subsequently executed by external tools. Furthermore, the integration of retrieval-augmented generation (RAG) enhances performance by incorporating up-to-date, domain-specific knowledge into the planning and execution processes. This approach has seen widespread adoption across various sectors, including healthcare, finance, and software development. Meanwhile, there are also growing concerns regarding the security of LLM-based applications. Researchers have disclosed various attacks, represented by jailbreak and prompt injection, to hijack the output actions of these applications. Existing attacks mainly focus on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Robotics and Automated Systems · Business Process Modeling and Analysis
