IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems
Liwen Wang, Wenxuan Wang, Shuai Wang, Zongjie Li, Zhenlan Ji, Zongyi Lyu, Daoyuan Wu, Shing-Chi Cheung

TL;DR
This paper presents MASLEAK, a black-box attack framework that effectively extracts sensitive proprietary information from Multi-Agent System applications using LLMs, highlighting security vulnerabilities.
Contribution
Introduces MASLEAK, the first attack framework targeting IP leakage in MAS, demonstrating high success rates in extracting system details without prior knowledge.
Findings
Achieves 87% success in extracting system prompts and instructions.
Attains 92% success in revealing system architecture.
Validates effectiveness on both synthetic and real-world MAS applications.
Abstract
The rapid advancement of Large Language Models (LLMs) has led to the emergence of Multi-Agent Systems (MAS) to perform complex tasks through collaboration. However, the intricate nature of MAS, including their architecture and agent interactions, raises significant concerns regarding intellectual property (IP) protection. In this paper, we introduce MASLEAK, a novel attack framework designed to extract sensitive information from MAS applications. MASLEAK targets a practical, black-box setting, where the adversary has no prior knowledge of the MAS architecture or agent configurations. The adversary can only interact with the MAS through its public API, submitting attack query and observing outputs from the final agent. Inspired by how computer worms propagate and infect vulnerable network hosts, MASLEAK carefully crafts adversarial query to elicit, propagate, and retain responses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Information and Cyber Security · Advanced Malware Detection Techniques
MethodsMixing Adam and SGD · Sparse Evolutionary Training
