The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability
Zijie Xu, Minfeng Qi, Shiqing Wu, Lefeng Zhang, Qiwen Wei, Han He, Ningran Li

TL;DR
This paper explores the Trust-Vulnerability Paradox in large language model multi-agent systems, showing that increased trust improves coordination but also raises security risks, and proposes metrics and defenses to manage this trade-off.
Contribution
It formalizes the Trust-Vulnerability Paradox, introduces unified metrics for trust-related risks, and evaluates defenses to balance trust and security in multi-agent systems.
Findings
Higher trust enhances task success but increases exposure risks.
Heterogeneous trust-risk mappings across different systems.
Defense mechanisms effectively reduce exposure and sensitivity.
Abstract
Multi-agent systems powered by large language models are advancing rapidly, yet the tension between mutual trust and security remains underexplored. We introduce and empirically validate the Trust-Vulnerability Paradox (TVP): increasing inter-agent trust to enhance coordination simultaneously expands risks of over-exposure and over-authorization. To investigate this paradox, we construct a scenario-game dataset spanning 3 macro scenes and 19 sub-scenes, and run extensive closed-loop interactions with trust explicitly parameterized. Using Minimum Necessary Information (MNI) as the safety baseline, we propose two unified metrics: Over-Exposure Rate (OER) to detect boundary violations, and Authorization Drift (AD) to capture sensitivity to trust levels. Results across multiple model backends and orchestration frameworks reveal consistent trends: higher trust improves task success but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Explainable Artificial Intelligence (XAI) · Security and Verification in Computing
