The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability

Zijie Xu; Minfeng Qi; Shiqing Wu; Lefeng Zhang; Qiwen Wei; Han He; Ningran Li

arXiv:2510.18563·cs.CR·October 22, 2025

The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability

Zijie Xu, Minfeng Qi, Shiqing Wu, Lefeng Zhang, Qiwen Wei, Han He, Ningran Li

PDF

Open Access

TL;DR

This paper explores the Trust-Vulnerability Paradox in large language model multi-agent systems, showing that increased trust improves coordination but also raises security risks, and proposes metrics and defenses to manage this trade-off.

Contribution

It formalizes the Trust-Vulnerability Paradox, introduces unified metrics for trust-related risks, and evaluates defenses to balance trust and security in multi-agent systems.

Findings

01

Higher trust enhances task success but increases exposure risks.

02

Heterogeneous trust-risk mappings across different systems.

03

Defense mechanisms effectively reduce exposure and sensitivity.

Abstract

Multi-agent systems powered by large language models are advancing rapidly, yet the tension between mutual trust and security remains underexplored. We introduce and empirically validate the Trust-Vulnerability Paradox (TVP): increasing inter-agent trust to enhance coordination simultaneously expands risks of over-exposure and over-authorization. To investigate this paradox, we construct a scenario-game dataset spanning 3 macro scenes and 19 sub-scenes, and run extensive closed-loop interactions with trust explicitly parameterized. Using Minimum Necessary Information (MNI) as the safety baseline, we propose two unified metrics: Over-Exposure Rate (OER) to detect boundary violations, and Authorization Drift (AD) to capture sensitivity to trust levels. Results across multiple model backends and orchestration frameworks reveal consistent trends: higher trust improves task success but also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAccess Control and Trust · Explainable Artificial Intelligence (XAI) · Security and Verification in Computing