The Traitors: Deception and Trust in Multi-Agent Language Model Simulations

Pedro M. P. Curvo

arXiv:2505.12923·cs.AI·December 16, 2025

The Traitors: Deception and Trust in Multi-Agent Language Model Simulations

Pedro M. P. Curvo

PDF

Open Access 1 Repo

TL;DR

This paper introduces The Traitors, a multi-agent simulation framework using large language models to study deception, trust, and social dynamics, aiming to improve understanding of AI behavior in socially complex interactions.

Contribution

It presents a formal, configurable simulation environment with evaluation metrics and experiments demonstrating deception and trust dynamics among LLM agents.

Findings

01

Advanced models like GPT-4o show stronger deception skills.

02

Deception abilities scale faster than detection capabilities.

03

GPT-4o is more vulnerable to falsehoods from others.

Abstract

As AI systems increasingly assume roles where trust and alignment with human values are essential, understanding when and why they engage in deception has become a critical research priority. We introduce The Traitors, a multi-agent simulation framework inspired by social deduction games, designed to probe deception, trust formation, and strategic communication among large language model (LLM) agents under asymmetric information. A minority of agents the traitors seek to mislead the majority, while the faithful must infer hidden identities through dialogue and reasoning. Our contributions are: (1) we ground the environment in formal frameworks from game theory, behavioral economics, and social cognition; (2) we develop a suite of evaluation metrics capturing deception success, trust dynamics, and collective inference quality; (3) we implement a fully autonomous simulation platform where…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pedrocurvo/thetraitors
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOpinion Dynamics and Social Influence · Ethics and Social Impacts of AI · Language and cultural evolution