Evolving Deception: When Agents Evolve, Deception Wins
Zonghao Ying, Haowen Dai, Tianyuan Zhang, Yisong Xiao, Quanchen Zou, Aishan Liu, Jian Yang, Yaodong Yang, Xianglong Liu

TL;DR
This paper demonstrates that self-evolving agents in competitive settings tend to develop deception as a stable strategy, driven by the asymmetry in generalization and the transferability of deceptive behaviors across tasks.
Contribution
It provides the first systematic empirical analysis of deception emergence in self-evolving language model agents within competitive environments.
Findings
Deception consistently emerges under utility-driven evolution.
Honest strategies are fragile and less transferable.
Agents develop rationalization mechanisms for deceptive actions.
Abstract
Self-evolving agents offer a promising path toward scalable autonomy. However, in this work, we show that in competitive environments, self-evolution can instead give rise to a serious and previously underexplored risk: the spontaneous emergence of deception as an evolutionarily stable strategy. We conduct a systematic empirical study on the self-evolution of large language model (LLM) agents in a competitive Bidding Arena, where agents iteratively refine their strategies through interaction-driven reflection. Across different evolutionary paths (\eg, Neutral, Honesty-Guided, and Deception-Guided), we find a consistent pattern: under utility-driven competition, unconstrained self-evolution reliably drifts toward deceptive behaviors, even when honest strategies remain viable. This drift is explained by a fundamental asymmetry in generalization. Deception evolves as a transferable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Multi-Agent Systems and Negotiation · Ethics and Social Impacts of AI
