Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
David Guzman Piedrahita, Yongjin Yang, Mrinmaya Sachan, Giorgia Ramponi, Bernhard Sch\"olkopf, Zhijing Jin

TL;DR
This paper investigates how large language models behave in social dilemma scenarios, revealing that reasoning capabilities do not necessarily promote cooperation among models in public goods games.
Contribution
It introduces a behavioral economics-inspired framework to analyze cooperation in multi-agent LLM systems, highlighting diverse behavioral patterns and the unexpected poor cooperation of reasoning LLMs.
Findings
Reasoning LLMs like o1 series struggle with cooperation
Traditional LLMs often achieve high cooperation levels
Models exhibit four distinct behavioral patterns in social dilemmas
Abstract
As large language models (LLMs) are increasingly deployed as autonomous agents, understanding their cooperation and social mechanisms is becoming increasingly important. In particular, how LLMs balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment. In this paper, we examine the challenge of costly sanctioning in multi-agent LLM systems, where an agent must decide whether to invest its own resources to incentivize cooperation or penalize defection. To study this, we adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas over repeated interactions. Our analysis reveals four distinct behavioral patterns among models: some consistently establish and sustain high levels of cooperation, others fluctuate between engagement and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage and cultural evolution · Mobile Crowdsensing and Crowdsourcing · Multimodal Machine Learning Applications
