Emergence of Cooperation in Two-agent Repeated Games with Reinforcement   Learning

Zhen-Wei Ding; Guo-Zhong Zheng; Chao-Ran Cai; Wei-Ran Cai; Li Chen,; Ji-Qiang Zhang; Xu-Ming Wang

arXiv:2307.04612·physics.soc-ph·May 17, 2024

Emergence of Cooperation in Two-agent Repeated Games with Reinforcement Learning

Zhen-Wei Ding, Guo-Zhong Zheng, Chao-Ran Cai, Wei-Ran Cai, Li Chen,, Ji-Qiang Zhang, Xu-Ming Wang

PDF

Open Access

TL;DR

This paper investigates how cooperation emerges and stabilizes in a two-agent system playing the prisoner's dilemma using reinforcement learning, highlighting the roles of memory, expectations, and exploration.

Contribution

It reveals the conditions under which coordinated optimal policies emerge and remain stable, emphasizing the importance of memory and future expectations in fostering cooperation.

Findings

01

Strong memory and long-term expectations promote cooperation.

02

Tolerance to defection can lead to cooperation collapse.

03

Weaker memory and lower expectations favor defection dominance.

Abstract

Cooperation is the foundation of ecosystems and the human society, and the reinforcement learning provides crucial insight into the mechanism for its emergence. However, most previous work has mostly focused on the self-organization at the population level, the fundamental dynamics at the individual level remains unclear. Here, we investigate the evolution of cooperation in a two-agent system, where each agent pursues optimal policies according to the classical Q-learning algorithm in playing the strict prisoner's dilemma. We reveal that a strong memory and long-sighted expectation yield the emergence of Coordinated Optimal Policies (COPs), where both agents act like Win-Stay, Lose-Shift (WSLS) to maintain a high level of cooperation. Otherwise, players become tolerant toward their co-player's defection and the cooperation loses stability in the end where the policy all Defection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Game Theory and Cooperation · Complex Systems and Time Series Analysis · Ecosystem dynamics and resilience