Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoners' dilemma game
Masahiko Ueda

TL;DR
This paper explores how two players can develop symmetric mutual reinforcement learning strategies with memory-two in the repeated prisoners' dilemma, establishing conditions and examples for such equilibria.
Contribution
It provides a necessary condition for memory-two deterministic strategies to form symmetric equilibria and demonstrates their stability under higher-memory reinforcement learning.
Findings
Identified necessary conditions for symmetric memory-two strategies to form equilibria.
Presented three explicit examples of such strategies.
Proved stability of these equilibria under higher-memory reinforcement learning.
Abstract
We investigate symmetric equilibria of mutual reinforcement learning when both players alternately learn the optimal memory-two strategies against the opponent in the repeated prisoners' dilemma game. We provide a necessary condition for memory-two deterministic strategies to form symmetric equilibria. We then provide three examples of memory-two deterministic strategies which form symmetric mutual reinforcement learning equilibria. We also prove that mutual reinforcement learning equilibria formed by memory-two strategies are also mutual reinforcement learning equilibria when both players use reinforcement learning of memory- strategies with .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Game Theory and Cooperation · Reinforcement Learning in Robotics · Game Theory and Applications
