Numerical analysis of a reinforcement learning model with the dynamic aspiration level in the iterated Prisoner's Dilemma
Naoki Masuda, Mitsuhiro Nakamura

TL;DR
This paper numerically analyzes a reinforcement learning model with dynamic aspiration levels in the iterated Prisoner's Dilemma, demonstrating conditions for cooperation and evolutionary invasion of learning strategies.
Contribution
It introduces a simplified, flexible reinforcement learning model with dynamic thresholds, showing its effectiveness in promoting cooperation and evolutionary invasion.
Findings
Mutual cooperation occurs when threshold dynamics are not too fast.
Learning players perform well against reactive strategies.
Learning strategies can invade populations of simpler strategies.
Abstract
Humans and other animals can adapt their social behavior in response to environmental cues including the feedback obtained through experience. Nevertheless, the effects of the experience-based learning of players in evolution and maintenance of cooperation in social dilemma games remain relatively unclear. Some previous literature showed that mutual cooperation of learning players is difficult or requires a sophisticated learning model. In the context of the iterated Prisoner's Dilemma, we numerically examine the performance of a reinforcement learning model. Our model modifies those of Karandikar et al. (1998), Posch et al. (1999), and Macy and Flache (2002) in which players satisfice if the obtained payoff is larger than a dynamic threshold. We show that players obeying the modified learning mutually cooperate with high probability if the dynamics of threshold is not too fast and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
