Win-Stay-Lose-Shift as a self-confirming equilibrium in the iterated Prisoner's Dilemma
Minjae Kim, Jung-Kyoo Choi, and Seung Ki Baek

TL;DR
This paper explores how players in the iterated Prisoner's Dilemma can achieve cooperative equilibrium through Bayesian inference of opponents' strategies based on observed moves, emphasizing the role of Win-Stay-Lose-Shift as a self-confirming strategy under certain conditions.
Contribution
It introduces a model where players learn strategies through observational Bayesian inference, demonstrating the emergence of Win-Stay-Lose-Shift as a self-confirming equilibrium in the game.
Findings
Players can escape defection into cooperation with low costs.
Uncertainty in observation promotes cooperative strategies.
Win-Stay-Lose-Shift becomes a stable equilibrium under certain conditions.
Abstract
Evolutionary game theory assumes that players replicate a highly scored player's strategy through genetic inheritance. However, when learning occurs culturally, it is often difficult to recognize someone's strategy just by observing the behaviour. In this work, we consider players with memory-one stochastic strategies in the iterated prisoner's dilemma, with an assumption that they cannot directly access each other's strategy but only observe the actual moves for a certain number of rounds. Based on the observation, the observer has to infer the resident strategy in a Bayesian way and chooses his or her own strategy accordingly. By examining the best-response relations, we argue that players can escape from full defection into a cooperative equilibrium supported by Win-Stay-Lose-Shift in a self-confirming manner, provided that the cost of cooperation is low and the observational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
