The Dynamics of Policy Gradient in Social Dilemmas with Partner Selection
Benedict Russell, Chin-wing Leung, Paolo Turrini

TL;DR
This paper provides an analytical framework for understanding how partner selection influences cooperation in social dilemmas through policy-gradient dynamics, highlighting the role of population variance and stochastic effects.
Contribution
It introduces an analytical solution to policy-gradient dynamics with partner selection, extending it with stochastic modeling and deriving conditions for cooperation emergence.
Findings
Partner selection alters opponent distribution and rewards, promoting cooperation.
Population variance is necessary for cooperation to emerge.
Stochastic modeling captures the effects of partner selection and learning rate on cooperation.
Abstract
In social dilemmas self-interested learning agents face the choice between the societal benefit of cooperation and the immediate reward of defection. Significant evidence exists on the benefits of assortment mechanisms such as partner selection for the emergence of cooperation, but this is largely available through agent-based simulations. In this paper, we provide an analytical solution to the problem, studying the policy-gradient dynamics in a multi-agent environment with partner selection. We show how partner selection changes the opponent distribution and hence the reward landscape, and prove this promotes cooperation under simple rules known from the literature. In particular, we find that population variance is a necessary condition for cooperation to emerge. Using a two-dimensional Wiener process, we extend the dynamics to capture the stochastic effects of partner selection and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
