Thermodynamics of Evolutionary Games

Christoph Adami; Arend Hintze (Michigan State University)

arXiv:1706.03058·q-bio.PE·June 27, 2018

Thermodynamics of Evolutionary Games

Christoph Adami, Arend Hintze (Michigan State University)

PDF

TL;DR

This paper models the evolution of cooperation using Hamiltonian dynamics similar to Ising models, revealing phase transitions and the role of punishment as a magnetic field influencing cooperative behavior.

Contribution

It introduces a novel Hamiltonian formalism for evolutionary games, linking cooperation dynamics to phase transitions in spin systems, and explores punishment effects within this framework.

Findings

01

Cooperation fraction corresponds to a magnetization-like observable.

02

A phase transition between cooperation and defection is identified.

03

Punishment acts as a magnetic field promoting cooperation.

Abstract

How cooperation can evolve between players is an unsolved problem of biology. Here we use Hamiltonian dynamics of models of the Ising type to describe populations of cooperating and defecting players to show that the equilibrium fraction of cooperators is given by the expectation value of a thermal observable akin to a magnetization. We apply the formalism to the Public Goods game with three players, and show that a phase transition between cooperation and defection occurs that is equivalent to a transition in one-dimensional Ising crystals with long-range interactions. We then investigate the effect of punishment on cooperation and find that punishment plays the role of a magnetic field that leads to an "alignment" between players, thus encouraging cooperation. We suggest that a thermal Hamiltonian picture of the evolution of cooperation can generate other insights about the dynamics…

Equations73

\displaystyle E=\bordermatrix{\mbox{}&C&D\cr C&R&S\cr D&T&P}\;.

\displaystyle E=\bordermatrix{\mbox{}&C&D\cr C&R&S\cr D&T&P}\;.

H = i = 1 \sum N m, n = 0 \sum 1 E_{mn} P_{m}^{(i)} \otimes P_{n}^{(i + 1)},

H = i = 1 \sum N m, n = 0 \sum 1 E_{mn} P_{m}^{(i)} \otimes P_{n}^{(i + 1)},

Z = Tr e^{- β H}

Z = Tr e^{- β H}

Z

Z

J_{z} = σ_{z} = P_{0} - P_{1} .

J_{z} = σ_{z} = P_{0} - P_{1} .

J_{z} = i \sum N (P_{0}^{(i)} - P_{1}^{(i)}),

J_{z} = i \sum N (P_{0}^{(i)} - P_{1}^{(i)}),

x \sum

x \sum

Z = Tr U^{N} = (1 + e^{- β r})^{N},

Z = Tr U^{N} = (1 + e^{- β r})^{N},

\displaystyle{\rm Tr\,}(U^{\prime}U^{N-1})=(1+e^{-\beta r})^{N-1}(-1+e^{-\beta r})\;,\ \ \ \ \ \ \

\displaystyle{\rm Tr\,}(U^{\prime}U^{N-1})=(1+e^{-\beta r})^{N-1}(-1+e^{-\beta r})\;,\ \ \ \ \ \ \

⟨ J_{Z} ⟩_{β}

⟨ J_{Z} ⟩_{β}

Π_{C}

Π_{C}

Π_{D}

\displaystyle\Pi_{\rm C}=\bordermatrix{\mbox{}&{\rm C}&{\rm D}\cr{\rm C}&r-1&\frac{2}{3}r-1\cr{\rm D}&\frac{2}{3}r-1&\frac{1}{3}r-1}

\displaystyle\Pi_{\rm C}=\bordermatrix{\mbox{}&{\rm C}&{\rm D}\cr{\rm C}&r-1&\frac{2}{3}r-1\cr{\rm D}&\frac{2}{3}r-1&\frac{1}{3}r-1}

\displaystyle E^{(C)}=\left(\begin{array}[]{cc}0&\frac{1}{3}r\\ \frac{1}{3}r&\frac{2}{3}r\end{array}\right),\;E^{(D)}=\frac{1}{3}r-1+E^{(C)}.\;\;\;\;\;\;

\displaystyle E^{(C)}=\left(\begin{array}[]{cc}0&\frac{1}{3}r\\ \frac{1}{3}r&\frac{2}{3}r\end{array}\right),\;E^{(D)}=\frac{1}{3}r-1+E^{(C)}.\;\;\;\;\;\;

H_{C}^{(i)} = i = 0 \sum N m, n = 0 \sum 1 E_{mn}^{(C)} P_{m}^{i - 1} \otimes P_{n}^{i + 1},

H_{C}^{(i)} = i = 0 \sum N m, n = 0 \sum 1 E_{mn}^{(C)} P_{m}^{i - 1} \otimes P_{n}^{i + 1},

H = i = 1 \sum N H_{C}^{(i)} P_{0}^{(i)} + H_{D}^{(i)} P_{1}^{(i)} .

H = i = 1 \sum N H_{C}^{(i)} P_{0}^{(i)} + H_{D}^{(i)} P_{1}^{(i)} .

\displaystyle\langle J_{z}\rangle_{\beta}=\frac{1}{Z}{\rm Tr\,}(J_{z}e^{-\beta H})=N\tanh\frac{\beta}{2}(r-1)\;,\;\;\;\ \ \

\displaystyle\langle J_{z}\rangle_{\beta}=\frac{1}{Z}{\rm Tr\,}(J_{z}e^{-\beta H})=N\tanh\frac{\beta}{2}(r-1)\;,\;\;\;\ \ \

Z

Z

Z

Z

m_{1} m_{2} m_{3} \sum ⟨ m_{1} m_{2} m_{3} ∣ J_{z} e^{- β H_{m_{2}}} ∣ m_{1} m_{2} m_{3} ⟩

m_{1} m_{2} m_{3} \sum ⟨ m_{1} m_{2} m_{3} ∣ J_{z} e^{- β H_{m_{2}}} ∣ m_{1} m_{2} m_{3} ⟩

\displaystyle\langle J_{z}\rangle_{\beta}=\frac{1}{Z}{\rm Tr\,}(e^{-\beta H}J_{z})=\tanh\frac{\beta}{2}(\frac{1}{3}r-1)\;.\ \ \ \ \ \ \ \

\displaystyle\langle J_{z}\rangle_{\beta}=\frac{1}{Z}{\rm Tr\,}(e^{-\beta H}J_{z})=\tanh\frac{\beta}{2}(\frac{1}{3}r-1)\;.\ \ \ \ \ \ \ \

p (x \leftarrow y) = \frac{1}{1 + e ^{- β (w_{x} - w_{y})}},

p (x \leftarrow y) = \frac{1}{1 + e ^{- β (w_{x} - w_{y})}},

M = \frac{P _{C} - P _{D}}{P _{C} + P _{D}}

M = \frac{P _{C} - P _{D}}{P _{C} + P _{D}}

Π_{C}

Π_{C}

Π_{D}

Π_{M}

Π_{I}

M

M

C

H = P_{00} H_{M} + P_{01} H_{C} + P_{10} H_{D} + P_{11} H_{I},

H = P_{00} H_{M} + P_{01} H_{C} + P_{10} H_{D} + P_{11} H_{I},

F^{(C)}

F^{(C)}

F^{(D)}

F^{(D)}

F^{(M)}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Thermodynamics of Evolutionary Games

Christoph Adami

[email protected]

Department of Microbiology & Molecular Genetics

Department of Physics & Astronomy

BEACON Centre for the Study of Evolution in Action

Michigan State University, East Lansing, MI 48824

Arend Hintze

Department of Computer Science & Engineering

Department of Integrative Biology

BEACON Centre for the Study of Evolution in Action

Michigan State University, East Lansing, MI 48824

Abstract

How cooperation can evolve between players is an unsolved problem of biology. Here we use Hamiltonian dynamics of models of the Ising type to describe populations of cooperating and defecting players to show that the equilibrium fraction of cooperators is given by the expectation value of a thermal observable akin to a magnetization. We apply the formalism to the Public Goods game with three players, and show that a phase transition between cooperation and defection occurs that is equivalent to a transition in one-dimensional Ising crystals with long-range interactions. We then investigate the effect of punishment on cooperation and find that punishment plays the role of a magnetic field that leads to an “alignment” between players, thus encouraging cooperation. We suggest that a thermal Hamiltonian picture of the evolution of cooperation can generate other insights about the dynamics of evolving groups by mining the rich literature of critical dynamics in low-dimensional spin systems.

Cooperation is a particularly interesting phenomenon in the context of evolution. Evolution acts on short-term benefits, which makes cooperators vulnerable to exploitation in the form of cheating or “defection” even if cooperation is a strategy with higher payoffs in the long-term, creating what is known as the “dilemma of cooperation”. It is often stated that because of the dilemma, the expected outcome of evolution should be defection, rendering the plethora of examples for cooperators in nature mysterious. However, there are number of different mechanisms that nevertheless enable cooperation MaynardSmith1982 ; Axelrod1984 ; HofbauerSigmund1998 ; Nowak2006 suggesting that, contrary to the naive expectation, cooperation is after all the natural outcome of evolution when mechanisms enabling assortment (such as discrimination via communication) are available Adamietal2016a . These results have been obtained using mathematics as well as computational-simulation tools. The mathematical results in particular provide insight into the evolutionary dynamics giving rise to cooperation from inspecting closed-form solutions, but such solutions are hard to come by when populations are finite, are not well-mixed, or are subject to significant mutation Adamietal2016a . Recently, progress was made in understanding the evolutionary dynamics on games played on arbitrary grids Allenetal2017 , but closed-form solutions predicting the “critical point” for the transition between cooperation and defection still do not exist. Here, we use methods borrowed from statistical physics that show the path to such general formulæ.

Prior investigations of several standard evolutionary games SzaboFath2007 ; Szolnokietal2009 ; Iliopoulosetal2010 ; AdamiSchossauHintze2012 ; HintzeAdami2015 ; SzaboBorsos2016 revealed that the evolutionary process often critically depends on a single parameter that causes an abrupt change in winning strategy. In some cases it is possible to move the parameter beyond the critical point without triggering the transition—the hallmark of hysteresis HintzeAdami2015 . These results suggest that there is an underlying analogy between evolutionary game dynamics and the statistical description of phase transitions. Indeed, Szabó and Hauert SzaboHauert2002 ; HauertSzabo2005 applied mathematical methods that are used to describe critical phase transitions like the ones found in the celebrated Ising model Ising1925 to evolutionary games on a lattice, and showed (via numerical simulation, as well as the pair-approximation on square lattices) that the Prisoner’s Dilemma (PD) game dynamics on random regular lattices fall into the directed percolation class of phase transitions.

Here we take a different approach, by explicitly constructing Hamiltonians for game dynamics inspired by Ising-type models, and studying games on finite regular lattices analytically (albeit only in one dimension). It might at first appear odd to consider thermal game theory, as temperature plays no role in evolutionary dynamics. In physics, thermal effects are due to fluctuations in energy, but payoffs in evolutionary games can fluctuate as well, for a number of different reasons. For example, a finite evolving population is subject to drift and thus to a random element in the payoffs. Mutations that change strategies can play a similar role. In evolutionary games, we can summarize the effect of fluctuations by introducing a parameter that controls the strength of selection in the game, using the “strategy adoption” mode of selection (see HauertSzabo2005 and below). While the dynamics under this rule is not precisely the same as the “strategy inheritance” mode of Darwinian selection, the differences (also discussed in HauertSzabo2005 ) are irrelevant for our purposes. The relationship between game dynamics and Ising-type models has been reviewed recently SzaboBorsos2016

To introduce our method and notation, we first study the Prisoner’s Dilemma Hamiltonian at finite temperature and recover well-known results. We then apply the method to the Public Goods game without punishment, which turns out to be equivalent to an Ising model with long-range interactions, but without a magnetic field. We then add punishment to the Public Goods game, leading to an Ising model with magnetic field (and corresponding hysteresis effects) that we solve exactly.

.1 Prisoner’s Dilemma

The Prisoner’s Dilemma is a game played between two individuals, in which both players have to make a decision about whether to cooperate or to defect. After both players have made their choice–to cooperate (C) or to defect (D)–their actions are revealed and players receive a payoff according to a payoff matrix (note that the values in the matrix correspond to the payoff given to the “row” player)

[TABLE]

The payoffs in that matrix define the type of game to be played. To obtain a Prisoner’s Dilemma, we must have Axelrod1984 $T>R>P>S$ . If the game is played repeatedly it becomes the iterated Prisoner’s Dilemma (IPD), a variant not considered here. Evolutionary game theory focuses on determining what strategies are evolutionarily stable in a population of strategies. In the simplest case, competition is between two unconditional deterministic strategies: one that always cooperates and one that always defects. A population starts out as a mix of both strategies, and players interact with a defined number of neighbors. Each player’s performance is evaluated by accumulating all payoffs received in that round. To model evolution, randomly-picked players (called focal players) can now either maintain their strategy or adopt the strategy of a competitor. Over time this process will lead to the spread of successful strategies and thus to evolution. This process of probabilistic strategy adoption is similar to the dynamics of strongly interacting spins described by Glauber Glauber1963 . In such a model of ferromagnetism, adjacent particles interact so that their spins will predominantly align (a spin adopting the state of its neighbor), giving rise to an overall magnetization that depends on the temperature of the system. In the following, we explore this analogy more deeply.

We first derive the thermodynamics of the Prisoner’s Dilemma with a payoff matrix where we set the reward $R=b-c$ (the benefit of cooperation minus the cost), while the temptation payoff $T=b$ (obtaining the benefit without bearing the cost). At the same time, the so-called “sucker-payoff” $S=-c$ due to paying the cost without any benefit, while $P=0$ is the “punishment” for both players mistrusting each other. In all of the following, we assume $c\geq 0$ as well as $b-c\geq 0$ , so that the net benefit $r=b-c\geq 0$ , ensuring that a dilemma exists. Indeed, even though the benefit outweighs the cost ( $r>0$ ), the Nash equilibrium and evolutionarily stable strategy is known to be defection, not cooperation. The payoff matrix in terms of these values then becomes $E=\left(\begin{array}[]{cc}b-c&-c\\ b&0\end{array}\right)$ .

To define a Hamiltonian (an operator that describes the total energy for this system) we can transform the payoffs into an energy by subtracting the payoff from its largest possible value. However, as this only adds a global constant it will cancel in observables, so to understand the population dynamics in terms of thermodynamics we can keep the payoff as is. A Hamiltonian is an operator that acts on a vector space (Hilbert space). A basis for the Hilbert space is spanned by the cooperative strategy C and the defecting strategy D by the vectors $C=|0\rangle=\left(\begin{array}[]{c}1\cr 0\end{array}\right)$ and $D=|1\rangle=\left(\begin{array}[]{c}0\cr 1\end{array}\right)$ .

In analogy to Ising spin systems, the Hamiltonian for the PD game can then be written in terms of the energy matrix $E$ and the projectors $P_{0}=|0\rangle\langle 0|=\left(\begin{array}[]{cc}1&0\\ 0&0\end{array}\right)$ and $P_{1}=|1\rangle\langle 1|=\left(\begin{array}[]{cc}0&0\\ 0&1\end{array}\right)$ as

[TABLE]

where the sum over $i$ goes over all the sites in this one-dimensional “spin chain”.

We proceed by calculating the thermal partition function of the system by writing ( $\beta=1/T$ is the inverse of the temperature, which the reader will not confuse with the temptation payoff)

[TABLE]

where $|x\rangle=|m_{1}m_{2}\cdots m_{N}\rangle$ is a circular chain so that the $N$ th site is adjacent to the first site. It is then easy to see that

[TABLE]

where $U_{ij}=e^{-\beta E_{ij}}$ .

To determine the equilibrium population composition, we define an order parameter given by the fraction of cooperators minus the fraction of defectors. For spin chains this is equal to the magnetization of the chain, defined using a spin operator $J_{z}$ for which $\langle 0|J_{z}|0\rangle=1$ and $\langle 1|J_{z}|1\rangle=-1$ . This can be achieved, e.g., with ( $\sigma_{z}$ is a Pauli matrix)

[TABLE]

We will understand this operator to act on the “row” player (that is, the first spin of the pair). For a chain of length $N$ ,

[TABLE]

so that

[TABLE]

due to the cyclic property of the trace. Here we introduced the matrix $U^{\prime}_{ij}=(-1)^{i}U_{ij}$ . An explicit calculation shows that (recall that $r=b-c$ )

[TABLE]

while since $U^{\prime}U=(1+e^{-\beta r})U^{\prime}$ and ${\rm Tr\,}U^{\prime}=-1+e^{-\beta r}$

[TABLE]

so that finally the thermal expectation value of the magnetization is

[TABLE]

We show the magnetization per player [Eq. (10) divided by $N$ ] as a function of the critical parameter $r$ in Fig. 1, and see that at low temperatures (high $\beta$ ) the population will consist mostly of defectors (negative magnetization) as this is the Nash equilibrium. We note that the parameter $r$ plays the same role as the interaction strength $J$ in the standard Ising model. The phase transition (vanishing magnetization) occurs at $r=0$ (the “boundary” of the parameter values), which is expected from the general arguments of van Hove vanHove1950 and of Landau LandauLifschitz1987 that forbid phase transitions in one-dimensional systems. Thus, we do not observe cooperation in the one-dimensional Prisoner’s dilemma, as is of course well-known.

.2 Public Goods game in one dimension

The PD game we just described turns out to be the two-player version of the more general Public Goods (PG) game. The PG game is a staple of evolutionary game theory as well as experimental economics Olson1971 ; DavisHolt1993 ; Ledyard1995 , and has been used to understand the Tragedy of the Commons Hardin1968 , a social dilemma that can lead to the overuse of public resources (for example, overfishing) because of selfish behavior. In the PG game, payoffs are defined for cooperators and defectors via

[TABLE]

where $\Pi_{C}$ is the payoff for a cooperator ( $\Pi_{D}$ for a defector). $N_{C}$ is the number of cooperators in the neighborhood (not counting the focal player, so it is the number of cooperators in the player’s periphery), and $r$ is the reward multiplier (synergy factor). These are the rules for a game with $k+1$ players in a group. In the following, we will treat the game in one dimension (so $k=2$ ).

The rules (11-12) imply a payoff matrix

[TABLE]

for cooperators, where the matrix elements indicate the states of spins in the periphery of the focal player. For example, $r-1$ is the payoff for a cooperator surrounded by two cooperators. The payoff matrix for defectors is simply $\Pi_{\rm D}=\Pi_{\rm C}-(\frac{1}{3}r-1)\;.$ .

We now construct a Hamiltonian to solve this evolutionary model exactly in two cases: one where the dynamics maximize the mean payoff of the population, and one in which the payoff of an individual is maximized. Naturally, we expect a correspondence with the evolutionary scenario only in the latter case. In this one-dimensional game, the population is arranged linearly so that each player forms a group with its left and right neighbor $(k=2)$ , see Fig. 2.

As mentioned earlier, we can create matrices for energies that should be minimized (rather than payoffs that need to be maximized) by subtracting the payoffs from the maximal payoff (here, $r-1$ ), leading to a ground state that has zero energy. Strictly speaking, the Hamiltonian for this system should be written as an interaction of three spins, but we will often write it in terms of a two-spin interaction matrix conditional on the state of the focal spin. For example, we can write

[TABLE]

We write a Hamiltonian for cooperators using these energies and the projectors previously defined

[TABLE]

and similarly for $H_{D}^{(i)}$ . The total Hamiltonian is (recall that $P_{0}$ projects onto a cooperator, so that $P_{0}|0\rangle=|0\rangle$ while $P_{0}|1\rangle=0$ )

[TABLE]

Using the spin operator (6) and the methods outlined earlier, we obtain after a somewhat tedious calculation

[TABLE]

suggesting a phase transition at $r=1$ , in contradiction with the standard expectation HintzeAdami2015 that suggests a transition at $r=3$ (see below). The reason for this discrepancy is not difficult to find: Hamiltonian dynamics minimize the energy of the entire spin chain, which is equivalent to maximizing population fitness as a whole. Darwinian evolution, however, does not optimize population fitness, but rather maximizes the fitness of a single individual within a population.

We can implement the latter dynamic by dropping the sum over sites in Eq. (17), and consider only the contribution to the energy from a single spin with its two neighbors. In that case (we take the middle site to be the focal site whose energy is minimized)

[TABLE]

where $U$ is the “cooperative” matrix $U=e^{-\beta H_{0}}$ while the defector matrix $V=e^{-\beta H_{1}}=e^{\beta(r/3-1)}U$ because defector energies differ by $r/3-1$ from cooperator energies, see Eq. (16). Then,

[TABLE]

Using the spin operator defined in Eq. (5) we obtain (again for a single focal player in the middle position)

[TABLE]

which allows us to calculate the order parameter as

[TABLE]

This function is plotted in Figure 3, and suggests that a phase transition with an interior critical point is possible in this game even though the game is one-dimensional, seemingly violating van Hove’s theorem vanHove1950 . However, the theorem forbidding internal critical points in one dimension only holds for short-range interactions, while the interaction between three players studied here is not of that kind.

To test the accuracy of our theoretical result, we now simulate the Public Goods game using agent-based methods Helbingetal2010 ; HintzeAdami2015 ; Adamietal2016a .

In the agent-based simulations we use a population of 1,024 players that either cooperate or defect, arranged in a one-dimensional chain just as in Fig. 2. Which of the two moves an agent chooses is determined by a genome (here a single locus) that evolves. At every update, players have a chance to change their strategy by probabilistically adopting the strategy of a competitor (Glauber dynamics, see, e.g., HauertSzabo2005 ; SzaboFath2007 ) using the rule (here $x$ is the focal player while $y$ is an alternative strategy)

[TABLE]

where $\beta$ is related to the strength of selection and $w$ is the fitness of each player defined by the payoff the player receives. In the case of rejection (i.e., non-adoption) the focal player retains its strategy.

We define an order-parameter-like function that indicates to what extent the population is in a cooperative or a defective regime. This parameter depends on the fraction of players in the population cooperating ( $P_{C}$ ) and the fraction defecting ( $P_{D}$ ) and is defined as:

[TABLE]

The agent-based simulations confirm that the fate of an evolving population depends critically on the synergy factor $r$ (see Figure 4), and changes from negative (defection) to positive (cooperation) at $r=3$ , in accordance with the critical $r_{c}=k+1$ for strategies to evolve cooperative behavior in the Public Goods game HintzeAdami2015 . In particular, the simulations confirm the theoretical results with high accuracy.

.3 Public Goods game with punishment

Cooperation evolves in the PG game if the synergy $r$ is at least as large as the group’s size $k+1$ . However, it is unlikely that in nature cooperation would ever create such a high synergy factor, implying that cooperation cannot evolve in this type of game. It has previously been suggested that punishment is one way to promote cooperation FehrGachter2002 ; FehrFischbacher2003 ; CamererFehr2006 ; Sigmundetal2001 ; Boydetal2003 ; Brandtetal2003 ; Helbingetal2010 . By introducing punishment, players can now not only choose between cooperation and defection, but can do this in conjunction with deciding whether or not to punish cheaters. This introduces two more strategies: a “moralist” M who cooperates and punishes, and an “immoralist I” who defects but also punishes Helbingetal2010 . For every player punished for defecting, each punishing player must pay a cost ( $\gamma$ ), and every player that is punished in such a way suffers a fine ( $\epsilon$ ), thus extending the rules (11,12) to (here, we show the special 1D case $k=2$ , for the general case see for example HintzeAdami2015 )

[TABLE]

where $N_{i}$ is the number of players in the immediate neighborhood of the focal player with strategy $i$ , $\epsilon$ parameterizes the effect of punishment, while $\gamma$ stands for the cost of punishment (see Helbingetal2010 ; HintzeAdami2015 ).

We now study this model thermodynamically, but in order to compare to the evolutionary dynamics we study the regime where the energy of a single site is minimized. To account for the additional strategies (beyond cooperator and defector), we extend the Hilbert space by allowing for a site-dependent magnetization $|i\rangle\rightarrow|i\rangle|j\rangle$ , so that each strategy is defined by a product of spin vectors. If we define punishment as $|0\rangle$ and non-punishment as $|1\rangle$ , we can write the states of the punishing and non-punishing cooperator as

[TABLE]

The payoffs (26-29) can be written in terms of a Hamiltonian for each of the four strategies as

[TABLE]

with projectors $P_{ij}$ on the respective states (with $\sum_{ij=0}^{1}P_{ij}=1$ ). Each Hamiltonian $H_{k}$ ( $k={\rm C},{\rm D},{\rm M},{\rm I})$ is written in terms of an energy matrix $F^{(k)}$ just as in Eq. (17)

[TABLE]

Similarly,

[TABLE]

We can now calculate the partition function

[TABLE]

on account of the decomposition (48), where

[TABLE]

or four times the contribution from each $E^{(C)}$ . Similarly,

[TABLE]

Finally, we obtain the order parameter that measures the degree of cooperation (the fraction of C and M players minus the fraction of D and I players), which turns into the surprisingly simple expression

[TABLE]

Note that the order parameter only depends on the effect of punishment $\epsilon$ but not the cost $\gamma$ , and reduces to expression (23) in the limit $\epsilon\to 0$ .

To check the theory, we can extend the agent-based model described above by including the two new strategies I and M. As before, we use 1024 players in a population that is arranged linearly (see Methods), and games are played in groups of three. Again, when we evolve this population using strategy adoption, we see the dependence of the critical point on the synergy factor $r$ and the selection strength $\beta=1/T$ . Since the game now includes two more strategies, we have to modify the function $M$ that describes the fraction of cooperators in the game to contain all four strategies as the fraction of contributing (cooperating) strategies:

[TABLE]

Evolving these populations using different fines $\epsilon$ and costs $\gamma$ , we find that the critical point now only depends on $\epsilon$ (see Figure 6), and moves the critical point in such a manner that the punishment fine reduces the critical synergy for cooperation HintzeAdami2015 .

It turns out that the closed-form solution Eq. (66) reproduces the agent-based simulations shown in Fig. 6 to a remarkable extent, confirming the unintuitive finding that the critical point only depends on the effect, but not on the cost, of punishment. The Hamiltonian model also clarifies that punishment indeed acts like a magnetic field that encourages alignment of spins, thus explaining why in agent-based simulations punishment induces hysteresis as a population is subjected to an adiabatically varying $r$ HintzeAdami2015 . Further work using the Hamiltonian model of cooperation with punishment may elucidate other aspects of the critical dynamics, in particular for games in higher dimensions, with more players per group, or even on irregular lattices.

.4 Discussion

Evolutionary Game Theory is a mathematical framework that has been eminently successful at unraveling the numerous elements that impact decisions, and to work out the decision’s consequences. While both mathematics and computational simulations have influenced this field (see for example the review Adamietal2016a , along with commentaries), the relationship between game theory and physics has been explored less. In real situations, decisions must be made under uncertainty; either due to unpredictable environments, or due to inherent noise. For evolutionary dynamics in particular, noise is unavoidable. After all, high reproductive potential does not guarantee survival, but only biases future outcomes. A standard result of population genetics for example predicts that a gene that confers a ten percent advantage in reproductive rate only has a twenty percent chance of being represented in future generations. The branch of science best equipped to tackle the impact of chance on dynamics is physics, with a well-developed corpus of results in statistical mechanics and thermodynamics. A growing literature has found success in mining these well-established methods, from harnessing the Fokker-Planck equation to describe the effect of chance due to drift in small populations TraulsenHauert2009 to using tools from statistical mechanics to study the universality class of phase transitions in the spatial Prisoner’s Dilemma HauertSzabo2005 . Here, we tapped a different set of well-established tools from statistical physics, namely the thermodynamics of spin systems. The analogy between the critical dynamics of spin systems and game theory is not difficult to see. After all, the correspondence between Eigen and Schuster’s model for the evolution of macromolecules EigenSchuster79 and two-dimensional Ising models was pointed out over thirty years ago Leuthaeusser1987 (see also section 11.4 in Adami1998 ) but we have not, as yet, seen a concerted effort to marshal the considerable machinery developed to tackle low-dimensional condensed matter structures to aid in understanding evolutionary game theory.

It may seem odd, at first sight, that a thermodynamic approach to game theory is possible at all, given that thermodynamics relies on the assumption that the system tends towards equilibrium, whereas in many game-theoretic situations (in particular, those that are of the Rock-Paper-Scissors type) the system appears to be maintained out of equilibrium. Fortunately, it is possible to show that even in systems out of equilibrium, detailed balance can be assured as long as microscopic reversibility is guaranteed GrahamHaken1971 ; Risken1972 . While this result depends on the nature of the boundary condition (it holds under “normal” boundary conditions, that is, boundary conditions in which the probability distribution vanishes at the boundary), there are strong reasons to believe that at least in the limit of large systems and low mutation rates, detailed balance can always be achieved for these games. Investigating this issue more deeply is left for forthcoming work.

The Hamiltonian approach we described here leads to important new insights about the dynamics of evolving populations at fixed strength of selection (and thus, to some extent, fixed temperature). First, we have shown that the standard statistical approach in which the energy of the entire ensemble is minimized, does not correspond to the evolutionary scenario, giving rise instead to a transition at $r=1$ . That result would imply that a dilemma is absent, and indeed this is precisely what we would expect if groups of organisms, rather than individuals, are selected. Second, the treatment of the Public Goods Game with punishment revealed that punishment plays the role that a local magnetic field plays when interacting with a system that can display spontaneous magnetization. An extensive literature in the area of spin-glasses of the Sherrington-Kirkpatrick type SherringtonKirkpatrick1975 suggests that local magnetic fields can give rise to spontaneous symmetry breaking, and that their mean-field solutions are similar to those of local spins interacting with a global magnetics field. These insights immediately suggests to look for effects such as hysteresis (as seen, for example, in HintzeAdami2015 ), but also interactions between hysteresis and impurities for example. Indeed, it is not unreasonable to imagine that including a third player strategy such as “abstaining” Fowler2005 ; Brandtetal2006 ; Hauertetal2008 can be viewed as impurities that can dramatically alter the critical dynamics we observe, for example by “pinning” the interfaces between domains111 Note that for this analogy to hold, the fraction of abstaining players (the density of impurities) must be kept constant. In general, these impurities can move through the crystal, but were kept fixed in numerical simulations of this effect VainsteinArenzon2001 . While the dynamics that includes abstaining may give rise to intransitive dominance dynamics Hauertetal2002 (such as the Rock-Paper- Scissors game), the arguments given above that out-of-equilibrium dynamics still gives rise to stationary equilibrium distributions as long as microscopic reversibility holds, suggest that such games can also be solved using Hamiltonian dynamics.

We should caution, however, that extending the present results to games in higher dimensions will be difficult. For example, while the Ising model can be solved in two dimensions, there is no solution for the model in two dimensions with a magnetic field, as it is related to the three-dimensional model for which a closed-form solution does not exist. Nevertheless, we expect that the tools developed here will be useful because if the analogy between evolutionary game dynamics and phase transitions in spin systems is established, other results from the rich literature of critical phenomena in spin systems may inform us about the dynamics of cooperation in groups. In particular, an extension of the calculation shown here to two dimensions may produce an exact solution along the lines of Onsager’s, which would allow us to move beyond pair-approximations for games on a 2D regular lattice. We hope that the simple results derived here (validated via computational simulation) can serve as a seed for the future development of this field.

Methods

The computational evolutionary model instantiates a population of 1,024 random agents in a circular configuration. At each update a single agent is randomly selected and its payoff computed by playing the strategy against its left and right neighbors. At the same time, the payoff of a strategy to potentially replace the agent is computed. In case of a two player game (C and D) the only other alternative strategy is used, in the case of four players (C, D, M, and I) one alternative strategy is chosen at random. Instead of the evolutionary updating of the population described in HintzeAdami2015 ; Adamietal2016a , here the likelihood to replace the strategy of the selected agent with the alternative is given by Eq. (24). In each replicate run, we updated strategies 2 million times (roughly 2,000 updates per site), then calculated the order parameter. The code as well as the analysis scripts to create all figures can be found at: github reference will be provided upon acceptance of the manuscript.

Acknowledgements

We thank Nathaniel Pasmanter for collaboration in the early stages of this work, as well as Claus Wilke for discussions. This work was supported in part by NSF’s BEACON Center for the Study of Evolution in Action, under Contract No. DBI-0939454. We wish to acknowledge the support of the Michigan State University High Performance Computing Center and the Institute for Cyber-Enabled Research.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) J. Maynard Smith. Evolution and the Theory of Games . Cambridge University Press, Cambridge, UK, 1982.
2(2) R. Axelrod. The Evolution of Cooperation . Basic Books, New York, NY, 1984.
3(3) J. Hofbauer and K. Sigmund. Evolutionary Games and Population Dynamics . Cambridge University Press, Cambridge, UK, 1998.
4(4) M. Nowak. Evolutionary Dynamics . Harvard University Press, Cambridge, MA, 2006.
5(5) C. Adami, J. Schossau, and A. Hintze. Evolutionary game theory using agent-based methods. Phys Life Rev , 19:1–26, 2016.
6(6) B. Allen, G. Lippner, Y.-T. Chen, B. Fotouhi, N. Momeni, S.-T. Yau, and M. A. Nowak. Evolutionary dynamics on any population structure. Nature , 544:227–230, 2017.
7(7) G. Szabó and G. Fáth. Evolutionary games on graphs. Phys Rep , 446:97–216, 2007.
8(8) A. Szolnoki, M. Perc, and G. Szabó. Phase diagrams for three-strategy evolutionary prisoner’s dilemma games on regular graphs. Phys Rev E , 80:056104, 2009.