Persistence of discrimination: revisiting Axtell, Epstein and Young
G\'erard Weisbuch

TL;DR
This paper revisits a model of social class emergence, using advanced cognitive and statistical physics methods, and finds that discrimination biases tend to reinforce and persist over time rather than leading to class formation.
Contribution
It introduces a more detailed cognitive framework into the model and reinterprets previous results, emphasizing the stability of discrimination biases.
Findings
Discrimination biases are reinforced and stable over time.
The model predicts long-term persistence of biases rather than class emergence.
Reformulation leads to different social interpretations of the phenomena.
Abstract
We reformulate an earlier model of the "Emergence of classes..." proposed by Axtell etal. using more elaborate cognitive processes allowing a statistical physics approach. The thorough analysis of the phase space and of the basins of attraction leads to a reconsideration of the previous social interpretations: our model predicts the reinforcement of discrimination biases and their long term stability rather than the emergence of classes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Persistence of discrimination:
revisiting Axtell, Epstein and Young.
Gérard Weisbuch
Ecole normale superieure, 24, rue Lhomond, Paris, France
Laboratoire de physique statistique, Département de physique de l’ENS,
École normale supérieure, PSL Research University, Université Paris Diderot,
Sorbonne Paris Cité, Sorbonne Universités, UPMC Univ. Paris
06, CNRS, 75005 Paris, France
email: [email protected]
Keywords: Socio-Physics; Social Cognition; Discrimination; Dynamics; Attraction basins.
Abstract
We reformulate an earlier model of the "Emergence of classes…" proposed by Axtell et al. (2001) using more elaborate cognitive processes allowing a statistical physics approach. The thorough analysis of the phase space and of the basins of attraction leads to a reconsideration of the previous social interpretations: our model predicts the reinforcement of discrimination biases and their long term stability rather than the emergence of classes.
1 Introduction
During the 90’s social scientists introduced several thought provocative models of social phenomena, most often using numerical simulations (multi-agent simulations). These models have later been extended by methods and concepts derived from statistical physics such as Master Equations and Mean Field Approximation. A few examples include voters models and imitation processes Nowak et al. (1990) and the review of Castellano et al. (2009) , El Farol and the minority game (Arthur, 1994) and Challet et al. (2013), diffusion of cultures (Axelrod, 1997) and (Castellano et al., 2000). Revisiting these models provided deeper insight, more precise results and even sometimes corrections.
The questions of the emergence and persistence of classes and discrimination received a lot of attention from social scientists, ethnographers and economists, see e.g. (Bowles & Naidu, 2006) and references within. A very inspiring model entitled "Emergence of Classes in a Multi-Agent Bargaining Model" was proposed by Axtell, Epstein and Young (Axtell et al. (2001)). We here propose to revisit their approach using a more elaborate model of agent cognition and to compare a mean field approach to our agent based simulation results.
2 The models
2.1 The original model of Axtell, Epstein and Young
Let us briefly recall the original hypotheses and the main results of Axtell, Epstein and Young (Axtell et al. (2001)).
- •
Framework: pairs of agents play a bargaining game introduced by Nash Jr (1950) and Young (1993). During sessions of the game, each agent can, independently of his opponent, request one among three demands: L(ow) demand 30 perc. of a pie, M(edium) 50 perc. and H(igh) 70 perc. As a result, the two agents get at the end of the session what they demanded when the sum of both demands is less than the 100 perc. total; otherwise they don’t get anything. The corresponding payoff matrix is written table (1). At each step a random pair of agents is selected to play the bargaining game. The iterated game is played for a large number of sessions, much larger that the total number of agents which could then learn from their experience how to improve their performance.
- •
Learning and memory: Agents keep records of the previous demands of their opponents, e.g. for 10 previous moves.
- •
Choosing the next move: at each time step, pairs of agent are randomly selected to play the bargaining game. They most often choose the move that optimises their expected payoff using the memory of previous encounters as a surrogate for the actual probability distribution of their opponent’s next moves. With a small probability , e.g. 0.1, they choose randomly among L, M, H.
The main results obtained by Axtell et al. (2001) from numerical simulations are:
- •
They observe different transient configurations which they interpret as "norms", e.g. the equity norm is observed when all agents play M. Because of the constant probability of random noise, the system never stabilises on an attractor, even in the sense of Statistical Physics. The duration of the transients increases exponentially with the memory size and .
- •
Their most fascinating result is obtained when agents are divided into two populations with arbitrary tags say e.g. one red and one blue. When agents take into account tags for playing and memorising games (in other words when agents play separately two games, one intra-game against agents with the same tag and another inter-game against agents with a different tag) one observes configurations in the inter-game such that one population always play H while the other population plays L; they interpret such inequity norm as the emergence of classes, the H playing population being the upper class.
Equivalent results are obtained when agents are connected via a social network as observed by (Poza et al., 2011) on a square lattice as opposed to the full connection structure used by (Axtell et al., 2001). For some instances, domains with different norms occupy different parts of the lattice. Otherwise, one single domain of agents playing the same norm covers the entire lattice, depending upon the initial conditions.
From now on, we follow a plan starting with the exposition of our own model (section 2.2). The use of a mean field approximation allows to simply describe the attractors of the dynamics and the different dynamical regimes (section 3). These results are then compared with those obtained by direct agent based simulations (section 4), including a thorough survey of the attraction basins. We further proceed with the analysis of the two tagged populations version (section 5). The discussion compares our results to those of previous models and to magnetic systems. A short conclusion stresses the difference in interpretation of the models in terms of social phenomena (section 6).
2.2 The moving average and Boltzman choice cognitive model
We start from the same bargaining game as (Axtell et al., 2001) with a payoff matrix written in table (1), but using different coding of past experience (moving average of past profits) and choice function (Boltzman function).
The present model is derived from standard models of reinforcement learning in cognitive science, see for instance Weisbuch et al. (2000).
Rather than memorising a full sequence of previous games, agents update 3 "preference coefficients" for each possible move , based on a moving average of the profits they made in the past when playing . is the preference coefficient for playing , for and for . The updating process following time interval after a transaction is:
[TABLE]
The decrease term in corresponds to discounting the importance of past transactions, which makes sense in an environment varying with the choices of the other players. is the actual profit made during the chosen transaction ; the 2 other corresponding to the 2 other choices are simply decreased.
These preference coefficients are then used to choose the next move in the bargaining game. Agents face an exploitation/exploration dilemma: they can decide to exploit the information they earlier gathered by choosing the move with the highest preference coefficient or check possible evolutions of profits by trying randomly other moves. Rather than using a constant rate of random exploration as in Axtell et al. (2001), the probability of choosing demand is based on the logit function:
[TABLE]
where , the discrimination rate, measures the non-linearity of the relationship between the probability and the preference coefficient . Large values results in always playing the choice with the largest , small values to indifference among the three choices. Economists use the name logit for the Boltzmann distribution. We have earlier shown Nadal et al. (1998) that the Boltzmann distribution can be derived by maximising a linear combination of expected profits and information gained through exploration, see (Bouchaud, 2013) for a thorough discussion.
Comparing our model with the one proposed by (Axtell et al., 2001):
- •
The moving average corresponds to a gradual rather than abrupt decrease of previous memories, it is based on agent’s own experience in terms of profit rather than the observation of her opponents’ moves and it uses less memory.
- •
Boltzman choice has a random character as the constant probability noise introduced in (Axtell et al., 2001), but furthermore the choice depends upon the differences in experienced profits; we might expect agents to be less hesitant when their previous experience resulted in very different preference coefficients.
3 The mean field approximation
3.1 Derivation of the mean field approximation
The difference equation (1) can be changed to a differential equation in the limit of a slow dynamics:
[TABLE]
where the time unit is the average time between the agent’s bargaining processes. is the profit made by the agent if he chose demand .
The Mean Field approximation consists in replacing by its expected value , thereby transforming the stochastic differential equation into a deterministic differential equation.
The time evolution of is thus approximated by the following set of equations:
[TABLE]
where is given by:
[TABLE]
is the agent’s move, are the 3 possible moves of her opponent, and the (0, 0.3, 0.5, 0.7) are the coefficients of the pay-off matrix. The mean field approximation neglects fluctuations among agents representations, their . Hence agent evaluates the probability of her opponent’s moves according to her own estimations, using Boltzman functions of her own .
Using statistical physics notation :
[TABLE]
the internal representation of the agent is thus vector () which components obey dynamics:
[TABLE]
[TABLE]
[TABLE]
Taking the exponentials as new variables simplifies expressions (7-9) and allows to deduce scaling properties. Let:
[TABLE]
[TABLE]
[TABLE]
The new equations are:
[TABLE]
[TABLE]
[TABLE]
with .
Expressions (13-15) show that a single parameter determines equilibrium conditions, an improvement on (Axtell et al., 2001) who needed two parameters and memory size. Phase transition diagrams will then be drawn varying while keeping constant. plays the role of a kinetic coefficient, increasing the characteristic time towards equilibrium. The magnitude of the coefficients at equilibrium scales as .
3.2 Mean field analysis: Attractors and transitions
The state of the system is described by the set of the preference coefficients , and of the agents, i.e. their estimated profit divided by for the three possible moves resp. H, M and L. This is an improvement with respect to (Axtell et al., 2001) which space phase dimension was three times the memory size. Our analysis can then proceed using the more powerful methods of dynamical systems and statistical mechanics rather the Markovian formulation proposed in (Axtell et al., 2001).
Trajectories in the J phase space are obtained by solving the mean field equations (4-5) using a Rosenbrock integrator (GRIND et al., 2017). Grids of trajectories help to figure out attractors and attraction basins. Since we cannot draw sets of 3 trajectories, we display their projections in plans () , () and () for a given choice of in figure (1). The trajectories start at regular interval in the projection plan with the same third coordinate.
Three attractors can be observed: one with large when move M is the preferred choice by all agents, to be called the M attractor; one with large when move L is the preferred choice by all agents, to be called the L attractor; and one with lower values of and , to be called the HL attractor.
The Mean Field analysis readily tells us that two of the attractors are such that all agents always play the same strategy either M or L.
The dependence of the J’s upon the reduced parameter / is displayed on the continuation plot of figure 2. We clearly identify the 3 same attractors in the ordered regime above , and only one attractor left with move M as the preferred choice for lower values . The bifurcation is observed around . A steep, but not abrupt, transition further occurs when to a disordered regime such that agents do not display strong preferences for any choice.
4 Agent-based simulations
4.1 Average analysis
Let us now compare the above results with those directly obtained by agent-based simulations. At each time step a pair of agents is randomly chosen. They play the bargaining game choosing their move with a probability given by equation (2) using their own specific (not an average as in the mean field approximation), which they update after the session. And so on.
We here report 4 types of results:
- •
On figure 3 the phase transition diagram (to be compared with figure 2).
- •
On figure 4 individual trajectories in the simplex.
- •
On figure 5 the distribution of individual ’s on 4 attractors.
- •
On figure 6 a sketch of the attraction basins.
We first monitor the different averaged over the whole population at equilibrium when is scanned downward and upward between 0 and 2 (figure 3). For the decreasing branches , we start at from initial distributions of close to one of the attractors for each branch and carry on integration until the attractor is reached. The branches are continued when is lowered, taking as initial conditions the previous values of on the attractor. The equivalent method is applied when is increased from 0 for the branches. For the sake of clarity, attractors HL and attractor L are represented resp. on the upper and lower plots.
Only attractor M can be reached when is increased from the disordered attractor. When , the path is reversible, but a hysteresis cycle is observed when crosses the bifurcation.
In the ordered region, the transition from attractor HL to attractor M is direct. By contrast, one first observes a continuous transition from L to HL around above the sharp transition at the bifurcation.
Some attractor levels can be readily obtained from equilibrium conditions of equations (10-12). When only one move is chosen by the agents, the fraction involving exponentials equals 1 and the value of is given by :
[TABLE]
in accordance with simulation results: on attractor M reaches and on attractor L reaches . In the case of the disordered attractor for low values, the exponentials are close to one and the are directly computed from equations (10-12).
Phase diagrams of the mean field approximation and of the agent-based simulations look pretty similar with the same attractors; the main difference is the dependence of J upon for the HL attractor observed in the Mean Field Approximation.
4.2 Individual positions
The previous results concerned global features. Let us now examine individual agent choices. We use a simplex representation as in Axtell et al. (2001). At any time step, preference coefficients J of an agent are displayed on the simplex by a point which position corresponds to the center of gravity of masses proportional to J1, J2, J3 placed at vertices H, M, L. For instance an agent positioned close to the center of the simplex is indifferent to choice H, M or L, while any agent close to one of the vertices has strong preferences and mostly plays according to that vertex.
Figure (4) represents a typical set of 30 agents trajectories in the simplex for and during 10000 integration steps (each agent has been sampled 666 times on average). We started from a uniform distribution of initial ’s of width 2 centered around (2.0, 2.0, 2.0). Their positions are indicated by a small square. Trajectories are distinctively coloured. After a few initial wanderings, they diverge in the direction of the closest vertex. They remain fixed in the case of L and M vertices. Some might fluctuate around vertex H because of possible encounters with high demanding opponents which result in the decrease of .
Each of the 16 simplices on figure (5) is a snapshot of agents’ s after a given integration time for a given value of and for .
Each red point represents the set of preference coefficients of a single agent. Each line of vertices displays the evolution of agents preferences at increasing iteration times towards one of the 4 asymptotic configurations for a given value of .
The initial conditions where chosen to favour the attractor to be displayed. We used uniform distributions of width 1.0 around (0.9, 0.9, 0.9) for the disordered attractor, D, around (1.0, 0.6, 1.0) for the M attractor, around (1.0, 0.6, 3.0) for the HL attractor and around (0.6, 0.6, 6.5) for the L attractor.
In agreement with previous observations, we see on the first line of figure (5) that for low values the agents positions remain dispersed inside the simplex even for long iteration times, which corresponds to a disordered phase.
By contrast when is increased, agents in the ordered phase gather towards one or two vertices, even after much smaller iteration times. We have chosen intermediate values of to avoid the accumulation of representative points on simplex vertices which would be observed at larger values, e.g. .
A physical interpretation of the above results would be a comparison with a condensed phase with thermal excitations above the ground state. When further increases, an equivalent of temperature decrease in physical systems, agents preferences condense exactly on the vertices (see further figure (6)), a property which helps us to check the basins of attraction.
4.3 Basins of attraction
The next question concerns the extension of the basins of attraction of the different attractors. In fact, a systematic search for displays many more attractors than expected from our preliminary scans.
Figuring basins of attraction in a 3D phase space is not obvious and we once again use a vertex representation. The data are obtained by a triple scan of initial conditions. Each initial condition is a uniform distribution of J’s values of width 1.0 around a given center. For instance, an initial distribution centered on the center of gravity of the simplex is randomly drawn in the cube: . The upper circle of figure 6 corresponds to the initial distribution: etc.
Figure (6) describes data gathered by 3 nested loops across initial , where distribution centers are varied from 0.4 to 6.4 by a factor 2. The positions of the centers of the circles in the simplex codes the initial distribution centers. We used the condensation of agents preferences on the 3 vertices to display final distributions by pie charts. Red sectors represent the percentage of agents with choice H, blue sectors represent the percentage of agents with choice M, green sectors represent the percentage of agents with choice L. Rare and narrow white sectors represent the percentage of agents inside the simplex. We checked that they are located in the immediate neighbourhood of vertices.
Several important conclusions about ordered phases for large can be drawn:
- •
Mixed strategies inside the vertex are unstable.
- •
The attractors are distributions of agents on the vertices or very close. Agents have strong opinions about the value of their choice and don’t change them frequently.
- •
J space is paved with basins of attraction surrounding the attractors. The dynamics collapse choices to the nearest vertex, with the exception of vertex H.
- •
No distribution consists of only H preferences - which would give no gain to the agents. When the initial conditions are close to vertex H, the attractors can only be mixed distributions, with very few agents playing M.
- •
(Axtell et al., 2001) and (Poza et al., 2011) report the existence of "fractious attractor" such that agents oscillate between choices H and L for long transient. We never observed such "fractious attractor", even after a specific search, and we suspect that they were due to their choice of a constant noise term.
5 Tagged populations
The most striking result in Axtell et al. (2001) is the existence of an inequity norm sustainable among two a priori equivalent tagged groups. Their inequity norm corresponds in our settings to an attractor such that all members of one population with tag T1 play H against any member of the other tagged population (T2) who always plays L against them.
To investigate the basins of attractions for the two tagged populations we proceed with the same scan of initial conditions of population with tag T2 as above, but maintain the same initial conditions for population with tag T1 around the center of the simplex:
.
Figure 7 displays the attractors of the inter-population dynamics, the simplex T1 above simplex T2 with the same conventions as in figure 5, except that the positions on both simplices correspond to the scan of initial conditions of population T2. The colour codes are the same as for figure 6 and reflects the attractors of each population.
The pie charts close to the left vertices of the simplices e.g. are coloured green for T1 and red for T2; the attractors of the dynamics are then L for T1 and H for T2. The pie charts close to the top vertices correspond to a stable mixture of the 3 possible moves H,M,L for T1 and to pure L for T2 The pie charts close to the right vertices correspond to a stable mixture of 2 possible moves M,L for T1 and to pure M for T2.
Not surprisingly, the attractors reflect the initial conditions of population T2 since the initial conditions of T1 were kind of neutral.
The main difference with the no-tag simulation is the appearance of a pure strategy attractor such that all agents with tag T2 play H against agents with tag T1 who play L. This asymmetrical attractor parallel the findings of (Axtell et al., 2001) who refer to a "discriminatory norm". But our analysis characterises an attractor reached from initial conditions that were already biased towards inequity with larger values of . And this condition is an attractor of the dynamics, not a transient.
6 Discussion and Conclusions
The reformulation of the iterated bargaining game of (Axtell et al., 2001) using more elaborate cognitive processes such as taking moving averages of past gains and choosing next moves according to Boltzman probabilities allows a more precise description of the dynamics in terms of attractors, regime transitions and basins of attraction. The number of parameters was reduced from two to one. Several transitions are observed between a disordered state and several stable ordered configurations when increases. Because we use Boltzman choice function, agents end-up using mostly pure strategies for larger values of . We never observed any "fractious state" such that agents remain in the interior of the simplex changing randomly their choice between H and L as reported in (Axtell et al., 2001) and (Poza et al., 2011). Our guess is that such behaviour is due to their hypothesis of a random choice with a constant non-zero probability.
As discussed earlier in sections 4.3 and 5, J space is paved with basins of attraction surrounding the attractors and dynamics collapse preference coefficients to the nearest vertex. Unbiased random initial conditions never generate H/L attractors.
Hence our interpretation in terms of social phenomena: game interactions and cognitive processes can increase and stabilise discrimination and inequality among tagged populations, even when tag were a priori neutral. On the other hand, inequality attractors never "emerge"111(Axtell et al., 2001) specify in a footnote that they ”use the term ”emergent” … to mean simply ”arising from the local interactions of agents.” ”. But the word Emergence in the title of their paper evokes the idea of emergence of Classes in a previously egalitarian society. spontaneously from random unbiased initial conditions.
The changes we introduced in (Axtell et al., 2001) also allow to figure out which properties can be considered as generic, that is to say independent from the details of each model, and which are specific to the exact formulation of the model.
The two versions of the iterated bargaining game, (Axtell et al., 2001) and the present paper, agree that unfair social institutions such as classes and discrimination can result as the downside of a rational cognitive practice, namely memorising or coding previous events to take present decisions. And furthermore, that taking into account a priori irrelevant tags can lead to a dissociation of the two tagged populations into an upper and a lower class.
But we differ by our interpretation in terms of social phenomena: game interactions and cognitive processes can increase and stabilise discrimination, but inequality attractors never "emerge" spontaneously from random unbiased initial conditions. History has taught us that wars and invasions often result into discriminations that are maintained long after these events.
Acknowledgments
We thank Sophie Bienenstock, Bernard Derrida, Alan Kirman, Jean-Pierre Nadal and Jean Roux for helpful discussions and David Poza for providing his Netlogo program of the lattice version of (Axtell et al., 2001) model.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Arthur (1994) Arthur, W. B. (1994). Bounded rationality and inductive behavior (the el farol problem). American Economic Review 84 (2), 406–411.
- 2Axelrod (1997) Axelrod, R. (1997). The dissemination of culture a model with local convergence and global polarization. Journal of conflict resolution 41 (2), 203–226.
- 3Axtell et al. (2001) Axtell, R. L., Epstein, J. M. & Young, H. P. (2001). The emergence of classes in a multiagent bargaining model. Social dynamics , 191–211.
- 4Bouchaud (2013) Bouchaud, J.-P. (2013). Crises and collective socio-economic phenomena: simple models and challenges. Journal of Statistical Physics 151 (3-4), 567–606.
- 5Bowles & Naidu (2006) Bowles, S. & Naidu, S. (2006). Persistent institutions. Tech. rep., working paper, Santa Fe Institute.
- 6Castellano et al. (2009) Castellano, C., Fortunato, S. & Loreto, V. (2009). Statistical physics of social dynamics. Reviews of modern physics 81 (2), 591.
- 7Castellano et al. (2000) Castellano, C., Marsili, M. & Vespignani, A. (2000). Nonequilibrium phase transition in a model for social influence. Physical Review Letters 85 (16), 3536.
- 8Challet et al. (2013) Challet, D., Marsili, M., Zhang, Y.-C. et al. (2013). Minority games: interacting agents in financial markets. OUP Catalogue .
