Opinion formation and distribution in a bounded confidence model on various networks
X. Flora Meng, Robert A. Van Gorder, and Mason A. Porter

TL;DR
This paper investigates how network structure and interaction parameters influence opinion formation and convergence time in the Deffuant model across various network types, revealing critical thresholds for consensus and opinion diversity.
Contribution
It provides a comprehensive numerical analysis of the Deffuant model on different networks, highlighting the effects of network topology and parameters on opinion dynamics and convergence behavior.
Findings
Network structure affects convergence time and opinion group formation.
A critical confidence bound triggers a transition from consensus to multiple opinions.
Convergence time varies significantly with network type and model parameters.
Abstract
In the social, behavioral, and economic sciences, it is an important problem to predict which individual opinions will eventually dominate in a large population, if there will be a consensus, and how long it takes a consensus to form. This idea has been studied heavily both in physics and in other disciplines, and the answer depends strongly on both the model for opinions and for the network structure on which the opinions evolve. One model that was created to study consensus formation quantitatively is the Deffuant model, in which the opinion distribution of a population evolves via sequential random pairwise encounters. To consider the heterogeneity of interactions in a population due to social influence, we study the Deffuant model on various network structures (deterministic synthetic networks, random synthetic networks, and social networks constructed from Facebook data) using…
| Network | Definition | Example |
|---|---|---|
| A complete graph has pairwise adjacent nodes West (2001). |
|
|
| For , a cycle has node set and edge set West (2001). |
|
|
| For , let and be the node sets of two disjoint cycles. The prism is defined as the graph obtained by joining the two cycles at the set of edges Boesch and Bogdanowicz (1987). |
|
|
| For a positive integer , we define a square lattice of side length as the graph with the node set and edges such that . |
|
|
| For an integer and positive integers , a complete -partite graph is a graph whose node set can be partitioned into subsets (called partite sets) of sizes , respectively, such that two nodes are adjacent if and only if they are from two distinct subsets. We consider complete -partite graphs with equal-sized partite sets and denote such graphs as , where equals the number of partite sets and (a multiple of ) is the size of the node set Chartrand and Zhang (2008). |
|
|
| For and , we define as follows: start with and add edges between non-adjacent nodes uniformly at random until there are extra edges on the cycle . |
|
|
| For and , we define as follows: start with and add edges between non-adjacent nodes uniformly at random until there are extra edges on the prism graph . |
|
|
| For and , we generate random graphs from the Erdős–Rényi (ER) model Gilbert (1959) as follows: start with disconnected nodes and place an edge between each distinct pair with independent probability . |
|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| AIC | ||||||
|---|---|---|---|---|---|---|
| N/A | ||||||||
| N/A | ||||||
| N/A | ||||||
| AIC | ||||||
|---|---|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
| Estimate | Std. Error | value | Pr() | |
|---|---|---|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Opinion formation and distribution in a bounded confidence model
on various networks
X. Flora Meng1,2, Robert A. Van Gorder1, and Mason A. Porter1,3,4,∗
1Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
2Department of Electrical Engineering and Computer Science,
Massachusetts Institute of Technology, Cambridge, MA 02139, USA
3CABDyN Complexity Centre, University of Oxford, Oxford OX1 1HP, UK
4Department of Mathematics, University of California, Los Angeles, CA 90095, USA
Abstract
In the social, behavioral, and economic sciences, it is an important problem to predict which individual opinions will eventually dominate in a large population, if there will be a consensus, and how long it takes a consensus to form. This idea has been studied heavily both in physics and in other disciplines, and the answer depends strongly on both the model for opinions and for the network structure on which the opinions evolve. One model that was created to study consensus formation quantitatively is the Deffuant model, in which the opinion distribution of a population evolves via sequential random pairwise encounters. To consider the heterogeneity of interactions in a population due to social influence, we study the Deffuant model on various network structures (deterministic synthetic networks, random synthetic networks, and social networks constructed from Facebook data) using several interaction mechanisms. We numerically simulate the Deffuant model and conduct regression analyses to investigate the dependence of the convergence time to equilibrium on parameters, including a confidence bound for opinion updates, the number of participating entities, and their willingness to compromise. We find that network structure and parameter values both have an effect on the convergence time, and for some network topologies, the convergence time undergoes a transition at a critical value of the confidence bound. We discuss the number of opinion groups that form at equilibrium in terms of a confidence-bound threshold for a transition from consensus to multiple-opinion equilibria.
I Introduction
Social interactions play a central role in the process of decision-making and opinion formation in populations of humans and animals Jackson (2008); Couzin et al. (2005). Discussions among acquaintances, coworkers, friends, and family members often lead interlocutors to adjust their viewpoints on politics, participation in a social movement, adoption of technological innovations, or other things DeGroot (1974); Oliver et al. (1985); Siegel (2009); Jackson and Yariv (2011); Jia et al. (2015); and the prediction of collective opinion formation in a population from attributes of individuals is one of the most important problems in the social sciences Castellano et al. (2009); Friedkin et al. (2016). Consensus dynamics is also a key problem in areas such as control theory Jadbabaie (2015); Jadbabaie et al. (2003) and collective dynamics more generally Vicsek and Zafeiris (2012). From a physical and mathematical standpoint, the study of opinion dynamics is one of the key motivating examples for studying the effects of network structure on dynamical processes on networks Porter and Gleeson (2016).
There are various methods for studying opinion formation in social networks, such as through Bayesian learning or generative social-interaction mechanisms Acemoglu and Ozdaglar (2011). Bayesian updating requires some unrealistic assumptions about individuals’ knowledge and reasoning ability, and it becomes computationally infeasible in complex settings Acemoglu and Ozdaglar (2011); Jackson (2008). Even in opinion models that do not suffer from these issues, there remains significant arbitrariness in the choice of specific models and parameters to use, and different choices can lead to markedly (and qualitatively) different results Acemoglu and Ozdaglar (2011); Sobkowicz (2009). A substantial amount of work on non-Bayesian approaches to opinion formation employs models and tools from dynamical systems, probability theory, and statistical physics Castellano et al. (2009). Moreover, a major theme in statistical physics is how global properties can emerge from local rules, which is similar to the question in social sciences of how the collective opinion of a population evolves as the result of individual attitudes and the mutual influence of individuals on each other Kozma and Barrat (2008a). Some notable generative models of opinion formation include voter models Clifford and Sudbury (1973); Holley and Liggett (1975); Holme and Newman (2006); Durrett et al. (2012), majority-rule models Galam (2002), models based on social-impact theory Latané (1981); Nowak et al. (1990), the Sznajd model Sznajd-Weron and Sznajd (2000); Sznajd-Weron (2005), and bounded confidence models Deffuant et al. (2000); Hegselmann and Krause (2002); Krause (2000); Weisbuch et al. (2000).
Bounded-confidence models, first introduced (to our knowledge) by Deffuant et al. Deffuant et al. (2000); Weisbuch et al. (2002) and Hegselmann and Krause Hegselmann and Krause (2002); Krause (2000), capture the notion of a tolerance threshold based on experimental social psychology Millon et al. (2003); Weisbuch et al. (2005). Bounded confidence reflects the psychological concept of selective exposure, which refers to an individual’s tendency to favor information that supports their views while neglecting conflicting arguments Lorenz and Urbig (2007); Sullivan (2009). The Deffuant model and the Hegselmann–Krause (HK) model both consider a set of agents who hold continuous opinions that can vary. Agents are connected to each other by an interaction network, and neighboring agents adjust their opinions at discrete time steps whenever their opinions are sufficiently close to each other. The two models differ primarily in their communication regime. In the HK model, agents interact with all of their compatible neighbors simultaneously at each time step, and they update their opinions to agree with the mean opinion of these neighbors. In contrast, the Deffuant model adopts a sequential updating rule and can be viewed as a discrete-time repeated game that is played in pairwise fashion among a set of agents until the agents’ opinions converge to either a single opinion or multiple opinions Fudenberg and Tirole (1995); Jackson (2008); Jackson and Zenou (2014). One can also tune the speed at which opinions converge in the Deffuant model through an additional parameter, sometimes called a cautiousness parameter, that describes openness to compromises. The Deffuant model was developed to study opinion-formation processes in large populations in which people interact in small groups (such as pairwise interactions in a network), whereas the HK model is suitable for contexts such as meetings with many participants. Two questions have drawn considerable interest: (1) how does the parameter space influence the number of opinion groups in an equilibrium state; and (2) how long does it take for a system to reach an equilibrium state Deffuant et al. (2000); Laguna et al. (2004); Lorenz (2005); Weisbuch et al. (2002); Weisbuch (2004)?
Despite its seeming simplicity, the Deffuant model is not analytically solvable in general, and most results about it have been obtained from Monte Carlo simulations. It has been shown numerically, for a few values of the cautiousness parameter, that consensus occurs for large confidence bound values on complete graphs with probability close to in the large-population limit, whereas multiple opinion groups persist at equilibrium for low confidence bounds Fortunato (2004); Laguna et al. (2004); Weisbuch et al. (2002); Weisbuch (2004). However, different confidence-bound thresholds have been proposed for transition from consensus to multiple opinion groups at equilibrium. In the latter case, one can approximate the number of groups by a function of the confidence-bound value Weisbuch et al. (2002); Weisbuch (2004). Numerical simulations have suggested that the time to opinion equilibrium is proportional to the number of agents in the network Laguna et al. (2004). Moreover, a higher value of the cautiousness parameter increases not only the convergence speed but also the number of agents that hold extreme opinions at equilibrium Laguna et al. (2004). Numerical simulations also have illustrated the possibility of forcing or preventing a consensus within a population by manipulating the initial opinion distribution Ben-Naim et al. (2003); Carro et al. (2013).
There has been some research that compares results for the Deffuant model on complete graphs with those on other networks. Results for complete graphs and square lattices are similar for large confidence-bound values, except that a few extreme opinions remain on square lattices at equilibrium Weisbuch et al. (2002). The Deffuant model has also been simulated on random graphs generated by Barabási–Albert (BA), Erdős–Rényi (ER), and Watts–Strogatz (WS) mechanisms Fortunato (2004); Gandica et al. (2010); Jalili (2013); Stauffer and Meyer-Ortmanns (2004). However, different assumptions and update rules are often used, and this poses a major barrier for comparing results across different networks.
There have also been efforts to study the Deffuant model from an analytical perspective using a density function that determines the agents’ density in opinion space Ben-Naim et al. (2003); Lorenz (2005). Such an approach adopts a classical strategy in statistical physics by deriving a rate equation (also called a “master equation”) and can be interpreted as taking the infinite limit of the number of agents Lorenz (2007). These derivations have not led to analytical solutions of the Deffuant model, but they require numerical integration only of the master equation, which is faster than running Monte Carlo simulations of the original model. Unfortunately, however, such density-based method requires fairly restrictive assumptions, such as homogeneous mixing and averaging agents’ opinions as the means of compromise.
The Deffuant model itself also has limitations, and numerous efforts have been made to extend it in order to better reflect reality. For instance, the confidence bound imposes a boundary on interacting agents’ decision whether or not to adjust their opinions. A small change in the difference between their opinions may lead to a different decision being made. For this reason, some scholars have proposed the use of smooth confidence bounds, with which the attraction of agents decreases as their opinion difference increases Deffuant et al. (2002, 2004). Other generalizations of the confidence bound include introducing heterogeneous tolerance thresholds in a population Weisbuch et al. (2002, 2005) and considering time-dependent thresholds Weisbuch et al. (2002). Additionally, the Deffuant model can be extended naturally to incorporate vector-valued opinions, as this only requires redefining opinion distance Fortunato et al. (2005).
Studies of variants of the Deffuant model often compare new results with those of the original model. However, numerical simulations of the original model are usually performed for specific parameter values and networks. Moreover, conclusions are often drawn based on visual inspection and sometimes rely on simplifying assumptions. Furthermore, quantifying the confidence bound and the cautiousness of a population is an open question for many applications. These issues motivate us to take a more systematic approach to the study of the Deffuant model on networks.
We explore the dependence of convergence time and the number of opinion groups at equilibrium on network topology, confidence bound, the number of participating agents, and their willingness to compromise. We conduct regression analyses to model convergence time as a function of the parameters considered and study the qualitative behavior of opinion groups at equilibrium. The networks that we study fall into three categories. The first set of networks — complete graphs, cycles, prism graphs, square lattices, and complete multipartite graphs — are synthetic and deterministic. The Deffuant model has been much studied on complete graphs and square lattices due to their simple structures, and we extend this list of simple network structures and compare simulation results on these networks with those on more complex structures. From our simulations on deterministic graphs, we find that network topology and parameter values of the Deffuant model appear to have an intertwined effect on convergence time, with the behavior of convergence time undergoing a transition at a confidence-bound threshold for some network structures. The second set of networks are (synthetic) random graphs, including cycles with random edges, prisms with random edges, and random graphs generated by an Erdős–Rényi model Gilbert (1959). Due to their simplicity, these models are a good starting point for understanding the Deffuant model on random graphs. Our simulations suggest that the behavior of convergence time on random-graph models is similar to that on their counterpart deterministic networks. The third set of networks are empirical and deterministic. In particular, we use two Facebook100 networks, which are constructed using Facebook “friendship” data Traud et al. (2012), and which are a type of network in which people have discussions and opinions can change over time. Using all three types of networks, we discuss the number of opinion groups at equilibrium and phenomena such as a confidence-bound threshold for a transition from consensus to multiple-opinion equilibria.
The rest of our paper is organized as follows. First, we introduce relevant definitions from network science, define the Deffuant model in mathematical terms, and present some important known results for the Deffuant model on networks. We then describe our methodology and introduce the networks and the approach that we use for numerical simulations. We then conduct regression analyses on our simulation results to explore the dependence of convergence time on network structure, confidence bound, the number of participating agents, and their cautiousness. We also discuss the phenomena that we observe about the number of opinion groups at equilibrium, and we discuss our results and their implications for sociology. We give further details on our statistical analysis in an appendix.
II Background
In this section, we recall relevant definitions from network science. We then define the Deffuant model, give some intuition about its design, and present some important known results about the Deffuant model on networks.
II.1 Basic definitions in network science
A network is a set of items (called nodes) with connections (called edges) between them Newman (2010). Many ideas in network science originated in graph theory, and we present some definitions Newman (2010); West (2001) that are pertinent to our study. A graph is a triple consisting of a node set , an edge set , and a relation that associates each edge with two nodes (not necessarily distinct) called its endpoints. The simplest type of network is a graph. Two nodes are adjacent, and are called neighbors of each other, if and only if they are endpoints of the same edge. The degree of a node is equal to the number of its neighbors. A regular graph is a graph in which each node has the same degree. A random-graph model is a probability distribution on graphs that has some fixed parameters and generates networks randomly in other respects.
II.2 The Deffuant model
In the Deffuant model, randomly-selected neighboring agents interact in a pairwise manner and make a compromise toward each other’s opinion whenever their opinion difference is below a given threshold. (Otherwise, their opinions do not change.) Consider a population of agents, who are connected to each other socially via a network ; and let be the opinion space. At time , suppose that each agent holds a time-dependent opinion . Given an initial profile , a confidence bound , and a cautiousness parameter that we call the multiplier , the Deffuant model is the random process defined as follows. At time , a pair of neighboring agents and are selected uniformly at random (i.e., we select an edge uniformly at random) and update their opinions according to the equations
[TABLE]
where .
The Deffuant model uses a continuous opinion space, as an individual’s stance on a specific matter can vary smoothly from one extreme to another in many real-world scenarios Castellano et al. (2009). For instance, a political position (on single dimension) is not typically simply “left” or“right” but somewhere in between two extremes. The study of opinion-formation processes has traditionally considered an opinion to be a discrete variable, which is a reasonable assumption for some applications. For instance, the classical voter model Clifford and Sudbury (1973); Holley and Liggett (1975) considers a binary variable that specifies one’s decision in a vote. However, it is important to develop models that incorporate more nuanced opinions.
As in the original paper Deffuant et al. (2000) that introduced the Deffuant model, most later studies treated the initial opinions as being independent and identically distributed according to the uniform distribution on the opinion space . We also adopt this convention, as our goal is to explore the basic version of the Deffuant model in a systematic manner to provide a point of reference for results of the model’s variants. Nonuniform initial opinion distributions are considered, for example, in Jacobmeier (2006).
The confidence bound characterizes a population’s tolerance of diverse viewpoints. If the opinion difference between a pair of agents is smaller than this threshold, they reduce their disagreement by making a compromise. Otherwise, the two agents keep their current opinions after they interact (or perhaps are unwilling to discuss the issue at all). In the extreme case of , no interaction can lead to compromise, and the initial opinion profile is a fixed point. At the other extreme, if , any pair of interacting agents will compromise their opinion if they interact with each other
The multiplier , which also called a convergence parameter in some papers Deffuant et al. (2000); Fortunato et al. (2005); Laguna et al. (2004); Weisbuch et al. (2002), specifies a population’s cautiousness in the modification of judgements. A larger value of indicates that individuals are more willing to compromise. In the special case , pairs of interacting agents agree on the mean of their opinions whenever their opinion difference is below the confidence bound. Most past work has examined homogeneous , but it would be interesting to examine the effects of heterogeneous levels of cautiousness. For example, Deffuant et al. (2004) used a smooth influence function in which agents whose opinions have low uncertainty are more influential than agents whose opinions have high uncertainty, and other types of heterogeneity are also worth exploring.
The Deffuant model, in its original form Deffuant et al. (2000), considers the confidence bound and the multiplier to be constant in time and homogeneous across the whole population. In this setting, the mean opinion of two agents is the same before and after their interaction.
Convergence of opinions is generally defined as the appearance of a stable configuration in which no more changes can occur. At equilibrium, the opinion distribution is a superposition of Dirac delta functions in the opinion space , such that consecutive spikes are separated by a distance of at least . In other words, any two agents either hold the same opinion or their viewpoints differ by a distance of at least . We use the notation to denote the number of opinion groups at equilibrium.
II.3 The Deffuant model on various networks
The agents in a Deffuant model are represented by nodes of a network, and a pair of agents on a randomly selected edge can interact with each other. To the best of our knowledge, the Deffuant model has been studied on only a small subset of networks, which includes complete graphs, square lattices, Erdős–Rényi (ER) random graphs, Watts–Strogatz (WS) random graphs, and Barabási–Albert (BA) random graphs Barabási and Albert (1999).
The Deffuant model on complete graphs has received considerable attention Deffuant et al. (2000). Complete graphs can be used to model small communities, where everyone knows each other, such as high-level political leaders in a country or inhabitants of a village. Complete graphs are also sometimes used as approximations for individual communities in large social networks, as individuals within communities are more closely connected with each other than with outsiders Fortunato and Hric (2016); Porter et al. (2009). In the homogeneous mixing case, the population’s opinions always reach equilibrium Lorenz (2005). It has been shown numerically that a large confidence bound yields an equilibrium state of consensus, whereas multiple opinion groups can persist for small values of Deffuant et al. (2000); Fortunato (2004); Laguna et al. (2004); Weisbuch et al. (2002); Weisbuch (2004). Such results were also obtained in simulations on square lattices, ER random graphs, WS random graphs, and BA random graphs Deffuant et al. (2000); Kozma and Barrat (2008a, b); Stauffer and Meyer-Ortmanns (2004). Moreover, numerical simulations on complete graphs imply that one can estimate the number of opinion groups at equilibrium by Deffuant et al. (2000); Weisbuch et al. (2002); Weisbuch (2004), and that multiplier and the number of participating agents do not have a significant effect on Deffuant et al. (2000); Weisbuch et al. (2002). However, a later study Laguna et al. (2004) observed that the number of “major opinion” groups that include many agents is a function of , whereas the number of “minor opinion” groups (i.e., groups of minorities) depends on .
On square lattices, WS random graphs, and BA random graphs, the Deffuant model includes behavior that differs from the homogeneous mixing case. For instance, simulations on square lattices and BA random graphs suggest that depends not only on , but also on , when multiple opinion groups persist at equilibrium Deffuant et al. (2000); Stauffer and Meyer-Ortmanns (2004). Simulations on WS random graphs indicate that depends on both and network structures, and that the presence of disorder (i.e., random “shortcut” edges) seems to have only a slight effect on convergence time Gandica et al. (2010).
Existing research on the Deffuant model on ER random graphs has focused mainly on adaptive networks, which evolve along with the game Kozma and Barrat (2008a, b). For WS random graphs, the study of the model has centered around opinion groups at equilibrium Gandica et al. (2010).
III Methods
For each network structure, we conduct a regression analysis to examine convergence time as a function of confidence bound, the number of participating agents, and the multiplier that measures their cautiousness. We then qualitatively study the behavior of the number of opinion groups at equilibrium, as such an approach is more natural than conducting regression analysis because of the complex nature of opinion-group distributions.
III.1 Networks studied
We study the Deffuant model on a variety of networks to develop a better understanding of the effect of network structure on convergence time and the number of opinion groups at equilibrium. Some of the networks that we study have deterministic structures, and others are random graphs. In Table 1, we list the notations, definitions, and examples of these networks. Finally, we conduct numerical simulations using networks that are constructed using Facebook “friendship” data Traud et al. (2012). The first set of networks that we study are deterministic graphs, including complete graphs (), cycles (), prism graphs (), square lattices (), and complete multipartite graphs (). These networks have been studied extensively because of their simple structures. Our simulation results on these networks provide references for comparison with conclusions on the variants of the Deffuant model as well on those of the original Deffuant model on more complicated network structures. The second set of networks that we study consists of random graphs, which are cycles with random edges () (which are related to WS small-world networks Watts and Strogatz (1998); Porter (2012)), prism graphs with random edges (), and random graphs generated by the Erdős–Rényi model. Finally, we investigate the Deffuant model on real social networks constructed using Facebook data.
III.2 Simulation specifications
Without loss of generality, we consider the Deffuant model with opinions on the space . In other words, we normalize the opinion dynamics so that each agent’s opinion lies between [math] and at any time step. We also consider the multiplier , as opposed with the interval in the original model Deffuant et al. (2000). This generalization is useful, as interacting agents can perhaps be convinced to believe in others’ opinions more than their own. Moreover, considering reveals interesting phenomena that we will discuss in Section IV. A few of the parameter values have specific interpretations. For example, for , any pair of interacting agents makes convergent opinion adjustments that correspond to interaction without a confidence bound. For , each pair of interacting agents agrees on their mean opinion whenever their opinion difference is below . Theoretically, there is no upper bound on the number of agents that one can consider in a population, but running numerical simulations on extremely large populations is computationally intensive. For our simulations, we use a maximum of agents, and one can infer the behavior of the model for larger populations from our regression analysis.
The convergence time and the number of opinion groups at equilibrium are both difficult to predict, as the initial opinion profile, the pair of agents that interact at each time step, and the particular graphs generated by random graph ensembles are all stochastic. To smooth out these sources of noise, we run groups of independent simulations for each network in Section III.1 and each combination of the values of , , and that we consider. During one simulation, we first generate a group of independent and identically distributed initial opinions from a uniform distribution on , and we then simulate the evolution of opinion dynamics according to the Deffuant model.
In principle, equilibrium is reached only at infinitely long times, as the opinion space is continuous and opinions approach each other arbitrarily closely without reaching the same value in finite times unless Laguna et al. (2004). However, the emergence of equilibrium is evident at finite times, as consecutive opinion groups must be separated by a distance of at least to avoid merging. Therefore, in practice, we need to set a convergence criterion in our numerical simulations. For our study, we consider an opinion profile to be at equilibrium if consecutive opinion groups are separated by a distance of at least and the range of opinions in each group is below . Based on some test runs, we also choose a bailout time of iterations for each simulation. If an equilibrium is reached by the bailout time, we record the convergence time () and the number () of opinion groups. Otherwise, we record , a strict upper bound that is higher than all possible convergence times, for the purpose of data visualization.
IV Numerical simulations and results
In this section, we study the Deffuant model on various deterministic, randomly generated, and real-world networks by considering different network structures and interaction mechanisms between pairs of agents. For each network structure, we first conduct data exploration and linear regression analysis to model convergence time () as a function of the number () of participating agents, confidence bound (), and multiplier (). We then discuss our qualitative observations about the number of opinion groups at equilibrium (). Because the process of data exploration and regression analysis is similar, we only give full details in Appendix A for a subset of the parameter space for our simulations on complete graphs.
For our linear regression analysis, we use the method of ordinary least squares, as the estimator is unbiased and consistent if the errors have the same finite variance and are uncorrelated with the explanatory variables Freeman (2005). If the errors are also normally distributed, ordinary least squares is also the maximum likelihood estimator Freeman (2005). We check these assumptions throughout our model-selection process. For each set of parameters and network structure that we consider, we conduct regression analysis using the mean results of different simulations. We only use simulation results of networks with or more agents in order to reduce the stochasticity introduced by the random initial opinion profile and to ensure a sufficient quantity of data for testing the model assumptions.
IV.1 Complete graphs
The simplest form of the Deffuant model allows any pair of agents in a system to interact Deffuant et al. (2000). This is equivalent to studying the model on a complete graph. Recall that denotes the number of nodes in a graph.
In Fig. 1, we summarize the values of that we observe in simulations for various , as these are representative of the trends that we observe in all simulations. We present a similar set of plots for all other network structures in the following subsections.
Our data exploration suggests that the convergence time has qualitatively different behavior for and , so we consider different regression models for these two cases. For model selection, we use the Akaike information criterion (AIC) Akaike (1973) to select the “best” subset of predictors, as this method balances the trade-off between the goodness of fit and the complexity of a model. This model selection approach aims to minimize the AIC value, which is defined by
[TABLE]
where is the number of estimated parameters and is the maximum value of the likelihood function for the model. The coefficient of determination, , is a measure of goodness of fit of a regression model Draper and Smith (1981). Values of that are closer to indicate better fits. For instance, implies that the response variable cannot be predicted from the explanatory variables, and implies that the response variable can be predicted without error from the explanatory variables. Let be the predicted value for the observed convergence time (), and let . One then calculates
[TABLE]
We use the AIC and to measure the goodness of fit and the simplicity of our regression models.
For , AIC-based model selection yields
[TABLE]
and we give our estimates for the coefficients (with ) in Table 2. (This is part of the regression output given by the software environment R Team (2008).) The column for values gives the values of the -statistic for the hypothesis test with the null hypothesis that the corresponding regression coefficient is [math]. The column for Pr() gives the probability for a test statistic to be at least as extreme as the observed value if the null hypothesis were true. A low value of Pr() suggests that it would be rare to obtain a result as extreme as the observed value if the coefficient under consideration were [math], and hence we should keep the corresponding term in our model. For Eq. (4), the values of the AIC and are and , respectively.
For , regression analysis suggests the model
[TABLE]
where we list our coefficient estimates in Table 3. For Eq. (5), the values of the AIC and are and , respectively.
The different forms of Eqs. (4) and (5) confirm our conjecture based on data exploration that undergoes a transition at . More precisely, the regression results suggest that the behavior of differs for and . To determine a more precise transition point for , one should conduct numerical simulations using . For , the multiplier has no statistically significant impact on . Moreover, increases with for , and it decreases with . For , the effects of , , and on seem to be independent (or at least predominantly independent) of each other. In particular, increases with roughly linearly. We also observe that increases with exponentially and has a minimum at , which corresponds to interactions without a confidence bound. In other words, for fixed and , the convergence time on complete graphs is minimal when any pair of interacting agents makes a convergent compromise. Furthermore, increases with exponentially and has a minimum at . This corresponds to the case in which each pair of interacting agents agrees at their mean opinion whenever their opinion difference is below the confidence bound.
For each combination of , , and , we average the number of opinion groups at equilibrium if and only if at least of simulations reach equilibrium within the bailout time. Otherwise, we state that we observe a “division” of opinion for the associated parameter combination. We also use the same standard to determine the number of opinions at equilibrium in our subsequent numerical experiments.
In Fig. 2, we summarize the number of opinion groups that persist at equilibrium in our simulations on complete graphs. We observe that depends on only when the confidence bound is , with the most dramatic changes occurring in the region of . For , consensus is reached consistently. For , we observe that generally increases with . For , we obtain . At , we observe that for . Additionally, for and , we observe that is generally larger for closer to . This is reasonable because, as , agents tend to agree on the mean of their opinions, which reduces the length of time for opinions to stabilize, so more opinion groups tend to persist at equilibrium.
IV.2 Cycles
In this subsection, we explore the behavior of convergence time and the number of opinion groups at equilibrium by simulating the Deffuant model on -node cycles. We will compare these simulation results to ones on cycles with additional, randomly-placed “shortcut” edges in Section IV.6.
In Fig. 3, we summarize the values of that we observe in our simulations on cycles. Our simulations suggest that changes rapidly with when is close to . We speculate that a singularity arises at and as . Our linear regression models cannot capture singular points, so we exclude data points that correspond to from our regression analysis for cycles. Our regression analysis gives the model
[TABLE]
where we list our coefficient estimates in Table 4. For Eq. (6), we obtain and .
In contrast to complete graphs, our simulations on cycles indicate that the dependence of on , , and does not undergo a transition with respect to . We observe that decreases with , and, as gets closer to , the value of changes with more rapidly as increases. Moreover, for , the convergence time obtains a global minimum at if and are held constant. Furthermore, increases with for . Additionally, the effects of and on appear to be weakly coupled.
In Fig. 4, we summarize the number of opinion groups that arise in our simulations on cycles. A consensus is reached for . Athough some of our simulations for do not converge by the bailout time, we conjecture that all simulations on cycles with large values of will eventually converge, independent of the values of and , if the Deffuant dynamics are continued for sufficiently many iterations. A consensus is reached when for and when for . This observation is reasonable as, with fewer agents adjacent to each other on a cycle, their initial opinions are more disperse, which compels them to form more groups. Similar to complete graphs, we observe that more opinion groups tend to emerge in the final state as if multiple opinion groups persist at equilibrium.
IV.3 Prism graphs
In this subsection, we explore the behavior of convergence time and the number of opinion groups at equilibrium by simulating the Deffuant model on prism graphs. Prism graphs are a special type of generalized Petersen graph Coxeter (1950). We will compare our simulation results on prism graphs to those on prisms with additional random edges in Section IV.7.
In Fig. 5, we summarize the values of that we observe in our simulations on prism graphs. Similar to our computations for complete graphs in Section IV.1, scatter plots of versus , , and , exhibit qualitatively distinct behavior for and . We thus conduct separate regression analyses for and . For , regression analysis suggests the model
[TABLE]
where we list our coefficient estimates in Table 5. For Eq. (7), we obtain and . For , regression analysis suggests the model
[TABLE]
where we list our coefficient estimates in Table 6. For Eq. (8), we obtain and .
Similar to the case of complete graphs, the different forms of Eqs. (7) and (8) confirm our conjecture based on data exploration that undergoes a transition at . According to Eqs. (7) and (8), increases with for , and increases with for . The effects of , , and on are coupled to each other for prism graphs. Additionally, increases with respect to more rapidly for than for . For and , the convergence time decreases with if and are held constant. For and , however, obtains a global maximum at if and are held constant. For and , the convergence time decreases with if and are held constant. For and , however, obtains a global minimum at if and are held constant. With fixed values of and , the convergence time obtains a global minimum at .
In Fig. 6, we summarize the number of opinion groups that persist in our simulations on prism graphs. For , a consensus is reached for all simulations on prism graphs. For , the equilibrium state is mostly polarized into distinct opinion groups if and can sometimes have more than opinion groups for . Similar to our simulations on cycles in Section IV.2, we observe that large discrepancies in the initial opinion distribution hinder the agents from agreeing with each other through their interactions on a prism graph.
IV.4 Square lattices
Apart from complete graphs, square lattices are the most common deterministic networks on which the Deffuant model has been studied previously Weisbuch et al. (2002).
In Fig. 7, we summarize the values of that we observe in our simulations on square lattices. For , most of the simulations do not converge by the bailout time, so we conduct regression analysis for . Our regression analysis suggests for that
[TABLE]
where we list our coefficient estimates in Table 7. For Eq. (9), we obtain and .
Similar to our observations for prism graphs, the effects of , , and on are coupled to each other for square lattices. According to Eq. (9), increases with . If , the convergence time increases with . Otherwise, obtains a global minimum at if and are held constant. Moreover, has a minimum at if and are held constant.
In Fig. 8, we summarize the number of opinion groups that persist at equilibrium in our simulations on square lattices. As with our results on prism graphs, a consensus is reached for all simulations on square lattices for .
IV.5 Complete multipartite graphs
In this subsection, we consider complete multipartite graphs with . We use the values , and we note that one construe a complete graph (see Section IV.1) as a complete multipartite graph with . By varying the value of , we explore the effect of network density (i.e., the ratio of the number of edges to the maximum possible number of edges West (2001)) on the behavior of the Deffuant model.
In Fig. 9, we summarize the values of that we observe in our simulations on complete -partite graphs (with ) and number of nodes with values , , , , , , , and . For , most of our simulations do not converge by the bailout time ( iterations), so we conduct regression analysis for . For , our regression analysis suggests the model given by Eq. (5), which has the same form as the regression model of complete graphs when but has different coefficient values (see Table 8). Note that one can construe a complete graph of size as a complete -partite graph.
The regression model in Eq. (5) suggests that the behavior of the convergence time on a general multipartite graph is similar to that on a complete graph. As the number of partite sets increases, the growth rate of with respect to decreases slightly if and are held constant. In other words, as a complete multipartite graph becomes more densely connected, adding agents to a network increases the convergence time of the Deffuant model at a slower rate if all other conditions remain the same. Additionally, increases with more slowly as increases.
In Fig. 10, we summarize the number of opinion groups that persist at equilibrium in our simulations on complete -partite graphs (with ). For , consensus is reached for all . We obtain consensus in all of our simulations on bipartite graphs with , whereas some simulations fail to converge by the bailout time for .
IV.6 Cycles with random edges
We consider random graphs generated by the ensemble (see Table 1) for , , and . Cycles with additional, random “shortcut” edges are related to Watts–Strogatz small-world networks Watts and Strogatz (1998); Newman and Watts (1999); Porter (2012) (see also earlier work by Bollobás and Chung Bollobás and Chung. (1988)), except that nodes initially have degree , which yields (for cycles that are not too small) a clustering coefficient of [math] for each node before random edges are added.
In Fig. 11, we summarize the values of that we observe in our simulations on for , , and s = . Regression analysis suggests the model
[TABLE]
where the power-transformation parameter is , , and for , , and , respectively. For , the term is statistically insignificant, and we thus drop it. In Table 9, we summarize our coefficient estimates for Eq. (10). For , we obtain and ; for , we obtain and ; and for , we obtain and .
Our data exploration and regression analysis suggest that does not experience a transition with respect to . According to Eq. (10), increases with for , , and . For , the convergence time obtains a global minimum at if and are held constant. For , if , the convergence time increases with . If , then obtains a global minimum at if and are held constant. Finally, obtains a global minimum at if and are held constant.
Similar to our observations for prism graphs and square lattices, the effects of , , and on are coupled to each other for cycles with random edges, in contrast to what we observed using our regression model of cycles (see Eq. (6)), which has only one weak coupling term . Adding random shortcut edges to cycles significantly decreases the convergence time. Additionally, increases much more slowly with on than it does for cycles.
In Fig. 12, we summarize the number of opinion groups that persist at equilibrium in our simulations on cycles with random edges. With only a small proportion (i.e., ) of random edges, the number of opinion groups at equilibrium is roughly the same as what we observed in our simulations on cycles (see Fig. 4). However, as more random edges are added, multiple opinion groups start to emerge at equilibrium for . We conjecture that, as the proportion of random edges increases, the behavior of is more similar to the case of complete graphs than that of cycles.
IV.7 Prism graphs with random edges
We consider random graphs generated by the ensemble (see Table 1) for , , and . We study the effect of random edges on the behavior of the Deffuant model by comparing our simulation results with the ones that we obtained for prism graphs in Section IV.3.
In Fig. 13, we summarize the values of that we observe in our simulations on for , , and . Similar to our results for prism graphs in Section IV.3, we observe qualitatively distinct behavior of the convergence time for and for the Deffuant model on . Therefore, we conduct separate regression analyses for these two cases. For , our regression analysis suggests the model
[TABLE]
where we list our coefficient estimates in Table 10. For , we obtain and ; for , we obtain and ; and for , we obtain and . For , our regression analysis suggests the model
[TABLE]
where we list our coefficient estimates in Table 11. For , we obtain and ; for , we obtain and ; and for , we obtain and .
The different forms of Eqs. (11) and (12) support our conjecture based on our data exploration that undergoes a transition at . If and are held constant, the convergence time obtains a maximum at for and at for . Moreover, increases with exponentially and has a minimum at . If and are held constant, decreases with . For , if , the convergence time decreases with ; otherwise, has a minimum at for and are held constant. In contrast to the coupling effects of , , and that we observed in our simulations on prism graphs, simulations on suggest that only a weak coupling term exists in the regression model (see Eqs. (11) and (12)). Additionally, adding random edges to prism graphs decreases more significantly for than for .
In Fig. 14, we summarize the number of opinion groups that arise in our simulations on prism graphs with randomly-generated extra edges. As we observed for prism graphs, consensus is always reached on prism graphs with random edges for . However, for and , we observe , in contrast to for prism graphs. Therefore, when a population’s confidence bound is small, adding random edges to prism graphs is more favorable to expediting the process of opinions dividing into distinct groups than reaching agreement among the population.
IV.8 Erdős–Rényi networks
We now consider random graphs generated by the Erdős–Rényi model, where is an independent probability for there to be an edge between a pair of nodes. Erdős–Rényi graphs are one of the best-studied models of network science, and they have been used in previous studies of the Deffuant model on networks Alaai et al. (2008); Fortunato (2004); Kozma and Barrat (2008a, b). However, existing research on the Deffuant model on ER random graphs has focused primarily on adaptive networks that evolve along with the game Kozma and Barrat (2008a, b). In our simulations, we consider the ER model for . Complete graphs are a special case of the ER model, as one obtains a complete graph for the parameter value .
In Fig. 15, we show a subset of the values of that we obtain in our simulations. These values are representative of the observed trends in all of our simulations. As in our simulations on complete graphs, we observe qualitatively distinct behavior for for and . Therefore, we conduct separate regression analyses for these two cases. For , regression analysis suggests the model
[TABLE]
where we list estimates for the coefficients in Table 12. Random graphs generated by the ER model are a source of stochasticity for the opinion dynamics. It is thus not surprising that we observe a larger number of outliers for our ER simulations than for complete graphs. Let be the proportion of data points that we identify as outliers and thus exclude from our regression analysis.111We construe a data point as outlier if . Each data point is a mean over 10 simulations, and we chose this value so that all 10 simulations converge by the bailout time of . For , we conduct a regression analysis for the graphs for , , and , as for smaller values of (and this would undermine the reliability of the regression analysis). For , we obtain , , and ; for , we obtain , , and ; and for , we obtain , , and . For , our regression analysis suggests the model in Eq. (5) for each value of that we consider. For each , Table 13 summarizes our estimates for the coefficients (), together with the corresponding values of AIC and .
The different forms of Eqs. (5) and (13) support our conjecture from our data exploration that undergoes a transition at . For , the convergence time increases with for and with for . For , the convergence time decreases with . For , if , then increases with ; otherwise, obtains a maximum at . For , Eq. (13) suggests that is proportional to , which contrasts with our regression model for complete graphs (see Eq. (4)), which do no exhibit a statistically significant influence of on . For , our regression model for the ER model is the same as what we obtained for complete graphs, so the behavior of with respect to , , and is similar in that parameter regime. For large values of , the estimated coefficients are very close to those for complete graphs. This suggests that it is probably accurate to use a mean-field approximation to study convergence time on the Erdős–Rényi model if is close to .
In Fig. 16, we summarize the number of opinion groups that arise in our simulations of the Deffuant model on random graphs generated by the ER model. When the connection probability is close to , the behavior of is similar to what we observed for complete graphs. As , the major qualitative difference is that opinions sometimes fail to converge within the bailout time for small values of .
IV.9 Facebook friendship networks
We now simulate the Deffuant model on two Facebook “friendship” networks Traud et al. (2012) — one of Swarthmore College and the other of the California Institute of Technology (Caltech) — from one day in autumn 2005. We consider the largest connected component (LCC) of each network. For the Swarthmore network, the LCC has nodes and edges. The LCC of the Caltech network has nodes and edges.
In Fig. 17, we summarize the values of that we observe in simulations. For , most of the simulations on both networks fail to converge by the bailout time, so we consider only the results of in our regression analysis. For the Swarthmore network, we obtain a regression model of
[TABLE]
where we list our estimates for the coefficients in Table 14. For Eq. (14), we obtain and . For the Caltech network, we obtain a regression model of
[TABLE]
where we list our estimates for the coefficients in Table 15. For Eq. (15), we obtain and .
For both networks, the variables and have an intertwined effect on . Moreover, if is held constant, the convergence time has a minimum at . If is held constant, increases with . The convergence time for both of the networks has qualitatively similar behavior as what we observed for cycles with random edges (see Section IV.6) of corresponding sizes for and . This empirical observation suggests that simulating the Deffuant model on random graphs generated by these and similar networks (e.g., WS networks) may yield some useful insights about the convergence time for the Deffuant model on social networks.
In Fig. 18, we summarize the number of opinion groups that persist at equilibrium in our simulations on the LCCs of the Swarthmore and the Caltech Facebook networks. In both networks, consensus occurs for all . For , at least half of the simulations fail to converge within the bailout time, but those that converge suggest that increases as approaches [math]. In contrast, our simulations of Deffuant dynamics on cycles with random edges reached equilibrium within bailout time (see Section IV.6).
V Conclusions and Discussion
We studied the Deffuant model on several types of deterministic and random networks. For each of these networks, we systematically examined the number of groups of different opinions in these networks and the convergence time to reach equilibrium as a function of the number () of agents that participate in the opinion dynamics, the population’s confidence bound (), and their cautiousness (which we measure using the multiplier ). For the convergence time to equilibrium, we used both numerical simulations and regression analyses to obtain qualitative and quantitative insights. For the number of opinion groups at equilibrium, we used our numerical simultions to examine the qualitative behavior of different types of networks.
We obtained many insights from our systematic computations. Studying the effect of network structure on dynamical processes (such as opinion models) is a difficult problem, and we were able to achieve several interesting insights about the intertwined effect of network topology and the parameter values of the Deffuant model on the convergence time . For example, our regression analysis suggests that the convergence time undergoes a transition at a critical value of the confidence bound () on complete graphs and prism graphs but not on cycles. We also illustrated that the interplay among the effects of , , and on the convergence time of the Deffuant model can be rather different qualitatively for different families of networks. For instance, the effects of the three parameters on are independent on complete graphs and complete multipartite graphs for large values of (these are mean-field situations), whereas and are weakly coupled for cycles, and all three parameters are coupled for prism graphs and square lattices.
Our results also shed further light on educated guesses and other claims that have appeared in the literature. We examined quantitatively how convergence time increases with the number of agents in a network, and we thereby obtained several insights for different network topologies. For example, although Laguna et al. (2004) speculated that is proportional to , our regression results indicate that the linear relationship need not hold and that it depends on the underlying network topology. Additionally, several papers have concluded based on numerical simulations for a few values of the multiplier that consensus occurs for the Deffuant model for several networks (e.g., ER networks, WS networks, and BA networks) when the confidence bound is large, whereas multiple opinion groups persist at equilibrium for low confidence bounds Fortunato (2004); Laguna et al. (2004); Weisbuch et al. (2002); Weisbuch (2004). However, different transition thresholds (e.g., , , and ) have been proposed for the confidence bound. In the synthetic networks that we study (except for cycles and cycles with random edges), our simulation results suggest that a transition threshold of is most likely for large populations. For , consensus occurs on all of our families of deterministic networks (both synthetic and empirical) except for bipartite graphs. For , more opinion groups persist at equilibrium as node degree increases for our simulations on -regular graphs given by cycles (for which the degree is for each node), prism graphs ( for each node), and complete graphs (of progressively larger size, starting from nodes, and hence of progressively larger degree for each node). This is possibly because agents who have a higher node degree in a regular graph have more neighbors with “competing” opinions, which gives the agents less time to make up their minds, so more opinion groups remain at equilibrium. Additionally, it was proposed in Weisbuch et al. (2002); Weisbuch (2004) based on numerical simulations that one can approximate the number of opinion groups at equilibrium by for large population. Our simulations show that this statement is not true in general. For instance, for simulations on prism graphs, for when is large.
Our simulations suggest that the equilibrium number of opinion groups is similar for random graph models and appropriate counterpart deterministic networks (at least for the network families that we study). For example, adding a small number of random edges per node (i.e., the number of random edges divided by total number of nodes is small) to cycles and prism graphs does not have an obvious impact on . We conjecture that approaches the situation for complete graphs as the proportion of random edges on cycles and prism graphs increases. For the Erdős–Rényi model, we observed that the behavior of is similar to that on complete graphs when the edge generation probability is close to . This suggests that it would be useful to study the Deffuant model on ER graphs using a mean-field approximation, especially as useful results have been obtained for other dynamical processes in this way Porter and Gleeson (2016).
Our results provide insight into the convergence of opinion dynamics into stable groups of different opinions and on how long it takes to achieve such groups in differently-structured populations. For instance, when it is desirable to achieve a consensus among many individuals (especially in a potentially contentious situation), one may try to obtain agreement as quickly as possible, and it is useful to obtain a better understanding of which network structures can best achieve such useful outcomes. It is also noteworthy that one topic in early studies of bounded-confidence models such as the Deffuant model was to examine how extremism can take hold in a population Deffuant et al. (2002); Amblard and Deffuant (2004); Deffuant et al. (2004), and (perhaps especially given recent events) it seems useful to revisit such applications of these models. In developing models further for such applications, it will be important to incorporate recent insights, such as those in Friedkin et al. (2016).
Our systematic approach for studying the Deffuant model on various network structures is also applicable to other bounded-confidence models and models of opinion dynamics more generally. For example, the Hegselmann–Krause model was invented and subsequently attracted much research about the same time as the Deffuant model. It would be interesting to study the HK model using a systemic approach that is similar to the one in the present paper. One can also generalize bounded-confidence models to incorporate population heterogeneity, such as by drawing cautiousness parameters from a distribution (analogous to what is done in threshold models of social influence Porter and Gleeson (2016); Watts (2002)) rather using the same constant for all individuals, as openness to compromise is different for different people. Our regression approach should also be useful more generally for studying dynamical processes on networks, including more general structures such as multilayer networks Kivelä et al. (2014), temporal networks Holme and Saramäki (2012), and adaptive networks Sayama et al. (2013).
Appendix A Statistical Analysis
In this appendix, we illustrate our statistical analysis in detail. For concreteness, we discuss our analysis in the context of the Deffuant model on complete graphs. We performed the same procedure for all of our regression analyses.
The scatter plots in Fig. 19 suggest that the convergence time () depends on the number of participating agents (), the population’s confidence bound (), and possibly on their cautiousness (which we measure using the multiplier ). In particular, the relationship between and seems to undergo a transition at a critical value , below which we observe a larger variation in . In Fig. 20, we show separate scatter plots for and to illustrate the qualitatively distinct behavior in the two regimes.
First, let’s consoder the case . We start by fitting a normal linear model
[TABLE]
where (with ) are coefficients to be estimated and we assume that is an independent and normally distributed error with mean [math] and constant variance for every observation. To account for the curvature observed in Fig. 20, we include explanatory variables up to the second order in the model in Eq. (16). We will subsequently drop statistically insignificant variables in a model-reduction procedure.
Before proceeding with model selection, we check the validity of our model assumptions. In Fig. 21, we check the assumption that the errors have [math] mean and constant variance by plotting studentized residuals versus the response values predicted by Eq. (16). Ideally, variance should be constant in the vertical direction, and the scatter should be symmetric vertically about [math]. However, Fig. 21 indicates that the variance is not constant, as the points follow a clear wedge-shaped pattern, with the vertical spread of the points increasing with the fitted values. In Fig. 21, we check the assumption of normality by plotting the sample quantiles versus the quantiles of a normal distribution. Data generated from a normal distribution should closely follow the line through the origin, but this is contradicted by the Q–Q plot in Fig. 21. Therefore, the diagnostics show the necessity of stabilizing the variance and thereby making the data more like a normal distribution.
The one-parameter Box–Cox method Box and Cox (1964) is a popular way to determine a transformation on strictly positive responses Osborne (2010). A Box–Cox transformation maps to , where the family of transformations indexed by is defined by
[TABLE]
In Fig. 22, we show that the confidence interval for at the confidence level is roughly . We choose to set , as this corresponds to taking a natural logarithm. The diagnostics of the new model suggest another log transformation, leading to the model
[TABLE]
where we assume that is an independent and normally-distributed error with mean [math] and constant variance for every observation. The variance for is not necessarily the same for Eq. (16) and (18). However, we use the same notation for , with the understanding that it is of course allowed to be different for different models.
This time, Fig. 23 shows approximately constant variance in the vertical direction, and the scatter is roughly symmetric vertically about [math]. There are no studentized residuals outside the range, revealing no serious outliers. In Fig. 23, the points closely follow the line through the origin. Therefore, our model assumptions appear to be reasonable for Eq. (18).
It is also important to minimize the number of regression terms in our models. AIC-based model selection drops the first-order term of and all terms that include to yield Eq. (5). The diagnostic graphs of Eq. (4) are similar to those in Fig. 23 and are therefore acceptable.
Cook’s distance Cook (1977) measures the influence of a data point in a least-squares regression analysis. A commonly used threshold for detecting highly influential observations is , where is the number of observations and is the number of fitting parameters. Fig. 24 reveals influential observations. We remove these points and give the resulting estimates (accurate to significant figures) for the coefficients (with ) of Eq. (4) in Table 2.
In Table 16, we summarize the values of AIC and for the regression models that we consider for simulations on complete graphs with . The substantial increase in and decrease in AIC indicate that our final model (see Eq. (4)) has a much better goodness-of-fit and a considerably simpler form than our original model (see Eq. (16)).
For , we go through a similar model-selection process for and thereby obtain
[TABLE]
We include an term in the full model (see Eq. (19)) to account for the linear dependence of on that Fig. 20 suggests. AIC-based model selection indicates the statistical significance of the term. For Eq. (19), we obtain and . In Table 17, we give the estimates for the coefficients (with ) of Eq. (19).
Table 17 suggests combining and into a single term , and it also suggests combining and into . The model with the combined terms has and , which are very close to those of Eq. (19) but have two fewer coefficients to estimate. Therefore, we update our model for to obtain the simpler model in Eq. (5).
Acknowledgements
We thank Mariano Beguerisse Díaz for helpful comments.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Jackson (2008) M. O. Jackson, Social and Economic Networks (Princeton University Press, Princeton, 2008).
- 2Couzin et al. (2005) I. D. Couzin, J. Krause, N. R. Franks, and S. A. Levin, Nature 433 , 513 (2005).
- 3De Groot (1974) M. H. De Groot, Journal of the American Statistical Association 69 , 118 (1974).
- 4Oliver et al. (1985) P. Oliver, G. Marwell, and R. Teixeira, American Journal of Sociology 91 , 522 (1985).
- 5Siegel (2009) D. A. Siegel, Am. J. Polit. Sci. 53 , 122 (2009).
- 6Jackson and Yariv (2011) M. O. Jackson and L. Yariv, in Handbook of Social Economics , edited by J. Benhabib, A. Bisin, and M. O. Jackson (North Holland Press, 2011), pp. 646–678.
- 7Jia et al. (2015) P. Jia, A. Mir Tabatabaei, N. E. Friedkin, and F. Bullo, SIAM Review 57 , 367 (2015).
- 8Castellano et al. (2009) C. Castellano, S. Fortunato, and V. Loreto, Rev. Mod. Phys. 81 , 591 (2009).
