Influencer identification in dynamical complex systems

Sen Pei; Jiannan Wang; Flaviano Morone; Hern\'an A Makse

arXiv:1907.13017·physics.soc-ph·August 30, 2019

Influencer identification in dynamical complex systems

Sen Pei, Jiannan Wang, Flaviano Morone, Hern\'an A Makse

PDF

TL;DR

This review surveys recent methods for identifying key influencers in complex systems, focusing on structural importance and dynamic impact, with applications across various disciplines.

Contribution

It provides a comprehensive overview of state-of-the-art influencer identification techniques from multiple perspectives and objectives.

Findings

01

Discusses minimal node removal for network breakdown

02

Surveys methods for locating nodes affecting global dynamics

03

Highlights differences between structural and dynamic influencer identification

Abstract

The integrity and functionality of many real-world complex systems hinge on a small set of pivotal nodes, or influencers. In different contexts, these influencers are defined as either structurally important nodes that maintain the connectivity of networks, or dynamically crucial units that can disproportionately impact certain dynamical processes. In practice, identification of the optimal set of influencers in a given system has profound implications in a variety of disciplines. In this review, we survey recent advances in the study of influencer identification developed from different perspectives, and present state-of-the-art solutions designed for different objectives. In particular, we first discuss the problem of finding the minimal number of nodes whose removal would breakdown the network (i.e., the optimal percolation or network dismantle problem), and then survey methods to…

Tables3

Table 1. Table 1: Summary of methods developed for optimal percolation. N 𝑁 N is the network size and M 𝑀 M is the number of links. “NR” stands for “not reported”.

Name	Description	Complexity	Ref
Collective Influence (CI)	Stability of the NB matrix, greedy approach, easy to interpret and implement, adapted in real-world problems	$O (N \log N)$	[20]
Network dismantle	Optimal decycling, belief-propagation approach, Min-Sum algorithm, solve by iteration until convergence, compute the optimal node set simultaneously	$O (M T)$ per iteration	[21]
Belief-propagation-guided decimation (BPD)	Minimum feedback vertex set, spin glass model, belief-propagation approach, no convergence needed in BP iteration, select nodes iteratively	$O (N \log N)$	[48]
Explosive Immunization (EI)	Explosive percolation, iteratively select less important nodes from candidates, based on score defined for each node	$O (N \log N)$	[49]
Equal graph partitioning (EGP)	Recursively partition networks into clusters of similar size, avoid breaking small clusters	NR	[26]
Generalized network dismantle (GND)	Consider costs of removing nodes, node-weighted partition, recursive equal-size partition, solved by spectral properties of a Laplacian and weighted vertex cover	$O (N \log^{2 + ϵ} (N))$	[141]

Table 2. Table 2: Summary of some methods developed for influence maximization in ICMs.

Name	Description	Ref
Monte Carlo simulations	Greedy approach, submodular function, performance guaranteed within a factor $(1 - 1 / e)$	[22]
Cost-Effective Forward algorithm (CEF)	Consider non-unit cost, performance guaranteed within a factor $(1 - 1 / e) / 2$	[146]
Cost-Effective lazy forward (CELF)	Fewer Monte Carlo simulations, higher efficiency, heap structure	[146]
Percolation-based approach	Map to bond percolation, use subgraph to estimate influence	[148]
Degree discount algorithm	Direct influence between immediate neighbors	[148]
Maximum influence arborescence (MIA)	Maximum influence path, Dijkstra shortest-path algorithm, arborescence structure, tradeoff between computational efficiency and performance	[149]
Community-based algorithm	Community detection, dynamic programming algorithm	[152]
Message-passing approach	Belief propagation, Max-Sum algorithm, SIR and SIS model, solved through iteration	[46]
Message-passing approach	Expected size of epidemic outbreaks, insensitive to origin, applicable to weighted networks	[182]
Sequential seeding	Initiate information seeds sequentially, trade-off between coverage and speed	[184]

Table 3. Table 3: Summary of some methods developed for influence maximization in LTMs and k-core percolation.

Name	Description	Ref
Monte Carlo simulations	Greedy approach, submodular function, performance guaranteed within a factor $(1 - 1 / e)$	[22]
Percolation-based approach	Map to bond percolation, use subgraph to estimate influence, $O (N)$ complexity	[149]
Local directed acyclic graph (LDAG)	Decay of influence with the propagation length, discard the nodes with small influence	[149]
SIMPATH algorithm	Enumerating simple paths, look ahead optimization, trade-off between accuracy and efficiency	[147]
Balanced Index (BI)	Select nodes with high resistance to activation and with large out-degree, fast to compute	[192]
Group Performance Index (GPI)	Measure performance of each node as a seed when it is a part of randomly selected seed set	[192]
Belief-propagation algorithm	Large deviations of LTM dynamics, consider cost and revenue, Max-Sum equations, solved by iteration until convergence	[47]
Survey propagation like algorithm	Near-optimal performance, solved by iteration until convergence, applied to random regular graphs	[100]
Collective Influence in Threshold Model (CI-TM)	Greedy approach, based on linearized message-passing equations, use subcritical clusters to estimate influence, $O (N \log N)$ complexity	[194]
CoreHD	Remove high-degree nodes in 2-core, perform well on loopy networks, $O (N)$ complexity	[132]
WEAK-NEIGHBOR	Improvement over the generalized CoreHD algorithm and CI-TM algorithm, $O (N)$ complexity	[133]

Equations44

ν_{i \to j} = n_{i} 1 - m \in \partial i ∖ j \prod (1 - ν_{m \to i}),

ν_{i \to j} = n_{i} 1 - m \in \partial i ∖ j \prod (1 - ν_{m \to i}),

ν_{i} = n_{i} [1 - m \in \partial i \prod (1 - ν_{m \to i})] .

ν_{i} = n_{i} [1 - m \in \partial i \prod (1 - ν_{m \to i})] .

M_{m \to n, i \to j} = n_{i} B_{m \to n, i \to j},

M_{m \to n, i \to j} = n_{i} B_{m \to n, i \to j},

B_{m \to n, i \to j} = {10 if n = i and j \neq = m, otherwise.

B_{m \to n, i \to j} = {10 if n = i and j \neq = m, otherwise.

CI_{ℓ} (i) = (k_{i} - 1) j \in \partial Ball (i, ℓ) \sum (k_{j} - 1),

CI_{ℓ} (i) = (k_{i} - 1) j \in \partial Ball (i, ℓ) \sum (k_{j} - 1),

q^{d ec} (A) = \frac{1}{N} i = 1 \sum N δ_{A_{i}}^{0},

q^{d ec} (A) = \frac{1}{N} i = 1 \sum N δ_{A_{i}}^{0},

C_{ij} (A_{i}, A_{j})

C_{ij} (A_{i}, A_{j})

Z (μ) = A \sum e^{μ N (1 - q^{d ec} (A))} (i, j) \in G \prod C_{ij} (A_{i}, A_{j}),

Z (μ) = A \sum e^{μ N (1 - q^{d ec} (A))} (i, j) \in G \prod C_{ij} (A_{i}, A_{j}),

x_{i}^{(t + 1)} = {1 I [\sum_{j \in \partial i} (1 - x_{j}^{t} (S)) \leq 1] if x_{i}^{t} (S) = 1 if x_{i}^{t} (S) = 0

x_{i}^{(t + 1)} = {1 I [\sum_{j \in \partial i} (1 - x_{j}^{t} (S)) \leq 1] if x_{i}^{t} (S) = 1 if x_{i}^{t} (S) = 0

\overset{η}{^} (S) = \frac{1}{Z ( μ )} e^{μ ∣ S ∣} i \in V \prod I [x_{i}^{*} (S) = 1],

\overset{η}{^} (S) = \frac{1}{Z ( μ )} e^{μ ∣ S ∣} i \in V \prod I [x_{i}^{*} (S) = 1],

t_{i} (S) = ϕ_{i} ({t_{j}}_{j \in \partial i}) = 1 + max_{2} ({t_{j} (S)}_{j \in \partial i}),

t_{i} (S) = ϕ_{i} ({t_{j}}_{j \in \partial i}) = 1 + max_{2} ({t_{j} (S)}_{j \in \partial i}),

Z (μ) = {t_{i}} \sum e^{μ \sum_{i} ψ_{i} (t_{i})} i \in V \prod I [t_{i} < \infty] I [t_{i} = ϕ_{i} ({t_{j}}_{j \in \partial i})],

Z (μ) = {t_{i}} \sum e^{μ \sum_{i} ψ_{i} (t_{i})} i \in V \prod I [t_{i} < \infty] I [t_{i} = ϕ_{i} ({t_{j}}_{j \in \partial i})],

η_{ij} (t_{i}, t_{j}) \propto {t_{k}}_{k \in \partial i ∖ j} \sum e^{μ ψ_{i} (t_{i})} I [t_{i} = ϕ_{i} ({t_{k}}_{k \in \partial i})] k \in \partial i ∖ j \prod η_{k i} (t_{k}, t_{i}) .

η_{ij} (t_{i}, t_{j}) \propto {t_{k}}_{k \in \partial i ∖ j} \sum e^{μ ψ_{i} (t_{i})} I [t_{i} = ϕ_{i} ({t_{k}}_{k \in \partial i})] k \in \partial i ∖ j \prod η_{k i} (t_{k}, t_{i}) .

σ_{i}^{(2)} = ⎩ ⎨ ⎧ \infty ∣ N_{i} ∣ ∣ N_{i} ∣ + ϵ ∣ C_{2} ∣ if G_{\infty} ⊊ N_{i}, else, if arg min_{i} ∣ N_{i} ∣ is unique, else,

σ_{i}^{(2)} = ⎩ ⎨ ⎧ \infty ∣ N_{i} ∣ ∣ N_{i} ∣ + ϵ ∣ C_{2} ∣ if G_{\infty} ⊊ N_{i}, else, if arg min_{i} ∣ N_{i} ∣ is unique, else,

\frac{1}{2} i, j \sum - \frac{1}{2} (v_{i} v_{j} - 1) A_{i, j} (w_{i} + w_{j} - 1),

\frac{1}{2} i, j \sum - \frac{1}{2} (v_{i} v_{j} - 1) A_{i, j} (w_{i} + w_{j} - 1),

ε (s, m) = μ i \in V \sum s_{i} c_{i} + ϵ i \in V \sum m_{i},

ε (s, m) = μ i \in V \sum s_{i} c_{i} + ϵ i \in V \sum m_{i},

m_{ij} = q + (1 - q) 1 - k \in \partial i ∖ j \prod (1 - p m_{k i}),

m_{ij} = q + (1 - q) 1 - k \in \partial i ∖ j \prod (1 - p m_{k i}),

m_{i} = q + (1 - q) [1 - k \in \partial i \prod (1 - p m_{k i})] .

m_{i} = q + (1 - q) [1 - k \in \partial i \prod (1 - p m_{k i})] .

t_{i} = ϕ_{i} ({t_{j}}) = min ⎩ ⎨ ⎧ t \in T : j \in \partial i \sum ω_{j i} I [t_{j} < t] \geq θ_{i} ⎭ ⎬ ⎫ .

t_{i} = ϕ_{i} ({t_{j}}) = min ⎩ ⎨ ⎧ t \in T : j \in \partial i \sum ω_{j i} I [t_{j} < t] \geq θ_{i} ⎭ ⎬ ⎫ .

P (t) = \frac{1}{Z} e^{- β ε (t)} i \in V \prod ψ_{i} (t_{i}, {t_{j}}),

P (t) = \frac{1}{Z} e^{- β ε (t)} i \in V \prod ψ_{i} (t_{i}, {t_{j}}),

P_{j} (t_{j}) \propto {t_{i}}_{i \in \partial j} \sum e^{- β ε_{j} (t_{j})} ψ_{j} (t_{j}, {t_{i}}) i \in \partial j \prod H_{ij} (t_{i}, t_{j}),

P_{j} (t_{j}) \propto {t_{i}}_{i \in \partial j} \sum e^{- β ε_{j} (t_{j})} ψ_{j} (t_{j}, {t_{i}}) i \in \partial j \prod H_{ij} (t_{i}, t_{j}),

H_{ij} (t_{i}, t_{j}) \propto e^{- β ε_{i} (t_{i})} {t_{k}} \sum ψ_{i} (t_{i}, {t_{k}}) k \prod H_{k i} (t_{k}, t_{i}) .

H_{ij} (t_{i}, t_{j}) \propto e^{- β ε_{i} (t_{i})} {t_{k}} \sum ψ_{i} (t_{i}, {t_{k}}) k \prod H_{k i} (t_{k}, t_{i}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Influencer identification in dynamical complex systems

Sen Pei*∗,†*,

Department of Environmental Health Sciences,

Mailman School of Public Health, Columbia University,

New York, NY 10032, USA

Jiannan Wang†

Research Institute of Frontier Science,

Beihang University, Beijing 100191, China

Levich Institute and Physics Department,

City College of New York, New York, NY 10031, USA

Flaviano Morone

Levich Institute and Physics Department,

City College of New York, New York, NY 10031, USA

and

Hernán A Makse∗

Levich Institute and Physics Department,

City College of New York, New York, NY 10031, USA

∗[email protected] (SP), [email protected] (HAM)

†These authors contributed equally to this work

Abstract

The integrity and functionality of many real-world complex systems hinge on a small set of pivotal nodes, or influencers. In different contexts, these influencers are defined as either structurally important nodes that maintain the connectivity of networks, or dynamically crucial units that can disproportionately impact certain dynamical processes. In practice, identification of the optimal set of influencers in a given system has profound implications in a variety of disciplines. In this review, we survey recent advances in the study of influencer identification developed from different perspectives, and present state-of-the-art solutions designed for different objectives. In particular, we first discuss the problem of finding the minimal number of nodes whose removal would breakdown the network (i.e., the optimal percolation or network dismantle problem), and then survey methods to locate the essential nodes that are capable of shaping global dynamics with either continuous (e.g., independent cascading models) or discontinuous phase transitions (e.g., threshold models). We conclude the review with a summary and an outlook.

1 Introduction

A wide variety of phenomena in nature and society can be unified under the umbrella of dynamical complex systems. Important social and biological processes such as epidemic outbreaks in population [1], information diffusion in social media [2], signal transmission in brain networks [3] and dynamical evolution of ecosystems [4] all boil down to interactions among large numbers of building units of each system, and therefore can be properly described by dynamical models in complex networks [5, 6, 7, 8]. In these systems, complex interactions at microscopic scale lead to the abundant dynamical behaviors we observe at macroscopic level. As a result, understanding how network structure impacts the function of dynamical complex systems becomes a central topic in modern network science.

In network science, it has been well established that the collective dynamics of complex systems can be shaped by a small number of essential nodes, or influencers. For example, opinion leaders in social media are capable of influencing the public viewpoint on certain trending topics [9]; critical regions in brain are essential in the formation of memory networks [3, 10, 11, 12]; and keystone species in ecology are responsible for the integrity and stability of ecosystems [13, 14, 15, 16]. Numerical simulations of epidemic processes have also demonstrated that the location of epidemic origin is critical for the final outbreak size [17] (see Figure 1 for an example).

In general, influencers can be vaguely defined as the nodes that are disproportionately “important” to the function of complex systems. However, in a given context, influencers may have a more specific definition: in social networks, influencers are opinion leaders who can influence a large number of people; in brain networks, influencers are important regions that maintain the connection across different functional parts; in ecological systems, influencers are keystone species whose extinction would collapse the network; and in epidemic spreading, influencers are superspreaders who transmit infectious diseases to a large population. In previous studies, abundant works exist dedicated to explore how to find influencers in a specific system. (For instance, in social science, various centrality measures have been developed to rank users’ importance in social networks [19].) Due to its vast scope, here we do not attempt to summarize all relevant works, but instead focus on two important problems with wide applications.

•

First, how to find the structural influencers whose removal would fragment the network? This problem, named optimal percolation [20] or network dismantle [21], is purely structural and does not involve with dynamical processes.

•

Second, how to find the dynamical influencers who can lead to the largest cascading following a spread model? This problem, named influence maximization [22], depends on both network structure and dynamical rules.

In our following discussion, we refer to the above two problems uniformly as influencer identification. The specific definition of influencers is thus context-dependent. Solutions to these two problems can be applied in real-world applications ranging from maximization of marketing in social networks [22, 23, 24], optimization of immunization campaigns [25, 26, 27] to protection of networks under malicious attacks [28, 29, 30].

Real-world dynamical complex systems generally fall into two major classes:

•

Systems with only positive interactions. For instance, in online social media, social ties can facilitate the spread of information among users [9]; in human population, physical contacts may transmit infectious diseases from person to person [31]; and in mutualistic ecosystems, cooperations between different species benefit their existence in ecology [32].

•

Systems with both positive and negative interactions. For instance, in neural systems, synaptic connections can be either excitatory or inhibitory [33]; in gene regulatory networks, molecular regulators can activate or inhibit the expression of certain genes [34]; and in ecosystems, both mutualistic and predator-prey relationships coexist among different species [32].

For systems with only positive interactions, influencer identification can be defined using key topological structures such as the giant component (GC) [35, 36, 37] and k-core [38, 39]. On the contrary, systems with both positive and negative interactions do not admit the classical definitions of the GC and k-core, so the influencer identification problem in these systems need to be treated with a different theory. In this review, we only consider the former case where all links have positive interactions, and the case of inhibition/activation interactions will be treated elsewhere.

For structural influencer identification, the solution only depends on the network structure. However, for dynamical influencer identification, spread models can be further divided into two classes with continuous (second order) and discontinuous (first order) phase transitions. In regular percolation process [35] and independent cascade models [40], the GC emerges continuously from zero size as links are gradually occupied [41]. In contrast, in k-core percolation [42] and threshold models [43], k-core structure with non-zero size can appear abruptly as more nodes are activated [44, 45]. For these two types of dynamical models, approaches to find influencers are qualitatively different. We therefore discuss the influencer identification problem for models with continuous and discontinuous phase transitions separately.

Heuristically, influencers can be selected by picking vital individual spreaders one by one using a greedy approach, in which the influence of single nodes is estimated via Monte Carlo simulations [22] or various centrality measures [19]. However, influencer identification is intrinsically an NP-hard combinatorial optimization problem [22]. Therefore, a collective point of view that considers interactions among multiple spreaders is required. Recent progresses have translated the influencer identification problem into other closely related optimization problems such as message passing [46], belief propagation [47], optimal percolation [20], optimal decycling [21, 48] and explosive percolation [49]. These new approaches have enriched our understanding of feasible directions to tackle the influencer identification problem, and provided a number of sophisticated yet efficient methods that are applicable to large-scale complex systems. Classical centrality-based approaches have been extensively discussed in previous literature. As a result, in this survey, we focus on development from other approaches. Readers who are interested in centralities can find details in Ref. [50, 51, 52] and references therein.

The paper is organized as follows. In Section 2, the GC and k-core structure, on which the influencer identification problem is defined, are introduced. In this section, we discuss the links between regular percolation and independent cascade models, as well as the relationship between k-core percolation and threshold models. In Section 3, we discuss the progresses in optimal percolation using collective influence, optimal decycling, explosive percolation and large deviations of percolation in details. This section summarizes recent developments in finding the minimal set of nodes to collapse a network, i.e., the structural influencer identification problem. In Section 4, approaches for models with continuous transitions using greedy search and message passing are presented. In Section 5, methods developed for threshold models with discontinuous transitions are reported. Section 4 and Section 5 survey the methods to solve the dynamical influencer identification problem, with focus on dynamical models with continuous and discontinuous transitions respectively. Lastly, we conclude the review with an outlook of further directions in Section 6.

2 Giant component and k-core structure

The topological feature of a network can be characterized by important structures such as the giant component and k-core. These concepts are fundamental in defining the problem of influencer identification in various dynamical processes. In this section, we introduce the regular percolation and k-core percolation processes, which are used to define the influencer identification problem, and elucidate their connections with commonly used spreading models.

2.1 Percolation and independent cascade models

The connectivity of a network is characterized by the number of nodes in the largest connected component, or the giant component (GC) $G_{\infty}$ . In the random graph theory established by Paul Erdős and Alfréd Rényi in 1960s [35], the percolation process describes the emergence of $G_{\infty}$ by gradually increasing the probability of connection between any pairs of nodes [53]. In its inverse process, the giant component $G_{\infty}$ of an initially connected network collapses as an increasing fraction $q$ of nodes or links are removed. This removal process, termed site or bond percolation, leads to a continuous phase transition at a critical value of $q$ , above which only fragmented clusters remain, as shown in Fig. 2(a).

For a network $G(V,E)$ with $N=|V|$ nodes and $M=|E|$ edges, we can use a vector $\bm{n}=(n_{1},\cdots,n_{N})$ to represent the configuration of whether a node $i$ is removed ( $n_{i}=0$ ) or not ( $n_{i}=1$ ). After the removal of $q=1-\sum_{i=1}^{N}n_{i}/N$ fraction of nodes, we define the size of remaining giant component $G_{\infty}(q)$ as the ratio of the number of nodes in $G_{\infty}$ to the network size $N$ . In the classical percolation theory [35], nodes are deleted randomly. At the critical value $q_{rand}$ , $G_{\infty}$ is completely dismantled and becomes negligible compared with the network size $N$ in a continuous, or second order, phase transition. In thermodynamic limit $N\to\infty$ , we have $\lim_{q\to q_{rand}^{-}}G_{\infty}(q)=0$ and $G_{\infty}(q)=0$ for $q\geq q_{rand}$ . In real-world networks, the critical point $q_{rand}$ upon random attack depends on the heterogeneity of network structure. In particular, using generation functions, it was proved that for random networks with a given degree distribution $P(k)$ , $q_{rand}$ is estimated by $\langle k\rangle/(\langle k^{2}\rangle-\langle k\rangle)$ , where $\langle k\rangle=\sum kP(k)$ and $\langle k^{2}\rangle=\sum k^{2}P(k)$ are the first and second moments of $P(k)$ [54]. This estimation predicts an extreme robustness to random attack, i.e., $q_{rand}\approx 1$ , for scale-free networks with a power law degree distribution $P(k)\propto k^{-\gamma}$ , which are ubiquitous in real-world systems [55, 56]. Later, more accurate estimations using the reciprocal of the largest eigenvalue of the adjacency matrix and non-backtracking matrix were developed [57, 58].

In random (or regular) percolation, nodes are removed without considering their difference in structural importance. As a matter of fact, if nodes are removed strategically, $G_{\infty}$ can be destructed well before the removal of $q_{rand}$ fraction of nodes. For example, $G_{\infty}$ in scale-free networks is extremely vulnerable to targeted attacks on hubs [28, 29, 59]. The optimized process, deviating from the mean-field dynamics given by classical percolation theory, expedites the collapse of $G_{\infty}$ and reduces the critical value of $q$ . In statistical physics and network science, a number of works have explored the large deviations of percolation., i.e. the deviations from the mean-field theory of percolation [60, 61, 62, 63, 64]. At $q_{rand}$ , there exist a number of possible configurations $\bm{n}$ such that $G_{\infty}(\bm{n})=0$ . As $q$ decreases, fewer configurations satisfy $G_{\infty}(\bm{n})=0$ , until at $q_{c}$ where only one configuration $\bm{n}^{*}$ exists. Below $q_{c}$ , there is no solution to $G_{\infty}(q)=0$ . Mathematically, $q_{c}=\min\{q\in[0,1]|G_{\infty}(q)=0\}$ . The optimal percolation, or network dismantle problem is to find the unique configuration $\bm{n}^{*}$ corresponding to the minimal $q_{c}$ , and influencers are the nodes with $n_{i}=0$ [20]. We note that the general framework of large deviations of percolation includes the optimal percolation problem as an extremal case [60, 63].

By definition, percolation process concerns the pure structural integrity of networks. Nevertheless, a class of spreading dynamics can be mapped to percolation process and therefore studied using percolation theory. One of these dynamics is described by independent cascade models (ICMs) [1, 7, 22]. In ICMs, an individual can be independently infected by any of his/her neighbors in the network. A spreading process starts from a set of infectious “seeds” in a susceptible population. In each time step, a susceptible individual can become infected by each of his/her infected neighbors with a certain transmission probability. Infected individuals keep infectious for a time of period, and then become susceptible again or immune to infection. The spreading process stops when there is no new infections. In applications, most widely used models include the susceptible-infected (SI) model, the susceptible-infected-susceptible (SIS) model and the susceptible-infected-removed (SIR) model [65, 66, 67]. These models are widely used in the simulation [1, 59, 68, 69, 70, 71], detection [72, 73, 74, 75, 76, 77] and forecast [78, 79, 80, 81, 82] of infectious disease spread and information diffusion. ICMs are closely related to percolation: the dynamical spreading process of an ICM can be transformed to a bond percolation with a given occupation probability [37]. As a result, the outcome of a dynamical ICM can be mapped to the static final state of an equivalent percolation process. This mapping bypasses the need to run dynamical models and enables us to analyze the spreading process using tools and properties of the well-studied percolation problem.

2.2 K-core percolation and threshold models

K-core decomposition classifies networks into layers with increasingly dense connections. In a network, the k-core is defined as the largest subgraph whose nodes have at least $k$ links [38, 39]. For example, the 1-core of a network is simply its GC; the 2-core is composed of all loops. Each node corresponds to a unique k-core index $k_{S}$ that indicates the highest k-core it locates. The k-core index $k_{S}$ is obtained through k-shell decomposition in which nodes are iteratively pruned according to their remaining degrees [83]. This process can be also viewed as a recursive calculation of the Hirsch-index $h$ [84], in which a node is assigned index $h$ if it has at least $h$ neighbors with degree no smaller than $h$ [85]. Nodes with low $k_{S}$ values are located at the periphery of the network while the center consists of nodes with high $k_{S}$ values. An example of k-core decomposition is shown in Fig. 3. Recently, k-core percolation is generalized to multiplex networks [86].

K-core structure provides higher-order information on network connectivity beyond giant component, which can be viewed as a 1-core. Originally proposed on lattices in statistical physics [87], k-core percolation (or bootstrap percolation) describes the formation process of k-core in networks [44]. In a standard k-core percolation, nodes in a given network can be either active or inactive. Initially $p$ fraction of nodes are activated; in later steps, inactive nodes with at least $k$ active neighbors become activated recursively. In the final state, active nodes form the percolated k-core. The reversal process of k-core percolation depicts the destruction of k-core structure. Specifically, $q$ fraction of nodes are removed from the network, and nodes with less than $k$ neighbors are further recursively deleted. The final size of k-core $|k_{S}|$ is the fraction of nodes left.

K-core percolation has many important variants developed independently in other disciplines. For instance, in sociology, Granovetter proposed the threshold model of collective behavior in society in 1978 [43]. In the well-studied version of linear threshold models (LTMs), nodes are activated only when the number of active neighbors exceeds a predefined threshold value. The heterogeneous k-core percolation, in which each node is assigned a local threshold, is a generalization of the classical k-core percolation and a special case of LTM [88, 89, 90]. Within this framework, classical GC percolation can be viewed as a special case of LTM where the threshold of each node $i$ is $k_{i}-1$ ( $k_{i}$ is the degree of node $i$ ) [20]. Further, weights of interactions and nonlinear threshold rules have been introduced to describe more complex dynamics [22, 45, 91]. Recently, the k-core was also applied as a precursor of the jamming transition in granular materials [92].

More recently, a generalized k-core percolation was proposed as a generalization of the leaf removal process [93]. In this $k$ -leaf removal algorithm, nodes of degree smaller than $k$ and their nearest neighbors together with all incident links are recursively pruned. The subgraph left after this pruning is called the Generalized $k$ -core, or $Gk$ -core. Similar as k-core percolation, the pruning procedure decomposes the network into layers of nested $Gk$ -cores. However, as indicated by the authors, unlike k-core decomposition that classifies nodes according to their topological properties, the $Gk$ -cores characterize a specific robustness of the network: it is actually the remained network after an epidemic that attacks weak individuals of degree less than $k$ and their neighbors.

The fundamental difference of k-core percolation from GC percolation is that the k-core size $|k_{S}|$ could undergo a discontinuous, or first order, phase transition under certain circumstances. For example, in Fig. 2(b), the left inset illustrates the 3-core of the network. Upon the removal of the red node, the 3-core is completely destroyed, with only 2-core left as shown in the right inset. In this example, the 3-core disappears abruptly from a non-zero size. Such discontinuous phase transition stems from the threshold rule of percolation, and lies at the heart of catastrophic cascading failures in many real-world systems [94, 95]. A number of seminal works have explored the phase diagram and mechanism of transition in k-core percolation or threshold models [42, 44, 45, 91, 96, 97]. In particular, Watts modeled the global cascade on random networks using a linear threshold model and derived the critical condition for the discontinuous transition [45]. Goltsev et al. found the hybrid phase transition in k-core percolation with a discontinuous emergence of k-core as well as a continuous emergence of GC [44]. In Ref. [44], authors demonstrated the crucial role of “corona”, a subset of nodes in the k-core that have exactly $k$ neighbors: a random removal of even one node from the corona will trigger the collapse of a vast region of the k-core around the removed node. Baxter et al. further analytically derived the condition for the discontinuous transition of k-core in networks with arbitrary degree distributions [42]. The abrupt jump from a k-core with non-zero size to its collapse can be mathematically explained by a bifurcation of the dynamical system describing the k-core percolation. In such bifurcation, a small change of parameters (e.g., fraction of removed nodes) leads to the discontinuous shift of the stable point from a non-zero solution (k-core with non-zero size) to a zero solution (no k-core). Such bifurcation-induced transition is also responsible for the global cascade and vulnerability in interdependent networks and network of networks [94, 98, 99].

Similar to optimal percolation, the configuration of node removal $\bm{n}$ can be optimized to induce early transition. For random k-core percolation, at the critical point $q_{rand}$ , we have $\lim_{q\to q_{rand}^{-}}|k_{S}|>0$ , $\lim_{q\to q_{rand}^{+}}|k_{S}|=0$ and $|k_{S}|(q)=0$ for $q>q_{rand}$ . The optimal k-core percolation problem is to find the unique configuration $\bm{n}^{*}$ for which $q_{c}=\min\{q\in[0,1]|\lim_{q^{\prime}\to q^{+}}|k_{S}|(q^{\prime})=0\}$ (Fig. 2(b)). In some literature, this problem is also known as the minimal contagious set problem[100, 101, 102, 103, 104, 105, 106]. For threshold models, the influencer identification problem is to search for a given number of seeds that can lead to the maximal number of activated nodes.

3 Optimal percolation

Influence maximization is closely related to the optimal percolation problem. In addition, optimal percolation also provides a solution to the optimal immunization problem by dismantling the underlying network on which propagation occurs. Recently, within the message passing framework, Morone and Makse developed an efficient algorithm, the collective influence (CI), that gives a good approximation of optimal percolation [20]. Later, better algorithms based on optimal decycling [21, 48] and explosive percolation [49] were proposed. In this section, we discuss these structural approaches to the influence maximization problem.

3.1 Collective influence

Considering a network $G$ with $N$ nodes and $M$ edges, the vector $\mathbf{n}=(n_{i},\cdots,n_{N})$ encodes the configuration of whether node $i$ is removed ( $n_{i}$ =0) or not ( $n_{i}$ =1). Denoting the fraction of removed nodes by $q=1-\sum_{i=1}^{N}n_{i}/N$ , optimal percolation aims to find the minimal fraction $q_{c}$ of nodes such that the giant component $G_{\infty}$ is fully dismantled. Within the message passing framework, define message $\nu_{i\to j}$ as the probability that node $i$ belongs to $G_{\infty}$ without being connected to it through node $j$ . Therefore, $\nu_{i\to j}=1$ if and only if $n_{i}=1$ and a least one of $i$ ’s neighbors other than $j$ is connected to $G_{\infty}$ . For a locally tree-like structure, the messages evolves by the following equations:

[TABLE]

where $\partial i\setminus j$ denotes the immediate neighbors of $i$ excluding $j$ . Taking node $j$ back into consideration, the probability that $i$ is connected to the giant component is then calculated as

[TABLE]

By linearizing Eq. (1) around the fixed point $\{\nu_{i\to j}=0\}$ , the stability of this solution is determined by the largest eigenvalue $\lambda(\mathbf{n};q)$ of a linear operator $\mathcal{M}$ . Specifically, $\mathcal{M}$ is the Jacobian of the system defined on $2M\times 2M$ directed edges as $\mathcal{M}_{m\to n,i\to j}\equiv\frac{\partial\nu_{i\to j}}{\partial\nu_{m\to n}}|_{\{\nu_{i\to j}=0\}}$ . A few calculations show that the matrix $\mathcal{M}$ can be represented in terms of the non-backtracking (NB) matrix $\mathcal{B}$ [107] via

[TABLE]

where the NB matrix is

[TABLE]

The matrix entry $\mathcal{B}_{m\to n,i\to j}$ is non-zero only when ( $m\to n$ , $i\to j$ ) form a pair of consecutive non-backtracking directed edges, i.e., ( $m\to n$ , $n\to j$ ) with $m\neq j$ . For non-backtracking edges, $\mathcal{B}_{m\to n,n\to j}=1$ .

Following the Frobenius theorem, the largest eigenvalue $\lambda(\mathbf{n};q)$ is real and positive. The solution $\{\nu_{i\to j}=0\}$ is stable if $\lambda(\mathbf{n};q)\leq 1$ . In this way, the optimal percolation problem can be solved by finding the optimal configuration $\mathbf{n}^{*}$ such that $\lambda(\mathbf{n}^{*};q_{c})=1$ . For $q<q_{c}$ , all configurations lead to $\lambda(\mathbf{n};q)>1$ . On the contrary, for $q>q_{c}$ , there exist two different circumstances. For some non-optimal configurations, the macroscopic component still exists. On the other hand, there are also configurations such that $\lambda(\mathbf{n};q)\leq 1$ , which correspond to a fully fragmented network. As $q\to q_{c}^{+}$ , the number of configurations satisfying $\lambda(\mathbf{n};q)\leq 1$ gradually decreases and eventually vanishes at $q_{c}$ , where the optimal configuration $\mathbf{n}^{*}$ is obtained. To develop a scalable algorithm, the eigenvalue can be approximated using the Power Method [108]. For a given number of iterations $\ell$ , the collective influence (CI) of node $i$ can be defined as:

[TABLE]

where $\partial\text{Ball}(i,\ell)$ is the frontier of the ball of radius $\ell$ in terms of shortest path centered around node $i$ . By iteratively removing the node with largest CI, the largest eigenvalue of $\mathcal{M}$ can be minimized with high efficiency. After each removal, the CI score of every remaining node in the network is recalculated. This process continues until the network is fully fragmented, i.e. $G_{\infty}\ll 1$ . The optimal configuration $\mathbf{n^{*}}$ and $q_{c}$ are estimated from this removal process.

For $q<q_{c}$ , the network can not be fully dismantled. In order to obtain the smallest giant component, a greedy reinsertion procedure is performed starting from the optimal configuration $\mathbf{n}^{*}$ . In the reinsertion procedure, an index $c(i)$ is define for each removed node. Specifically, $c(i)$ is the number of clusters that would be joined together if node $i$ is put back in the network. Nodes with the smallest $c(i)$ score are iteratively reinserted until the fraction of removed nodes decreases to $q$ .

The computational complexity of the CI algorithm is $O(N^{2})$ . In practice, it can be accelerated by limiting the calculation and update of CI inside the $(\ell+1)$ -ball around the removed node. In addition, the complexity can be further reduced to $O(N\log N)$ by sorting the CI scores in a heap structure [109], which makes it scalable to large networks. Simulation results on both synthetic and real-world social networks show that the CI algorithm outperforms the equal graph partitioning (EGP) immunization strategy [26] and frequently used heuristic metrics such as degree centrality, PageRank and k-core index. For a Twitter network with $469,013$ users and a Mexico mobile communication network of $1.4\times 10^{7}$ users, the CI algorithm achieves fully fragmentation with a smaller set of influencers [20] (see Fig. 4). For such massively large-scale networks, a variant of the CI algorithm can be applied without losing performance by removing a finite fraction of nodes instead of one node at each step.

Given a finite radius $\ell$ , the CI algorithm is local in nature. To incorporate the influence of a node at the global level, Morone et al. improved the CI algorithm using a message-passing approach and proposed the CI propagation algorithm ( $\text{CI}_{\text{P}}$ ) [109]. As a variant of the CI algorithm in the limit of $\ell\to\infty$ , the $\text{CI}_{\text{P}}$ algorithm is able to reach the analytical optimal percolation threshold of random cubic graphs [110]. Another belief-propagation variant of CI algorithm, $\text{CI}_{\text{BP}}$ , was also proposed. Combining the dynamics of the SIR model with message-passing updating rules, $\text{CI}_{\text{BP}}$ achieves similar performance with $\text{CI}_{\text{P}}$ . However, the improvements of $\text{CI}_{\text{P}}$ and $\text{CI}_{\text{BP}}$ over CI are made at the expense of increasing computational complexity from $O(N\log N)$ to $O(N^{2}\log N)$ . Kobayashi and Masuda recently developed an immunization algorithm for networks with community structure combining the CI algorithm and coarse graining procedure in which communities were regarded as supernodes [111]. From a mesoscopic scale, nodes connecting different communities can be identified at a cost of $O((N^{2}/N_{C})\log N)$ ( $N_{C}$ is the number of communities). The optimal percolation problem was also studied on multiplex networks. Osat et al. showed that characteristics in multiplex networks such as edge overlap and interlayer degree-degree correlation could profoundly change the properties of influencers [112]. Neglecting the multiplex structure of a network would lead to significant inaccuracies about its robustness. In applications, the collective influence theory has been used to locate superspreaders of information in real-world social media [113], find sources of fake news in Twitter during the 2016 US presidential election [114, 115], single out critical regions in brain networks [10, 116], infer personal economic status [117], improve cooperation in evolutionary games [118] and control biological networks [119, 120, 121, 122].

As demonstrated in Ref. [20], the optimal percolation problem can be mapped exactly onto the influence maximization problem for the linear threshold model with threshold $k_{i}-1$ ( $k_{i}$ is the degree of node $i$ ). As a result, the CI algorithm, designed for optimal percolation, also provides a solution to the influence maximization problem for this specific transmission model. For linear threshold models with other threshold values, the CI algorithm was generalized to solve the influence maximization problem with first-order transitions, which will be addressed later. In addition, a detailed discussion on the relation of the CI algorithm with the SIR model can be found in Ref. [20].

3.2 Optimal decycling-based algorithms

Recent works have shown that the optimal percolation problem is closely related to the optimal decycling problem, or minimum feedback vertex set (FVS) problem [21, 48]. A feedback vertex set is the set of nodes whose removal would break all the loops in the network [123]. The optimal decycling problem is, in fact, analogous to find the FVS with smallest number of nodes. The rationale behind the connection between optimal percolation and optimal decycling is that, for sparse random networks, short loops rarely exist in small connected components [124, 125, 126]. If the long loops in the giant component are cut, the network will break into small tree fragments. As indicated by Braunstein et al. [21], the optimal decycling threshold $q_{c}^{dec}$ acts as an upper bound of the optimal percolation threshold $q_{c}$ . For random networks with light-tailed degree distribution (finite second moment), the minimal size of decycling set is equal to the minimal size of dismantling set in the limit $N\to\infty$ . The optimal decycling problem is itself an NP-hard problem, but can be solved via belief propagation algorithms approximately. Two approaches based on decycling algorithm were developed recently [21, 48]. Both of them apply a three-stage procedure: first decycle the network with minimal number of nodes, then break the tree into small components, and finally reinsert some nodes to the network without increasing the size of the largest component. Compared with the CI algorithm, these two algorithms take into account the global topology of the network and achieve a better performance.

The belief-propagation-guided decimation (BPD) algorithm proposed by Mugisha and Zhou is based on the spin glass model of the FVS problem [127]. In order to transform the global acyclic constraint into local ones, a variable $A_{i}$ , which takes the value [math], $i$ or $j\in\partial i$ , is assigned to each node [127]. If node $i$ is removed from $G$ , $A_{i}=0$ . Otherwise, $A_{i}=i$ if it is a root of a tree or $A_{i}=j$ if node $i$ has a parental node $j$ . Given a microscopic configuration $\mathbf{A}=\{A_{1},A_{2},\cdots,A_{N}\}$ , the fraction of removed nodes is represented by:

[TABLE]

where $\delta_{n}^{l}$ is the Kronecker delta function ( $\delta_{n}^{l}=1$ if $n=l$ and 0 otherwise). For each edge $(i,j)$ in the network, an edge factor $C_{ij}(A_{i},A_{j})$ is defined as [127]:

[TABLE]

The edge factor $C_{ij}(A_{i},A_{j})$ is either 1 or 0. The edge $(i,j)$ is regarded as satisfied if $C_{ij}(A_{i},A_{j})=1$ , and unsatisfied otherwise. For a configuration $\mathbf{A}$ , if all edges in a network $G$ are satisfied, we define $\mathbf{A}$ as a solution of $G$ . The definition of satisfied edges relaxes the original problem of acyclic components to allow at most one cycle in the remained components. Indeed, it has been proven that all remaining nodes in a graph for a solution $\mathbf{A}$ form a subgraph consisting of several components that each contains at most one cycle. Considering all solutions of the network, a partition function of the system is defined as:

[TABLE]

where $\mu$ is the inverse of temperature. At the limit of zero temperature, the partition function is contributed exclusively by the optimal configuration $\mathbf{A}^{*}$ with the minimal fraction $q_{c}^{dec}$ .

Under locally tree-like assumption, the marginal probability $q_{i}^{0}(t)$ for node $i$ to be removed from the remaining network $G(t)$ can be calculated through iterations of a set of belief propagation (BP) equations [48]. At each time step $t$ , the BP equations are iterated for a given number of rounds and the removal probability $q_{i}^{0}$ is calculated for each node. The node with the highest probability $q_{i}^{0}$ is removed from the network even if the BP equations do not converge to a fixed point. The process stops when the network becomes acyclic. If the largest component $G_{\infty}$ remains extensive, it can be further fragmented by iteratively deleting nodes that lead to the smallest giant component. The BPD algorithm can be well applied to networks with rare short loops. However, for a large number of networks with abundant communities, the nodes in FVS set are usually more than necessary to dismantle the network stricture. Therefore a reinsertion process can be proceeded without significantly increasing the size of $G_{\infty}$ . This process can be done through a greedy algorithm, in which the nodes that cause the least increase in $G_{\infty}$ are reinserted one after another until the size of $G_{\infty}$ reaches a predefined threshold.

The BPD algorithm is scalable to large networks with a computational complexity of $O(N\log N)$ . Simulations on random network ensembles and real-world networks indicate that the BPD algorithm is superior to the CI algorithm in optimal percolation problem (see Fig. 5). However, as shown in Ref. [109], the BPD algorithm is relatively slower than the CI algorithm. For large random Erdős-Rényi (ER) networks and scale-free networks, the BPD algorithm manages to fragment the network by removing a smaller set of nodes compared with CI algorithm. In particular, the percolation threshold is close to the minimal value predicted by the replica-symmetric (RS) mean field theory [110, 127]. In the CI algorithm, the size of $G_{\infty}$ decreases almost linearly with the increase of $q$ . In contrast, $G_{\infty}$ features an abrupt collapse under the BPD algorithm. This results from the intrinsic nature of the FVS problem and the efficiency of tree dismantling. With the existence of such collapse, the BPD process can work as an efficient attack strategy, leaving no warning to the system before its total failure. In a recent work on dismantling efficiency and network fractality [128], it was found that the BPD algorithm outperforms the CI algorithm no matter whether the network is fractal or not, while the CI algorithm works better on non-fractal networks, which have high ratios of long-range shortcuts to short-range connections.

Braunstein et al. considered the optimal decycling problem from a different point of view [21]. In this work, the optimal percolation problem was named as network dismantle. Noticing that a network is acyclic if and only if its 2-core is empty, authors mapped the decyling process to a 2-core percolation. Assume a set of nodes $S\subset V$ are initially removed from the network. The 2-core percolation can be described by the evolution of time-dependent binary variables $x_{i}^{t}(S)$ for $1\leq i\leq N$ . Starting from the initial setting $x_{i}^{0}(S)=1$ for removed nodes $i\in S$ and $x_{i}^{0}(S)=0$ for $i\notin S$ at $t=0$ , the evolution follows [21]

[TABLE]

where the indicator function $\mathbb{I}$ is 1 if the argument is true and 0 otherwise. As $x_{i}^{t}$ can only change from 0 to 1, the equations admit a fixed solution $x_{i}^{*}(S)$ as $t\rightarrow\infty$ . In particular, $x_{i}^{*}(S)=0$ iff $i$ belongs to the 2-core of $G\setminus S$ . If $x_{i}^{*}(S)=1$ for all nodes, $G\setminus S$ contains no loops and $S$ is called a decycling set. To find the minimal decycling set, it is convenient to introduce the probability distribution over decycling sets $S$ using the Boltzmann distribution in statistical physics

[TABLE]

where $|S|$ is the number of nodes in $S$ , $\mu$ is the inverse temperature, and $Z(\mu)$ is the partition function that normalizes the distribution. The minimal size of decycling sets can be calculated in the zero-temperature limit: $q_{c}^{dec}=\lim_{\mu\rightarrow-\infty}\frac{1}{N\mu}\ln Z(\mu)$ .

Since $x_{i}^{*}$ depends on $S$ in a global way, it is difficult to compute $Z(\mu)$ directly. To solve this problem, authors transformed the global constraint $\prod_{i\in V}\mathbb{I}\left[x_{i}^{*}(S)=1\right]$ to its local equivalent. The node removal process in 2-core percolation can be described by an integer $t_{i}(S)=\min\{t:x_{i}^{t}(S)=1\}$ defined for each node $i$ , which encodes the time when node $i$ is removed from the network. For $i\in S$ , it is straightforward that $t_{i}(S)=0$ . For $i\notin S$ , $t_{i}(S)$ depends locally on its neighbors according to

[TABLE]

where the function $\max_{2}$ returns the second largest value in its argument. Under this parameterization, the partition function can be rewritten as

[TABLE]

where $\psi_{i}(t_{i})=\mathbb{I}[t_{i}=0]$ .

The exact computation of Eq. (12) is NP-hard. In calculation, a simplification of the partition function in Eq. (12) can be performed by restricting $t_{i}$ to be no larger than $T$ . All values $t_{i}$ larger than $T$ are regarded as infinity. Under this simplification, trees with diameters larger than $T+1$ are considered to be part of a long cycle. Given a large enough $T$ , the effect of this simplification is negligible. For locally tree-like graphs, the partition function can be computed by the cavity method [129, 130], in which “messages” are exchanged between neighboring nodes. For each link $i\to j$ , a message $\eta_{ij}(t_{i},t_{j})$ as a function of activation times $t_{i}$ and $t_{j}$ is introduced. The messages satisfy the self-consistent BP equations [21]:

[TABLE]

As the temperature approaching zero ( $\mu\to-\infty$ ), probabilities of the messages $\eta_{ij}(t_{i},t_{j})$ in the BP equations concentrate on the solution to Eq. (9) that minimizes the cost function $\sum_{i}\psi_{i}(t_{i})$ . To develop an algorithm that finds the optimal decycling set, a slightly different cost function is used: $\psi_{i}(t_{i})=\mathbb{I}[t_{i}=0]+\varepsilon_{i}(t_{i})$ , where $\varepsilon_{i}(t_{i})$ is a randomly chosen small cost. Further, the 2-core percolation process is relaxed to allow $t_{i}\geq 1+\max_{2}(\{t_{j}\}_{j\in\partial i})$ in Eq. (9). Define $h_{i}(t_{i})$ as the minimal cost to dismantle the 2-core under the condition that node $i$ is removed at $t_{i}$ . The optimal decycling set is determined by $S^{*}=\{i\in V|t_{i}^{*}=0\}$ , where $t_{i}^{*}=\arg\min h_{i}(t_{i})$ . In calculation, $h_{i}(t_{i})$ can be computed using Min-Sum algorithm, which is derived at the zero temperature limit of BP equations. Concretely, messages $h_{i}(t_{i})$ are solved by iterating a set of equations [21]. In most cases, convergence can be reached within a small number of iterations, with a computational complexity $O(MT)$ in each iteration. In case the Min-Sum equations do not converge, a reinforcement procedure is applied to damp the system [131].

In the acyclic network $G\backslash S^{*}$ , there may still exist some extensive tree components. These large trees can be fragmented efficiently via a greedy tree breaking procedure with computational complexity of $O(N(\log N+T))$ . In addition, for networks containing many short loops, a reverse greedy (RG) reinsertion procedure is applied to recover the nodes that do not increase the size of the giant component, as performed in the CI and BPD algorithms. The computational cost of this RG procedure is $k_{max}C^{\prime}\log(k_{max}C^{\prime})$ , where $k_{max}$ is the maximal degree and $C^{\prime}$ is the upper bound of $G_{\infty}$ size.

Simulations on both synthetic and real-world social networks demonstrate the effectiveness of this decycling based algorithm. For an ER random graph of size $N=78,125$ and average degree $d=3.5$ , the $G_{\infty}$ size deceases to $0.032$ when $17.81\%$ of nodes are removed. Compared with metrics of degree centrality, eigenvector centrality and the CI algorithm with $\ell=5$ , it was found that the three-stage algorithm is superior in dismantling the giant component (see Fig. 6). The Monte Carlo-based simulated annealing (SA) algorithm gives a competitive result. However, its computational complexity is much higher. For the same Twitter network analyzed in Ref. [20], the Min-Sum algorithm with RG performs equally well with SA, removing only $3.4\%$ of nodes to break the giant component (smaller than 1,000 nodes). In comparison, CI needs to remove $4.6\%$ of nodes to achieve the same fragmentation performance.

Inspired by the decycling-based algorithm, a simple and faster heuristic algorithm with complexity $O(N)$ , CoreHD, was developed [132]. Starting from the 2-core of a network $G$ , CoreHD recursively removes nodes with the highest degree in the 2-core until $G$ is fully dismantled. Despite its simpleness, CoreHD is reported to perform better than the CI algorithm. Specially, for large random networks, the performance of CoreHD is close to the theoretical solution predicted by replica-symmetry and 1RSB approximation [100]. In addition, this simple algorithm is amenable to rigorous analysis, performing well even on loopy networks which are not accessible for typical message-passing algorithms. In a recent work by Schmidt et al. [133], the CoreHD algorithm was analyzed rigorously by translating the node removal in the CoreHD algorithm to a random process on the degree distribution of the network. The mapped dynamics, described by a set of coupled nonlinear ordinary differential equations, characterize the behavior of the CoreHD algorithm on random graphs. In the analysis, new upper bounds on the size of the minimal contagious sets in random graphs were proposed, which improves the best known results [100, 110]. The CoreHD analysis also inspired an improved heuristic algorithm, WEAK-NEIGHBOR, that works for both optimal percolation and k-core percolation [133]. Details of this algorithm will be introduced in the next section.

3.3 Explosive percolation-based immunization

Another approach of optimal percolation was developed by Clusella et al. [49] based on explosive percolation (EP). In contrast with ordinary bond percolation which usually exhibits second or higher order phase transitions, EP features an unusual threshold behavior – an explosive emergence of the giant component at the critical point [134, 135, 136, 137, 138]. To obtain an explosive transition, Achlioptas et al. proposed a modified edge addition procedure, wherein, at each step, two candidate edges are chosen randomly, but only one of them is actually occupied [134]. Given the weight of a node measured by the size of the connected component it belongs to, the edge with the minimal sum or product of nodes’ weights is selected. These two procedures are referred to as the min-cluster and min-product rule. Compared with the random occupation of edges in ordinary percolation, the min-cluster or min-product rule favors the connection between small components, hereby suppresses the generation of an extensive component.

The explosive immunization (EI) algorithm adopts an inverse strategy that starts from a configuration where all nodes are virtually removed ( $q=1$ ). Then less “dangerous” nodes are progressively unvaccinated. The procedure is performed in two schemes for $q>q_{c}$ and $q<q_{c}$ , each of which uses a score to rank nodes in terms of their suitability to be unvaccinated. Similar to the construction of EP, in each time step, $m$ candidates (typically $m\approx 10^{3}$ ) are randomly selected. For $q>q_{c}$ , the node with the lowest blocking ability (the weakest blocker) is put back into the network. The blocking ability is quantified by a score $\sigma_{i}^{(1)}$ , which is a synthesis of the size of clusters it would join and its local effective connectivity. Specifically, the score $\sigma_{i}^{(1)}$ is defined as [49]: $\sigma_{i}^{(1)}=k_{i}^{(\text{eff})}+\sum_{\mathcal{C}\subset\mathcal{N}_{i}}(\sqrt{|\mathcal{C}|}-1)$ , where $\mathcal{N}_{i}$ is the set of all components connected to node $i$ and $|\mathcal{C}|$ is the size of a component $\mathcal{C}$ . $k_{i}^{(\text{eff})}$ measures the “effective” connectivity of node $i$ based on the local structure of its neighborhood and can be determined by a set of closed equations [49]: $k_{i}^{(\text{eff})}=k_{i}-L_{i}-M_{i}(\{k_{j}^{(\text{eff})}\}_{j\in\partial i})$ , where $k_{i}$ is the degree of node $i$ , $L_{i}$ is the number of leaves in the neighborhood of node $i$ and $M_{i}$ returns the number of strong hubs. The strong hubs are defined recursively as nodes with $k_{i}^{(\text{eff})}$ larger than a threshold value (set as 6 in applications). The terms $L_{i}$ and $M_{i}$ are subtracted from $k_{i}$ since leaves have no contribution to connectivity and hubs are more likely to be removed in explosive immunization.

In the first part of the EI algorithm, the node with the lowest $\sigma_{i}^{(1)}$ score among $m$ candidates is unvaccinated in each iteration. This process eventually reaches a critical fraction of immunized nodes $q_{c}$ where the $G_{\infty}$ size exceeds a small threshold value. In the region of $q<q_{c}$ , however, the same procedure will lead to an abrupt jump of the $G_{\infty}$ size when two large components are joined together. As a consequence, in the second part at $q<q_{c}$ , another score $\sigma_{i}^{(2)}$ is used to suppress such explosive growth of the giant component. The definition of $\sigma_{i}^{(2)}$ reads [49]

[TABLE]

where $|\mathcal{N}_{i}|$ is the number of components connected to $i$ , $\mathcal{C}_{2}$ is the second largest component in $\mathcal{N}_{i}$ , and $\epsilon$ is a small positive number. According to the score $\sigma_{i}^{(2)}$ , the selection is made only among the neighborhood of $G_{\infty}$ . The candidate with the smallest number of neighboring components is favored; if it is not unique, the one with the smallest $|\mathcal{C}_{2}|$ is selected. This process is proceeded recursively until the fraction of vaccinated nodes $q$ reaches the expected value.

Using the Newman-Ziff percolation algorithm in identifying susceptible components [139], the explosive immunization algorithm is computationally efficient, which scales as $O(N\log N)$ . In addition, it can be accelerated further by considering a small number of candidates. Simulations on both synthetic and real-world networks indicate that the explosive immunization algorithm outperforms the CI algorithm (see Fig. 7). As a matter of fact, it achieves the smallest percolation threshold $q_{c}$ except for the belief propagation algorithms in Ref. [21, 48].

3.4 Graph partition-based algorithm

In an earlier work, the optimal immunization problem was solved by an equal graph partitioning (EGP) immunization strategy based on the heuristic optimal partitioning of graphs [26]. In EGP, the network is fragmented into small connected clusters of approximately equal size. In a targeted attack on high-degree nodes, clusters after fragmentation have a broad distribution of sizes, including many small clusters. The targeted strategy may select high-degree nodes in these small clusters, which are unnecessary in breaking down the network. The EGP method avoids fragmenting small clusters, as the clusters all have similar sizes. In the EGP method, a network is first separated into two components with arbitrary size ratio by a minimal number of separators, solved using the nested dissection (ND) algorithm [140]. Then the network can be partitioned into any desirable number of same size clusters by applying ND algorithm recursively. This greedy graph-partitioning strategy provides $5\%$ to $50\%$ improvement over the targeted strategy on model networks and real-world networks.

The original network dismantle problem was recently extended to a generalized network dismantle problem in which the cost of removing a node is considered [141]. In real-world systems, attacking important nodes typically requires a high cost as they are usually well protected. The generalized network dismantle problem seeks to find a set of nodes whose removal would fragment a network at the minimal cost.

Authors solved this problem by recursively applying node-weighted partition, i.e., partition a network into two parts of same size by removing a minimal number of edges. Specifically, define $v_{i}=+1$ if node $i$ belongs to a subgraph $M$ and $v_{i}=-1$ if node $i$ belongs to its complement $\bar{M}$ . Assuming that the cost of cutting a link $(i,j)$ equals the cost of removing nodes $i$ and $j$ , a node-weighted spectral cut objective function was proposed [141]:

[TABLE]

where $A$ is the adjacency matrix, and $w_{i}$ is the cost of removing node $i$ . The optimization problem was then written in matrix notation as minimizing $v^{T}L_{w}v/4$ subject to $\sum_{i}v_{i}=0$ , where $L_{w}=D_{B}-B$ is the node-weighted Laplacian of the matrix $B=AW+WA-A$ ( $W$ and $D_{B}$ are diagonal matrices with elements $W_{ii}=w_{i}$ and $(D_{B})_{ii}=\sum_{j}B_{ij}$ .)

The problem with integer constraint $v_{i}\in\{+1,-1\}$ is difficult to solve. As a result, the problem is relaxed to allow a real number $v_{i}\in\mathbf{R}$ . For the relaxed problem, the solution of $v$ is analytically given by the second-smallest eigenvector of $L_{w}$ , denoted by $v^{(2)}$ . To approximate this solution, the matrix $L_{w}$ is transformed so that $v^{(2)}$ becomes the second-largest eigenvector. The eigenvector problem is solved by power iteration, with the initial vector set perpendicular to the largest eigenvector of the transformed matrix. Once $v^{(2)}$ is obtained, the separating edges are those connecting nodes with $v_{i}\geq 0$ to nodes with $v_{i}<0$ . The set of nodes to be removed are optimized to cover all separating edges with minimal cost, which is transformed to the weighted vertex cover problem [142]. Finally, a reinsertion procedure is applied to find the nodes that are not necessary to fragment networks.

The generalized network dismantle (GND) algorithm has complexity $O(N\log^{2+\epsilon}(N))$ , which can be applied to large-scale networks. For nonunit costs, the GND algorithm outperforms current state-of-the-art; for unit cost, it performs better than or comparable to state-of-the-art [141].

3.5 Large deviations of percolation

The optimal percolation problem can be studied within the framework of large deviations of percolation. Generally, in the BP equations that describe the percolation process, the inverse temperature $\beta$ in the Boltzmann distribution of configurations $\mathbf{n}$ , $e^{-\beta\mathcal{E(\mathbf{n})}}$ ( $\mathcal{E(\mathbf{n})}$ is energy defined by the size of giant component for $\mathbf{n}$ ), controls the deviation of dynamics from random percolation. For instance, an infinity temperature ( $\beta=0$ ) corresponds to the random scenario, where each configuration is equally possible. As the temperature decreases, the dynamics start to deviate from the random scenario to more extreme cases: the distribution of configurations will concentrate on rare configurations with lower energy, i.e., smaller giant component. Particularly, at zero temperature $\beta\to\infty$ , only the configuration with the smallest giant component exists with non-zero probability. In this way, the optimal percolation problem can be interpreted as an extreme case of the large deviations of percolation.

Recently, properties of large deviations of percolation have been analyzed using Monte Carlo Markov Chains [61] and Belief Propagation [62]. In particular, Bianconi [60, 62] developed a large deviation theory of percolation that characterizes the response of a sparse network to rare events. This general theory contains both continuous transitions observed for random initial damage and discontinuous transitions corresponding to rate configurations of the initial damage that suppresses the GC size. This large deviation theory of percolation was also generalized to multiplex networks [63], based on which a new metric, sageguard centrality, was developed to single out the nodes that control the response of the entire multiplex network to random damage [64]. It was found that the sageguard centrality correlates well with nodes in the optimal percolation problem.

3.6 Summary

It is interesting that the optimal percolation, or network dismantle problem, can be solved from quite different approaches: the CI algorithm optimizes the stability of zero solution by minimizing the spectral radius of the NB matrix; the BPD and network dismantle algorithms aim to optimally remove cycles in the network; the EI algorithm attempts to gradually identify less vital nodes so that an explosive collapse of network would occur if the remaining critical nodes are attacked; the EGP and GND algorithms work by recursively partitioning the network into equal-size components; and large deviations of percolation considers the rare events deviated from random percolation. In terms of implementation, CI proceeds as a greedy adaptive algorithm, which is straightforward to implement; the BPD, network dismantle algorithm and large deviations of percolation need to iterate BP or Min-Sum equations to find the solution; the EI algorithm iteratively selects unvaccinated nodes from a number of candidates; and the EGP and GND algorithms apply graph partition recursively with different techniques. Most of these algorithms require a reinsertion process that excludes unnecessary nodes from the optimal node set. In essence, to solve an intrinsically global optimization problem, most approaches have to transform it to another problem that can be solved locally. For instance, CI defines a centrality based on local structure; the BP equations in the BPD and network dismantle algorithms incorporate local constraints compatible with the global constraints; the score calculation in the EI algorithm depend on local connectivity; and the EGP and GND algorithms are designed to recursively partition smaller local clusters. More features of these algorithm are summarized in Table 1.

4 Dynamics with continuous transitions

The problem of influencer identification in ICMs was originated from the work of Domingos and Richardson [24, 143], who aimed to advertise a product though viral marketing. Instead of viewing market as a set of independent entities, they treated it as a networked system where the potential profit contributed by a customer is mostly determined by his/her interactions with others. This problem was later formalized by Kempe et al. into a well-defined combinatorial optimization problem [22]: Considering an independent cascade model in a network $G$ and an integer $k$ , how to find the optimal set of $k$ seeds that initiates the largest scale propagation? The intrinsic difficulty of this problem is rooted in the exponentially growing configuration space with $k$ . In fact, it was proven to be among the class of the hardest optimization problems - NP hard [22], and thus can only be solved approximately via heuristic approaches in polynomial time.

4.1 Greedy algorithms

One of the most intuitive solutions is to use greedy algorithm that selects the $k$ most influential single spreaders to approximate the optimal set of influencers. In this approach, the influence of single influencers can be estimated by averaging a large number of Monte Carlo simulations of spreading processes initiated by each node. As proposed in Kempe et al. [22], the optimal set of influencers $S$ is obtained by recursively adding the node that leads to the largest marginal increase to the total influence. The influence function $\sigma(S)$ , defined as the expected number of active nodes given the seed set $S$ , can be calculated by Monte Carlo simulations. The marginal contribution of an individual influencer $i$ , $\sigma_{S}(i)$ , can then be computed through $\sigma_{S}(i)=\sigma(S\cup\{i\})-\sigma(S)$ . For a general class of spreading models including ICMs, the influence function $\sigma(S)$ was proven to satisfy the characteristic of the so-called submodularity [144, 145] – A function $\sigma(\cdot)$ is submodular if the marginal gain from adding an element to a set $S$ is at least as high as the marginal gain from adding the same element to a superset of $S$ . In 1978, Nemhauser et al. mathematically proved that, for problems with submodular property, a greedy heuristic always finds a solution whose value is at least $1-[(K-1)/K]^{K}$ times the optimal value [144, 145]. Here $K$ is the size of seed set. This bound has a limiting value of $1-1/e$ , which is independent of the size of network or seed set. Leveraging on this theoretical result, the simple greedy algorithm for these models is guaranteed to approximate the optimal influence within a factor of $1-1/e\approx 63\%$ , i.e., $\sigma(S)\geq(1-1/e)\sigma(S^{*})$ , where $S$ is obtained from the greedy algorithm and $S^{*}$ is the actual optimal set.

In case the cost of removing each node is not identical, the result of this basic greedy algorithm can be far from optimal. In such circumstance, a naive modification of the basic greedy algorithm can be made by favoring the node with maximum benefit-cost ratio. Unfortunately, this intuitive generalization can perform arbitrarily worse than the optimal solution $S^{*}$ . In order to guarantee a relatively good performance, Leskovec et al. proposed the Cost-Effective Forward (CEF) algorithm [146]. As a combination of the benefit-cost and unit-cost greedy algorithms, the CEF algorithm provides a constant factor $(1-1/e)/2$ approximation of the maximal influence. Even though each of the two basic greedy algorithms can perform arbitrarily bad, it was proved that for a given circumstance, at least one of them could obtain a relatively good performance.

Due to the heavy computational burden of massive Monte Carlo simulations, greedy algorithms are unscalable to large-scale networks. This can be partly alleviated by exploiting the sparsity of cost reductions [146]. Furthermore, by exploiting the submodular property of the influence function, the number of simulations can be significantly reduced in practice. Given that the marginal increment of a node is monotonically decreasing with the growth of $S$ , there is no need to recompute the marginal increments for all nodes at each time step. Specifically, if the marginal increment of a node $i$ in previous time steps is already smaller than that of another node $j$ in current time step, the recomputation for $\sigma(i)$ is unnecessary as it is definitely smaller than $\sigma(j)$ . In calculations, the marginal influence of each node $\sigma(i)$ is marked valid initially. Before the next influencer is selected, the nodes are scanned in a decreasing order of their influence. If $\sigma(i)$ for the top node $i$ is invalid, it is recomputed and inserted into the existing order using a priority queue. If the recomputation leads to a new value that ranks at the top, it should be added into $S$ without calculating the marginal increments for any other nodes. This cost-effective lazy forward (CELF) algorithm leads to far fewer evaluations of the influence function and achieves up to a factor of 700 improvement in speed compared to CEF with equal performance. Further improvement of CELF can be made by recording the node with largest marginal gain among the nodes that are already examined in the current iteration in a heap data structure [147]. This technique can improve the efficiency of CELF by another 35-55%.

Further improvement of greedy algorithms was achieved using the connection between ICMs and percolation. As indicated before, ICMs can be mapped to a bond percolation. Based on this idea, Chen et al. performed a bond percolation on a graph $G$ to estimate the influence of a seed set [148]. Specifically, each link in a graph $G$ is randomly selected with the predefined transmission probability, and the selected links form a subgraph $G^{\prime}$ . Then the influence function $\sigma(S)$ can be quantified by the number of vertices reachable from $S$ in $G^{\prime}$ , where each edge in $G^{\prime}$ is regarded as a real propagation path. With this simplification, the influence of a single node $i$ can be obtained with a linear scan of the graph $G^{\prime}$ and its marginal increment to $S$ is either [math] or $\sigma(i)$ , depending on whether $i$ is in the influence range of $S$ or not. This procedure provides $O(N)$ speedup to the basic greedy algorithm. In implementation, it can be proceeded in combination with CELF to avoid unnecessary evaluations.

Despite above improvements of greedy algorithms for independent cascade model, it is still prohibitive for massively large social networks with millions of users. In order to reach the tradeoff between performance and computational efficiency, Chen et al. also proposed a heuristic degree discount algorithm [148]. The basic idea of the degree discount algorithm is that $\sigma(i)$ should be quantified by its degree discounted by the number of its neighbors that are already included in $S$ . For ICMs with a small propagation probability, the indirect influence between multi-hop neighbors is negligible so we can only take into account the direct influence between immediate neighbors. Under this assumption, a more precise metric was proposed. The performance of this algorithm nearly matches that of the basic greedy algorithms. Furthermore, it is far more efficient in combined use of the heap data structure and scalable for large-scale networks.

Another scalable variant of the basic greedy algorithm was developed based on local influence regions [149]. The maximum influence arborescence (MIA) algorithm assumes that propagations tend to be along the maximum influence paths (MIP) between each pair of nodes, which are defined as the path with the highest propagation probability among all possible ensembles. For a given pair of nodes, the MIP between them can be computed efficiently using the Dijkstra shortest-path algorithm [150, 151]. The union of MIPs starting or ending at a node $i$ form an arborescence structure, which defines its local influence region denoted by $\delta(i)$ . The global influence of a set $S$ is then quantified by the size of the union of all local influence regions: $\sigma(S)=|\bigcup_{i\in S}\delta(i)|$ , where $|\cdot|$ denotes the size of a set. A tuning parameter is introduced so that all MIPs with probability below $\theta$ are discarded. By adjusting the parameter $\theta$ , the size of the local influence regions can be altered so that tradeoff between computational efficiency and performance is achieved. Based on such approximations, the local marginal increment of a node can be calculated with significantly high efficiency. As the local influence function is also submodular, the basic greedy algorithm guarantees the $1-1/e$ approximation bound for influence maximization. The linearity of local marginal influence allows for the efficient update of incremental influence during iterations. More importantly, the update is only required in a local influence region around the selected influencer.

Wang et al. proposed a community-based greedy algorithm for mining top-k influential nodes in mobile social networks [152]. In the algorithm, communities with regional information diffusion are first detected, and influential nodes are then located by selecting certain communities using a dynamic programming algorithm. As shown in recent works, modularity of networks has significant impact on information diffusion [153, 154, 155]. In the general idea, the community-based greedy algorithm considers information diffusion within each community to disentangle their interactions, thus simplifies the process of selecting multiple influencers. This algorithm was found to be more than an order of magnitude faster than typical greedy algorithms. In a recent work by Hu et al., authors employed percolation theory to show that spreading processes of ICM are limited to a local area in most occasions [156]. Therefore, local structure can identify and quantify influential global spreaders in large scale social networks. An efficient percolation-based greedy algorithm was proposed.

In another line of research, instead of using Monte Carlo simulations, centrality metrics based on the topological structure of the underlying network were adopted to estimate nodes’ influence. These metrics are independent of specific spreading processes thus can be calculated with high computational efficiency. In addition, they also shed light on the impact of network topology on spreading processes, which is of great significance in both accelerating and confining propagations. Instead of actually running the spreading process, these metrics are mostly based on the local or global topology of a node in the network, for instance, number of immediate neighbors [28, 29, 59, 157], global position [17, 158, 159, 160, 161, 162], number of shortest paths [163, 164, 165, 166], random walks [167, 168, 169], eigenvectors [170, 171, 172, 173], path counting [174, 175, 176, 177], etc. Even though the optimal metric that performs best for all spreading dynamics on all underlying networks does not seem to exist [178, 179, 180], these centrality-based approaches are still persistently used due to their simplicity and relative satisfactory performance in some occasions.

4.2 Message passing approach

Although the greedy optimization guarantees to approximate the maximum influence by a constant factor, it often suffers from the drawback of being trapped into local optimum. From an optimization point of view, the message-passing approach, which has been well developed in statistical physics [129, 181], can avoid such undesirable situation. In addition, message-passing algorithms usually scales almost linearly with the number of edges, which makes it applicable to large real-world networks. Based on message-passing approach, Altarelli et al. developed the belief-propagation (BP) and max-sum (MS) algorithms for the problem of optimal immunization for SIR and SIS model [46].

For each configuration $\mathbf{s}=(s_{1},s_{2},...,s_{N})$ , the following energy function is considered

[TABLE]

where $s_{i}\in\{0,1\}$ ( $s_{i}=1$ if $i$ is immunized, and $s_{i}=0$ otherwise), $c_{i}$ is the cost of immunizing node $i$ and $m_{i}$ is the probability that $i$ is eventually infected in the case of SIR model, or the probability that it is infected in the stationary state in the case of SIS model. The parameters $\mu$ and $\epsilon$ control the tradeoff between the cost of immunization and the cost in treating infected patients. The constraint on all feasible configurations is manifested by the local update equations of $m_{i}$ . Based on the energy function $\varepsilon(\mathbf{s},\mathbf{m})$ , a Boltzmann weight $e^{-\beta\mathcal{E}}$ is assigned to each feasible configuration, where $\beta$ is the inverse temperature. Take the SIR model for an example, the probability $m_{ij}$ that node $i$ is infected in the absence of it neighboring node $j$ satisfies a set of equations:

[TABLE]

where $q$ is the self-infection probability, $p$ is the transmission probability, and $\partial i\setminus j$ denotes the neighbors of node $i$ excluding $j$ . Then the marginal probability $m_{i}$ that node $i$ is eventually infected is

[TABLE]

Based on the locally-tree like assumption, BP equations can be derived and solved through iteration making use of the properties of convolutions of messages. As $\beta\to\infty$ , the Boltzmann distribution is concentrated on the optimal configuration with the lowest energy cost. In addition, the MS equations can be developed to find the nearly optimal set of immunized nodes. In simulations, MS algorithm performs better than the topological-based heuristics, greedy algorithm as well as simulated annealing.

In a recent work by Min [182], the message-passing approach was used to calculate analytically the expected size of epidemic outbreaks originated from a single seed. It was found that, while the probability of triggering an epidemic outbreak depends on the location of the seed, the final size of the outbreak is insensitive to the seed once it occurs. This approach is also applicable to weighted networks.

For ICMs, two important problems are connected: the optimal selection of nodes to either minimize or maximize the influence. The minimization problem, equivalent to optimal percolation, aims to find the “superblockers” that should be removed to make $G_{\infty}$ as small as possible. Instead, “superspreaders” are those that maximize the average influence if selected as seeds. Radicchi and Castellano performed an extensive analysis over a range of real-world networks and found that these two optimization problems are not equivalent, i.e., superblockers are not superspreaders [183]. The identification of superblockers is based purely on the topology of the network, while superspreaders in influence maximization problem are strongly dependent on the parameters of the spreading dynamics.

4.3 Sequential seeding

In above discussed studies, influencers or information seeds are activated simultaneously at the start of diffusion (i.e., single stage seeding). An alternative approach would be to initiate seeds sequentially, which allows the diffusion take place before next seeds are selected. Such sequential seeding strategy has the advantage of avoiding selecting highly ranked nodes that are already activated by previous diffusion. Jankowski et al. introduced several approaches for sequential seeding, and discussed the balance between diffusion speed and coverage [184]. Using experiments in real-world networks, it was found that sequential seeding strategies achieve better coverage than single stage seeding in about 90% of cases. Longer seeding sequences can activate more nodes but prolong the duration of diffusion. Authors proposed several variants of sequential seeding to resolve the trade-off between diffusion coverage and speed.

Jankowski et al. further presented a formal proof that sequential seeding performs at least as good as the single stage seeding does in terms of spread coverage [185]. It was shown that, under mild assumptions, sequential seeding outperforms single stage seeding using the same number of seeds and node ranking. Authors compared single stage and sequential approaches with the greedy approach in experiments on directed and undirected graphs, and demonstrated that applying sequential seeding to a simple degree-based ranking leads to higher diffusion coverage than the computationally expensive greedy algorithm.

4.4 Summary

We summarize features of the methods introduced in this section in Table 2. For greedy approaches, the central task is to estimate the influence of each node, using either Monte Carlo simulation or local structural information. Following this idea, its improvement is designed along two directions: avoiding unnecessary simulations or develop better local proxies for influence. The performance of greedy algorithms is guaranteed for dynamics with submodular property. The message-passing approach calculates the spreading outcomes by solving a set of BP equations, thus considers the problem from a global viewpoint. In addition, there is no requirement for the submodular property. The sequential seeding strategy aims to maximize diffusion coverage by adopting an alternative seeding approach, which brings our attention to the trade-off between diffusion coverage and speed.

In a recent work by Erkol et al. [186], the performance of 16 methods for identifying influential spreaders in ICMs were systematically compared on a large corpus of 100 real-world networks. Extensive numerical experiments indicate that the performance of many simple heuristic methods, such as adaptive degree and closeness centrality, is similar to that of more computationally expensive greedy algorithms. This provides some practical methods for large-scale problems where greedy algorithms are prohibitive. It was also found that the performance can be further improved towards the optimality by using hybrid methods that combine multiple topological metrics.

5 Dynamics with discontinuous transitions

Threshold models and k-core percolation are frequently used to describe cascading processes with discontinuous phase transitions in various disciplines, for instance, failure propagation in infrastructure [94], diffusion of innovations in social networks [187], and adoption of new behaviors [188]. By definition, k-core percolation is a special case of a more general class of threshold models where each node has a fixed threshold $k$ . The fundamental difference from threshold models to ICMs is that, in threshold models, the state of a node is collectively determined by the states of all its neighbors. As a consequence, the impact of perturbing one node can propagate to a vast area of the network through long-range chains of interactions, manifested by a discontinuous phase transition in network dynamics. In this section, we first introduce methods developed for linear threshold models (LTMs) using greedy strategy, belief-propagation and collective influence, and then discuss algorithms designed for k-core percolation. Note that algorithms designed for LTMs are applicable to k-core percolation.

5.1 Linear threshold models

Linear threshold models have several different forms. A typical LTM is defined on a weighted network $G=(V,E,\omega)$ , where $\omega:V\times V\rightarrow[0,1]$ is a weight function and $\omega=0$ iff the corresponding edge does not exist. Similar to ICMs, the spreading process in LTMs is initiated by a set of seeds while all other nodes are inactive. In following steps, a node is activated if the sum of weights of its active neighbors reaches its predefined threshold value $\theta_{i}$ , i.e. $\sum_{j\in\partial i}\omega_{ij}\geq\theta_{i}$ , where $\partial i$ stands for the set of neighbors of node $i$ . In another form, a node is activated if it has at least a certain number of active neighbors.

The threshold value for each node can be either a fixed constant or a random variable drawn from a predefined distribution. For LTMs with a uniform fixed threshold value $\omega\in[0,1]$ , Singh et al. studied the cascade size as a function of the fraction of seeds [189]. It was found that even for large threshold values, a critical fraction of seeds exists beyond which the cascade becomes global. In addition, networks with community structure and high clustering were found more effective in facilitating cascade than homogeneous random networks. For LTMs with heterogeneous thresholds, Karampourniotis et al. examined how cascade size varies with the standard deviation of the distribution of thresholds [190]. Using a truncated normal distribution, authors varied the distribution of thresholds between two extreme cases: identical thresholds and a uniform distribution. A non-monotonic change in the cascade size appeared with the varying standard deviation, indicating that, for a given number of seeds, an optimal variance of the threshold distribution exists.

5.1.1 Greedy approach

The greedy algorithm is also applicable to LTMs. For a special class of LTMs where the weight of each edge and the threshold of each node are drawn uniformly from the interval $[0,1]$ , it was proved that its influence function is submodular [22]. Therefore, the influence maximization problem in this class of LTMs can be approximately solved by greedy algorithms.

Like ICMs, a linear threshold model can be also mapped to a modified percolation process defined as follows: Each node $i$ picks at most one of its incoming edges, with probability $\omega_{ji}$ to select the edge from $j$ to $i$ and $1-\sum_{j}\omega_{ji}$ to select none. The selected edges are defined as live. Considering the subgraph $G^{\prime}$ composed of live edges, Kempe et al. proved that for a given set $S$ , the number of nodes activated by $S$ in LTMs has the same distribution with the number of reachable nodes of $S$ in the subgraph $G^{\prime}$ [22].

Using the same mapping, Chen et al. gave an efficient approximation of the influence of an individual node in a local subgraph [149]. In cases where the weights $\omega_{ij}$ and $\omega_{ji}$ are not symmetrical, the undirected graph $G$ can be transformed into an equivalent directed graph, where edges from $i$ to $j$ and from $j$ to $i$ are both included. Using the randomized algorithm of Cohen [191], the influence of a set $S$ is quantified by the number of nodes reachable from $S$ in the subgraph $G^{\prime}$ . Although computing the exact influence in a network is $\#$ P-hard, this approximation based on directed acyclic graph (DAG) can be finished within linear time. In order to further accelerate the calculation, a local DAG (LDAG) is considered instead of DAG. Validation of this approximation is supported by the exponential decay of influence with the propagation length. The construction of LDAG should include a majority part of influence from other nodes while discarding the nodes with small influence. Similar to the idea in Ref. [149], a threshold is introduced to control the size of LDAG, so that the tradeoff between efficiency and accuracy can be tuned. Once the LDAG is constructed, the incremental influence of each node can be quantified with great efficiency. As a result, the LDAG algorithm is scalable to networks with millions of nodes and is among the best greedy algorithms in performance.

The LDAG algorithm assumes that the influence of a node is mainly bounded within its LDAG. However, if the spreading process starting from a node can reach outside its LDAG, the estimation of influence in the LDAG algorithm might be inaccurate. Besides, the algorithm depends heavily on the proper choice of a high quality LDAG, which is an NP-hard problem itself. To avoid these problems, Goyal et al. developed the SIMPATH algorithm in which the influence of a node is quantified by enumerating the simple paths starting from it [147]. Although this problem is also $\#$ P-hard, it can be well approximated with high efficiency by enumerating paths within a small neighborhood. With this approximation, the influence of a set $S$ can be calculated as the sum of influence of each node in it on appropriately induced subgraphs. Similar to the arborescence structures constructed in Ref. [149], a tuning parameter is introduced to control the size of the neighborhood, which leads to a direct trade-off between the accuracy and computational efficiency. To reduce the number of estimation calls in SIMPATH, a vertex cover optimization was introduced so that only the influence of nodes in the vertex cover set needs to be computed. For the rest of the nodes, their influence can be derived from their neighbors. Besides, as the seed set $S$ grows larger, a look ahead optimization can be made to accelerate the estimation: It picks the top $l$ most promising candidates as a batch in the start of an iteration and shares the marginal gain computation within the batch. Extensive experiments on real datasets show that compared with the basic greedy algorithm, the SIMPATH algorithm is more efficient, consumes less memory and produces seed sets with larger influence.

In a recent study by Karampourniotis et al., two different metrics were proposed to find influencers for LTMs with fixed heterogeneous thresholds [192]. The first metric, termed Balanced Index (BI), tends to select nodes with high resistance to activation and those with large out-degree. BI is a linear combination of three properties of a node including degree, susceptibility to new information, and the impact its activation would have on its neighbors. The performance of BI depends on the weights of these three properties. The second metric, termed Group Performance Index (GPI), quantifies the impact of each node as a seed when it is part of randomly selected seed set. For LTMs with fixed and known thresholds, these two metrics were found effective for influence maximization.

The performance of most greedy algorithms mentioned above is guaranteed thanks to the submodular property of the influence function. However, for a general LTM with fixed weights and thresholds, the influence function is not always submodular [22]. An important class of LTM that may not be submodular is defined as follows: A node $i$ is activated only after a certain number $m_{i}$ of its neighbors are activated. The variation of threshold $m_{i}$ can lead to two qualitatively different classes of cascades featured by either continuous or discontinuous phase transitions. For instance, in the special case when $m_{i}=k_{i}-1$ ( $k_{i}$ is the degree of node $i$ ), the scale of propagation experiences a continuous phase transition [20]. In contrast, for k-core percolation and bootstrap percolation, a first-order, or discontinuous phase transition may appear [44]. Solutions to the influence maximization problem in LTMs without submodular property require a better understanding of the physical mechanism of the spreading process, and will be introduced in detail in following subsections.

5.1.2 Belief-propagation algorithms

For the influence maximization problem on a general LTM, Altarelli et al. regarded it as a nontypical trajectory deviated from the average behavior of dynamics initiated by randomly chosen seeds [60]. To explore the dynamical properties of nontypical trajectories of general LTMs, Altarelli et al. proposed a BP algorithm that could estimate statistical properties of nontypical trajectories and found the initial conditions that lead to cascading with desired properties [47]. In contrast to ICMs, the trajectory of a given LTM is determined solely by its initial condition. Due to the irreversibility of LTM dynamics, the spreading process can be parameterized by a configuration $\mathbf{t}=(t_{1},t_{2},...t_{N})$ , where $t_{i}\in\mathbf{T}=\{0,1,...T,\infty\}$ is the activation time of node $i$ . Considering the properties of LTM, the dynamical rule can be represented by the constraint on the activation time of a node and its neighbors [47]: $t_{i}=\phi_{i}(\{t_{j}\})$ , where

[TABLE]

Based on this static parametrization of LTM, the following Boltzmann distribution is considered:

[TABLE]

where $\psi_{i}(t_{i},\{t_{j}\})=\mathbb{I}[t_{i}=0]+\mathbb{I}[t_{i}=\phi_{i}(\{t_{j}\})]$ , $Z=\sum_{\mathbf{t}}e^{-\beta\varepsilon(\mathbf{t})}\prod_{i\in V}\psi_{i}(t_{i},\{t_{j}\})$ . The most common form of the energy function is $\varepsilon(\mathbf{t})=\sum_{i}\varepsilon_{i}(t_{i})$ , where $\varepsilon_{i}(t_{i})=\mathbb{I}[t_{i}=0]-\varepsilon\mathbb{I}[t_{i}<\infty]$ . For $\varepsilon=0$ , the distribution degenerates to the spreading dynamics initiated by a random set of seeds.

In order to avoid short loops in the factor graph that describes the constraints of a configuration, a dual factor graph is constructed with a variable node $(t_{i},t_{j})$ introduced to each edge $(i,j)$ . The obtained dual factor graph is locally tree-like if the original network is so. This allows for the application of the cavity method. Denote $P_{j}(t_{j})$ as the marginal probability that node $j$ is activated at time $t_{j}$ . In a tree-like factor graph, it can be calculated as

[TABLE]

where $H_{ij}(t_{i},t_{j})$ is defined as the probability that nodes $i$ and $j$ get activated at $t_{i}$ and $t_{j}$ respectively in the absence of the constraint $\psi_{j}$ and the energy term $\varepsilon_{j}$ . This equation computes the contribution from all neighbors of node $j$ . In the dual factor graph, $H_{ij}(t_{i},t_{j})$ , named cavity marginals or “beliefs”, satisfy local constraints described by a set of belief-propagation (BP) equations. In particular, the recursive relation of the cavity marginal $H_{ij}(t_{i},t_{j})$ on the dual factor graph defines the following belief BP equations [47]:

[TABLE]

Here $\psi_{i}(t_{i},\{t_{k}\})$ is the local constraint on links connected to node $i$ (except node $j$ ), the product term computes the contribution of “beliefs” from the neighbors of node $i$ excluding node $j$ , the summation term considers different occasions of $t_{k}$ for neighbors of node $i$ , and $e^{-\beta\varepsilon_{i}(t_{i})}$ defines the weight for energy $\varepsilon_{i}(t_{i})$ using the Boltzmann distribution. The BP equations are solved through iteration. Once the fixed values of the cavity marginals are obtained, the marginal $P_{j}(t_{j})$ and other statistics of nontypical trajectories, such as the entropy and distribution of activation time, can be subsequently computed.

On homogeneous random regular graphs, the BP equations can be simplified to a self-consistent equation of a single marginal. Analysis for different threshold values indicates quantitative difference in the distribution of activation time $P(t)$ for the regimes of continuous and discontinuous transitions. Specifically, for continuous transitions, $P(t)$ is monotonically decreasing. On the contrary, $P(t)$ shows a second peak for discontinuous transitions, corresponding to the abrupt cascade activation. In order to obtain the optimal set of seeds, Max-Sum equations can be derived by setting the inverse temperature $\beta\to\infty$ in the energy function [47]. Authors performed numerical experiments on a real-world network (the Epinions network) with an energy function $\varepsilon(\mathbf{t})=\sum_{i}\{c_{i}\mathbb{I}[t_{i}=0]-r_{i}\mathbb{I}[t_{i}<\infty]\}$ , where $c_{i}$ is the cost of seeding node $i$ and $r_{i}$ is the revenue generated by the activation of node $r$ . The Max-Sum algorithm was compared with competing methods including greedy algorithm based on energy computation (GA), greedy algorithm based on HITS (HITS), high degree (Hubs) and simulated annealing (SA). The Max-Sum algorithm outperforms other approaches by selecting the seed set that best tradeoffs the revenue and cost. The performance of Max-Sum algorithms on synthetic networks also outperforms a range of centrality metrics, as shown in Fig. 8.

Extending the work under the assumption of replica symmetry, Guggiola and Semerjian [100] studied the minimal contagious set problem for LTM dynamics with and without a constraint on the maximal activation time $T$ . In this theoretically impressive work, authors aim to find the theoretical limit of the minimal contagious set (i.e., the minimal seed set that can activate the entire graph) in random regular graphs using the cavity method with the effect of replica symmetry breaking. Following the theoretical development, a survey propagation like algorithm [193] is investigated on single instances of random regular graphs to find the exact seed set. It was found that the survey propagation algorithm achieves near-optimal performance for small activation time limit. For a large activation time limit, authors reported convergence issues in iteration that cannot be effectively solved by a simple damping. However, stopping the iterations after a predefined time proved to be a pragmatic and satisfactory strategy. In this work, authors tested the algorithm on random regular graphs; in practice, how survey propagation algorithm works for more realistic networks needs to be tested. Readers interested in the survey propagation algorithm can find details in Ref. [100].

5.1.3 Collective influence in threshold model

The collective influence theory can be generalized to deal with the influence maximization in general LTMs [194]. For a network $G(V,E)$ with $N$ nodes and $M$ links, we use the vector $\mathbf{n}=(n_{1},n_{2},\cdots,n_{N})$ to record whether a node $i$ is chosen as a seed ( $n_{i}=1$ ) or not ( $n_{i}=0$ ). The LTM spreading starts from a $q=\sum_{i}n_{i}/N$ fraction of active seeds and evolves following a threshold rule: a node $i$ becomes active if it has at least $m_{i}$ active neighbors. Here, the threshold $m_{i}$ is an integer ranging from 1 to the degree of node $i$ . Further, we introduce $\nu_{i}$ to indicate the final state of node $i$ : active ( $\nu_{i}=1$ ) or inactive ( $\nu_{i}=0$ ). For a given $q$ fraction of seeds, the influence maximization problem aims to find the optimal set of seeds so that the size of active population is maximized.

For each link $i\to j$ , we introduce a binary variable $\nu_{i\to j}$ as the indicator of $i$ being in the active state assuming node $j$ is disconnected from the network. For locally tree-like networks, $\nu_{i\to j}$ satisfies a set of self-consistent message-passing equations. Different from the case of optimal percolation in Ref. [20], the zero solution is not a fixed point. As a consequence, the stability analysis around zero solution in Ref. [20] is no longer valid for LTMs. However, the solution can be approximated through iteration of the linearized system. By linearizing the equations, it was found that the subsequent activation of nodes in each iteration only depends on the number of subcritical nodes, defined as the nodes with $m_{i}-1$ active neighbors (i.e., nodes whose activation can be triggered by one more active neighbor). Moreover, subcritical nodes can form long subcritical paths that generate long-range cascade of activation, which is core to the discontinuous transition in LTM dynamics. Following this idea, the CI-TM (Collective Influence in Threshold Model) algorithm was proposed that recursively selects nodes with the largest CI-TM score. The CI-TM score enumerates the number of subcritical paths starting from each node, and uses that to quantify nodes’ spreading capability. With an $O(N\log N)$ computational complexity, the efficient CI-TM algorithm is applicable to large-scale networks. In numerical simulations, the CI-TM algorithm outperforms the greedy algorithm and several widely used heuristic centralities, and achieves comparable performance to the Max-Sum algorithm in synthetic random networks (see Fig. 8).

5.2 K-core percolation

Because k-core percolation is a special case of LTMs, influence maximization algorithms developed for general LTMs can be naturally extended to work for k-core percolation.

In statistical physics and combinatorial optimization, several theoretical works have explored the lower and upper bounds on the size of the minimal set to destroy the k-core. In the evaluation of approximating algorithms, these results can help us to assess how far the estimated size of minimal contagious set is from the theoretical limit. For instance, Bau et al. studied the decycling numbers of random regular graphs [110]. As stated before, the decycling process is equivalent to destroying the 2-core of networks. For a random cubic graph $G$ that all nodes have degree 3, it was proven that the decycling number $\phi(G)=\lceil N/4+1/2\rceil$ as the graph size $N\to\infty$ . For a general random $d$ -regular graph $G$ with $N$ nodes ( $d\geq 4$ ), authors proved that $\phi(G)/N$ is bounded below and above asymptotically almost surely by certain constants that depend solely on $d$ . In particular, the lower and upper bounds can be calculated by solving an algebraic equation and a set of differential equations, respectively. Janson and Thomason found that, for sparse random graphs or random regular networks with $N$ nodes with $N\to\infty$ , the number of nodes that must be removed so that no component with more than $k$ nodes exists is essentially the same for all values of $k$ if $k\to\infty$ and $k=o(N)$ [195]. Reichman showed that the size of a contagious set is bounded from above by $\sum_{v\in V}\min\left\{1,\frac{k}{d(v)+1}\right\}$ in the destruction of k-core ( $d(v)$ is the degree of node $v$ ) [103]. Later, using the cavity method with replica symmetry breaking, Guggiola and Semerjian [100] obtained several conjectures on the size of minimal contagious sets for k-core percolation in random regular graphs. In particular, authors conjectured that the minimal contagious set size is 1/6 for 5-regular random graphs with a threshold of 3, and 1/4 for 6-regular with threshold 4. In addition, they also proposed the conjecture for ( $k$ +1)-regular networks with the threshold $k$ that the minimal contagious set size is $1-2(\ln k)/k-2/k+O(1/(k\ln k))$ . According to this conjecture, the minimal contagious set size 3-regular (cubic) random graphs with a threshold of 2 is 1/4, which is in agreement with the decycling number of cubic random graphs $\phi(G)/N\to 1/4$ obtained in Ref. [110]. Sun et al. also proposed a lower bound of the network dismantling problem by analyzing specific 2-core subnetworks of many real-world networks that have heterogeneous degree distribution [196]. Coja-Oghlan et al. explored the minimal contagious set problem on graphs with expansion properties [197].

Recently, Schmidt et al. [133] studied the minimal contagious sets for k-core percolation in random networks. In this work, authors proposed a generalized CoreHD algorithm, in which nodes with the highest degree in the k-core are recursively removed until the k-core completely collapses. To analyze the property of this algorithm, the generalized CoreHD-guided k-core removal was translated to a random process on the degree distribution of the graph [198, 199]. The running time of the process, characterized by a set of nonlinear ordinary differential equations, describes the behavior of the algorithm on a random graph. By analyzing the stopping time, new upper bounds on the minimal contagious set were obtained, which improve the best currently known ones in Ref. [100, 110]. This approach is applicable not only to random regular graphs, but also to random networks generated from the configuration model with a given degree distribution. Inspired by the analysis of the CoreHD algorithm, an improved algorithm, called WEAK-NEIGHBOR, was developed. In this algorithm, instead of removing high degree nodes, nodes with the highest value $k_{i}-\sum_{j\in\partial i}k_{j}/k_{i}$ in the k-core are removed ( $k_{i}$ is the degree of node $i$ ). For networks with bounded degree, the algorithm has $O(N)$ complexity, where $N$ is the network size. In numerical experiments, the WEAK-NEIGHBOR algorithm improves over the generalized CoreHD algorithm and CI-TM algorithm in a range of k-core percolation processes in random regular graphs.

5.3 Summary

For LTMs, the major effort in greedy approach is to develop more efficient and accurate estimation of marginal increments using local network structure. This pursuit has inspired different techniques designed for this goal. Most greedy methods quantify the marginal increment by the number of nodes that would be activated if a node is selected as a seed. The CI-TM algorithm, in contrast, uses the number of subcritical paths attached to a node to estimate the marginal increment. Belief-propagation approaches solve the problem as a global issue through iteration, and can flexibly incorporate the cost of activating seeds. Apart from devising practical methods to solve the influence maximization problem for LTMs, analytical works on random regular graphs would help to identify how far away current approaches are from the theoretical limit of the size of optimal seed set. Features about the introduced methods are summarized in Table 3.

6 Conclusions and discussions

With an increasing number of real-world complex systems formulated as networks, a theory for identifying influencers is required to facilitate a better understanding and control of various dynamical complex systems. Over the years, this problem has been extensively studied in different contexts by physicists, mathematicians, sociologists, computer scientists, etc. In this survey, we review recent advances in this area. Because this topic spans a wide spectrum of research, we cannot report every relevant work exhaustively. However, we try to organize the survey in a way such that recent developments made in several fields of broad interest are covered.

Despite great advances in influencer identification, many ongoing problems and directions exist that need to be addressed in future works. First, as shown in several theoretical works, even for homogeneous structure such as random regular networks, there is still a gap between the result obtained from the state-of-the-art algorithms and its theoretical limit. This provides a room that we can improve in algorithm design. Second, the topological structure of real-world complex systems can be much more complicated than the case considered in ideal conditions. In a recent comparative analysis, it was found that recently proposed techniques perform well only on specific network types [200]. Further, connections may be time-varying in temporal networks [201], or posses complicated interlayer interactions in multiplex networks [202, 203]. Third, in many systems, links are often of different types with distinct functions. These systems cannot be described by the simple network structure discussed before, and do not even admit a formal definition of influencers. In future works, these open problems remain to be explored in more detail.

In terms of applications, use of influencer identification theory in biological, social and engineering systems is still very limited. As some advanced methodologies in statistical physics are technical and challenging to interpret, applying the latest progresses of influencer identification in specific real-world systems can better illustrate and disseminate these techniques. Moreover, current methods are mostly developed under ideal conditions. In real-world systems, errors or noises inevitably exist [204, 205]. How to quantify and alleviate the impact of errors or noises is of great practical values in applications. In addition, certain non-dynamical factors beyond the simplified assumption in pure modeling studies, e.g., human activity [206, 207, 208, 209, 210, 211, 212, 213, 214], homophily [215, 216, 217], complex contagion [188, 218] and social influence bias [219], may need to be considered. This calls for a deeper understanding of the systems under study and a more integrative application of the influencer identification theory.

Funding

Part of this work was supported by the National Institutes of Health [R01EB022720, U54CA137788, and U54CA132378 to H.A.M.]; National Science Foundation [1515022 to H.A.M.]; Army Research Laboratory [W911NF-09-2-0053 to H.A.M.]; and China Scholarship Council and the Academic Excellence Foundation of BUAA for PhD Students (to J.W.).

Bibliography219

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. (2015) Epidemic processes in complex networks. Rev. Mod. Phys. , 87 (3), 925.
2[2] Zhang, Z.-K., Liu, C., Zhan, X.-X., Lu, X., Zhang, C.-X. & Zhang, Y.-C. (2016) Dynamics of information diffusion and its applications on complex networks. Phys. Rep. , 651 , 1–34.
3[3] Bullmore, E. & Sporns, O. (2009) Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. , 10 (3), 186.
4[4] Montoya, J. M., Pimm, S. L. & Solé, R. V. (2006) Ecological networks and their fragility. Nature , 442 (7100), 259.
5[5] Newman, M. E. J. (2003) The structure and function of complex networks. SIAM Rev. , 45 (2), 167–256.
6[6] Barrat, A., Barthelemy, M. & Vespignani, A. (2008) Dynamical processes on complex networks . Cambridge University Press, New York.
7[7] Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D.-U. (2006) Complex networks: Structure and dynamics. Phys. Rep. , 424 (4-5), 175–308.
8[8] Albert, R. & Barabási, A.-L. (2002) Statistical mechanics of complex networks. Rev. Mod. Phys. , 74 (1), 47.