A Differential Game Approach to Decentralized Virus-Resistant Weight Adaptation Policy over Complex Networks
Yunhan Huang, Quanyan Zhu

TL;DR
This paper models virus spread in complex networks using a differential game framework, proposing a decentralized weight adaptation policy to mitigate malware propagation and improve network resilience.
Contribution
It introduces a novel differential game approach for decentralized virus mitigation and designs a penalty-based mechanism to align individual actions with social welfare.
Findings
Nash equilibrium structure characterized in the epidemic control game.
Decentralized weight adaptation reduces virus spread effectively.
Penalty scheme improves overall network resilience.
Abstract
Increasing connectivity of communication networks enables large-scale distributed processing over networks and improves the efficiency for information exchange. However, malware and virus can take advantage of the high connectivity to spread over the network and take control of devices and servers for illicit purposes. In this paper, we use an SIS epidemic model to capture the virus spreading process and develop a virus-resistant weight adaptation scheme to mitigate the spreading over the network. We propose a differential game framework to provide a theoretic underpinning for decentralized mitigation in which nodes of the network cannot fully coordinate, and each node determines its own control policy based on local interactions with neighboring nodes. We characterize and examine the structure of the Nash equilibrium, and discuss the inefficiency of the Nash equilibrium in terms of…
| Cost functions Network Topology Spreading | ||||||||||
| 20 | ||||||||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence · Complex Network Analysis Techniques · Mathematical and Theoretical Epidemiology and Ecology Models
A Differential Game Approach to Decentralized Virus-Resistant Weight Adaptation Policy over Complex Networks
Yunhan Huang, Quanyan Zhu Y. Huang and Q. Zhu are both with the Department of Electrical and Computer Engineering, New York University, Brookly, NY, USA; e-mail: {yh2315,qz494}@nyu.edu.
Abstract
Increasing connectivity of communication networks enables large-scale distributed processing over networks and improves the efficiency for information exchange. However, malware and virus can take advantage of the high connectivity to spread over the network and take control of devices and servers for illicit purposes. In this paper, we use an SIS epidemic model to capture the virus spreading process and develop a virus-resistant weight adaptation scheme to mitigate the spreading over the network. We propose a differential game framework to provide a theoretic underpinning for decentralized mitigation in which nodes of the network cannot fully coordinate, and each node determines its own control policy based on local interactions with neighboring nodes. We characterize and examine the structure of the Nash equilibrium, and discuss the inefficiency of the Nash equilibrium in terms of minimizing the total cost of the whole network. A mechanism design through a penalty scheme is proposed to reduce the inefficiency of the Nash equilibrium and allow the decentralized policy to achieve social welfare for the whole network. We corroborate our results using numerical experiments and show that virus-resistance can be achieved by a distributed weight adaptation scheme.
Index Terms:
Virus Resistance, Malware Spreading, Differential Game, Complex Networks, Decentralized Control, Mechanism Design, Network Security, Epidemic Processes.
I Introduction
The integration of the information and communications technologies into systems upgrades system performance. However, the integration also degrades the security level of the systems and introduces vulnerabilities that undermine the reliability of critical infrastructure. The connectivity and interdependence of cyber networks make the system even more vulnerable due to the existence of the wide-spreading cyber-attacks on networks. It provides opportunities for the sophisticated and stealthy malware and virus to spread over the network. One noteworthy example is the StuxNet attack [1]. In June 2010, certain control systems of a nuclear-enrichment plant in Iran were infected by a carefully crafted computer worm called StuxNet. The worm, spreading through USB devices, intended to breach the implemented cyberprotection schemes and alter both the measurement and actuation signals which caused instabilities and damage the physical plant[2]. More recent examples of wide-spreading cyber-attacks include WannaCry and Petya Ransomware, which have incurred billions of dollars of losses [3].
With an increasing number of wide-spreading cyber-attacks on networks, protection against malware and virus spreading in cyber networks is central to the security of network systems [3]. However, there are many challenges on designing a protection scheme for cyber networks. One challenge is due to the interderdependency between the microscopic individual behaviors and the macroscopic spreading phenomenon. The local interactions over a large network where nodes communicate, share information, and make interdependent decisions, can result in a macroscopic behavior, which will in turn affect the agents’ behaviors. This type of microscopic and macroscopic couplings has been illustrated in Fig. 1. Another challenge arises from the fact that cyber networks are often formed by a large number of self-interested agents or decision-makers. The noncooperation among the agents makes it almost impossible for the system to be coordinated as a whole to defend against wide-spreading cyber-attacks.
To this end, one way to mitigate the malware spreading over large networks is to control the intensity of interactions with neighboring nodes. By adapting the rate of communications or contacts, nodes can reduce the likelihood of infection. This type of mechanism is called weight adaptation as the weights between two nodes of a network capture the intensity of the connectivity [4]. The most fundamental reason that virus and malware can go viral is the inherent property of networks: connectivity. Weight adaptation is a mechanism that hits the nail. Weight adaptation lowers the connectivity which leaves virus and malware no way out. Compared with quarantining and link removal [5], weight adaptation does not need to completely disconnect nodes from others but rather adjust weights to connect more loosely with nodes with a higher likelihood of infection. Instead of fixing the weights for the whole spreading process, in the weight adapation scheme, each agent dynamically updates their weight in response to the state of the neighboring nodes. Weight adaptation is different from changing the infection rate. The infection rate is usually considered to be decided by some interior factors like physiological or immunological states of individual. The weight between two nodes is usually used to describe how strongly two nodes are connected. Changing the weight can be interpreted as an exterior change.
We consider a directed weighted network where the nodes and the edges represent the agents and the connections between the agents respectively. The directed connection between two nodes can be considered as one agent acquiring information/data/packet from another agent. The weight between two agents quantifies the frequency or the volume of communication between two agents [6]. The original weight is pre-designed by multilateral agreement among agents to achieve certain goals or to optimize the system performance when there is no infection. For example, in distributed estimation or learning problems over networks such as [7, 8, 6], one agent needs to communicate with its neighboring agents at a sufficient rate to find the global estimate of the state. The optimal weighting on the edges quantitatively captures the minimum required frequency of contacting neighboring nodes. As illustrated in Fig. 2, when there are wide-spreading virus or cyber attacks, the agent can decrease the likelihood of being infected by reducing their weight with infected neighbors. The agent then restores the connections when the infected neighbors are recovered. Deviation of the weights from the optimal ones introduces cost induced by performance degradation and system inefficiency. Infected agents may not function normally. The agents and the network system will suffer losses. Thus, it is essential to consider the trade-off between malfunction cost caused by infection and inefficiency or performance degradation cost caused by weight deviation.
In this paper, an -person nonzero-sum differential game-based model is proposed to model the virus spreading and the agents’ adaptive response to virus infection. This model captures the non-cooperative behaviors among agents, dynamic properties of spreading process, and the complexity of the local interactions. We characterize the Nash equilibrium (NE) for the game and investigate the network effects under the non-cooperative strategies. We observe that under the open-loop NE, each agent updates his weight based on its own infection level and its out-neighbors infection level as well as the corresponding component of its costate. When the agent’s own infection level is high, it does not care much about the weight of links to infected out-neighbors. When its out-neighbor’s infection level is high, it lowers more weight of the corresponding connection. The corresponding component of each agent’s costate encodes the information about the network structure and the infection of the whole network.
We use a centralized optimal control problem to serve as a benchmark problem to study the efficiency of the decentralized problem. Under centralized policies, the system operator develops optimal weight adaptation scheme to achieve social optimum. Compared with the centralized solution, the open-loop NE solution is not the best from a system point of view since in the game, agents consider only their own cost. Such inefficiency caused by selfish behavior of agents has a significant impact on network and service management. One example is the congestion in traffic network caused by selfish drivers [9]. To address the inefficiency, we propose a dynamic penalty approach by designing a mechanism in which each agent pays for the infection cost of all agents that are reachable to him/her. We show that with this mechanism, the open-loop NE policy achieves the social optimum.
The equilibrium analysis and the mechanism design lead to a distributed algorithm for the network operator and the agents to compute the optimal weight adaptation where each agent only has to know local information. We summarize the principal contributions as follows:
We propose a differential game model to develop a virus-resistant weight adaptation scheme for cyber networks formed by a group of self-interested agents. 2. 2.
We study the structure of the open-loop NE for the differential game over complex networks and show the weight adaptation rule is based on the agents’ and its out-neighbors’ infection level as well as the costate. 3. 3.
We discuss the inefficiency of the NE. A dynamic penalty scheme is proposed to achieve social optimum for the whole network. 4. 4.
An implementable distributed virus resistance algorithm is proposed to compute the NE-based control policy.
Game theory has long been a useful tool to design strategies on network systems for virus resistance purposes [3, 10, 11, 12]. In [10], the authors have proposed a network formation game that balances multiple partially conflicting objectives such as the cost of installing links, the performance of the network and the resistance to virus. In their work, an undirected unweighted static network is formed. Hayel et.al. in [3, 11] have studied large population game with heterogeneous types of individuals. They focus on group behavior of certain type in stead of individual behavior. Besides game theory, other tools such as impulse control[13], optimal control[14], and optimization [15] have been used to design strategies to mitigate malware attacks and virus spreading.
Virus spreading over adaptive networks has first been studied by Gross et.al. in [16]. They investigated adaptive behavior in a homogeneous way where the whole network takes the same adaption. Based on the work on epidemic spreading over time-varying networks [17], optimal control method has been utilized to find the optimal time-varying topology response for the network system in [14]. However, the centralized optimal control method is not practical and lack of incentive. The effect of heterogeneous weight adaptation on virus spreading has been studied by Yun et al. in [18, 19]. In [18], the authors have proposed a weight adaptation rule without taking cost into consideration. The weight adaptation rule is based on the infection level of the whole network.
Vaccination and immunity have been studying for control of virus spreading over decades [13, 15, 20]. But vaccination may not be efficient for some malware and virus due to their fast upgrading and undetectability. Also, getting every individual vaccinated is costly. Quarantining [5] is equivalent to removal of all connections of one agent. Compared with weight adaptation scheme, it is overreacting to disconnect all links since connection with healthy agents cause no harm.
The paper is organized as follows. In Sect. II, preliminaries are given and the -person nonzero-sum differential game framework is introduced. Section III describes the open-loop NE of the differential game and the weight adaptation scheme. Sect. IV studies the efficiency of the NE solution. Comparisons of the differential game-based weight adaptation scheme with the optimal control based scheme and other numerical results are given in Sect. V. Conclusions are contained in Sect. VI.
II Preliminaries and Problem Formulation
In this section, we introduce notations and preliminary results needed in our derivations. Along the way, we describe and develop the problem formulation.
II-A Graph Theory
A weighted, directed graph can be defined by a triple . represents a set of nodes. Define . A set of directed edges is denoted by . The set of in-neighbors of node is defined as . Denote by the cardinality of a set. So, the in-degree of is . Similarly, the set of out-neighbors of is . The out-degree of is . The weight adjacency matrix is denoted by an matrix where refers to the weight of the edge from node to . We assume that graph has no self-loops.
We denote the original weight adjacency matrix by . Let () be the set of out-neighbors (in-neighbors) under the original optimal weight pattern .
II-B Virus Spreading Model
With the fact that cyber network nodes do not have human-like autoantibody/vaccination which can prevent individual from being infected again, we study the so-called susceptible-infected-susceptible (SIS) models. Consider a population of agents. Each agent can be either susceptible (S) or infected (I). Infected individuals infect others at rate . The intensity of interaction between and is described by the weight . Denote . We assume that the weight is bounded by . If is susceptible while is infected, there is possibility that will be infected after the interaction. In addition, each infected agent returns to the susceptible state at some rate . The state of a node at time is a binary random variable , with (), indicating that agent is susceptible (infected). The state vector of all agents is denoted by . With the adaptive weight from agent to , the stochastic state transitions of node from time to can be written as follows:
[TABLE]
The model (1) is computationally challenging under large-scale networks due to the exponentially increasing state space. Hence, we resort to mean-field approximation of the Markov process [17, 21, 22]. Denote as the probability of agent being infected at time . The mean-field approximation then provides
[TABLE]
for . To write this dynamics equation in a more compact form, denote . We have
[TABLE]
where which can be written as where , , , and .
According to the discussion in [17], the -intertwined model (2) gives an upper-bound for the exact probability of infection, . However, the mean-field approximation consider herein, while it is an approximation, is well constructed because the scale of networks, i.e., N in our model is large and we focus on the cases where is above the threshold [22].
The graph and the epidemic spreading process can be viewed as physical constraints. The agents in the network are coupled by these constraints while trying to minimize their own cost. Such behaviors lead to differential games over networks, which will be introduced in the following section.
II-C Differential Game Over Networks
As we mentioned in Section I, the self-interested agents aim to minimize their own costs. One cost arise from malfunction caused by infection. Another cost for agent is to describe inefficiency or degradation of system performance caused by deviation from the original weight for all . We consider the original weight as an optimal weight under which the agent can achieve the most benefit.
For agent , the infection cost function, given by , is a function of . is assumed to be monotonically increasing to capture the loss of being infected. A weight cost function for edge from to is given by where is convex. The function satisfies at and only at for all because the original weight is optimal to the agent when there is no infection. It is optimal in terms of the tradeoff between price and performance. The marginal cost of deviation from the optimal weight will increase as the distance from the adapted weight to the optimal weight increases. Considering a time duration from [math] to , the cost function of agent during time interval is given as follows by
[TABLE]
As each node determines its own weight adaptation policy, it naturally leads to a differential game framework defined as follows. Consider agents in the network as players with an index set . The duration of the evolution of the game is given by the time interval . Denote . Let be the permissible set of the states. For each fixed , . Let be the controls of player . The admissible control set for player is , i.e., for each fixed , . A differential equation is given by (2) whose solution describes the state trajectory of the game corresponding to the -tuple of control functions and the given initial state . Define a set-valued function for each to characterize the information pattern of player . We consider the open-loop pattern in our case where . We can state our problem as the following differential game problem:
[TABLE]
where and . Each player aims to find a control policy to generate a weight trajectory . Such control policies are open-loop ones that depend on the initial condition of the individual state.
Remark 1**.**
The game defined by (5) is a differential game over networks where the cost only depends on their own state and controls. Nodes interact with their neighbors. The network topology is captured by . The time-varying property of the network is described by for .
Remark 2**.**
Information structure determines the state information gained and recalled by players at time . The reasons why we adopt open-loop policies are three-fold. First, the obtained open-loop policy can be implemented as a feedback policy [23] as is shown in Section III. Since the dynamics (3) is determined, the state at any time can be computed and used to determine the control policy. Second, to obtain a strongly time-consistent optimal and individual feedback policies, we have to resort to techniques of dynamic programming. However, a direct application of dynamic programming will not yield an individual feedback policy. Also, computation of the feedback control law derived from Hamilton–Jacobi–Bellman equation requires solving nonlinear PDEs which increases the difficulty of distributed implementation. Third, obtaining open-loop policy resorts to maximum principle which well presents the structure of the optimal solution. This helps us to analyze the inefficiency of the NE and obtain a penalty function to achieve social optimum as is shown in Section IV.
III Analytic Results
The solutions to the -person non-cooperative nonzero-sum differential game (5) played with an open-loop information structure are open-loop Nash equilibria.
Definition 1**.**
The weight adaptation trajectories or say the control trajectories , constitute an open-loop NE solution of the differential game (5) if the inequalities
[TABLE]
hold for all control trajectories . We denote the associated state trajectory for .
The definition states that at open-loop NE, no agents have incentive to deviate unilaterally away from the optimal trajectory from time [math] to time .
To obtain the necessary conditions for the open-loop NE, we make two mild assumptions.
Assumption 1**.**
For each , the infection cost function is to be of class.
Assumption 2**.**
For each , the weight deviation cost function is to be of class.
Each player can decide to receive data or packets from any other agent. The following observation narrows down the set of possible solutions of the open-loop NE.
Observation 1**.**
If is an open-loop NE solution for the following differential game
[TABLE]
with for , and is an open-loop NE solution for the differential game defined by (5), then we have
[TABLE]
for all player and for each .
Proof: See Appendix B-A.
Observation 1 simplifies the searching process for the open-loop NE. Instead of analyzing problem , we can focus on problem (7) which contains a smaller admissible control set. Define . To be specific, the admissible control set of game problem (7) for player is . From Theorem 5.1 of [23] and Lemma 1, the differential equation in (7) admits a unique solution if the weight adaptation control is continuous in .
Next, we discuss the derivation of candidate NE solutions for differential game (7) when the information structure of the game is open-loop pattern. Utilizing techniques in optimal control theory, we arrive at the following result.
Theorem 1**.**
For the -person differential game (7), we have assumptions 1 and 2. Then, if is an open-loop NE solution, and is the corresponding state trajectory, there exist costate functions , whose -th component is denoted by , such that the following relations are satisfied:
[TABLE]
[TABLE]
[TABLE]
where
[TABLE]
and is a matrix given by
[TABLE]
* is a vector whose -th component is and other components are zero, for .*
Proof: See Appendix B-B.
Note that turns out to be the same for different . In later discussion, we shall omit the idex . Now, the dynamics of the costate function can be given as for which sheds some light on the design for achieving social welfare in the following section. is a -matrix [24] for every where the diagonal entries of are positive and all off-diagonal entries are non-positive. Therefore is structurally in line with the graph Laplacian whose diagonal entries are the out-degrees of the agents [25]. That is for every zero or negative entry of the matrix , the corresponding entry of the graph Laplacian is zero or negative respectively and vice versa. If the original graph is a directed acyclic graph, is a lower triangular matrix given the index of a proper permutation. Other than the topology information, also contains the infection information. Note that even though we write the dynamics of the costates in an affine form, it is actually not affine which is because depends on and as we can see from (13) and is dependent on as we will show next in Theorem 2.
Theorem 2**.**
Define where is the th component of the costate function . The basic structure of the NE-based optimal weight control, i.e., the solution to (10), can be written as:
[TABLE]
for .
Proof: See Appendix B-C.
Condition (10) in Theorem 1 can thus be replaced by . Theorem 1 together with (14) provides a weight adaptation scheme where each agent adapts its weight to minimize the possibility of being infected and the loss of efficiency/interest. The weight of the edge from player to player , controlled by player , is based on the costate component , player ’s own infection and its out-neighbors infection. Apparently, the higher the infection level of agent is, the lower the weight of edge should be. As is shown in (11) and (13), is highly coupled and it contains information about the effect of the whole network.
Remark 3**.**
Based on the structure of the optimal control (14), the dynamics of costates (11) as well as Lemma 1, we can infer that the NE-based optimal control trajectory is continuous for every which means there is no switching in the optimal weight adaptation.
Remark 4**.**
From (14), we know the weight between agents and may be adapted to zero at certain time as one can see that if . That means the connection between agent and may be disconnected temporarily which will be restored according to Theorem 3. For agent , if all its out-links have weight zero, i.e., for all , we can view this agent as being quarantined from infection. We say being quarantined from infection because there might still be in-links connecting to agent which means here, the concept of being quarantined is different from the concept in undirected graph. Besides, the weight adaptation scheme we proposed is different from quarantining in a sense that the weight adaptation scheme does not need to completely disconnect nodes from all other others but rather adjust weights to connect more loosely with particular nodes with a higher likelihood of infection.
Corollary 1**.**
If is concave, i.e., the marginal cost of deviation increases as the adapted weight becomes more far away from the optimal weight, the optimal control policy can be given as follows
[TABLE]
for , where is defined in Theorem 2.
We can see that if is concave, the optimal control policy switches between [math] and . In this paper, we focus our study on the case when is convex.
Before stepping into the numerical computation of the open-loop NE candidates, we go into further analysis and obtain other structural results that would be beneficial for more insightful understanding of the weight adaptation mechanism.
Theorem 3**.**
The costate function and the open-loop control trajectories have the following properties:
- (i)
Along the open-loop NE trajectory, holds for all and all . Furthermore, stays positive for all and all .
- (ii)
The open-loop NE control trajectory , for , satisfies at and only at and for , .
- (iii)
If , i.e., the in-degree of player is zero in the original graph, under linear infection cost function , the component is bounded above by . That is, for .
- (iv)
If , i.e., the out-degree of player is zero in the original graph, under linear infection cost function where , the costate component is strictly monotonically decreasing over .
Proof: See Appendix B-D.
Theorem 3 indicates that during the time interval , the agents, with an incentive to lower their own costs, adapt their weight accordingly to impede the spreading of virus. After the prescribed alert duration , a recovery of topology is always on the way to meet the minimum cost. Also, from theorem 3 (iii), we know for agent who has no in-neighbors, its out-link will never be [math] if . This can be readily shown by .
IV Inefficiency of Nash Equilibrium
It is well known that the non-cooperative NE in nonzero-sum games is generally inefficient [26]. There is need to develop a mechanism to attain a higher social welfare or lower aggregate costs through cooperation behavior [27]. The notion of the price of anarchy has been introduced in [28] to quantify the inefficiency. In the network, the social cost is the aggregate costs of all players. Let where be the weight control variable for the whole network with admissible set . Denote by the social optimal solution. The social optimum can be attained by solving the optimal control problem:
[TABLE]
where . An application of maximum principle gives the following: the optimal control and corresponding trajectory must satisfy the following so-called canonical equations:
[TABLE]
for all , where is the same with the one given in (13) for the dynamics of the costate in the differential game problem and , the Hamiltonian of the optimal control problem is defined as
[TABLE]
and is the costate function, is its th component. The Hamiltonian of the optimal control problem (20) is different from the individual Hamiltonian defined in (12). But they are related. The Hamiltonian of the optimal control problem includes the cost of all agents over the network instead of just individual’s cost. Also, the costate corresponds to the state of all agents . The counterpart of in the individual Hamiltonian defined for the game problem is . Due to the similar structure of the Hamiltonian of the optimal control problem and the individual Hamiltonians for the game problem, after applying maximum principle, we obtain (17-19) that are in the same structure with (9-11).
An optimal point can in principle be computed centrally by network operator to achieve social optimum. However, this will require the network operator to be omniscient and also not all the agents have incentives to adapt their connection weights based on the rule designed to minimize the aggregate costs. Also, for large-scale network/system, centralized solution gives rise to computational problems and implementability issues. So, centralized optimal control solution is impractical. To achieve social optimum in a distributed way, a mechanism needs to be designed on behalf of the network operator. The strategy for the network operator is to set penalties so that the cost for player at time is . To set a proper penalty for each player, we need to utilize the theory of potential differential games [29] which is an extension of the potential game concept for static games [30]. The following is the definition of potential differential games.
Definition 2**.**
A differential game with cost and dynamics defined by (2) is an potential differential game if there exists a function that satisfies the following condition for every player
[TABLE]
for all , where are the corresponding states under controls and respectively. Here, denote the collection of controls of all players except player , i.e., .
If we can find for every such that relation (21) holds for , then the differential game is a potential game whose open-loop NE solution is equivalent to the open-loop solution of optimal control problem defined by (16). The following theorem gives important insights about choosing proper penalties .
Theorem 4**.**
Consider a differential game with penalties where the cost of player is given by
[TABLE]
which is obtained by introducing a penalty term introduced to in (7), and the constraint dynamics is in accordance with the differential game defined by (7). Let . Then, the differential game with (22) is a potential differential game corresponding to the optimal control problem defined by (16), and if is an open-loop NE solution for new differential game with cost (22), and is the corresponding state trajectory, the relations (17) (18) and (19) hold for and with replaced by and replaced by .
Proof: See Appendix B-E.
Theorem 4 indicates that with a proper choice of the penalties , the necessary conditions for the open-loop NE solution of the differential game is aligned with the necessary conditions for the optimal solution of the social cost optimal control problem. The counterpart of condition (11) for the penalty-based differential game can be written as
[TABLE]
where th component of is . In implementation, the system operator sends the penalty function to each agent. If the open-loop NE and the social optimal solution uniquely exist, the open-loop NE achieves the social optimum.
Definition 3**.**
A directed graph is strongly connected if it contains a directed path from to for every pair of vertices .
In a directed graph, a directed path is a sequence of edges which join a sequence of vertices, but with the restriction that the edges all be directed in the same direction. What we have developed so far in this section is for cases where the original weighted network is strongly connected. If the original network is not strongly connected, we have the following.
Definition 4**.**
Given a graph, if there exists a directed path from vertex to vertex , we say is reachable from . Denote by the set of vertices that can be reachable from.
In graph theory, a single vertex is defined to connect to itself by a trivial path. We have . If the graph is strongly connected, for every , . Otherwise, for some , we have . Denote the counterpart of for the original graph defined by .
Corollary 2**.**
Consider the differential game with cost functions defined in (22) and dynamics given by (7) as in Theorem 4. Let . Then, if is an open-loop NE solution for the new differential game, and is the corresponding state trajectory, the relations (17), (18), and (19) hold for and with replaced by and replaced by .
Proof.
The proof of Corollary 2 simply follows from the proof for Theorem 4. ∎
To illustrate Corollary 2, we present an example in Appendix C
V Algorithms and Case Studies
In this section, we provide the set-up information for the case studies. Besides, based on the equilibrium analysis and the optimal control analysis, an algorithm is proposed to compute the optimal weight adaptation trajectory for the system operator and the agents.
V-A Preliminaries
In the simulation, the infection cost function is given to be linear in , i.e., . Here, we set . The weight adaptation cost is taken to be quadratic where for all . Unless otherwise stated, let for all . Note that under this setting, assumptions 1 and 2 hold and is even and convex.
The original network in the simulation is a bi-directional scale-free network with agents generated based on the Barabási-Albert model [31, 32]. We select this model since many kinds of computer networks, including the internet and the web graph of the World Wide Web, have scale-free properties. We generate the network by following the growth and preferential attachment properties given in section VII of [31]. For simplicity, the original weight is set to be for all edge . Let be the average in-degree (out-degree) of the network we generated. We have .
For simplicity, we take same infection rates and curing rates for all players. Unless otherwise stated, let and . From the result in [34] and the fact that the largest real part of the eigenvalues of matrix is , we say that the virus epidemic will outbreak in the original network. The initial infection level is also set to be the same for all players, for all . Table I is a summary of the setups.
V-B Computational Algorithm
Note that we aim to propose an implementable distributed virus resistance algorithm (DVR algorithm). Based on the algorithm proposed in [33] for computation of open-loop NE for nonzero-sum differential games, we present, in algorithm 1, the DVR algorithm to compute the candidate open-loop NE solutions for the differential game described by (7) and the penalty-based differential game defined in Theorem 4. The solution of the penalty-based differential game is inline with the solution of the optimal control problem defined in (16).
Initially, the input data includes initial infection data: for all ; infection rate , recovery rate for all ; the original topology ; the cost functions for all and a stopping value to stop the algorithm. In the first step, each player arbitrarily selects a continuous control trajectory within the admissible control set for every out-link it has: for each , for all , and reports the weight adaptation scheme to the network operator. In step , each player utilizes the initial infection data and the control policy , solve (9) forward in time to obtain and report it to the network operator. If the system aims to achieve the social optimal control problem, then the algorithm goes into step . Otherwise, the algorithm steps into step . In step , the system operator utilizes the reported and , the infection damage cost to compute backward based on (23) and sends back to the corresponding player . In step , the system operator utilizes the reported and , the infection damage cost , computes backward based on (11) and sends back to the corresponding player . In the next step, each player updates its control based on (14) which only requires its out-neighbors infection information and reports the updated control policy to the network operator. Denote by the updated control policy. If , the algorithm moves back to step . Otherwise, the latest updated policy is the optimal control policy for agent .
V-C Numerical Results
In this subsection, we present the numerical results. First, we show the dynamics of the costate function for all players. Then, we show the evolution of the weight adaptation, the infection and the costate of selected agents to see individuals’ behaviors. Second, we give the comparisons between the optimal control based-weight adaptation scheme (this scheme is equivalent to the penalized differential game based-weight adaptation scheme) and the differential game-based weight adaptation scheme. The optimal control based adaptation scheme is from solving optimal control problem (16). The two schemes together with the case of weight adaptation. are compared in terms of the total cost and the infection level of the whole network.
From and , we know that the weight adaptation of player is based on its own infection, its out-neighbors, and the costate component . The infection of player and its neighbors are just local information. From (11) and (13), we can see the effect of the whole network’s situation is conveyed by costate component to the weight adaptation strategy of player . Thus, we investigate the dynamics of in Fig. 4, where the costate component ’s dynamics for all agents are plotted. As we can see, is positive for all during the whole time interval which corroborates Theorem 3. For most of the players, the value of is high at the very beginning and then decreases to [math]. One interpretation is that players are more sensitive at the beginning to their out-neighbors infection and tend to cut their weights more heavily.
To see individual behaviors and states, we rank the agents based on their out-degrees. Agent has the largest out-degree. From the first plot of Fig. 5, agent is more likely to be infected due to its large degree. We can see that all weights equal at and only at , which corroborates Theorem 3. The weight is reduced to [math] for some time. This phenomenon occurs because the costate and its out-neighbors’ infection levels are high during that time period. The third plot shows that agents with higher out-degrees reduce less weight. Usually, one suppose to cut more weights on highly connected nodes to slow the infection propagation. However, the obtained weight adaptation scheme in this paper is a result of considering both the infection and the loss of efficiency of the network agents. There is a trade-off between maintaining the network’s performance and lowering the infection. So, the agents with higher out-degrees may cut less weight to maintain the performance/function.
To show that each agent has heterogeneous weight adaptation to different neighbors, we present Fig. 6. We can know from Fig. 6 that agent adapts weights with his/her out-neighbors accordingly based on the evolution of the infection levels of his/her out-neighbors. As we can see, agent cuts more weight on neighbors with higher infection levels. For example, the infection level of agent is higher than agent all the time. Thus, weight is lower than . Also, agent reduces its weight on agent to zero due to the latter’s high infection level while its weight on agent remains above .
Here, we compare the NE-based weight adaptation scheme, the optimal control-based weight adaptation scheme, and no weight adaptation scheme. In Fig. 7, we plot the total cost under the three schemes for different . We observe that no adaptation scheme cause the most total cost. For different values of , the NE-based scheme always incurs a higher cost than the optimal control-based scheme, which indicates the inefficiency of the NE solution. From the plot, we see that a higher causes more inefficiency.
Fig. 8 is presented to show the virus-resistance of the proposed schemes. The black line shows the infection level for the case with no adaptation scheme, the blue line shows the case with the game-based scheme, and the green line shows the case with the optimal control-based scheme. Even though the game-based scheme is inefficient in terms of minimizing the total cost, it outperforms the optimal control-based scheme since the infection level under the game-based scheme is always lower than the infection level under the optimal control based scheme. No matter in what case, the scheme we have proposed has proven to be virus-resistant and generated a lower total cost than the scheme without adaptation did.
VI Conclusion and Future Work
In this paper, we have established a differential game framework to develop decentralized virus-resistant mechanisms over complex networks. We have shown that weight adaptation policies allow nodes to change weights to mitigate their infection. The differential game approach has captured the strategic and dynamic behaviors of a large number of self-interested agents over time-varying networks. Each player adapts its weight based on its own infection and its out-neighbors infection. It has been observed that the higher levels of its out-neighbors’ infection lead to lower weights. The effect of non-local behaviors on the adaptation strategy has been encoded in the costate function. We have discussed the inefficiency of the open-loop Nash equilibrium and have proposed a penalty-based mechanism to achieve efficiency by imposing local costs induced by reachable nodes. The differential game framework has enabled the design and implementation of a distributed algorithm over large-scale networks to control the macroscopic behaviors of the virus spreading over networks. Numerical examples have been used to illustrate the virus-resistance of the proposed scheme and the inefficiency of the Nash equilibrium. The differential game approach achieves a better performance than its centralized counterpart in terms of the mitigation of virus spreading. One future direction for this work would be to study the steady behavior of long-term virus-resistance scheme where the duration of virus spreading is sufficiently long.
Appendix A Lemmas
Lemma 1**.**
The dynamics equation is uniformly Lipschitz in and for each .
Proof.
From (3), we have
[TABLE]
where and , and . Since is bounded by , is bounded. So, is uniformly Lipschitz in . The proof for is uniformly Lipschitz in for all can be obtained by following the similar steps. ∎
Lemma 2**.**
Let , be the corresponding solution to the ODEs in (2). For all , given , holds for all .
Proof.
The proof follows Lemma 1 of [17]. It is clear that is a continuous function of time. When , then from (2), we have , which means once reaches , it cannot stay there. On the other hand, when , the solution would always lie in . Otherwise, suppose that there exists such that or . In the first case, note that holds for the time interval , which gives . It yields a contradiction. In the second case where , we have over time interval . So, we have which contradicts the fact obtained from (2) that . ∎
Appendix B proof
B-A Proof of Observation 1
Proof.
Given the original weight pattern , suppose that , i.e., and . Obviously, player can lower his own cost by deviating from to [math], which contradicts the fact that is the open-loop NE. A similar statement can be made for the case when . ∎
B-B Proof of Theorem 1
Proof.
Based on Theorem 6.11 in [23], conditions (9) and (10) are directly derived. To obtain (11), we have
[TABLE]
Reformulating (25), we obtain condition (11). As the terminal cost is unspecified and the final state is free, we have the transversality condition . ∎
B-C Proof of Theorem 2
Proof.
By Assumption 2 and the fact that is convex, we know that the Hamiltonian is differentiable and convex on for every . Also, the admissible control set is convex. Thus, the solution of (10) can be obtained by letting , i.e.,
[TABLE]
Suppose . If which happens if and only if , then according to (10), . Otherwise, if , while if , . Thus, we have the optimal control rule in the form of (14). ∎
B-D *Proof of Theorem 3 *
Proof.
For proof of (i), note the fact that if is a continuous and piece-wise differential function over such that while for all in , and cannot be negative simultaneously. From (11), we have , which gives . Hence, there exists such that and for over . Suppose that one of the costate component violates the inequality first at , i.e., we have for , we obtain which is not feasible. If it is that first violates the inequality at time , we have which is in contradiction with the fact.
To prove (ii), note that at time , we have . Combined with the fact that is convex and continuously differentiable, and , by expression (14), we have that holds only at .
For proof of (iii), from Observation 1, we know that if , the optimal weight adaptation for every . Here, indicates for all . Based on (11) and (13), the dynamics of the costate component can be written as
[TABLE]
Note that the first term in the bracket is non-negative and . Moving backward from , it’s obvious that is bounded above by .
Under conditions stated in (iv), the dynamics of the costate component can be written as
[TABLE]
We have proved in (i) that holds for all and for which indicates . Thus, is strictly monotonically decreasing over . ∎
B-E Proof of Theorem 4
Proof.
Assume that is an open-loop optimal control of the centralized control problem (16), and is the state path under the optimal control. Fix an arbitrary , and let be an open-loop strategy for player . Let be the new state trajectory given by (16) corresponding to . As and are optimal for the optimal control problem (16), then
[TABLE]
Adding to both sides of this this inequality the constant
[TABLE]
we obtain that for all . According to the definition of open-loop NE for differential games in (6), we know is also an open-loop NE for the differential game with penalties.
To show the optimal control problem (16) shares the same necessary conditions with the new differential game, we again utilize the maximum principle. The Hamiltonian of player for the new differential game is . We can find that relations (9) (10) and (11) under the Hamiltonian are aligned with relations (17) (18) and (19) where at each for all . ∎
Appendix C Example
To illustrate Corollary 2, we consider a directed network in Fig. 3. Here, , , . The associated with this network can be rewritten as an upper triangular block matrix. The upper triangular matrix is denoted by where the first rows and columns of this matrix represent the vertices in in an ascending order. The last rows and columns represent the rest of the vertices in in an ascending order. For example, the permutation for agent is . Thus, the dynamics of under the differential game given in Corollary 2 can be written as
[TABLE]
where
[TABLE]
Thus, if we let , the dynamics of described by is consistent with the dynamics of the th component of described by (18). By solving the optimization problem (19), we know that the optimal control problem shares the same control rule (14) with the differential game problem. Since for every , we can see the statement in Corollary 2 holds.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Farwell, J.P., Rohozinski, R. “Stuxnet and the future of cyber war”. Survival, vol. 53, no. 1, 2011, pp.23-40.
- 2[2] Pasqualetti, F., Dorfler, F., Bullo, F. “Control-theoretic methods for cyberphysical security: Geometric principles for optimal cross-layer resilient control systems”. IEEE Control Systems, vol. 35, no. 1, 2015, pp. 110-127.
- 3[3] Hayel, Y., Zhu, Q. “Dynamics of Strategic Protection Against Virus Propagation in Heterogeneous Complex Networks”. In International Conference on Decision and Game Theory for Security, Springer, 2017, pp. 506-518.
- 4[4] Guo, D., Trajanovski, S., van de Bovenkamp, R., Wang, H. and Van Mieghem, P. ”Epidemic threshold and topological structure of susceptible − - infectious − - susceptible epidemics in adaptive networks”. Physical Review E, vol. 88, no. 4, 2013, p.042802.
- 5[5] Khouzani, M. H. R., Eitan Altman, and Saswati Sarkar. “Optimal quarantining of wireless malware through reception gain control.” IEEE Transactions on Automatic Control, vol. 57, no. 1, 2012, pp. 49–61.
- 6[6] Zhu, Q., Fung, C, Boutaba, R and Başar, T. “GUIDEX: A game-theoretic incentive-based mechanism for intrusion detection networks.” IEEE Journal on Selected Areas in Communications vol, 30, no. 11, pp. 2220-2230, 2012.
- 7[7] Mai, V., and Abed, E. “Distributed optimization over weighted directed graphs using row stochastic matrix.” 2016 American Control Conference (ACC), Boston, MA, 2016, pp. 7165-7170.
- 8[8] Zhang, T. and Zhu, Q. “Dynamic differential privacy for ADMM-based distributed classification learning”. IEEE Transactions on Information Forensics and Security, vol. 12, no. 1, 2017, pp.172-187.
