Persuasion-based Robust Sensor Design Against Attackers with Unknown Control Objectives
Muhammed O. Sayin, Tamer Basar

TL;DR
This paper develops a robust sensor design framework that uses persuasion strategies to influence attacker beliefs in stochastic control systems, minimizing damage from attackers with unknown objectives.
Contribution
It introduces a linear-plus-noise signaling strategy and a semi-definite programming approach for robust sensor design against unknown attacker objectives in stochastic control.
Findings
Closed-form solution for signaling strategy
Linear matrix inequality condition for belief covariance
Semi-definite program for global optimization
Abstract
In this paper, we introduce a robust sensor design framework to provide "persuasion-based" defense in stochastic control systems against an unknown type attacker with a control objective exclusive to its type. For effective control, such an attacker's actions depend on its belief on the underlying state of the system. We design a robust "linear-plus-noise" signaling strategy to encode sensor outputs in order to shape the attacker's belief in a strategic way and correspondingly to persuade the attacker to take actions that lead to minimum damage with respect to the system's objective. The specific model we adopt is a Gauss-Markov process driven by a controller with a (partially) "unknown" malicious/benign control objective. We seek to defend against the worst possible distribution over control objectives in a robust way under the solution concept of Stackelberg equilibrium, where the…
| Max | |||||
| Max | |||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Persuasion-based Robust Sensor Design Against Attackers with Unknown Control Objectives
Muhammed O. Sayin and Tamer Başar This research was supported by the U.S. Office of Naval Research (ONR) MURI grant N00014-16-1-2710.Muhammed O. Sayin is with Laboratory for Information and Decision Systems at MIT, Cambridge, MA 02139. E-mail: [email protected] Başar is with the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA. E-mail: [email protected]
Abstract
In this paper, we introduce a robust sensor design framework to provide “persuasion-based” defense in stochastic control systems against an unknown type attacker with a control objective exclusive to its type. For effective control, such an attacker’s actions depend on its belief on the underlying state of the system. We design a robust “linear-plus-noise” signaling strategy to encode sensor outputs in order to shape the attacker’s belief in a strategic way and correspondingly to persuade the attacker to take actions that lead to minimum damage with respect to the system’s objective. The specific model we adopt is a Gauss-Markov process driven by a controller with a (partially) “unknown” malicious/benign control objective. We seek to defend against the worst possible distribution over control objectives in a robust way under the solution concept of Stackelberg equilibrium, where the sensor is the leader. We show that a necessary and sufficient condition on the covariance matrix of the posterior belief is a certain linear matrix inequality and we provide a closed-form solution for the associated signaling strategy. This enables us to formulate an equivalent tractable problem, indeed a semi-definite program, to compute the robust sensor design strategies “globally” even though the original optimization problem is non-convex and highly nonlinear. We also extend this result to scenarios where the sensor makes noisy or partial measurements. Finally, we analyze the ensuing performance numerically for various scenarios.
Index Terms:
Stackelberg games, Stochastic control, Security, Sensor placement, Semi-definite programming.
I Introduction
Cyber connectedness of control systems has brought in new security challenges where attackers can manipulate control systems at unprecedented levels with various malicious tasks on their agenda[1, 2, 3]. We can view such intelligent attackers as decision makers that take actions driven by and compatible with their own objective and information available to them. This implies that the kind of information available to an attacker has an indirect impact on what actions the attacker would take. Correspondingly, it is intuitively expected that if we can manipulate the information available to attackers, then we can persuade them to attack the system in a way in line with the system’s objective to the extent possible, so that the attack would cause minimum damage to the system.
There are certain distinct challenges for persuasion-based defense measures. For example, attackers make their decisions based on the belief they have formed using the information available to them. A challenge is how to shape that belief in a desired way by controlling the information available to the attacker. The first requirement (and challenge) here is to be able to identify how an attacker would form its belief. However, when an attacker forms its belief strategically, there might be multiple Nash equilibria in a general-sum setting111In a zero-sum setting, there exists a unique “babbling equilibrium” where the attacker forms a belief independent of the information provided by a defender (since the defender would not share anything useful) whereas the defender just makes irrelevant information available to the attacker (since a more informed attacker would not cause less damage to the system)., as shown in the strategic information transmission framework [4]. In that case how an attacker forms its belief is not well defined since the attacker might be forming its belief differently at different equilibria and whether (and which) equilibrium will be realized would not be known, and constitutes an uncertainty. Furthermore, even when there exists a well-defined characterization of how an attacker would form its belief, another challenge is what decision the attacker would make based on that belief, since that decision would depend on the attacker’s malicious objective. This implies that persuasion-based defense is attack-specific. Therefore another challenge is how to persuade an attacker whose objective is not known to act in a certain way.
Before addressing these challenges, let us briefly review the literature from a broader perspective where a decision maker seeks to induce another one to take certain actions. A closely related framework is Bayesian persuasion, introduced by Kamenica and Gentzkow in their seminal paper [5]. They addressed how a sender could persuade a receiver to take certain actions under the solution concept of Stackelberg equilibrium [6] where the sender (the receiver) is the leader (the follower). In other words, the receiver is aware of the sender’s signaling strategy while taking its actions. Such a leader-follower scheme creates an environment where the receiver’s actions are not strategic, and therefore there is a well-defined characterization for how the belief is formed.
In Bayesian persuasion framework, a necessary and sufficient condition is that mean of posterior belief must be equal to the prior belief. This enables us to formulate an equivalent problem over distributions over posterior beliefs under a linear equality constraint. The authors provided a geometrical interpretation to compute the solution, which necessitates computation of a convex envelope of a function. Although this interpretation provides essential insight for designing signaling strategies and has led to various applications (see the recent survey on Bayesian persuasion [7] and the references therein), it is viable only for scenarios where the state space is very small.
We emphasize that [5] studies the Bayesian persuasion problem in a very general framework, where the underlying distribution is arbitrary as long as its support set is compact, cost functions are arbitrary, and the signaling strategy is any stochastic kernel (between the state space and the signal space). Therefore if we consider specific distributions and cost functions, we should be able to obtain more tractable solutions. For example in [8], the author addressed Bayesian persuasion problem for multi-variate Gaussian information and quadratic cost functions, and provided a closed-form solution for the optimal signaling strategy, which turns out to be a linear function. We note that the studies [5, 8] focused on static systems. To be able to adopt this framework to control systems, an important step would be to extend the framework to dynamic systems. In [9], we have extended the results in [8] to dynamic environments where the underlying information is a discrete-time Gauss Markov process. We showed that there exists a linear signaling strategy optimal within the general class of measurable policies and provided a semi-definite program (SDP) to compute optimal signaling strategies numerically.
Before delving into persuasion-based defense measures in control systems, let us also review the literature on security of control systems. To this end, we selectively focus on studies where an attacker can monitor and intervene the links from sensor to controller and from controller to actuator so that there would be an information flow to the attacker who monitors these links. In [10], the authors showed that an attacker can lead to unbounded estimation error by injecting false data into the link from sensor to controller when there exists a certain threshold-based detector monitoring the link. In [11, 12], the authors characterized the reachable set to which an evasive attacker can drive the system by injecting data into both links jointly. In [13, 14], the authors analyzed attacks where an attacker seeks to drive the state of the system according to his/her adversarial goal evasively by injecting data into both links jointly. In [15], the authors have analyzed optimal attack strategies to maximize quadratic cost of a system with linear Gaussian dynamics by injecting false data into the link from controller to actuator without being detected.
As seen in the literature reviewed above, an attacker can monitor and intervene the links in a control system, e.g., as illustrated in Fig. 1 for a linear quadratic Gaussian (LQG) control system. This implies that an attacker can bypass the controller of a system by monitoring and intervening both links jointly. Then the attacker would receive the sensor outputs and could generate its own control inputs to conduct its malicious control objective similar to the controller of the system. From this viewpoint, within the Bayesian persuasion framework, we can encode the sensor outputs to persuade the attacker to generate control inputs that would minimize the system’s very own control objective as much as possible. We partially addressed this challenge in [9, 16, 17]. In [9], we formulated an optimal linear signaling rule in an LQG setting when attacker’s control objective is known to the sensor. This limits applicability of the solution since the sensor must know when there is an attack and what the control objective of the attacker is beforehand. In the follow-up papers [16, 17], we showed that this limitation could be relaxed in a straightforward way if the sensor knows the distribution over the control objectives of attackers that may attack the system and at what probability there might be an attack. Correspondingly, a risk-neutral sensor could minimize the expected damage with respect to that distribution. However, this still limits its applicability since the sensor must know the underlying distribution over the attack space.
Although it is not a persuasion-based defense, it is worth noting that in [18], the authors proposed linear encoding schemes222We use the terminologies encoding scheme and signaling rule/strategy interchangeably. for sensor outputs in an LQG control problem in order to enhance detectability of false data injection attacks under an essential assumption that the encoding matrix is oblivious to attackers. In that respect, the encoding scheme could be viewed as corrupting the information available to attackers (without impacting the information flowing to the controller of the system) in order to limit the attacker’s capability to evade detection rather than persuading the attacker to attack in a certain way. And this security measure becomes undermined completely once the encoding scheme is compromised.
Coming to the specifics of this paper, we also consider a discrete-time LQG control problem, similar to the studies reviewed above [13, 14, 15, 16, 17, 18], but with important differences. We particularly seek to design linear-plus-noise signaling rules for persuasion-based robust sensor design against an unknown type attacker with a control objective exclusive to its type. It is robust in the sense that sensor outputs are designed against the worst possible distribution over all possible control objectives of the attacker. We consider the scenario where the set of types is finite and known to the system. Under the solution concept of Stackelberg equilibrium, we consider the setting where the sensor is the leader and the attacker is the follower. This yields that the underlying encoding scheme is not necessarily oblivious to the attacker. Furthermore, we address the scenarios where the sensor can have partial or noisy measurements different from the models in [9, 16, 17]. Note that the worst case distribution over the type set turns out to be not necessarily a degenerate distribution, i.e., defending only against the (strongest) type of attack that leads to largest cost for the system is not necessarily optimal with respect to the system’s objective.
Due to the underlying leader-follower setting, the response of the follower (the attacker) is non-strategic and the problem faced by the leader (the sensor) turns out to be an optimization problem rather than a fixed-point problem. By using the classical method of completion of squares [19], we can show that the objective function in that optimization problem is linear in the covariance matrix of the posterior estimate of the underlying (control-free) state. However that covariance matrix depends on the encoding scheme in a nonlinear way, which leads to a non-convex and highly nonlinear optimization problem, to be solved globally in order to compute robust sensor design strategies. In [9], our previous inspection revealed a necessary condition on this covariance matrix, which is just a linear matrix inequality. Here we show that this necessary condition is also a sufficient condition when the sensor selects linear plus noise signaling strategies. Furthermore for any matrix satisfying the necessary condition, we provide a closed-form solution for the associated linear plus noise signaling rule. Therefore instead of trying to solve a non-convex and highly nonlinear optimization problem, we are able to bring the problem to one of solving a linear optimization problem under linear matrix inequality constraints, which can be done numerically using existing SDP solvers effectively, e.g., [20, 21]. This solution concept can be seen to have similar flavor with our previous paper [9], reviewed above. However, here we develop and present a completely new and more comprehensive set of technical tools since the results of [9] cannot be adopted for the general setting of this paper. The reader can refer to Appendix A for a detailed discussion on this matter.
We now highlight the main contributions of this paper as follows:
- •
We show that a necessary and sufficient condition on the covariance matrix of the posterior estimate of control-free state is a certain linear matrix inequality. This result is important by itself since it yields a tractable solution concept to design sensor outputs in general settings not limited to the special setting studied in this paper.
- •
Based on this necessary and sufficient condition, we provide an SDP equivalent to the original optimization problem faced by the sensor, which is non-convex and highly nonlinear, while the SDP could be solved globally using existing powerful computational tools effectively [20, 21]. Furthermore, this result can be extended to scenarios where there are partial or noisy measurements.
- •
We note that robust signaling rule can dictate sensors to introduce irrelevant information into sensor outputs quite contrary to other settings, e.g., when there is no uncertainty (or there is imperfect information) on attacker’s type. This observation enables us to draw the following conclusions:
- –
The equivalence result does not necessarily hold if the sensor can only use linear signaling rules.
- –
Based on Blackwell’s Irrelevant Information Theorem [22, Theorem D.1.1], linear signaling strategies are not the best one within the general class of measurable strategies in this robust setting.
The paper is organized as follows: In Section II, we formulate the robust sensor design game. In Section III, we analyze the equilibrium of the robust sensor design game under perfect measurements. In Section IV, we extend the results to the cases where there are partial or noisy measurements. In Section V, we examine numerically the performance of the proposed scheme for various scenarios. We conclude the paper in Section VI with several remarks and possible research directions. An appendix provides further discussion on related literature, proofs of all technical results in the order they appear throughout the paper, and some closed-form expressions for the reader’s reference.
Notation: We denote a collection of parameters via a subscript, e.g., , by dropping the subscript, e.g., , for notational brevity. For a vector and a matrix , and denote their transposes, and denotes the Euclidean () norm of the vector . For a matrix , and denote its trace and rank, respectively. We denote the identity and zero matrices with the associated dimensions by and , respectively. We denote the set of -by- symmetric, positive semi-definite, and positive definite matrices by , , and , respectively. For a matrix , denotes its Moore-Penrose inverse. For positive semi-definite matrices and , means that is also a positive semi-definite matrix. We denote the Kronecker product of matrices and by .
We denote an ordered set and its augmented vector version, , by , by some abuse of notation. denotes the multivariate Gaussian distribution with zero mean and designated covariance matrix. We denote random variables by bold lower case letters, e.g., . We denote expectation and (co)variance of a random variable by and , respectively. For random variables and , denotes the expectation of with respect to the random variable . We denote the set of all possible probability distributions over a set by .
II Problem Formulation
Consider a control system, as illustrated in Fig. 1, whose underlying state dynamics and sensor measurements are described, respectively, by:
[TABLE]
for , where333Even though we consider time-invariant matrices , and , for notational simplicity, the provided results could be extended to time-variant cases rather routinely. Furthermore, we consider all the random parameters to have zero mean; however, the derivations can be extended to non-zero mean case in a straight-forward way. , and , and the initial state . The additive state and measurement noise sequences and , respectively, are white Gaussian vector processes, i.e., and ; and are independent of the initial state and of each other.
As seen in Fig. 1, measurements are encoded into a signal through a signaling strategy , which is a stochastic kernel, and almost everywhere over . We specifically consider “linear plus noise” signaling rules such that the signal is given by
[TABLE]
almost everywhere over , where can be any -by- deterministic matrix, and is a zero mean multivariate Gaussian noise independent of every other parameter, and its covariance matrix can be any -by- positive semi-definite matrix. We denote the set of such signaling rules by , i.e., . Furthermore, the closed-loop control input is given by
[TABLE]
almost everywhere over , where can be any Borel measurable function from to . We denote the set of such control policies by , i.e., . For notational brevity, let us denote signaling (control) strategies and the associated sets across the horizon by and ( and ), respectively.
II-A Defense Model
We consider an LQG control problem, where the controller of the system selects a measurable control strategy in order to minimize
[TABLE]
where444For notational simplicity, we consider time-invariant and . However, the results provided could be extended to the general time-variant case rather routinely. and . As illustrated in Fig. 1, we consider the encoder of the system as a decision maker, denoted by . selects the signaling strategy in order to minimize the same objective with the controller, (5). Note that if there were no attacks, identity function, where the measurements are shared with the controller fully, would be an optimal encoding scheme due to the data processing inequality [23, Theorem 2.8.1]. However, we consider here the scenarios where there can be an attack with an unknown control objective. Correspondingly there might be encoding schemes that do not share the measurements fully and can lead to better performance with respect to (5).
II-B Attack Model
We consider an attacker who is aware of the underlying state dynamics, i.e., gain matrices , , and ; covariance matrices , , and ; and the encoding scheme, i.e., . The attacker is of an unknown type, which determines its control objective. Let us denote the set of all possible types by . We suppose that is finite and known by the system. We seek to provide a compact and unified analysis. Therefore we consider that the control objective of type is given by
[TABLE]
where and are exclusive to the type . Note that when there is an attack, the attacker selects a control strategy in order to construct its control input and correspondingly its control objective includes the term . We also denote the state driven by type- attacker by , to make it explicit.
Remark 1* (Versatility of Control Objectives).*
We model the control objectives of the system and the attacker by (5) and (6), respectively, within a unified framework. However, arbitrariness of weight matrices in the control objectives (5) and (6) brings in flexibility to model various attack scenarios (that are not in the exact form of (6)) through the transformation of the underlying state space as exemplified in Section V.
II-C Game Model
We analyze the interaction between the attacker and under the solution concept of Stackelberg equilibrium where is the leader. Note that from the viewpoint of , either the controller of the system is receiving the sensor output and driving the state or there is an unknown type attack, and it is getting the sensor output and it is generating the control input. Since whether there is an attack or not is also an uncertainty, let us view the attacker and the controller of the system as a single player in a unified way, denoted by , with an unknown type from the extended type set , where type- corresponds to the controller of the system, i.e., and . Therefore, we consider a Stackelberg game between the leader and the follower (of an unknown type).
We note that depending on its type, selects different control policies, which lead to different control inputs, and states. Therefore for type- , we use , and . The objective of type- is given by
[TABLE]
On the other hand, the objective of is given by
[TABLE]
where denotes all possible distributions over the extended type set . Note that the maximization in (8) computes the cost of for the worst case distribution over .
Before describing the game formally, let us take a closer look at ’s objective (8). We can view (8) consisting of two parts, one of which is
[TABLE]
where corresponds to the probability that the controller of the system drives the state under the worst case distribution and the summation is identical to (5) since and . The other part is
[TABLE]
which implies that seeks to minimize when type- attacker drives the state. Note that it includes only the first term in (5) since we consider that would not necessarily want the attacker to have small size control inputs.
Definition* (Robust Sensor Design Game).*
The robust sensor design game
[TABLE]
is a Stackelberg game between the leader and the follower . Objectives of and are given by (7) and (8), respectively. Then a tuple of strategies attains the Stackelberg equilibrium provided that
[TABLE]
where, by an abuse of notation, we denote type- ’s strategy by to show its dependence on ’s signaling rules due to the leader-follower scheme, explicitly.
Note that there might be multiple best responses by for a signaling strategy. Correspondingly (12a) would not be well defined if these best responses lead to different costs for . However, as we will show later in detail, reaction set of type- turns out to be an equivalence class such that all in the reaction set lead to the same control input almost surely, and correspondingly lead to the same cost for .
III Robust Sensor Design Framework
In this section, we consider the special case where has access to perfect measurements, i.e., for ; the general noisy/partial measurements case will be addressed later in Section IV. To compute the equilibrium of the game , we first focus on best response strategy of for a given signaling strategy. This enables us to formulate the optimization problem faced by to compute robust signaling strategies. Even though this is a finite-dimensional optimization problem, further inspection reveals that it is highly nonlinear and non-convex. Therefore a generic approach would not be able to address it globally. To mitigate this issue, we formulate a tractable problem equivalent to the original optimization problem. We can solve this tractable problem globally using existing computational tools effectively. Given that solution, we also show how we can compute the associated signaling strategies. We now provide the details of these steps.
An important challenge in the design of encoding schemes in control systems compared to communication systems is that the underlying state depends on control inputs. To mitigate this issue, we introduce control-free state evolving according to
[TABLE]
and . As shown in [24], the routine technique of completion of squares yields that
[TABLE]
where the matrices , , and scalar are given by
[TABLE]
and satisfies the following discrete-time dynamic Riccati equation
[TABLE]
and . Furthermore depends on the control inputs through the following transformation:
[TABLE]
Note that is positive definite for all .
Contrary to team problems (where all decision makers have the same objective), as studied in [24], in non cooperative settings, (14) does not imply that a control input leading to is optimal since control inputs can have an impact on the signals that will be generated in future stages. However, as we will show below, cannot impact the signals generated in future stages when uses linear plus noise signaling strategies only. To show this, we let the gain matrix in signaling strategy , as described in (3), be partitioned as , where . Then, signal can be written as
[TABLE]
where the term in-between is - measurable. Similar to555We note that [9, Lemma 12] shows a result similar to (16) when the sender selects only linear and memoryless signaling strategies. [9, Lemma 12], this yields that
[TABLE]
where we define . Correspondingly, since is positive definite for all , right-hand sides of (14) and (15) yield that optimal reaction of type- is indeed the one that leads to
[TABLE]
almost everywhere over , since does not depend on . This shows that all strategies in the best reaction set of type- lead to (16) and therefore lead to the same cost for .
Based on (16), the following lemma shows that we can write the optimization objective in (8) as a linear function of the covariance matrix of the posterior estimate of control-free state, i.e.,666Henceforth we say “covariance matrix” instead of “covariance matrix of posterior estimate of control-free state”, and we say “posterior” instead of “posterior estimate of control-free state”.
[TABLE]
for .
Lemma 1**.**
The problem faced by , i.e., (12a), can be written as
[TABLE]
where is a certain symmetric matrix, described in (50), that does not depend on the optimization arguments, and , described in (44) and (49), is a certain fixed scalar.
We emphasize that complexity of the objective functions (5) and (8) is buried in fixed parameters . Based on this observation, we make the following remarks:
Remark 2* (Versatility of the results with respect to ’s objective).*
The problem faced by is a linear function of the covariance matrices since optimal reaction of turns out to be linear in the posterior estimate of the control-free state, i.e., , as seen in (16). Therefore the results henceforth hold for any other scenarios where has any other objective in which optimal reaction of still turns out to be a linear function of the posterior estimate. Note that we need to compute the associated accordingly.
Remark 3* (Versatility of the results with respect to ’s objective).*
We motivate and consider the case where has objective (8). However, the results henceforth would also hold for scenarios where ’s objective is any other (convex or non-convex) quadratic function of the state and control input. Note that we also need to compute the associated accordingly.
Compared to original form of the optimization problem (12a), the optimization problem (17) has a concrete structure showing that the optimization function depends on the signaling strategy through the covariance matrix only, and it is a linear function of the covariance matrix. Therefore it is instructive to study the relationship between the covariance matrix and the signaling strategy. Since the underlying distributions are all jointly Gaussian, we can express the covariance matrix in closed form:
[TABLE]
and the signal is given by (3). This yields that even though computing robust signaling strategies would mean finding a certain number of matrices, it is a highly nonlinear and non-convex optimization problem due to the matrix inversion in (18). Therefore it is difficult to obtain a global solution through a generic attempt, e.g., genetic algorithm [25] or particle swarm optimization [26]. On the other hand, the following proposition shows that there is a computationally tractable relationship between the covariance matrix and linear plus noise signaling strategies.
Proposition 1** (A Necessary and Sufficient Condition).**
For any signaling rule , covariance matrix of posterior estimate of the control-free state, , satisfies
[TABLE]
where777By the definition of , we have . .
Furthermore for any collection of positive semi-definite matrices satisfying
[TABLE]
there exists a (memoryless) linear-plus-noise signaling strategy888Such a signaling strategy is described in (21) later. such that .
In the following, we provide a description of signaling strategies999A derivation of the associated signaling strategies can be found in the constructive proof of Proposition 1, which is provided in Appendix C. that lead to covariance matrices matching with a given collection of positive semi-definite matrices satisfying (35). To this end, we let
[TABLE]
be the eigen-decomposition such that , where . Furthermore, we let
[TABLE]
have the eigen-decomposition with eigenvalues, e.g., . We note that turns out to be in . Then, the associated signaling strategy is given by
[TABLE]
where with , and the gain matrix is given by
[TABLE]
where is a diagonal matrix such that satisfies
[TABLE]
The following corollary to Proposition 1 provides a problem equivalent to (17).
Corollary 1** (Equivalence Result).**
The problem faced by , i.e., (8), is equivalent to
[TABLE]
subject to the following linear matrix inequalities:
[TABLE]
And given a solution , an associated signaling strategy can be computed according to (21).
Henceforth, we will be working with (22) instead of (17) while analyzing the equilibrium of the game . Furthermore, for brevity of presentation, let us introduce
[TABLE]
and be the set corresponding to the constraints (23) in this new high-dimensional space, i.e., . With this new notation, the problem faced by , i.e., (22), can be written as
[TABLE]
The following proposition addresses the existence of an equilibrium for the game .
Proposition 2** (Existence Result).**
There exists at least one tuple of strategies attaining the equilibrium of the Stackelberg game , i.e., satisfying (12).
It is instructive to examine whether optimal signaling strategies end up using irrelevant information or not. The inner optimization problem in (24)
[TABLE]
is a convex function of since the maximum of any family of linear/affine functions is a convex function [27]. Therefore, there might be examples where any extreme point of the constraint set is not a solution for (24). Correspondingly, optimal signaling strategy can turn out to be including irrelevant information. This is interesting in view of Blackwell’s Irrelevant Information Theorem [22, Theorem D.1.1]. Particularly, the theorem says that given a cost measure, for any Borel measurable function that uses irrelevant information, there exists another Borel measurable function that does not use any irrelevant information and can lead to a cost less than or equal to the one attained with the former function. Therefore we can conclude that in this robust setting, linear signaling strategies are not the best one within the general class of measurable strategies.
Next, we seek to compute the equilibrium of . To this end, we examine the equilibrium conditions further. In particular, according to (24), given , maximizing is given by
[TABLE]
since the optimization objective in (24) is a linear function of . Based on the observation (26) and the assumption that the type set is finite, the following theorem provides an algorithm to compute robust sensor outputs.
Theorem 1** (Computing the Equilibrium).**
The value of the Stackelberg equilibrium (24) is given by , where
[TABLE]
Furthermore, let and
[TABLE]
Then, given , an associated linear-plus-noise signaling strategy can be computed according to (21).
For the reader’s reference, in the following, we list the steps to compute robust sensor design strategies:
- •
We first compute the matrices , described in (50), and scalars , described in (44) and (49), for each type of control objectives, . This step includes computation of gain matrices of optimal control input, e.g., (16), in an LQG control problem.
- •
Given computed , we solve SDP, described in (27), for each type by using an SDP solver, e.g., [20, 21], numerically.
- •
When we compute according to (28), we can compute the associated signaling strategies according to (30). Note that this step includes computation of eigen-decomposition of some matrices.
We re-emphasized that we provide an algorithm to compute robust signaling strategies globally even though the problem is highly nonlinear and non-convex. There is, however, still room to develop computationally more efficient approaches. And the solution concept proposed in the paper can be used as a benchmark to evaluate performance of such computationally efficient algorithms.
IV Noisy or Partial Measurements
In this section, we seek to obtain robust signaling strategies under noisy or partial (i.e., imperfect) measurements of the type (2). To this end, we turn the problem to the same structure with the case under perfect measurements based on a recent result from [28] and then invoke the results from the previous section.
There are several challenges in robust sensor design under imperfect measurements. For example, Proposition 1 does not hold anymore. To obtain a result similar to Proposition 1, a first attempt would be to focus on the covariance matrix of the posterior estimate of control-free measurements, i.e., , where , instead of . Quite contrary to the control-free state , however, control-free measurements do not necessarily constitute a Markov process since in general . Therefore, we focus on the augmented vector of measurements .
Since the measurements are jointly Gaussian, we have
[TABLE]
This implies that the augmented measurements evolve according to
[TABLE]
where we define
[TABLE]
Note that neither nor depend on signaling or control strategies. Furthermore, similar to (15), it can be shown that
[TABLE]
where we now have .
Since is - measurable, and are independent of each other conditioned on . This implies that , , and can be viewed as forming a Markov chain in the order . In that respect, [28, Lemma 1] shows that when are jointly Gaussian and form a Markov chain in the order , there exists a linear relation between and irrespective of , and the relation is given by
[TABLE]
where is defined by
[TABLE]
Under imperfect measurements, counterpart of the covariance matrix is the covariance matrix of the posterior estimate of control-free augmented measurements, denoted by
[TABLE]
Based on (30) and (31), we have
[TABLE]
Correspondingly, (17), i.e., the problem faced by , can be written as
[TABLE]
where we define101010We use the cyclic property of the trace operator. , which can be viewed as the counterpart of under imperfect measurements. We remark the resemblance between (17) and (32), where we have instead of and instead of .
Recall that , which yields that . Under the assumption that instead of (so that can disclose the auxiliary perfectly), we would have transformed the problem under imperfect measurements into a problem under perfect measurements, and correspondingly we could have invoked the results from the previous section directly. The following lemma shows that results for the case where would also hold for the case where even when .
Lemma 2**.**
Let us denote signaling strategies when by and the associated strategy space by . Then for any , there exists a such that they both lead to the same control strategy and correspondingly the same cost for . And such a signaling strategy is given by
[TABLE]
Based on Lemma 2, the following corollary to Proposition 1 provides a tractable necessary and sufficient condition on .
Corollary 2** (A Necessary and Sufficient Condition Under Imperfect Measurements).**
For any signaling rule , covariance matrix of posterior estimate of the control-free measurements, , satisfies
[TABLE]
where .
Furthermore for any collection of positive semi-definite matrices satisfying
[TABLE]
there exists a linear-plus-noise signaling strategy such that .
The following corollary provides a problem equivalent to (32).
Corollary 3**.**
The problem faced by , i.e., (32), is equivalent to
[TABLE]
subject to the following linear matrix inequalities:
[TABLE]
Furthermore, given as solution for (36), we can compute an optimal signaling strategy for (32), as follows:
- •
Compute signaling strategies as if signal can be dimensional, i.e., , according to (21), where we have instead of , instead of , and instead of .111111We provide closed-form expressions for the auxiliary parameters , and in Appendix G for the reader’s reference.
- •
For computed , compute associated signaling strategies according to (33).
We remark that under imperfect measurements optimal signaling strategies are not necessarily memoryless anymore.
Through a similar notational convention as in the previous section, we can write (36) as
[TABLE]
where we let be the set corresponding to the constraints (37). Then, based on Corollary 3, the following corollary to Theorem 1 provides a computationally tractable way to compute robust sensor outputs under imperfect measurements.
Corollary 4** (Computing the Equilibrium Under Imperfect Measurements).**
The value of the Stackelberg equilibrium (38) is given by , where
[TABLE]
Furthermore, let and
[TABLE]
Then, given , an associated linear-plus-noise signaling strategy can be computed according to Corollary 3.
V Illustrative Examples
In this section, we examine the performance of the proposed defense measure over various attack scenarios. As an illustrative example, we set length of the time horizon at , dimension of state , and dimension of control input . We consider the scenario where the system’s control objective is to track an exogenous process evolving according to
[TABLE]
where , and is a white Gaussian noise process, and they are independent of each other and every other parameter. Correspondingly, the system’s control objective can be written as
[TABLE]
We set and while and . The sensor has access to the measurements:
[TABLE]
where is a white Gaussian measurement noise independent of every other parameter.
Note that the control objective is not in the form of (5); however, we can transform it into the form of (5) by introducing the augmented state evolving according to
[TABLE]
and augmented measurements are then given by
[TABLE]
Correspondingly (41) can be written as
[TABLE]
where
[TABLE]
As examples of attack scenarios, we consider a type set . Let us partition the underlying state and the exogenous process , where and (or and ) correspond to the first (or the second) entries of the state and the exogenous process, respectively. We assume that type- attacker seeks to make track instead of whereas it is not interested in . Then its control objective can be written as
[TABLE]
Similarly type- attacker seeks to make track instead of whereas it is not interested in . Then its control objective can be written as
[TABLE]
On the other hand, type- attacker is interested in both entries and seeks to make the entire state track . Accordingly, its control objective can be written as
[TABLE]
We note that numerical simulations show that mixtures of types , and can lead to larger costs for the system than any single type, including type-. In the following, we examine the cost of under perfect and imperfect measurements separately.
Under perfect measurements, the sensor has access to . Note that perfect measurements provide the utmost freedom for to shape the belief of the attacker. For any cost that can attain under imperfect measurements, can select a signaling strategy under perfect measurements to attain the same cost. Therefore, attains the lowest possible cost under perfect measurements in the robust sensor design framework.
In Table I, we tabulate the cost to for the scenarios where there is no attack, i.e., type of is ; there is an attack by type- attacker, i.e., type of is ; there is an attack by type- attacker, i.e., type of is ; and there is an attack by type- attacker, i.e., type of is . Cost to varies depending on the type of and how prepared is while constructing the sensor outputs. In other words, constructs the sensor outputs according to an extended type set.
For example, would have constructed signaling strategies according to if it views that there would not be any attack. Correspondingly, if there is no attack, then the cost would be . However, if there is an attack by, e.g., type- attacker, then the cost would be , which is significantly higher compared to . Next, consider that has constructed the sensor outputs according to . Then, the system would be prepared against an attack by type- attacker, and the cost would be when type- attacker attacks. This is significantly lower than the cost obtained when constructs sensor outputs without any concern about possible attacks. However, now the cost to is if there is no attack and the type of is . This is also higher than the cost obtained when constructs sensor outputs without any concern about possible attacks. It is an uncertainty whether there will be an attack or not. Correspondingly, if constructs the sensor outputs according to , then the cost would be when there is no attack. It is lower than the cost obtained before. On the other hand, the cost would still be around when type- attacker attacks the system.
Even though is prepared to an attack by type- attacker by constructing the sensor outputs according to , there can be an attack by another type attacker, e.g., type-. Then the cost would be , which is significantly higher than the cost that would be obtained when type- attacker attacks. The system can decrease this cost by also considering the possibility of attacks by type- attacker while constructing the sensor outputs. For example, if constructs the sensor outputs by taking into account types , , and attacks, then the cost would be at most if any of those types of attacks occurs and the cost would be if there is no attack. These examples show the importance of constructing sensor outputs in a robust way.
As an example for imperfect measurements, we take , where , and , where . In Table II, we tabulate the costs to over various scenarios, similar to Table I. Table II shows that imperfect measurements lead to larger cost for the system when there is no attack, as to be expected. However, at certain scenarios, imperfect measurements can lead to better performance for the system. For example, when constructs sensor outputs by considering that there would not be any attack, i.e., according to , and there is an attack by type- attacker, the cost would be , which is lower than the cost obtained under perfect measurements. For this scenario, imperfect measurements end up obfuscating type- attacker and lead to lower cost. In that respect, robust sensor outputs can be viewed as optimal imperfect measurements that lead to minimum cost for the system. Furthermore, the cost to increases under imperfect measurements over the scenarios where it is prepared. In Table II, we highlight those scenarios by shading their cells. A comparison of shaded cells of Tables I and II verifies the observation emphasized above, i.e., imperfect measurements limit ’s ability to persuade .
VI Concluding remarks
In this paper, we have proposed and addressed persuasion-based robust sensor design as a security measure in control systems against attackers with unknown control objectives. By designing sensor outputs cautiously in advance, we have sought to shape attackers’ believes about the underlying state of the system in order to induce them to act/attack to the system in line with the system’s normal operation. We have modeled the problem formally under the solution concept of Stackelberg equilibrium where the defender/sensor is the leader. Non-strategic reaction of the follower/attacker implies that the defender faces an optimization problem while seeking to design robust sensor strategies. We have shown that the optimization problem is non-convex and highly nonlinear. To mitigate this issue, we have formulated a tractable problem equivalent to that problem and shown how to compute the associated signaling strategies. We have also extended the results to scenarios where there are imperfect measurements of the underlying state. Finally, we have examined the performance of the proposed framework across various attack scenarios.
Future directions of research on this topic include development of computationally efficient algorithms to compute optimal signaling rules and developing persuasion-based sensor design strategies for scenarios where attackers have partial information about the underlying state dynamics instead of full knowledge of it or have side information about the state instead of relying on sensor outputs only. Another interesting research direction would be its application on sensor placement or sensor selection.
Furthermore, even though we have motivated the framework by relating it to security, the framework could also address strategic information disclosure over multi-agent control networks where agents have different control objectives. Particularly, independent of how we motivate and set up the signaling problem (e.g., a security application or a multi-agent non-cooperative control network), the solution concept developed can be adopted in various settings in a straightforward way provided that
- •
Information of interest and all random variables/vectors are jointly Gaussian
- •
There is a single sender and possibly multiple receivers
- •
Optimal reaction of each receiver is linear in its posterior belief
- •
The sender’s objective depends on receivers’ reactions only through an arbitrary quadratic function of their posterior beliefs
In this paper, we have used this result to address uncertainties regarding attackers’ (or receivers’) objectives in the security of control systems over a finite horizon. However, the result could be adopted in several other scenarios as well, such as:
- •
over infinite horizon (as we did in [29])
- •
when there are additional tractable constraints on the covariance of the posterior belief (as we did in [28] for a power constraint over the signals when there is an additive Gaussian noise channel between the sender and the receiver)
Furthermore, the ability to turn signaling problems (which lead to highly nonlinear and non-convex optimization problems) into linear optimization problems (over the space of positive-semi definite matrices with linear matrix inequality constraints) facilitates analysis of problems over more complex settings, e.g., where
- •
There can be multiple controllers seeking to drive the same system
- •
There can be multiple senders that compete with each other to induce a controller to take certain actions
Appendix A Novelty Relative to Reference [9]
In Reference [9], we formulated an SDP equivalent to the original optimization problem in scenarios where there is no uncertainty on the attacker’s objective. Although it may seem to have a similar flavor, in [9] we used different technical tools and these tools cannot be adopted for the settings of this paper. Particularly, in [9] we exploited the fact that a solution of a linear optimization problem over a compact convex set lies at extreme points121212We say that a point in a convex set is an extreme point if it cannot be expressed as a convex combination of any other two points in the set. of the constraint set. Even though we were able to characterize the extreme points of the constraint set for the specific optimization problem in [9], characterization of extreme points is challenging in general, e.g., see [30]. Furthermore, in the settings of this paper, the associated optimization problem includes an inner maximization induced by the sensor’s robustness concern. Therefore the techniques developed in [9] cannot be used in this setting since the objective function is no longer linear in the optimization arguments due to the inner maximization. It is indeed a convex function since maximum of any family of linear/affine functions is a convex function [27]. Correspondingly, the solution does not necessarily lie at an extreme point of the constraint set. To be able to solve this optimization problem globally, in this paper we have shown that those linear matrix inequalities provide not only necessary but also sufficient conditions. This leads to a more comprehensive solution concept since it can be adopted in other sensor design settings in a straightforward way in order to obtain an equivalent tractable optimization problem, e.g., for the settings over imperfect measurements as we did here, infinite horizon as shown in [29] (based on the results of this paper), and several others.
Appendix B Proof of Lemma 1
Based on (16) and (12a), we obtain (17) through some algebra as detailed below. We focus on part of the cost induced by the controller of the system, e.g., (9), and part of the cost induced by the attacker, e.g., (10), separately.
Part- For notational brevity, let us first introduce the following matrices
[TABLE]
Then, the right hand side of (14) can be written in a compact form as
[TABLE]
in terms of the augmented vectors and . Correspondingly, for type- , we obtain
[TABLE]
where is the augmented vector of posteriors. Related to (9), this yields
[TABLE]
where we define ,
[TABLE]
and , which follows since due to the law of iterated expectations.
Part- The state can be written in terms of control-free state and control inputs as follows:
[TABLE]
Let us define
[TABLE]
Then (42) and (45) yield that for , we have
[TABLE]
where , , , , and for all we define
[TABLE]
Note that (47) follows since is a white noise and turns out to be an upper triangular (block) matrix.
Combining Parts and together, we obtain that faces the following problem:
[TABLE]
where and for are as described in (43), (44), (48), and (49). Based on the definition of , it can be shown that
[TABLE]
Therefore, the corresponding in (17) is defined by
[TABLE]
where is an block of , with indexing from the right-bottom to the left-top.
Appendix C Proof of Proposition 1
The necessity condition has been shown in [9, Lemma 3].
In order to show the sufficiency of the condition, suppose that a collection of positive semi-definite matrices satisfying (35) is given. Then satisfies
[TABLE]
Note that can be singular. Therefore let be the eigen-decomposition such that and . When we multiply the terms in (51) from right by the unitary matrix and from left by the transpose of the unitary matrix, i.e., , we obtain
[TABLE]
which implies that
[TABLE]
where we let be the corresponding partitioning, e.g., .
Since , we have [31, Observation 7.1.2]. However, the bottom-right block of the positive semi-definite matrix (the whole term) on the left-hand-side of the inequality (52), i.e., , must also be a positive semi-definite matrix, which implies . Therefore we can conclude that .
Next we invoke [32, Lemma 3] yielding . Therefore (52) can be written as
[TABLE]
We define , let be its eigen-decomposition, and let be its eigenvalues. Then (53) yields that and correspondingly , which implies that for all .
Since , , and , can be written as
[TABLE]
Since is not a singular matrix, (54) yields that there exists a bijective relation between and , i.e., given , we can compute and vice versa. Therefore, we can just focus on instead of . To this end, consider a signaling strategy , where . Then, the covariance matrix is given by
[TABLE]
Note that we can set and arbitrarily. Given the eigenvalues of ; it can be verified that if we set and such that
[TABLE]
and the entries satisfy
[TABLE]
then we obtain exactly. Particularly, for such and , eigen-decomposition of yields that can be written as
[TABLE]
since unitary matrices satisfy . On the other hand, can be written as
[TABLE]
Therefore can be written as
[TABLE]
which follows from (56). Recall that . Therefore (57) is equivalent to (54), which verifies the claim. Note also that there always exist satisfying (56) since for all .
We have shown that given satisfying (51), we can select a signaling strategy such that exactly. Next, by following similar lines, we compute the associated signaling strategies for under the assumption that we have obtained them up to .
Suppose that for . Then, satisfies
[TABLE]
which is equivalent to
[TABLE]
Correspondingly, can be singular. Let be the eigen-decomposition such that and . When we multiply the terms in (58) from right by the unitary matrix and from left by the transpose of the unitary matrix, i.e., , we obtain
[TABLE]
Following the same reasons for the case , [32, Lemma 3] yields that there exists a symmetric matrix such that
[TABLE]
and there exists a bijective relation between and . Similarly, (59) and (60) yield that , which implies that has eigenvalues in the closed interval . Let be the eigen decomposition and be the associated eigenvalues.
Furthermore, consider a signaling strategy , where . Then, the covariance matrix is given by
[TABLE]
which follows since
[TABLE]
due to the independence of the jointly Gaussian and . We can again set and arbitrarily. Given the eigenvalues of , it can be verified that if we set and such that
[TABLE]
and the entries satisfy
[TABLE]
then we would obtain exactly. Therefore, by induction, we conclude that for any satisfying (35), there exists a certain signaling strategy such that for all .
Appendix D Proof of Proposition 2
Since the objective function in (24) is continuous in the optimization arguments, and the constraint sets are decoupled and compact, the extreme value theorem and maximum theorem (showing the continuity of parametric maximization under certain conditions [33]) yields the existence of a solution to (24).
Appendix E Proof of Theorem 1
Based on the existence result in Proposition 2, suppose that solves (24) and is the maximizer of (25) for . Since , there must be at least one type with positive weight. For example, suppose positive weight for the type , i.e., . This implies that
[TABLE]
since by (26). Furthermore, this also implies that
[TABLE]
since if . These necessary conditions yield that (24) is equivalent to
[TABLE]
which is an SDP isolated from the distribution over the extended type set. Therefore, by searching over the extended type set , we can compute the minimum of (24), which is the minimum over . Once the minimum value is computed, can be computed according to the corresponding type, i.e., (28).
Appendix F Proof of Lemma 2
For a signaling strategy as described in (33), we would have
[TABLE]
where the last line follows since , for , is just a measurable function of while is already conditioned on . This yields that they both would lead to the same posterior. Recall that the best reaction of is linear function of as described in (16). Therefore they both would lead to the same control inputs almost surely and therefore the same cost for .
Appendix G Closed-Form Expressions for Auxiliary Parameters Under Imperfect Measurements
Note that , , and can be written as
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. Giraldo, E. Sarkar, A. A. Cardenas, M. Maniatakos, and M. Kantarcioglu, “Security and privacy in cyber-physical systems: A survey of surveys,” IEEE Design & Test , vol. 34, pp. 7–17, 2017.
- 2[2] A. Humayed, J. Lin, F. Li, and B. Luo, “Cyber-physical systems security – A survey,” IEEE Internet of Things Journal , vol. 4, no. 6, 2017.
- 3[3] N. Nelson, “The impact of Dragonfly malware on industrial control systems,” The SANS Institute , 2016.
- 4[4] V. Crawford and J. Sobel, “Strategic information transmission,” Econometrica , vol. 50, no. 6, pp. 1431–1451, 1982.
- 5[5] E. Kamenica and M. Gentzkow, “Bayesian persuasion,” American Economic Review , vol. 101, pp. 25 090–2615, 2011.
- 6[6] T. Başar and G. J. Olsder, Dynamic Noncooperative Game Theory . Society for Industrial Mathematics (SIAM) Series in Classics in Applied Mathematics, 1999.
- 7[7] E. Kamenica, “Bayesian persuasion and information design,” Annual Review of Economics , vol. 11, 2019.
- 8[8] W. Tamura, “A theory of multidimensional information disclosure,” Working paper, available at SSRN 1987877 , 2014.
