Hypothesis Testing under Subjective Priors and Costs as a Signaling Game

Serkan Sar{\i}ta\c{s}; Sinan Gezici; Serdar Y\"uksel

arXiv:1906.04577·math.OC·September 9, 2019·IEEE Trans. Signal Process.

Hypothesis Testing under Subjective Priors and Costs as a Signaling Game

Serkan Sar{\i}ta\c{s}, Sinan Gezici, Serdar Y\"uksel

PDF

TL;DR

This paper analyzes binary signaling games with subjective priors and costs, exploring equilibrium existence and properties under different game-theoretic frameworks, and examining robustness to perturbations.

Contribution

It formulates signaling problems as Bayesian games under Nash and Stackelberg equilibria, deriving conditions for informative equilibria and analyzing robustness to perturbations.

Findings

01

Informative and non-informative equilibria can exist under Stackelberg and Nash.

02

Equilibrium existence is guaranteed in the team setup, always informative.

03

Stackelberg equilibria are sensitive to small perturbations, unlike Nash.

Abstract

Many communication, sensor network, and networked control problems involve agents (decision makers) which have either misaligned objective functions or subjective probabilistic models. In the context of such setups, we consider binary signaling problems in which the decision makers (the transmitter and the receiver) have subjective priors and/or misaligned objective functions. Depending on the commitment nature of the transmitter to his policies, we formulate the binary signaling problem as a Bayesian game under either Nash or Stackelberg equilibrium concepts and establish equilibrium solutions and their properties. We show that there can be informative or non-informative equilibria in the binary signaling game under the Stackelberg and Nash assumptions, and derive the conditions under which an informative equilibrium exists for the Stackelberg and Nash setups. For the corresponding…

Equations102

sgn (x) = ⎩ ⎨ ⎧ - 1 01 if x < 0 if x = 0 if x > 0 .

sgn (x) = ⎩ ⎨ ⎧ - 1 01 if x < 0 if x = 0 if x > 0 .

H_{0} : Y = S_{0} + N, H_{1} : Y = S_{1} + N,

H_{0} : Y = S_{0} + N, H_{1} : Y = S_{1} + N,

r (δ) = π_{0} R_{0} (δ) + π_{1} R_{1} (δ),

r (δ) = π_{0} R_{0} (δ) + π_{1} R_{1} (δ),

R_{i} (δ) = C_{0 i} P_{0 i} + C_{1 i} P_{1 i},

R_{i} (δ) = C_{0 i} P_{0 i} + C_{1 i} P_{1 i},

\displaystyle\delta:\Bigg{\{}\pi_{1}(C_{01}-C_{11})p_{1}(y)\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\pi_{0}(C_{10}-C_{00})p_{0}(y)\;,

\displaystyle\delta:\Bigg{\{}\pi_{1}(C_{01}-C_{11})p_{1}(y)\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\pi_{0}(C_{10}-C_{00})p_{0}(y)\;,

r^{j} (S, δ) = π_{0}^{j} (C_{00}^{j} P_{00} + C_{10}^{j} P_{10}) + π_{1}^{j} (C_{01}^{j} P_{01} + C_{11}^{j} P_{11}),

r^{j} (S, δ) = π_{0}^{j} (C_{00}^{j} P_{00} + C_{10}^{j} P_{10}) + π_{1}^{j} (C_{01}^{j} P_{01} + C_{11}^{j} P_{11}),

S ≜ {S = {S_{0}, S_{1}} : ∣ S_{0} ∣^{2} \leq P_{0}, ∣ S_{1} ∣^{2} \leq P_{1}},

S ≜ {S = {S_{0}, S_{1}} : ∣ S_{0} ∣^{2} \leq P_{0}, ∣ S_{1} ∣^{2} \leq P_{1}},

r^{t} (S^{*}, δ^{*}) r^{r} (S^{*}, δ^{*}) \leq r^{t} (S, δ^{*}) \forall S \in S, \leq r^{r} (S^{*}, δ) \forall δ \in Δ .

r^{t} (S^{*}, δ^{*}) r^{r} (S^{*}, δ^{*}) \leq r^{t} (S, δ^{*}) \forall S \in S, \leq r^{r} (S^{*}, δ) \forall δ \in Δ .

r^{t} (S^{*}, δ_{S^{*}}^{*}) \leq r^{t} (S, δ_{S}^{*}) \forall S \in S, where δ_{S}^{*} satisfies r^{r} (S, δ_{S}^{*}) \leq r^{r} (S, δ_{S}) \forall δ_{S} \in Δ .

r^{t} (S^{*}, δ_{S^{*}}^{*}) \leq r^{t} (S, δ_{S}^{*}) \forall S \in S, where δ_{S}^{*} satisfies r^{r} (S, δ_{S}^{*}) \leq r^{r} (S, δ_{S}) \forall δ_{S} \in Δ .

r^{t} (S, δ)

r^{t} (S, δ)

\Rightarrow C_{01}^{t} = C_{10}^{t} = α and C_{00}^{t} = C_{11}^{t} = 1 - α,

r^{r} (S, δ)

\Rightarrow C_{01}^{r} = C_{10}^{r} = 1 and C_{00}^{r} = C_{11}^{r} = 0 .

r^{j} (S, δ) = π_{0}^{j} C_{00}^{j} + π_{1}^{j} C_{11}^{j} + π_{0}^{j} (C_{10}^{j} - C_{00}^{j}) P_{10} + π_{1}^{j} (C_{01}^{j} - C_{11}^{j}) P_{01},

r^{j} (S, δ) = π_{0}^{j} C_{00}^{j} + π_{1}^{j} C_{11}^{j} + π_{0}^{j} (C_{10}^{j} - C_{00}^{j}) P_{10} + π_{1}^{j} (C_{01}^{j} - C_{11}^{j}) P_{01},

\displaystyle\delta:\Bigg{\{}\zeta{p_{1}(y)\over p_{0}(y)}\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta{\pi_{0}^{r}(C^{r}_{10}-C^{r}_{00})\over\pi_{1}^{r}(C^{r}_{01}-C^{r}_{11})}=\zeta\tau\;.

\displaystyle\delta:\Bigg{\{}\zeta{p_{1}(y)\over p_{0}(y)}\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta{\pi_{0}^{r}(C^{r}_{10}-C^{r}_{00})\over\pi_{1}^{r}(C^{r}_{01}-C^{r}_{11})}=\zeta\tau\;.

δ_{S_{0}, S_{1}}^{*}

δ_{S_{0}, S_{1}}^{*}

P_{10}

P_{10}

= Q (ζ (\frac{σ ln ( τ )}{∣ S _{1} - S _{0} ∣} + \frac{∣ S _{1} - S _{0} ∣}{2 σ})),

\displaystyle\begin{split}{\mathrm{d}\,r^{t}(\mathcal{S},\delta)\over\mathrm{d}\,d}&=-{1\over\sqrt{2\pi}}\exp\left\{-{(\ln\tau)^{2}\over 2d^{2}}\right\}\exp\left\{-{d^{2}\over 8}\right\}\\ &\quad\;\times\Bigg{(}\pi_{0}^{t}\zeta(C^{t}_{10}-C^{t}_{00})\tau^{-{1\over 2}}\left(-{\ln\tau\over d^{2}}+{1\over 2}\right)+\pi_{1}^{t}\zeta(C^{t}_{01}-C^{t}_{11})\tau^{1\over 2}\left({\ln\tau\over d^{2}}+{1\over 2}\right)\Bigg{)}\;.\end{split}

\displaystyle\begin{split}{\mathrm{d}\,r^{t}(\mathcal{S},\delta)\over\mathrm{d}\,d}&=-{1\over\sqrt{2\pi}}\exp\left\{-{(\ln\tau)^{2}\over 2d^{2}}\right\}\exp\left\{-{d^{2}\over 8}\right\}\\ &\quad\;\times\Bigg{(}\pi_{0}^{t}\zeta(C^{t}_{10}-C^{t}_{00})\tau^{-{1\over 2}}\left(-{\ln\tau\over d^{2}}+{1\over 2}\right)+\pi_{1}^{t}\zeta(C^{t}_{01}-C^{t}_{11})\tau^{1\over 2}\left({\ln\tau\over d^{2}}+{1\over 2}\right)\Bigg{)}\;.\end{split}

\frac{d r ^{t} ( S , δ )}{d d}

\frac{d r ^{t} ( S , δ )}{d d}

π_{0}^{t} (C_{10}^{t} - C_{00}^{t}) d^{*} = 0 ⋛ d^{*} = d_{m a x}

π_{0}^{t} (C_{10}^{t} - C_{00}^{t}) d^{*} = 0 ⋛ d^{*} = d_{m a x}

π_{0}^{t} (C_{10}^{t} - C_{00}^{t}) Q (ζ (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2})) + π_{1}^{t} (C_{01}^{t} - C_{11}^{t}) Q (ζ (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}))

π_{0}^{t} (C_{10}^{t} - C_{00}^{t}) Q (ζ (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2})) d^{*} = 0 ⋛ d^{*} = d_{m a x} π_{1}^{t} (C_{01}^{t} - C_{11}^{t}) Q (ζ (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}))

ζ k_{0} τ Q (ζ (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2})) d^{*} = 0 ⋛ d^{*} = d_{m a x} ζ k_{1} Q (ζ (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2})) .

\frac{k _{0} τ}{k _{1}} Q (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{0} τ}{k _{1}} Q (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{1}}{k _{0} τ} Q (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{1}}{k _{0} τ} Q (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

ζ k_{1} Q (ζ (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2})) d^{*} = 0 ⋛ d^{*} = d_{m a x} ζ k_{0} τ Q (ζ (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2})) .

ζ k_{1} Q (ζ (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2})) d^{*} = 0 ⋛ d^{*} = d_{m a x} ζ k_{0} τ Q (ζ (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2})) .

\frac{k _{0} τ}{k _{1}} Q (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{0} τ}{k _{1}} Q (- \frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (- \frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{1}}{k _{0} τ} Q (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

\frac{k _{1}}{k _{0} τ} Q (\frac{ln ( τ )}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{ln ( τ )}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

(\frac{k _{1}}{k _{0} τ})^{sgn (l n (τ))} Q (\frac{∣ ln ( τ ) ∣}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{∣ ln ( τ ) ∣}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

(\frac{k _{1}}{k _{0} τ})^{sgn (l n (τ))} Q (\frac{∣ ln ( τ ) ∣}{d _{m a x}} - \frac{d _{m a x}}{2}) - Q (\frac{∣ ln ( τ ) ∣}{d _{m a x}} + \frac{d _{m a x}}{2}) d^{*} = 0 ⋛ d^{*} = d_{m a x} 0 .

τ

τ

k_{0}

k_{1}

τ = \frac{π _{0}}{π _{1}}, k_{0} = π_{0} π_{1} (2 α - 1), k_{1} = π_{0} π_{1} (2 α - 1) .

τ = \frac{π _{0}}{π _{1}}, k_{0} = π_{0} π_{1} (2 α - 1), k_{1} = π_{0} π_{1} (2 α - 1) .

r^{t} (S, δ) = π_{0}^{t} C_{00}^{t}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Hypothesis Testing under Subjective Priors and Costs as a Signaling Game111This research was supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada. Part of this work [1] was presented at the 57th IEEE Conference on Decision and Control (CDC 2018).

Serkan Sarıtaş1

Sinan Gezici2

Serdar Yüksel3

(1Division of Network and Systems Engineering, KTH Royal Institute of Technology, SE-10044, Stockholm, Sweden. Email: [email protected].

2Department of Electrical and Electronics Engineering, Bilkent University, 06800, Ankara, Turkey. Email: [email protected].

3Department of Mathematics and Statistics, Queen’s University, K7L 3N6, Kingston, Ontario, Canada. Email: [email protected].

)

Abstract

Many communication, sensor network, and networked control problems involve agents (decision makers) which have either misaligned objective functions or subjective probabilistic models. In the context of such setups, we consider binary signaling problems in which the decision makers (the transmitter and the receiver) have subjective priors and/or misaligned objective functions. Depending on the commitment nature of the transmitter to his policies, we formulate the binary signaling problem as a Bayesian game under either Nash or Stackelberg equilibrium concepts and establish equilibrium solutions and their properties. We show that there can be informative or non-informative equilibria in the binary signaling game under the Stackelberg and Nash assumptions, and derive the conditions under which an informative equilibrium exists for the Stackelberg and Nash setups. For the corresponding team setup, however, an equilibrium typically always exists and is always informative. Furthermore, we investigate the effects of small perturbations in priors and costs on equilibrium values around the team setup (with identical costs and priors), and show that the Stackelberg equilibrium behavior is not robust to small perturbations whereas the Nash equilibrium is.

Index terms— Signal detection, hypothesis testing, signaling games, Nash equilibrium, Stackelberg equilibrium, subjective priors.

1 INTRODUCTION

In many decentralized and networked control problems, decision makers have either misaligned criteria or have subjective priors, which necessitates solution concepts from game theory. For example, detecting attacks, anomalies, and malicious behavior with regard to security in networked control systems can be analyzed under a game theoretic perspective, see e.g., [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13].

In this paper, we consider signaling games that refer to a class of two-player games of incomplete information in which an informed decision maker (transmitter or encoder) transmits information to another decision maker (receiver or decoder) in the hypothesis testing context. In the following, we first provide the preliminaries and introduce the problems considered in the paper, and present the related literature briefly.

1.1 Notation

We denote random variables with capital letters, e.g., $Y$ , whereas possible realizations are shown by lower-case letters, e.g., $y$ . The absolute value of scalar $y$ is denoted by $|y|$ . The vectors are denoted by bold-faced letters, e.g., $\mathbf{y}$ . For vector $\mathbf{y}$ , $\mathbf{y}^{T}$ denotes the transpose and $\|\mathbf{y}\|$ denotes the Euclidean ( $L_{2}$ ) norm. $\mathds{1}_{\{D\}}$ represents the indicator function of an event $D$ , $\oplus$ stands for the exclusive-or operator, $\mathcal{Q}$ denotes the standard $\mathcal{Q}$ -function; i.e., $\mathcal{Q}(x)={1\over\sqrt{2\pi}}\int_{x}^{\infty}\exp\{-{t^{2}\over 2}\}{\rm{d}}t$ , and the sign of $x$ is defined as

[TABLE]

1.2 Preliminaries

Consider a binary hypothesis-testing problem:

[TABLE]

where $Y$ is the observation (measurement) that belongs to the observation set $\Gamma=\mathbb{R}$ , $S_{0}$ and $S_{1}$ denote the deterministic signals under hypothesis $\mathcal{H}_{0}$ and hypothesis $\mathcal{H}_{1}$ , respectively, and $N$ represents Gaussian noise; i.e., $N\sim\mathcal{N}(0,\sigma^{2})$ . In the Bayesian setup, it is assumed that the prior probabilities of $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ are available, which are denoted by $\pi_{0}$ and $\pi_{1}$ , respectively, with $\pi_{0}+\pi_{1}=1$ .

In the conventional Bayesian framework, the aim of the receiver is to design the optimal decision rule (detector) based on $Y$ in order to minimize the Bayes risk, which is defined as [14]

[TABLE]

where $\delta$ is the decision rule, and $R_{i}(\cdot)$ is the conditional risk of the decision rule when hypothesis $\mathcal{H}_{i}$ is true for $i\in\{0,1\}$ . In general, a decision rule corresponds to a partition of the observation set $\Gamma$ into two subsets $\Gamma_{0}$ and $\Gamma_{1}$ , and the decision becomes $\mathcal{H}_{i}$ if the observation $y$ belongs to $\Gamma_{i}$ , where $i\in\{0,1\}$ .

The conditional risks in (2) can be calculated as

[TABLE]

for $i\in\{0,1\}$ , where $C_{ji}\geq 0$ is the cost of deciding for $\mathcal{H}_{j}$ when $\mathcal{H}_{i}$ is true, and $\mathsf{P}_{ji}=\mathsf{Pr}(y\in\Gamma_{j}|\mathcal{H}_{i})$ represents the conditional probability of deciding for $\mathcal{H}_{j}$ given that $\mathcal{H}_{i}$ is true, where $i,j\in\{0,1\}$ [14].

It is well-known that the optimal decision rule $\delta$ which minimizes the Bayes risk is the following test, known as the likelihood ratio test (LRT):

[TABLE]

where $p_{i}(y)$ represents the probability density function (PDF) of $Y$ under $\mathcal{H}_{i}$ for $i\in\{0,1\}$ [14].

If the transmitter and the receiver have the same objective function specified by (2) and (3), then the signals can be designed to minimize the Bayes risk corresponding to the decision rule in (4). This leads to a conventional formulation which has been studied intensely in the literature [14, 15].

On the other hand, it may be the case that the transmitter and the receiver can have non-aligned Bayes risks. In particular, the transmitter and the receiver may have different objective functions or priors: Let $C^{t}_{ji}$ and $C^{r}_{ji}$ represent the costs from the perspective of the transmitter and the receiver, respectively, where $i,j\in\{0,1\}$ . Also let $\pi_{i}^{t}$ and $\pi_{i}^{r}$ for $i\in\{0,1\}$ denote the priors from the perspective of the transmitter and the receiver, respectively, with $\pi_{0}^{j}+\pi_{1}^{j}=1$ , where $j\in\{t,r\}$ . Here, from transmitter’s and receiver’s perspectives, the priors are assumed to be mutually absolutely continuous with respect to each other; i.e., $\pi_{i}^{t}=0\Rightarrow\pi_{i}^{r}=0$ and $\pi_{i}^{r}=0\Rightarrow\pi_{i}^{t}=0$ for $i\in\{0,1\}$ . This condition assures that the impossibility of any hypothesis holds for both the transmitter and the receiver simultaneously. The aim of the transmitter is to perform the optimal design of signals $\mathcal{S}=\{S_{0},S_{1}\}$ to minimize his Bayes risk; whereas, the aim of the receiver is to determine the optimal decision rule $\delta$ over all possible decision rules $\Delta$ to minimize his Bayes risk.

The Bayes risks are defined as follows for the transmitter and the receiver:

[TABLE]

for $j\in\{t,r\}$ . Here, the transmitter performs the optimal signal design problem under the power constraint below:

[TABLE]

where $P_{0}$ and $P_{1}$ denote the power limits [14, p. 62].

Although the transmitter and the receiver act sequentially in the game as described above, how and when the decisions are made and the nature of the commitments to the announced policies significantly affect the analysis of the equilibrium structure. Here, two different types of equilibria are investigated:

Nash equilibrium: the transmitter and the receiver make simultaneous decisions. 2. 2.

Stackelberg equilibrium : the transmitter and the receiver make sequential decisions where the transmitter is the leader and the receiver is the follower.

In this paper, the terms Nash game and the simultaneous-move game will be used interchangeably, and similarly, the Stackelberg game and the leader-follower game will be used interchangeably.

In the simultaneous-move game, the transmitter and the receiver announce their policies at the same time, and a pair of policies $(\mathcal{S}^{*},\delta^{*})$ is said to be a Nash equilibrium [16] if

[TABLE]

As noted from the definition in (6), under the Nash equilibrium, each individual player chooses an optimal strategy given the strategies chosen by the other player.

However, in the leader-follower game, the leader (transmitter) commits to and announces his optimal policy before the follower (receiver) does, the follower observes what the leader is committed to before choosing and announcing his optimal policy, and a pair of policies $(\mathcal{S}^{*},\delta^{*}_{\mathcal{S}^{*}})$ is said to be a Stackelberg equilibrium [16] if

[TABLE]

As observed from the definition in (7), the receiver takes his optimal action $\delta^{*}_{\mathcal{S}}$ after observing the policy of the transmitter $\mathcal{S}$ . Further, in the Stackelberg game (also often called Bayesian persuasion games in the economics literature, see [17] for a detailed review), the leader cannot backtrack on his commitment, but he has a leadership role since he can manipulate the follower by anticipating the actions of the follower.

If an equilibrium is achieved when $\mathcal{S}^{*}$ is non-informative (e.g., $S_{0}^{*}=S_{1}^{*}$ ) and $\delta^{*}$ uses only the priors (since the received message is useless), then we call such an equilibrium a non-informative (babbling) equilibrium [18, Theorem 1].

1.3 Two Motivating Setups

We present two different scenarios that fit into the binary signaling context discussed here and revisit these setups throughout the paper222Besides the setups discussed here (and the throughout the paper), the deception game can also be modeled as follows. In the deception game, the transmitter aims to fool the receiver by sending deceiving messages, and this goal can be realized by adjusting the transmitter costs as $C^{t}_{00}>C^{t}_{10}$ and $C^{t}_{11}>C^{t}_{01}$ ; i.e, the transmitter is penalized if the receiver correctly decodes the original hypothesis. Similar to the standard communication setups, the goal of the receiver is to truly identify the hypothesis; i.e., $C^{r}_{00}<C^{r}_{10}$ and $C^{r}_{11}<C^{r}_{01}$ ..

1.3.1 Subjective Priors

In almost all practical applications, there is some mismatch between the true and an assumed probabilistic system/data model, which results in performance degradation. This performance loss due to the presence of mismatch has been studied extensively in various setups (see e.g.,[19], [20], [21] and references therein). In this paper, we have a further salient aspect due to decentralization, where the transmitter and the receiver have a mismatch. We note that in decentralized decision making, there have been a number of studies on the presence of a mismatch in the priors of decision makers [22, 23, 24]. In such setups, even when the objective functions to be optimized are identical, the presence of subjective priors alters the formulation from a team problem to a game problem (see [25, Section 12.2.3] for a comprehensive literature review on subjective priors also from a statistical decision making perspective).

With this motivation, we will consider a setup where the transmitter and the receiver have different priors on the hypotheses $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ , and the costs of the transmitter and the receiver are identical. In particular, from transmitter’s perspective, the priors are $\pi_{0}^{t}$ and $\pi_{1}^{t}$ , whereas the priors are $\pi_{0}^{r}$ and $\pi_{1}^{r}$ from receiver’s perspective, and $C_{ji}=C^{t}_{ji}=C^{r}_{ji}$ for $i,j\in\{0,1\}$ . We will investigate equilibrium solutions for this setup throughout the paper.

1.3.2 Biased Transmitter Cost333Here, the cost refers to the objective function (Bayes risk), not the cost of a particular decision, $C_{ji}$ . Note that, throughout the manuscript, the cost refers to $C_{ji}$ except when it is used in the phrase Biased Transmitter Cost.

A further application will be for a setup where the transmitter and the receiver have misaligned objective functions. Consider a binary signaling game in which the transmitter encodes a random binary signal $x=i$ as $\mathcal{H}_{i}$ by choosing the corresponding signal level $S_{i}$ for $i\in\{0,1\}$ , and the receiver decodes the received signal $y$ as $u=\delta(y)$ . Let the priors from the perspectives of the transmitter and the receiver be the same; i.e., $\pi_{i}=\pi_{i}^{t}=\pi_{i}^{r}$ for $i\in\{0,1\}$ , and the Bayes risks of the transmitter and the receiver be defined as $r^{t}(\mathcal{S},\delta)=\mathbb{E}[\mathds{1}_{\{1=(x\oplus u\oplus b)\}}]$ and $r^{r}(\mathcal{S},\delta)=\mathbb{E}[\mathds{1}_{\{1=(x\oplus u)\}}]$ , respectively, where $b$ is a random variable with a Bernoulli distribution; i.e., $\alpha\triangleq\mathsf{Pr}(b=0)=1-\mathsf{Pr}(b=1)$ , and $\alpha$ can be translated as the probability that the Bayes risks (objective functions) of the transmitter and the receiver are aligned. Then, the following relations can be observed:

[TABLE]

Note that, in the formulation above, the misalignment between the Bayes risks of the transmitter and the receiver is due to the presence of the bias term $b$ (i.e., the discrepancy between the Bayes risks of the transmitter and the receiver) in the Bayes risk of the transmitter. This can be viewed as an analogous setup to what was studied in a seminal work due to Crawford and Sobel [18], who obtained the striking result that such a bias term in the objective function of the transmitter may have a drastic effect on the equilibrium characteristics; in particular, under regularity conditions, all equilibrium policies under a Nash formulation involve information hiding; for some extensions under quadratic criteria please see [26] and [27].

1.4 Related Literature

In game theory, Nash and Stackelberg equilibria are drastically different concepts. Both equilibrium concepts find applications depending on the assumptions on the leader, that is, the transmitter, in view of the commitment conditions. Stackelberg games are commonly used to model attacker-defender scenarios in security domains [28]. In many frameworks, the defender (leader) acts first by committing to a strategy, and the attacker (follower) chooses how and where to attack after observing defender’s choice. However, in some situations, security measures may not be observable for the attacker; therefore, a simultaneous-move game is preferred to model such situations; i.e., the Nash equilibrium analysis is needed [29]. These two concepts may have equilibria that are quite distinct: As discussed in [26, 17], in the Nash equilibrium case, building on [18], equilibrium properties possess different characteristics as compared to team problems; whereas for the Stackelberg case, the leader agent is restricted to be committed to his announced policy, which leads to similarities with team problem setups [30, 27, 31]. However, in the context of binary signaling, we will see that the distinction is not as sharp as it is in the case of quadratic signaling games [26, 17].

Standard binary hypothesis testing has been extensively studied over several decades under different setups [14, 15], which can also be viewed as a decentralized control/team problem involving a transmitter and a receiver who wish to minimize a common objective function. However, there exist many scenarios in which the analysis falls within the scope of game theory; either because the goals of the decision makers are misaligned, or because the probabilistic model of the system is not common knowledge among the decision makers.

A game theoretic perspective can be utilized for hypothesis testing problem for a variety of setups. For example, detecting attacks, anomalies, and malicious behavior in network security can be analyzed under the game theoretic perspective [2, 3, 4, 5, 6]. In this direction, the hypothesis testing and the game theory approaches can be utilized together to investigate attacker-defender type applications [7, 8, 9, 11, 12, 13, 10], multimedia source identification problems [32], inspection games [33, 34, 35], and deception games[36]. In [8], a Nash equilibrium of a zero-sum game between Byzantine (compromised) nodes and the fusion center (FC) is investigated. The strategy of the FC is to set the local sensor thresholds that are utilized in the likelihood-ratio tests, whereas the strategy of Byzantines is to choose their flipping probability of the bit to be transmitted. In [9], a zero-sum game of a binary hypothesis testing problem is considered over finite alphabets. The attacker has control over the channel, and the randomized decision strategy is assumed for the defender. The dominant strategies in Neyman-Pearson and Bayesian setups are investigated under the Nash assumption. The authors of [34, 35] investigate both Nash and Stackelberg equilibria of a zero-sum inspection game where an inspector (environmental agency) verifies, with the help of randomly sampled measurements, whether the amount of pollutant released by the inspectee (management of an industrial plant) is higher than the permitted ones. The inspector chooses a false alarm probability $\alpha$ , and determines his optimal strategy over the set of all statistical tests with false alarm probability $\alpha$ to minimize the non-detection probability. On the other side, the inspectee chooses the signal levels (violation strategies) to maximize the non-detection probability. [10] considers a complete-information zero-sum game between a centralized detection network and a jammer equipped with multiple antennas and investigates pure strategy Nash equilibria for this game. The fusion center (FC) chooses the optimal threshold of a single-threshold rule in order to minimize his error probability based on the observations coming from multiple sensors, whereas the jammer disrupts the channel in order to maximize FC’s error probability under instantaneous power constraints. However, unlike the setups described above, in this work, we assume an additive Gaussian noise channel, and in the game setup, a Bayesian hypothesis testing setup is considered in which the transmitter chooses signal levels to be transmitted and the receiver determines the optimal decision rule. Both players aim to minimize their individual Bayes risks, which leads to a nonzero-sum game.[36] investigates the perfect Bayesian Nash equilibrium (PBNE) solution of a cyber-deception game in which the strategically deceptive interaction between the deceivee (privately-informed player, sender) and the deceiver (uninformed player, receiver) are modeled by a signaling game framework. It is shown that the hypothesis testing game admits no separating (pure, fully informative) equilibria, there exist only pooling and partially-separating-pooling equilibria; i.e., non-informative equilibria. Note that, in [36], the received message is designed by the deceiver (transmitter), whereas we assume a Gaussian channel between the players. Further, the belief of the receiver (deceivee) about the priors is affected by the design choices of the transmitter (deceiver), unlike this setup, in which constant beliefs are assumed.

Within the scope of the discussions above, the binary signaling problem investigated here can be motivated under different application contexts: subjective priors and the presence of a bias in the objective function of the transmitter compared to that of the receiver. In the former setup, players have a common goal but subjective prior information, which necessarily alters the setup from a team problem to a game problem. The latter one is the adaptation of the biased objective function of the transmitter in [18] to the binary signaling problem considered here. We discuss these further in the following.

1.5 Contributions

The main contributions of this paper can be summarized as follows: (i) A game theoretic formulation of the binary signaling problem is established under subjective priors and/or subjective costs. (ii) The corresponding Stackelberg and Nash equilibrium policies are obtained, and their properties (such as uniqueness and informativeness) are investigated. It is proved that an equilibrium is almost always informative for a team setup, whereas in the case of subjective priors and/or costs, it may cease to be informative. (iii) Furthermore, robustness of equilibrium solutions to small perturbations in the priors or costs are established. It is shown that, the game equilibrium behavior around the team setup is robust under the Nash assumption, whereas it is not robust under the Stackelberg assumption. (iv) For each of the results, applications to two motivating setups (involving subjective priors and the presence of a bias in the objective function of the transmitter) are presented.

In the conference version of this study [1], some of the results (in particular, the Nash and Stackelberg equilibrium solutions and their robustness properties) appear without proofs. Here we provide the full proofs of the main theorems and also include the continuity analysis of the equilibrium. Furthermore, the setup and analysis presented in [1] are extended to the multi-dimensional case and partially to the case with an average power constraint.

The remainder of the paper is organized as follows. The team setup, the Stackelberg setup, and the Nash setup of the binary signaling game are investigated in Sections II, Section III, and Section IV, respectively. In Section V, the multi-dimensional setup is studied, and in Section VI, the setup under an average power constraint is investigated. The paper ends with Section VII, where some conclusions are drawn and directions for future research highlighted.

2 TEAM THEORETIC ANALYSIS: CLASSICAL SETUP with IDENTICAL COSTS and PRIORS

Consider the team setup where the costs and the priors are assumed to be the same and available for both the transmitter and the receiver; i.e., $C_{ji}=C^{t}_{ji}=C^{r}_{ji}$ and $\pi_{i}=\pi_{i}^{t}=\pi_{i}^{r}$ for $i,j\in\{0,1\}$ . Thus the common Bayes risk becomes $r^{t}(\mathcal{S},\delta)=r^{r}(\mathcal{S},\delta)=\pi_{0}(C_{00}\mathsf{P}_{00}+C_{10}\mathsf{P}_{10})+\pi_{1}(C_{01}\mathsf{P}_{01}+C_{11}\mathsf{P}_{11})$ . The arguments for the proof of the following result follow from the standard analysis in the detection and estimation literature [14, 15]. However, for completeness, and for the relevance of the analysis in the following sections, a proof is included.

Theorem 2.1.

Let $\tau\triangleq{\pi_{0}(C_{10}-C_{00})\over\pi_{1}(C_{01}-C_{11})}$ . If $\tau\leq 0$ or $\tau=\infty$ , the team solution of the binary signaling setup is non-informative. Otherwise; i.e., if $0<\tau<\infty$ , the team solution is always informative.

Proof.

The players adjust $S_{0}$ , $S_{1}$ , and $\delta$ so that $r^{t}(\mathcal{S},\delta)=r^{r}(\mathcal{S},\delta)$ is minimized. The Bayes risk of the transmitter and the receiver in (5) can be written as follows555Note that we are still keeping the parameters of the transmitter and the receiver as distinct in order to be able to utilize the expressions for the game formulations.:

[TABLE]

for $j\in\{t,r\}$ .

Here, first the receiver chooses the optimal decision rule $\delta^{*}_{S_{0},S_{1}}$ for any given signal levels $S_{0}$ and $S_{1}$ , and then the transmitter chooses the optimal signal levels $S_{0}^{*}$ and $S_{1}^{*}$ depending on the optimal receiver policy $\delta^{*}_{S_{0},S_{1}}$ .

Assuming non-zero priors $\pi_{0}^{t},\pi_{0}^{r},\pi_{1}^{t}$ , and $\pi_{1}^{r}$ , the different cases for the optimal receiver decision rule can be investigated by utilizing (4) as follows:

If $C^{r}_{01}>C^{r}_{11}$ ,

(a)

if $C^{r}_{10}>C^{r}_{00}$ , the LRT in (4) must be applied to determine the optimal decision. 2. (b)

if $C^{r}_{10}\leq C^{r}_{00}$ , the left-hand side (LHS) of the inequality in (4) is always greater than the right-hand side (RHS); thus, the receiver always chooses $\mathcal{H}_{1}$ . 2. 2.

If $C^{r}_{01}=C^{r}_{11}$ ,

(a)

if $C^{r}_{10}>C^{r}_{00}$ , the LHS of the inequality in (4) is always less than the RHS; thus, the receiver always chooses $\mathcal{H}_{0}$ . 2. (b)

if $C^{r}_{10}=C^{r}_{00}$ , the LHS and RHS of the inequality in (4) are equal; hence, the receiver is indifferent of deciding $\mathcal{H}_{0}$ or $\mathcal{H}_{1}$ . 3. (c)

if $C^{r}_{10}<C^{r}_{00}$ , the LHS of the inequality in (4) is always greater than the RHS; thus, the receiver always chooses $\mathcal{H}_{1}$ . 3. 3.

If $C^{r}_{01}<C^{r}_{11}$ ,

(a)

if $C^{r}_{10}\geq C^{r}_{00}$ , the LHS of the inequality in (4) is always less than the RHS; thus, the receiver always chooses $\mathcal{H}_{0}$ . 2. (b)

if $C^{r}_{10}<C^{r}_{00}$ , the LRT in (4) must be applied to determine the optimal decision.

The analysis above is summarized in Table 1:

As it can be observed from Table 1, the LRT is needed only when $\tau\triangleq{\pi_{0}^{r}(C^{r}_{10}-C^{r}_{00})\over\pi_{1}^{r}(C^{r}_{01}-C^{r}_{11})}$ takes a finite positive value; i.e., $0<\tau<\infty$ . Otherwise; i.e., $\tau\leq 0$ or $\tau=\infty$ , since the receiver does not consider any message sent by the transmitter, the equilibrium is non-informative.

For $0<\tau<\infty$ , let $\zeta\triangleq\text{sgn}(C^{r}_{01}-C^{r}_{11})$ $($ notice that $\zeta=\text{sgn}(C^{r}_{01}-C^{r}_{11})=\text{sgn}(C^{r}_{10}-C^{r}_{00})$ and $\zeta\in\{-1,1\})$ . Then, the optimal decision rule for the receiver in (4) becomes

[TABLE]

Let the transmitter choose optimal signals $\mathcal{S}=\{S_{0},S_{1}\}$ . Then the measurements in (1) become $\mathcal{H}_{i}:Y\sim\mathcal{N}(S_{i},\sigma^{2})$ for $i\in\{0,1\}$ , as $N\sim\mathcal{N}(0,\sigma^{2})$ , and the optimal decision rule for the receiver is obtained by utilizing (9) as

[TABLE]

Since $\zeta Y(S_{1}-S_{0})$ is distributed as $\mathcal{N}\Big{(}\zeta(S_{1}-S_{0})S_{i},(S_{1}-S_{0})^{2}\sigma^{2}\Big{)}$ under $\mathcal{H}_{i}$ for $i\in\{0,1\}$ , the conditional probabilities can be written based on (10) as follows:

[TABLE]

and similarly, $\mathsf{P}_{01}$ can be derived as $\mathsf{P}_{01}=\mathcal{Q}\left(\zeta\left(-{\sigma\ln(\tau)\over|S_{1}-S_{0}|}+{|S_{1}-S_{0}|\over 2\sigma}\right)\right)$ .

By defining $d\triangleq{|S_{1}-S_{0}|\over\sigma}$ , $\mathsf{P}_{10}=\mathcal{Q}\left(\zeta\left({\ln(\tau)\over d}+{d\over 2}\right)\right)$ and $\mathsf{P}_{01}=\mathcal{Q}\left(\zeta\left(-{\ln(\tau)\over d}+{d\over 2}\right)\right)$ can be obtained. Then, the optimum behavior of the transmitter can be found by analyzing the derivative of the Bayes risk of the transmitter in (8) with respect to $d$ :

[TABLE]

In (12), if we utilize $C_{ji}=C^{t}_{ji}=C^{r}_{ji}$ , $\pi_{i}=\pi_{i}^{t}=\pi_{i}^{r}$ and $\tau={\pi_{0}(C_{10}-C_{00})\over\pi_{1}(C_{01}-C_{11})}$ , we obtain the following:

[TABLE]

Thus, in order to minimize the Bayes risk, the transmitter always prefers the maximum $d$ , i.e., $d^{*}={\sqrt{P_{0}}+\sqrt{P_{1}}\over\sigma}$ , and the equilibrium is informative. ∎

Remark 2.1.

(i)

Note that there are two informative equilibrium points which satisfy $d^{*}={\sqrt{P_{0}}+\sqrt{P_{1}}\over\sigma}$ : $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{P_{0}},\sqrt{P_{1}}\right)$ and $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{P_{0}},-\sqrt{P_{1}}\right)$ , and the decision rule of the receiver is chosen based on the rule in (10) accordingly. Actually, these equilibrium points are essentially unique; i.e., they result in the same Bayes risks for the transmitter and the receiver.

(ii)

In the non-informative equilibrium, the receiver chooses either $\mathcal{H}_{0}$ or $\mathcal{H}_{1}$ as depicted in Table 1. Since the message sent by the transmitter has no effect on the equilibrium, there are infinitely many ways of signal selection, which implies infinitely many equilibrium points. However, all these points are essentially unique; i.e., they result in the same Bayes risks for the transmitter and the receiver. Actually, if the receiver always chooses $\mathcal{H}_{i}$ , the Bayes risks of the players are $r^{j}(\mathcal{S},\delta)=\pi^{j}_{0}C^{j}_{i0}+\pi_{1}^{j}C_{i1}^{j}$ for $i\in\{0,1\}$ and $j\in\{t,r\}$ .

3 STACKELBERG GAME ANALYSIS

Under the Stackelberg assumption, first the transmitter (the leader agent) announces and commits to a particular policy, and then the receiver (the follower agent) acts accordingly. In this direction, first the transmitter chooses optimal signals $\mathcal{S}=\{S_{0},S_{1}\}$ to minimize his Bayes risk $r^{t}(\mathcal{S},\delta)$ , then the receiver chooses an optimal decision rule $\delta$ accordingly to minimize his Bayes risk $r^{r}(\mathcal{S},\delta)$ . Due to the sequential structure of the Stackelberg game, besides his own priors and costs, the transmitter also knows the priors and the costs of the receiver so that he can adjust his optimal policy accordingly. On the other hand, besides his own priors and costs, the receiver knows only the policy and the action (signals $\mathcal{S}=\{S_{0},S_{1}\}$ ) of the transmitter as he announces during the game-play; i.e., the costs and priors of the transmitter are not available to the receiver.

3.1 Equilibrium Solutions

Under the Stackelberg assumption, the equilibrium structure of the binary signaling game can be characterized as follows:

Theorem 3.1.

If $\tau\triangleq{\pi_{0}^{r}(C^{r}_{10}-C^{r}_{00})\over\pi_{1}^{r}(C^{r}_{01}-C^{r}_{11})}\leq 0$ or $\tau=\infty$ , the Stackelberg equilibrium of the binary signaling game is non-informative. Otherwise; i.e., if $0<\tau<\infty$ , let $d\triangleq{|S_{1}-S_{0}|\over\sigma}$ , $d_{\max}\triangleq{\sqrt{P_{0}}+\sqrt{P_{1}}\over\sigma}$ , $\zeta\triangleq\text{sgn}(C^{r}_{01}-C^{r}_{11})$ , $k_{0}\triangleq\pi_{0}^{t}\zeta(C^{t}_{10}-C^{t}_{00})\tau^{-{1\over 2}}$ , and $k_{1}\triangleq\pi_{1}^{t}\zeta(C^{t}_{01}-C^{t}_{11})\tau^{1\over 2}$ . Then, the Stackelberg equilibrium structure can be characterized as in Table 2, where $d^{*}=0$ stands for a non-informative equilibrium, and a nonzero $d^{*}$ corresponds to an informative equilibrium.

Before proving Theorem 2, we make the following remark:

Remark 3.1.

As we observed in Theorem 2.1, for a team setup, an equilibrium is almost always informative (practically, $0<\tau<\infty$ ), whereas in the case of subjective priors and/or costs, it may cease to be informative.

Proof.

By applying the same case analysis as in the proof of Theorem 2.1, it can be deduced that the equilibrium is non-informative if $\tau\leq 0$ or $\tau=\infty$ (see Table 1). Thus, $0<\tau<\infty$ can be assumed. Then, from (12), $r^{t}(\mathcal{S},\delta)$ is a monotone decreasing (increasing) function of $d$ if $k_{0}\left(-{\ln\tau\over d^{2}}+{1\over 2}\right)+k_{1}\left({\ln\tau\over d^{2}}+{1\over 2}\right)$ , or equivalently $d^{2}(k_{0}+k_{1})-2\ln\tau\,(k_{0}-k_{1})$ is positive (negative) $\forall d$ , where $k_{0}$ and $k_{1}$ are as defined in the theorem statement. Therefore, one of the following cases is applicable:

if $\ln\tau\;(k_{0}-k_{1})<0$ and $k_{0}+k_{1}\geq 0$ , then $d^{2}(k_{0}+k_{1})>2\ln\tau(k_{0}-k_{1})$ is satisfied $\forall d$ , which means that $r^{t}(\mathcal{S},\delta)$ is a monotone decreasing function of $d$ . Therefore, the transmitter tries to maximize $d$ ; i.e., chooses the maximum of $|S_{1}-S_{0}|$ under the constraints $|S_{0}|^{2}\leq P_{0}$ and $|S_{1}|^{2}\leq P_{1}$ , hence $d^{*}=\max{|S_{1}-S_{0}|\over\sigma}={\sqrt{P_{0}}+\sqrt{P_{1}}\over\sigma}=d_{\max}$ , which entails an informative equilibrium. 2. 2.

if $\ln\tau\;(k_{0}-k_{1})<0$ , $k_{0}+k_{1}<0$ , and $d_{\max}^{2}<\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , then $r^{t}(\mathcal{S},\delta)$ is a monotone decreasing function of $d$ . Therefore, the transmitter maximizes $d$ as in the previous case. 3. 3.

if $\ln\tau\;(k_{0}-k_{1})<0$ , $k_{0}+k_{1}<0$ , and $d_{\max}^{2}\geq\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , since $d^{2}(k_{0}+k_{1})-2\ln\tau\,(k_{0}-k_{1})$ is initially positive then negative, $r^{t}(\mathcal{S},\delta)$ is first decreasing and then increasing with respect to $d$ . Therefore, the transmitter chooses the optimal $d^{*}$ such that $(d^{*})^{2}=\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ which results in a minimal Bayes risk $r^{t}(\mathcal{S},\delta)$ for the transmitter. This is depicted in Figure 1. 4. 4.

if $\ln\tau\;(k_{0}-k_{1})\geq 0$ and $k_{0}+k_{1}<0$ , then $d^{2}(k_{0}+k_{1})<2\ln\tau(k_{0}-k_{1})$ is satisfied $\forall d$ , which means that $r^{t}(\mathcal{S},\delta)$ is a monotone increasing function of $d$ . Therefore, the transmitter tries to minimize $d$ ; i.e., chooses $S_{0}=S_{1}$ so that $d^{*}=0$ . In this case, the transmitter does not provide any information to the receiver and the decision rule of the receiver in (9) becomes $\delta:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ ; i.e., the receiver uses only the prior information, thus the equilibrium is non-informative. 5. 5.

if $\ln\tau\;(k_{0}-k_{1})\geq 0$ , $k_{0}+k_{1}\geq 0$ , and $d_{\max}^{2}<\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , then $r^{t}(\mathcal{S},\delta)$ is a monotone increasing function of $d$ . Therefore, the transmitter chooses $S_{0}=S_{1}$ so that $d^{*}=0$ . Similar to the previous case, the equilibrium is non-informative. 6. 6.

if $\ln\tau\;(k_{0}-k_{1})\geq 0$ , $k_{0}+k_{1}\geq 0$ , and $d_{\max}^{2}\geq\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , $r^{t}(\mathcal{S},\delta)$ is first an increasing then a decreasing function of $d$ , which makes the transmitter choose either the minimum $d$ or the maximum $d$ ; i.e., he chooses the one that results in a lower Bayes risk $r^{t}(\mathcal{S},\delta)$ for the transmitter. If the minimum Bayes risk is achieved when $d^{*}=0$ , then the equilibrium is non-informative; otherwise (i.e., when the minimum Bayes risk is achieved when $d^{*}=d_{\max}$ ), the equilibrium is an informative one. There are three possible cases:

(a)

$\zeta(1-\tau)>0$ :

i.

If $d^{*}=0$ , since $\delta:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ , the receiver always chooses $\mathcal{H}_{1}$ , thus $\mathsf{P}_{10}=\mathsf{P}_{11}=1$ and $\mathsf{P}_{00}=\mathsf{P}_{01}=0$ . Then, from (8), $r^{t}(\mathcal{S},\delta)=\pi_{0}^{t}C^{t}_{00}+\pi_{1}^{t}C^{t}_{11}+\pi_{0}^{t}(C^{t}_{10}-C^{t}_{00})$ . 2. ii.

If $d^{*}=d_{\max}$ , by utilizing (8) and (11), $r^{t}(\mathcal{S},\delta)=\pi_{0}^{t}C^{t}_{00}+\pi_{1}^{t}C^{t}_{11}+\pi_{0}^{t}(C^{t}_{10}-C^{t}_{00})\mathcal{Q}\left(\zeta\left({\ln(\tau)\over d_{\max}}+{d_{\max}\over 2}\right)\right)+\pi_{1}^{t}(C^{t}_{01}-C^{t}_{11})\mathcal{Q}\left(\zeta\left(-{\ln(\tau)\over d_{\max}}+{d_{\max}\over 2}\right)\right)$ .

Then the decision of the transmitter is determined by the following:

[TABLE]

For (13), there are two possible cases:

i.

$\zeta=1$ and $0<\tau<1$ : Since $\ln\tau(k_{0}-k_{1})\geq 0\Rightarrow k_{0}-k_{1}\leq 0$ and $k_{0}+k_{1}\geq 0$ , $k_{1}\geq 0$ always. Then, (13) becomes

[TABLE] 2. ii.

$\zeta=-1$ and $\tau>1$ : Since $\ln\tau(k_{0}-k_{1})\geq 0\Rightarrow k_{0}-k_{1}\geq 0$ and $k_{0}+k_{1}\geq 0$ , $k_{0}\geq 0$ always. Then, (13) becomes

[TABLE] 2. (b)

$\zeta(1-\tau)=0\Leftrightarrow\tau=1$ : Since $k_{0}+k_{1}\geq 0$ and $d^{2}(k_{0}+k_{1})-2\ln\tau\,(k_{0}-k_{1})\geq 0$ , $r^{t}(\mathcal{S},\delta)$ is a monotone decreasing function of $d$ , which implies $d^{*}=d_{\max}$ and informative equilibrium. 3. (c)

$\zeta(1-\tau)<0$ :

i.

If $d^{*}=0$ , since $\delta:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ , the receiver always chooses $\mathcal{H}_{0}$ , thus $\mathsf{P}_{00}=\mathsf{P}_{01}=1$ and $\mathsf{P}_{10}=\mathsf{P}_{11}=0$ . Then, from (8), $r^{t}(\mathcal{S},\delta)=\pi_{0}^{t}C^{t}_{00}+\pi_{1}^{t}C^{t}_{11}+\pi_{1}^{t}(C^{t}_{01}-C^{t}_{11})$ . 2. ii.

If $d^{*}=d_{\max}$ , by utilizing (8) and (11), $r^{t}(\mathcal{S},\delta)=\pi_{0}^{t}C^{t}_{00}+\pi_{1}^{t}C^{t}_{11}+\pi_{0}^{t}(C^{t}_{10}-C^{t}_{00})\mathcal{Q}\left(\zeta\left({\ln(\tau)\over d_{\max}}+{d_{\max}\over 2}\right)\right)+\pi_{1}^{t}(C^{t}_{01}-C^{t}_{11})\mathcal{Q}\left(\zeta\left(-{\ln(\tau)\over d_{\max}}+{d_{\max}\over 2}\right)\right)$ .

Then, similar to the analysis in case-a), the decision of the transmitter is determined by the following:

[TABLE]

For (14), there are two possible cases:

i.

$\zeta=-1$ and $0<\tau<1$ : Since $\ln\tau(k_{0}-k_{1})\geq 0\Rightarrow k_{0}-k_{1}\leq 0$ and $k_{0}+k_{1}\geq 0$ , $k_{1}\geq 0$ always. Then, (14) becomes

[TABLE] 2. ii.

$\zeta=1$ and $\tau>1$ : Since $\ln\tau(k_{0}-k_{1})\geq 0\Rightarrow k_{0}-k_{1}\geq 0$ and $k_{0}+k_{1}\geq 0$ , $k_{0}\geq 0$ always. Then, (14) becomes

[TABLE]

Thus, by combining all the cases, the comparison of the transmitter Bayes risks for $d^{*}=0$ and $d^{*}=d_{\max}$ reduces to the following rule:

[TABLE]

∎

The most interesting case is Case-3 in which $\ln\tau\;(k_{0}-k_{1})<0,k_{0}+k_{1}<0,$ and $d_{\max}^{2}\geq\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , since in all other cases, the transmitter chooses either the minimum or the maximum distance between the signal levels. Further, for classical hypothesis-testing in the team setup, the optimal distance corresponds to the maximum separation [14]. However, in Case-3, there is an optimal distance $d^{*}=\sqrt{\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}}<d_{\max}$ that makes the Bayes risk of the transmitter minimum as it can be seen in Figure 1.

Remark 3.2.

Similar to the team setup analysis, for every possible case in Table 2, there are more than one equilibrium points, and they are essentially unique since the Bayes risks of the transmitter and the receiver depend on $d$ . In particular,

(i)

for $d^{*}=d_{\max}$ , the equilibrium is informative, $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{P_{0}},\sqrt{P_{1}}\right)$ and $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{P_{0}},-\sqrt{P_{1}}\right)$ are the only possible choices for the transmitter, which are essentially unique, and the decision rule of the receiver is chosen based on the rule in (10).

(ii)

for $d^{*}=\sqrt{\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}}$ , the equilibrium is informative, there are infinitely many choices for the transmitter and the receiver, and all of them are essentially unique; i.e., they result in the same Bayes risks for the transmitter and the receiver.

(iii)

for $d^{*}=0$ or $\tau\notin(0,\infty)$ , the equilibrium is non-informative and there are infinitely many equilibrium points which are essentially unique; see Remark 2.1-(ii).

3.2 Continuity and Robustness to Perturbations around the Team Setup

We now investigate the effects of small perturbations in priors and costs on equilibrium values. In particular, we consider the perturbations around the team setup; i.e., at the point of identical priors and costs.

Define the perturbation around the team setup as $\boldsymbol{\epsilon}=\{\epsilon_{\pi 0},\epsilon_{\pi 1},\epsilon_{00},\epsilon_{01},\epsilon_{10},\epsilon_{11}\}\in\mathbb{R}^{6}$ such that $\pi_{i}^{t}=\pi_{i}^{r}+\epsilon_{\pi i}$ and $C_{ji}^{t}=C_{ji}^{r}+\epsilon_{ji}$ for $i,j\in\{0,1\}$ (note that the transmitter parameters are perturbed around the receiver parameters which are assumed to be fixed). Then, for $0<\tau<\infty$ , at the point of identical priors and costs, small perturbations in both priors and costs imply $k_{0}=(\pi_{0}^{r}+\epsilon_{\pi 0})\zeta(C^{r}_{10}-C^{r}_{00}+\epsilon_{10}-\epsilon_{00})\tau^{-{1\over 2}}$ and $k_{1}=(\pi_{1}^{r}+\epsilon_{\pi 1})\zeta(C^{r}_{01}-C^{r}_{11}+\epsilon_{01}-\epsilon_{11})\tau^{1\over 2}$ . Since, for $0<\tau<\infty$ , $k_{0}=k_{1}=\sqrt{\pi^{r}_{0}\pi^{r}_{1}}\sqrt{(C^{r}_{10}-C^{r}_{00})(C^{r}_{01}-C^{r}_{11})}>0$ at the point of identical priors and costs, it is possible to obtain both positive and negative $(k_{0}-k_{1})$ by choosing the appropriate perturbation $\boldsymbol{\epsilon}$ around the team setup. Then, as it can be observed from Table 2, even the equilibrium may alter from an informative one to a non-informative one; hence, under the Stackelberg equilibrium, the policies are not continuous with respect to small perturbations around the point of identical priors and costs, and the equilibrium behavior is not robust to small perturbations in both priors and costs.

3.3 Application to the Motivating Examples

3.3.1 Subjective Priors

Referring to Section 1.3.1, for $0<\tau<\infty$ , the related parameters can be found as follows (note that the equilibrium is non-informative if $\tau\leq 0$ or $\tau=\infty$ ):

[TABLE]

Since $k_{0}+k_{1}>0$ , depending on the values of $\ln\tau\;(k_{0}-k_{1})$ , $d_{\max}^{2}$ , and $\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ , Case-1, Case-5 or Case-6 of Theorem 2 may hold as depicted in Table 3. Here, the decision rule in Case-6 is the same as (15).

3.3.2 Biased Transmitter Cost

Based on the arguments in Section 1.3.2, the related parameters can be found as follows:

[TABLE]

Then, $\ln\tau\;(k_{0}-k_{1})=0$ and $k_{0}+k_{1}=2\sqrt{\pi_{0}\pi_{1}}(2\alpha-1)$ ; hence, either Case-4 or Case-6 of Theorem 2 applies. Namely, if $\alpha<1/2$ (Case-4 of Theorem 2 applies), the transmitter chooses $S_{0}=S_{1}$ to minimize $d$ and the equilibrium is non-informative; i.e., he does not send any meaningful information to the transmitter and the receiver considers only the priors. If $\alpha=1/2$ , the transmitter has no control on his Bayes risk, hence the equilibrium is non-informative. Otherwise; i.e., if $\alpha>1/2$ (Case-6 of Theorem 2 applies), the equilibrium is always informative. In other words, if $\alpha>1/2$ , the players act like a team. As it can be seen, the informativeness of the equilibrium depends on $\alpha=\mathsf{Pr}(b=0)$ , the probability that the Bayes risks of the transmitter and the receiver are aligned.

4 NASH GAME ANALYSIS

Under the Nash assumption, the transmitter chooses optimal signals $\mathcal{S}=\{S_{0},S_{1}\}$ to minimize $r^{t}(\mathcal{S},\delta)$ , and the receiver chooses optimal decision rule $\delta$ to minimize $r^{r}(\mathcal{S},\delta)$ simultaneously. In this Nash setup, the transmitter and the receiver do not need to know the priors and the costs of each other; they need to know only their own priors and costs while calculating the best response to a given action of other player. Further, there is no commitment between the transmitter and the receiver. Due to this difference, the equilibrium structure and robustness properties of the Nash equilibrium show significant differences from the ones in the Stackelberg equilibrium, as stated in the following.

In the analysis, we assume deterministic policies for the transmitter and receiver, and we restrict the receiver to use only the single-threshold rules. Although a single-threshold rule is sub-optimal for the receiver in general, it is always optimal for Gaussian densities, and always optimal for uni-modal densities under the maximum likelihood decision rule [14, 37].

4.1 Equilibrium Solutions

Under the Nash assumption, the equilibrium structure of the binary signaling game can be characterized as follows:

Theorem 4.1.

Let $\tau\triangleq{\pi_{0}^{r}(C^{r}_{10}-C^{r}_{00})\over\pi_{1}^{r}(C^{r}_{01}-C^{r}_{11})}$ and $\zeta\triangleq\text{sgn}(C^{r}_{01}-C^{r}_{11})$ , $\xi_{0}\triangleq{C^{t}_{10}-C^{t}_{00}\over C^{r}_{10}-C^{r}_{00}}$ , and $\xi_{1}\triangleq{C^{t}_{01}-C^{t}_{11}\over C^{r}_{01}-C^{r}_{11}}$ . If $\tau\leq 0$ or $\tau=\infty$ , then the Nash equilibrium of the binary signaling game is non-informative. Otherwise; i.e., if $0<\tau<\infty$ , the Nash equilibrium structure is as depicted in Table 4.

Proof.

Let the transmitter choose any signals $\mathcal{S}=\{S_{0},S_{1}\}$ . Assuming nonzero priors $\pi_{0}^{t},\pi_{0}^{r},\pi_{1}^{t}$ and $\pi_{1}^{r}$ , the optimal decision for the receiver is given by (10). By applying the same extreme case analysis as in the proof of Theorem 2.1, the equilibrium is non-informative if $\tau\leq 0$ or $\tau=\infty$ (see Table 1); thus, $0<\tau<\infty$ can be assumed.

Now assume that the receiver applies a single-threshold rule; i.e., $\delta:\Bigg{\{}ay\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ where $a\in\mathbb{R}$ and $\eta\in\mathbb{R}$ .

Remark 4.1.

Note that for $a=0$ , the receiver chooses either always $\mathcal{H}_{0}$ or always $\mathcal{H}_{1}$ without considering the value of $y$ , which implies a non-informative equilibrium. Therefore, $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ (i.e., the decision rule of the receiver is $\delta^{*}:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ ) constitute a non-informative equilibrium regardless of the values of the priors and costs of the players.

Thus, due to the remark above, it can be assumed that $a\neq 0$ holds. Since $aY\sim\mathcal{N}\Big{(}aS_{i},a^{2}\sigma^{2}\Big{)}$ under $\mathcal{H}_{i}$ for $i\in\{0,1\}$ , the conditional probabilities are $\mathsf{P}_{10}=\mathcal{Q}\left(\eta-aS_{0}\over|a|\sigma\right)$ and $\mathsf{P}_{01}=\mathcal{Q}\left(-{\eta-aS_{1}\over|a|\sigma}\right)$ . Then, the Bayes risk of the transmitter becomes

[TABLE]

Since the power constraints are $|S_{0}|^{2}\leq P_{0}$ and $|S_{1}|^{2}\leq P_{1}$ , the signals $S_{0}$ and $S_{1}$ can be regarded as independent, and the optimum signals $\mathcal{S}=\{S_{0},S_{1}\}$ can be found by analyzing the derivative of the Bayes risk of the transmitter with respect to the signals:

[TABLE]

Then, for $i\in\{0,1\}$ , the following cases hold:

$C^{t}_{1i}=C^{t}_{0i}$ $\Rightarrow$ $S_{i}$ has no effect on the Bayes risk of the transmitter. 2. 2.

$C^{t}_{1i}\neq C^{t}_{0i}$ $\Rightarrow$ $r^{t}(\mathcal{S},\delta)$ is a decreasing (increasing) function of $S_{i}$ if $a(C^{t}_{1i}-C^{t}_{0i})$ is negative (positive); thus the transmitter chooses the optimal signal levels as $S_{0}=-\text{sgn}(a)\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}$ and $S_{1}=\text{sgn}(a)\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}$ .

By using the expressions above, the cases can be listed as follows:

$\tau\leq 0$ or $\tau=\infty$ $\Rightarrow$ The equilibrium is non-informative. 2. 2.

$C^{t}_{10}=C^{t}_{00}$ (and/or $C^{t}_{01}=C^{t}_{11}$ ) $\Rightarrow$ $S_{0}$ (and/or $S_{1}$ ) has no effect on the Bayes risk of the transmitter; thus it can arbitrarily be chosen by the transmitter. In this case, if the transmitter chooses $S_{0}=S_{1}$ ; i.e., he does not send anything useful to the receiver, and the receiver applies the decision rule $\delta:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ ; i.e., he only considers the prior information (totally discards the information sent by the transmitter). Therefore, there exists a non-informative equilibrium. 3. 3.

Notice that, since $0<\tau<\infty$ is assumed, $\zeta=\text{sgn}(C^{r}_{01}-C^{r}_{11})=\text{sgn}(C^{r}_{10}-C^{r}_{00})$ is obtained. Now, assume that the decision rule of the receiver is $\delta:\Bigg{\{}ay\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ . Then, the transmitter selects $S_{0}=-\text{sgn}(a)\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}$ and $S_{1}=\text{sgn}(a)\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}$ as optimal signals, and the decision rule becomes (10). By combining the best responses of the transmitter and the receiver,

[TABLE]

is obtained. Here, unless (17) is satisfied, the best responses of the transmitter and the receiver cannot match each other. Then, there are four possible cases:

(a)

$\xi_{0}<0$ and $\xi_{1}<0$ $\Rightarrow$ (17) cannot be satisfied; thus, the best responses of the transmitter and the receiver do not match each other, which results in the absence of a Nash equilibrium for $a\neq 0$ . However, as discussed in Remark 4.1, $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ always constitute a non-informative equilibrium. 2. (b)

$\xi_{0}<0$ and $\xi_{1}>0$ $\Rightarrow$ (17) is satisfied only when $\sqrt{P_{1}}>\sqrt{P_{0}}$ . If $\sqrt{P_{1}}<\sqrt{P_{0}}$ , (17) cannot be satisfied and the best responses of the transmitter and the receiver do not match each other, which results in the absence of a Nash equilibrium for $a\neq 0$ . However, due to Remark 4.1, for $a=0$ , there always exist non-informative equilibria. If $\sqrt{P_{1}}=\sqrt{P_{0}}$ (which implies $S_{0}=S_{1}$ ), then the receiver applies $\delta:\Bigg{\{}\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ as in Case-2, and the receiver chooses either always $\mathcal{H}_{0}$ or always $\mathcal{H}_{1}$ . Hence, there exists a non-informative equilibrium; i.e., the transmitter sends dummy signals, and the receiver makes a decision without considering the transmitted signals. 3. (c)

$\xi_{0}>0$ and $\xi_{1}<0$ $\Rightarrow$ (17) is satisfied only when $\sqrt{P_{0}}>\sqrt{P_{1}}$ . If $\sqrt{P_{0}}<\sqrt{P_{1}}$ , (17) cannot be satisfied and the best responses of the transmitter and the receiver do not match each other, which results in the absence of a Nash equilibrium for $a\neq 0$ . However, due to Remark 4.1, for $a=0$ , there always exist non-informative equilibria. If $\sqrt{P_{0}}=\sqrt{P_{1}}$ (which implies $S_{0}=S_{1}$ ), then the receiver applies $\delta:\Bigg{\{}\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ as in Case-2, and the equilibrium is non-informative. 4. (d)

$\xi_{0}>0$ and $\xi_{1}>0$ $\Rightarrow$ (17) is always satisfied; thus, the consistency is established, and there exists an informative equilibrium.

∎

As it can be deduced from Table 4, as the costs related to both hypotheses are aligned666 $\xi_{i}$ is the indicator that the transmitter and the receiver have similar preferences about hypothesis $\mathcal{H}_{i}$ ; i.e., if $\xi_{i}>0$ , then both the transmitter and the receiver aim to transmit and decode the hypothesis $\mathcal{H}_{i}$ correctly (or incorrectly). If $\xi_{i}<0$ , then the transmitter and the receiver have conflicting goals over hypothesis $\mathcal{H}_{i}$ ; i.e., one of them tries to achieve the correct transmission and decoding, whereas the goal of the other player is the opposite. for the transmitter and the receiver, the Nash equilibrium is informative. If the power limit corresponding to the hypothesis that has aligned costs for the transmitter and receiver is greater than the power limit of the other hypothesis, again, there exists an informative equilibrium. For the other cases, there may exist non-informative equilibrium.

Remark 4.2.

(i)

We emphasize that, under the Nash formulation, while calculating the best responses, the transmitter and the receiver do not need to know the priors and the costs of each other. In particular,

–

for a given decision rule of the receiver $\delta:\Bigg{\{}ay\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ , the best response of the transmitter is $S_{0}^{\mathrm{BR}}=-\text{sgn}(a)\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}$ and $S_{1}^{\mathrm{BR}}=\text{sgn}(a)\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}$ .

–

similarly, for a given signal design $S_{0}$ and $S_{1}$ of the transmitter, the best response of the receiver is $a^{\mathrm{BR}}=\zeta(S_{1}-S_{0})$ and $\eta^{\mathrm{BR}}=\zeta\left(\sigma^{2}\ln(\tau)+{(S_{1})^{2}-(S_{0})^{2}\over 2}\right)$ .

(ii)

As shown in Theorem 4, at the informative Nash equilibrium, the transmitter selects $S_{0}^{*}=-\text{sgn}(a^{*})\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}$ and $S_{1}^{*}=\text{sgn}(a^{*})\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}$ , and the decision rule of the receiver is $\delta^{*}:\Bigg{\{}a^{*}y\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta^{*}$ , where $a^{*}=\zeta(S_{1}^{*}-S_{0}^{*})$ and $\eta^{*}=\zeta\left(\sigma^{2}\ln(\tau)+{(S_{1}^{*})^{2}-(S_{0}^{*})^{2}\over 2}\right)$ . Similar to the team and Stackelberg setup analyses, the informative equilibrium is essentially unique in the Nash case, too; i.e., if $(S_{0}^{*},S_{1}^{*},a^{*},\eta^{*})$ is an equilibrium point, then $(-S_{0}^{*},-S_{1}^{*},-a^{*},\eta^{*})$ is another equilibrium point, and they both result in the same Bayes risks for the transmitter and the receiver.

(iii)

For the non-informative equilibrium, as discussed in Remark 4.1, the optimal strategies of the transmitter and the receiver are determined by $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ ; which results in essentially unique equilibria (see Remark 2.1-(ii)).

Even though the transmitter and the receiver do not know the private parameters of each other, they can achieve (converge) to an equilibrium. Note that, due to Remark 4.2-(i), for any arbitrary receiver strategy $(a,\eta)$ , the best response of the transmitter $(S_{0}^{\mathrm{BR}},S_{1}^{\mathrm{BR}})$ is one of the four possibilities: $(\sqrt{P_{0}},\sqrt{P_{1}})$ , $(-\sqrt{P_{0}},\sqrt{P_{1}})$ , $(\sqrt{P_{0}},-\sqrt{P_{1}})$ , or $(-\sqrt{P_{0}},-\sqrt{P_{1}})$ . Then, the corresponding best responses of the receiver are characterized by $(a^{\mathrm{BR}}_{1},\eta^{\mathrm{BR}})$ , $(a^{\mathrm{BR}}_{2},\eta^{\mathrm{BR}})$ , $(-a^{\mathrm{BR}}_{2},\eta^{\mathrm{BR}})$ , or $(-a^{\mathrm{BR}}_{1},\eta^{\mathrm{BR}})$ , respectively, where $a^{\mathrm{BR}}_{1}\triangleq\zeta(\sqrt{P_{1}}-\sqrt{P_{0}})$ , $a^{\mathrm{BR}}_{2}\triangleq\zeta(\sqrt{P_{1}}+\sqrt{P_{0}})$ , and $\eta^{\mathrm{BR}}=\zeta\left(\sigma^{2}\ln(\tau)+{P_{1}-P_{0}\over 2}\right)$ . By continuing these iterations, the best responses of the transmitter and the receiver can be combined and (17) is obtained. If their private parameters (priors and costs) satisfy the condition of the unique informative equilibrium in Table IV, their best responses match each other, so the best-response dynamics converges to an equilibrium (e.g., $(a,\eta)\rightarrow(\sqrt{P_{0}},\sqrt{P_{1}})\rightarrow(a^{\mathrm{BR}}_{1},\eta^{\mathrm{BR}})\rightarrow(\sqrt{P_{0}},\sqrt{P_{1}})\rightarrow\cdots$ ). Otherwise, the optimal strategies (best responses) of the transmitter and the receiver oscillate between two best responses; e.g., $(a,\eta)\rightarrow(\sqrt{P_{0}},\sqrt{P_{1}})\rightarrow(a^{\mathrm{BR}}_{1},\eta^{\mathrm{BR}})\rightarrow(-\sqrt{P_{0}},-\sqrt{P_{1}})\rightarrow(-a^{\mathrm{BR}}_{1},\eta^{\mathrm{BR}})\rightarrow(\sqrt{P_{0}},\sqrt{P_{1}})\rightarrow\cdots$ . Then, they deduce that there exist only non-informative equilibria, in which $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ (see Remark 4.2-(iii)).

Note that, when $a\neq 0$ , the misalignment between the costs can even induce a scenario, in which there exists no equilibrium. For $a\neq 0$ , the main reason for the absence of a non-informative (babbling) equilibrium under the Nash assumption is that in the binary signaling game setup, the receiver is forced to make a decision. Using only the prior information, the receiver always chooses one of the hypothesis. By knowing this, the transmitter can manipulate his signaling strategy for his own benefit. However, after this manipulation, the receiver no longer keeps his decision rule the same; namely, the best response of the receiver alters based on the signaling strategy of the transmitter, which entails another change of the best response of the transmitter. Due to such an infinite recursion, the optimal policies of the transmitter and the receiver keep changing, and thus, there does not exist a pure Nash equilibrium unless $a=0$ ; i.e., due to Remark 4.1, there always exist non-informative equilibria with $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ .

4.2 Continuity and Robustness to Perturbations around the Team Setup

Similar to that in Section 3.2 for the Stackelberg setup, the effects of small perturbations in priors and costs on equilibrium values around the team setup are investigated for the Nash setup as follows:

Define the perturbation around the team setup as $\boldsymbol{\epsilon}=\{\epsilon_{\pi 0},\epsilon_{\pi 1},\epsilon_{00},\epsilon_{01},\epsilon_{10},\epsilon_{11}\}\in\mathbb{R}^{6}$ such that $\pi_{i}^{t}=\pi_{i}^{r}+\epsilon_{\pi i}$ and $C_{ji}^{t}=C_{ji}^{r}+\epsilon_{ji}$ for $i,j\in\{0,1\}$ (note that the transmitter parameters are perturbed around the receiver parameters which are assumed to be fixed). Then, for $0<\tau<\infty$ , at the point of identical priors and costs, small perturbations in priors and costs imply $\xi_{0}={C^{r}_{10}-C^{r}_{00}+\epsilon_{10}-\epsilon_{00}\over C^{r}_{10}-C^{r}_{00}}$ and $\xi_{1}={C^{r}_{01}-C^{r}_{11}+\epsilon_{01}-\epsilon_{11}\over C^{r}_{01}-C^{r}_{11}}$ . As it can be seen, the Nash equilibrium is not affected by small perturbations in priors. Further, since $\xi_{0}=\xi_{1}=1$ at the point of identical priors and costs for $0<\tau<\infty$ , as long as the perturbation $\boldsymbol{\epsilon}$ is chosen such that $\Big{\lvert}{\epsilon_{10}-\epsilon_{00}\over C^{r}_{10}-C^{r}_{00}}\Big{\rvert}<1$ and $\Big{\lvert}{\epsilon_{01}-\epsilon_{11}\over C^{r}_{01}-C^{r}_{11}}\Big{\rvert}<1$ , we always obtain positive $\xi_{0}$ and $\xi_{1}$ in Table 4. Thus, under the Nash assumption, the equilibrium behavior is robust to small perturbations in both priors and costs.

For the continuity analysis, first consider a non-informative equilibrium; i.e., the policies are $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ , which are independent of the values of the priors and costs of the players. Thus, consider when $a\neq 0$ ; i.e., an informative equilibrium: if the priors and costs are perturbed around the team setup, $S_{0}=-\text{sgn}(a)\text{sgn}(C^{r}_{10}-C^{r}_{00}+\epsilon_{10}-\epsilon_{00})\sqrt{P_{0}}$ and $S_{1}=\text{sgn}(a)\text{sgn}(C^{r}_{01}-C^{r}_{11}+\epsilon_{01}-\epsilon_{11})\sqrt{P_{1}}$ are obtained. As long as the perturbation $\boldsymbol{\epsilon}$ is chosen such that $\Big{\lvert}{\epsilon_{10}-\epsilon_{00}\over C^{r}_{10}-C^{r}_{00}}\Big{\rvert}<1$ and $\Big{\lvert}{\epsilon_{01}-\epsilon_{11}\over C^{r}_{01}-C^{r}_{11}}\Big{\rvert}<1$ , the changes in $\eta$ , $S_{0}$ and $S_{1}$ are continuous with respect to perturbations; actually, the values of the equilibrium parameters remain constant; i.e., either $(S_{0}^{*},S_{1}^{*},a^{*},\eta^{*})=\left(-\zeta\sqrt{P_{0}},\zeta\sqrt{P_{1}},(\sqrt{P_{0}}+\sqrt{P_{1}}),\zeta\left(\sigma^{2}\ln(\tau)+{S_{1}^{2}-S_{0}^{2}\over 2}\right)\right)$ or the essentially equivalent one $(S_{0}^{*},S_{1}^{*},a^{*},\eta^{*})=\left(\zeta\sqrt{P_{0}},-\zeta\sqrt{P_{1}},-(\sqrt{P_{0}}+\sqrt{P_{1}}),\zeta\left(\sigma^{2}\ln(\tau)+{S_{1}^{2}-S_{0}^{2}\over 2}\right)\right)$ holds. Thus, the policies are continuous with respect to small perturbations around the point of identical priors and costs.

4.3 Application to the Motivating Examples

4.3.1 Subjective Priors

The related parameters are $\tau={\pi_{0}^{r}(C_{10}-C_{00})\over\pi_{1}^{r}(C_{01}-C_{11})}$ , $\xi_{0}=1$ , and $\xi_{1}=1$ . Thus, if $\tau<0$ or $\tau=\infty$ , the equilibrium is non-informative; otherwise, there always exists a unique informative equilibrium.

4.3.2 Biased Transmitter Cost

Based on the arguments in Section 1.3.2, the related parameters can be found as follows:

[TABLE]

If $\alpha>1/2$ (Case-3-d of Theorem 4 applies), the players act like a team and the equilibrium is informative. If $\alpha=1/2$ (Case-2 of Theorem 4 applies), the equilibrium is non-informative. Otherwise; i.e., if $\alpha<1/2$ (Case-3-a of Theorem 4 applies), there exist non-informative equilibria. As it can be seen, the existence of the equilibrium depends on $\alpha=\mathsf{Pr}(b=0)$ , the probability that the Bayes risks of the transmitter and the receiver are aligned.

5 EXTENSION to the MULTI-DIMENSIONAL CASE

When the transmitter sends a multi-dimensional signal over a multi-dimensional channel, or the receiver takes multiple samples from the observed waveform, the scalar analysis considered heretofore is not applicable anymore; thus, the vector case can be investigated. In this direction, the binary hypothesis-testing problem aforementioned can be modified as

[TABLE]

where $\mathbf{Y}$ is the observation (measurement) vector that belongs to the observation set $\Gamma=\mathbb{R}^{n}$ , $\mathbf{S}_{0}$ and $\mathbf{S}_{1}$ denote the deterministic signals under hypothesis $\mathcal{H}_{0}$ and hypothesis $\mathcal{H}_{1}$ , such that $\mathbb{S}\triangleq\{\mathcal{S}:\lVert\mathbf{S}_{0}\rVert^{2}\leq P_{0}\,,\;\lVert\mathbf{S}_{1}\rVert^{2}\leq P_{1}\}$ , respectively, and $\mathbf{N}$ represents a zero-mean Gaussian noise vector with the positive definite covariance matrix $\Sigma$ ; i.e., $\mathbf{N}\sim\mathcal{N}(\mathbf{0},\Sigma)$ . All the other parameters ( $\pi_{i}^{k}$ and $C^{k}_{ji}$ for $i,j\in\{0,1\}$ and $k\in\{t,r\}$ ) and their definitions remain unchanged.

5.1 Team Setup Analysis

Theorem 5.1.

Theorem 2.1 also holds for the vector case: if $0<\tau<\infty$ , the team solution is always informative; otherwise, there exist only non-informative equilibria.

Proof.

Let the transmitter choose optimal signals $\mathcal{S}=\{\mathbf{S}_{0},\mathbf{S}_{1}\}$ . Then the measurements become $\mathcal{H}_{i}:\mathbf{Y}\sim\mathcal{N}(\mathbf{S}_{i},\Sigma)$ for $i\in\{0,1\}$ . As in the scalar case in Theorem 2.1, the equilibrium is non-informative for $\tau\leq 0$ or $\tau=\infty$ ; hence, $0<\tau<\infty$ can be assumed. Similar to (10), the optimal decision rule for the receiver is obtained by utilizing (9) as

[TABLE]

Since, under hypothesis $\mathcal{H}_{i}$ , $\zeta(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}\mathbf{Y}\sim\mathcal{N}\left(\zeta(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}\mathbf{S}_{i},(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}(\mathbf{S}_{1}-\mathbf{S}_{0})\right)$ for $i\in\{0,1\}$ , by defining $d^{2}\triangleq(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}(\mathbf{S}_{1}-\mathbf{S}_{0})$ , the conditional probabilities can be written as follows:

[TABLE]

Notice that the conditional probabilities are the same in (11) and (19); therefore, in the vector case, the equilibrium is always informative, and the transmitter always prefers the maximum distance similar to the scalar case. However, selecting optimal vector signals is not as trivial as in the scalar case; see [14, pp. 61–63] for details. Since the eigenvector with the largest (smallest) eigenvalue of $\Sigma$ corresponds to the direction, along which the noise is most (least) powerful, signaling in the least noisy direction results in the highest signal-to-noise power ratio for the system. Accordingly, the optimum signals are $\mathbf{S}_{0}=\pm\sqrt{P_{0}}{\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ and $\mathbf{S}_{1}=\mp\sqrt{P_{1}}{\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ , which corresponds to $d_{\max}^{2}={(\sqrt{P_{0}}+\sqrt{P_{1}})^{2}\over\lambda_{\min}}$ , where $\lambda_{\min}$ is the minimum eigenvalue of $\Sigma$ and $\boldsymbol{\nu}_{\min}$ is the eigenvector corresponding to $\lambda_{\min}$ [14, pp. 61–63]. ∎

5.2 Stackelberg Game Analysis

Theorem 5.2.

Let $d\triangleq\sqrt{(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}(\mathbf{S}_{1}-\mathbf{S}_{0})}$ and $d_{\max}^{2}\triangleq{(\sqrt{P_{0}}+\sqrt{P_{1}})^{2}\over\lambda_{\min}}$ , where $\lambda_{\min}$ is the minimum eigenvalue of $\Sigma$ . Then Theorem 2 also holds for the vector case.

Proof.

The proof of Theorem 2 can be applied by modifying the definitions of $d$ and $d_{\max}$ as in the statement. For $d^{*}=d_{\max}$ , the method described in the proof of Theorem 5.1 can be applied for the optimal signal selection, whereas, for $d^{*}=0$ , by choosing $\mathbf{S}_{0}=\mathbf{S}_{1}$ , the non-informative equilibrium can be achieved. Further, for Case-3 of Theorem 2, in order to achieve $(d^{*})^{2}=\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}<d_{\max}^{2}$ , the signals can be chosen in the direction of $\boldsymbol{\nu}_{\min}$ , that is, the eigenvector corresponding to $\lambda_{\min}$ . Accordingly, $\mathbf{S}_{0}=(-\sqrt{P_{0}}+t){\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ and $\mathbf{S}_{1}=(-\sqrt{P_{0}}+d^{*}+t){\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ for $t\in[0,\sqrt{P_{1}}+\sqrt{P_{0}}-d^{*}]$ are possible optimal signal pairs. Similarly, $\mathbf{S}_{0}=(\sqrt{P_{0}}-t){\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ and $\mathbf{S}_{1}=(\sqrt{P_{0}}-d^{*}-t){\boldsymbol{\nu}_{\min}\over\|\boldsymbol{\nu}_{\min}\|}$ for $t\in[0,\sqrt{P_{1}}+\sqrt{P_{0}}-d^{*}]$ consist of another set of possible optimal signal pairs. Note that it may be possible to find optimal signal pairs $\{\mathbf{S}_{0},\mathbf{S}_{1}\}\in\mathbb{S}$ that satisfy $(\mathbf{S}_{1}-\mathbf{S}_{0})^{T}\Sigma^{-1}(\mathbf{S}_{1}-\mathbf{S}_{0})=\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}$ in any other direction rather than the direction of $\boldsymbol{\nu}_{\min}$ ; however, finding a single pair that corresponds to an equilibrium would be sufficient. ∎

5.3 Nash Game Analysis

Theorem 5.3.

Theorem 4 also holds for the vector case.

Proof.

Let the transmitter choose any signals $\mathcal{S}=\{\mathbf{S}_{0},\mathbf{S}_{1}\}$ . Assuming nonzero priors $\pi_{0}^{t},\pi_{0}^{r},\pi_{1}^{t}$ and $\pi_{1}^{r}$ , the optimal decision rule for the receiver is given by (18). Similar to the team case analysis in Section 5.1, the equilibrium is non-informative if $\tau\leq 0$ or $\tau=\infty$ ; thus, $0<\tau<\infty$ can be assumed.

Now assume that the receiver applies a single-threshold rule; i.e., $\delta:\Bigg{\{}\mathbf{a}^{T}\mathbf{y}\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ where $\mathbf{a}\in\mathbb{R}^{n}$ and $\eta\in\mathbb{R}$ .

Remark 5.1.

Note that for $\mathbf{a}=\mathbf{0}$ , the receiver chooses either always $\mathcal{H}_{0}$ or always $\mathcal{H}_{1}$ without considering the value of $\mathbf{y}$ , which implies a non-informative equilibrium. Therefore, $\mathbf{S}_{0}^{*}=\mathbf{S}_{1}^{*}$ , $\mathbf{a}^{*}=\mathbf{0}$ , and $\eta^{*}=\zeta(\tau-1)$ (i.e., the decision rule of the receiver is $\delta^{*}:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ ) constitute a non-informative equilibrium regardless of the values of the priors and costs of the players.

Thus, due to the remark above, it can be assumed that $\mathbf{a}\neq\mathbf{0}$ holds. Since $\mathbf{a}^{T}\mathbf{Y}\sim\mathcal{N}\Big{(}\mathbf{a}^{T}\mathbf{S}_{i},\mathbf{a}^{T}\Sigma\mathbf{a}\Big{)}$ under $\mathcal{H}_{i}$ for $i\in\{0,1\}$ , the conditional probabilities are $\mathsf{P}_{10}=\mathcal{Q}\left(\eta-\mathbf{a}^{T}\mathbf{S}_{0}\over\sqrt{\mathbf{a}^{T}\Sigma\mathbf{a}}\right)$ and $\mathsf{P}_{01}=\mathcal{Q}\left(-{\eta-\mathbf{a}^{T}\mathbf{S}_{1}\over\sqrt{\mathbf{a}^{T}\Sigma\mathbf{a}}}\right)$ . Then, the Bayes risk of the transmitter becomes

[TABLE]

Since the power constraints are $\lVert\mathbf{S}_{0}\rVert^{2}\leq P_{0}$ and $\lVert\mathbf{S}_{1}\rVert^{2}\leq P_{1}$ , the signals $\mathbf{S}_{0}$ and $\mathbf{S}_{1}$ can be regarded as independent. Since $\mathcal{Q}$ function is a monotone decreasing, the following cases hold for $i\in\{0,1\}$ :

$C^{t}_{1i}<C^{t}_{0i}$ $\Rightarrow$ Then, $r^{t}(\mathcal{S},\delta)$ is a decreasing function of $\mathbf{a}^{T}\mathbf{S}_{i}$ , thus the transmitter always chooses $\mathbf{a}^{T}\mathbf{S}_{i}$ as maximum subject to $\lVert\mathbf{S}_{i}\rVert^{2}\leq P_{i}$ ; i.e., $\mathbf{S}_{i}=\sqrt{P_{i}}{\mathbf{a}\over\|\mathbf{a}\|}$ . 2. 2.

$C^{t}_{1i}=C^{t}_{0i}$ $\Rightarrow$ Then $\mathbf{S}_{i}$ has no effect on the Bayes risk of the transmitter. 3. 3.

$C^{t}_{1i}>C^{t}_{0i}$ $\Rightarrow$ Then, $r^{t}(\mathcal{S},\delta)$ is an increasing function of $\mathbf{a}^{T}\mathbf{S}_{i}$ , thus the transmitter always chooses $\mathbf{a}^{T}\mathbf{S}_{i}$ as minimum subject to $\lVert\mathbf{S}_{i}\rVert^{2}\leq P_{i}$ ; i.e., $\mathbf{S}_{i}=-\sqrt{P_{i}}{\mathbf{a}\over\|\mathbf{a}\|}$ .

Thus, the the optimal signals can be characterized as $S_{0}=-\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}{\mathbf{a}\over\|\mathbf{a}\|}$ and $S_{1}=\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}{\mathbf{a}\over\|\mathbf{a}\|}$ .

By using the expressions above, the cases can be listed as follows:

$\tau\leq 0$ or $\tau=\infty$ $\Rightarrow$ The equilibrium is non-informative. 2. 2.

$C^{t}_{10}=C^{t}_{00}$ (and/or $C^{t}_{01}=C^{t}_{11}$ ) $\Rightarrow$ $\mathbf{S}_{0}$ (and/or $\mathbf{S}_{1}$ ) has no effect on the Bayes risk of the transmitter, thus it can arbitrarily be chosen by the transmitter. In this case, if the transmitter chooses $\mathbf{S}_{0}=\mathbf{S}_{1}$ ; i.e., he does not send anything useful to the receiver, and the receiver applies the decision rule $\delta:\zeta\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\zeta\tau$ ; i.e., he only considers the prior information (totally discards the information sent by the transmitter). Then there exists a non-informative equilibrium. 3. 3.

Notice that, since $0<\tau<\infty$ is assumed, $\zeta=\text{sgn}(C^{r}_{01}-C^{r}_{11})=\text{sgn}(C^{r}_{10}-C^{r}_{00})$ is obtained. Now, assume that the decision rule of the receiver is $\delta:\Bigg{\{}\mathbf{a}^{T}\mathbf{y}\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ . Then, the transmitter selects $S_{0}=-\text{sgn}(C^{t}_{10}-C^{t}_{00})\sqrt{P_{0}}{\mathbf{a}\over\|\mathbf{a}\|}$ and $S_{1}=\text{sgn}(C^{t}_{01}-C^{t}_{11})\sqrt{P_{1}}{\mathbf{a}\over\|\mathbf{a}\|}$ as optimal signals, and the decision rule becomes (18). By combining the best responses of the transmitter and the receiver,

[TABLE]

Notice that the expressions in (20) and (17) of Theorem 4 are the same, and Remark 4.1 and Remark 5.1 are equivalent; hence, the Nash equilibrium solution of Theorem 4 also holds for the vector case.

∎

6 EXTENSION TO a SCENARIO with an AVERAGE POWER CONSTRAINT

Besides the peak power constraint considered in the previous sections, the average power constraint can be assumed at the transmitter side. Before presenting the technical results, we provide the following lemma which will be utilized in the equilibrium analyses of the team and Stackelberg setups.

Lemma 6.1.

The optimal solutions to the optimization problem

[TABLE]

are $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{{\beta_{1}\over\beta_{0}(\beta_{0}+\beta_{1})}P},\sqrt{{\beta_{0}\over\beta_{1}(\beta_{0}+\beta_{1})}P}\right)$ and $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{{\beta_{1}\over\beta_{0}(\beta_{0}+\beta_{1})}P},-\sqrt{{\beta_{0}\over\beta_{1}(\beta_{0}+\beta_{1})}P}\right)$ .

Proof.

Observe the following inequalities:

[TABLE]

Here, (b) follows from the inequality for the arithmetic and geometric mean, and the equality holds iff $\beta_{1}^{2}S_{1}^{2}=\beta_{0}^{2}S_{0}^{2}$ . For (a), the equality holds iff $S_{1}S_{0}\leq 0$ ; and for (c), the equality holds iff $\beta_{0}S_{0}^{2}+\beta_{1}S_{1}^{2}=P$ . Thus, the upper bound of $(S_{1}-S_{0})^{2}$ can be achieved with optimal solutions $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{{\beta_{1}\over\beta_{0}(\beta_{0}+\beta_{1})}P},\sqrt{{\beta_{0}\over\beta_{1}(\beta_{0}+\beta_{1})}P}\right)$ or $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{{\beta_{1}\over\beta_{0}(\beta_{0}+\beta_{1})}P},-\sqrt{{\beta_{0}\over\beta_{1}(\beta_{0}+\beta_{1})}P}\right)$ so that $(S_{1}^{*}-S_{0}^{*})^{2}={\beta_{0}+\beta_{1}\over\beta_{0}\beta_{1}}P$ . ∎

Consider a transmitter with an average power constraint; i.e., the transmitter performs the optimal signal design problem under the power constraint below:

[TABLE]

where $P_{\mathrm{avg}}$ denotes the average power limit.

6.1 Team Theoretic Analysis

In order to minimize the Bayes risk, the transmitter always prefers the maximum $d={|S_{1}-S_{0}|\over\sigma}$ . Thus, by Lemma 6.1, the optimal signal levels are chosen as either $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{{\pi_{1}^{t}\over\pi_{0}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}},\sqrt{{\pi_{0}^{t}\over\pi_{1}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}}\right)$ or $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{{\pi_{1}^{t}\over\pi_{0}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}},-\sqrt{{\pi_{0}^{t}\over\pi_{1}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}}\right)$ . The corresponding optimal decision rule of the receiver is chosen based on the rule in (10) accordingly. Actually, the equilibrium points are essentially unique; i.e., they result in the same Bayes risks for the transmitter and the receiver.

6.2 Stackelberg Game Analysis

Similar to the team setup analysis, for every possible case in Table 2, there are more than one equilibrium points, and they are essentially unique since the Bayes risks of the transmitter and the receiver depend on $d$ . For example, for $d^{*}=d_{\max}\triangleq{\sqrt{{\pi_{0}^{t}+\pi_{1}^{t}\over\pi_{0}^{t}\pi_{1}^{t}}P_{\mathrm{avg}}}\over\sigma}$ , $(S_{0}^{*},S_{1}^{*})=\left(-\sqrt{{\pi_{1}^{t}\over\pi_{0}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}},\sqrt{{\pi_{0}^{t}\over\pi_{1}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}}\right)$ and $(S_{0}^{*},S_{1}^{*})=\left(\sqrt{{\pi_{1}^{t}\over\pi_{0}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}},-\sqrt{{\pi_{0}^{t}\over\pi_{1}^{t}(\pi_{0}^{t}+\pi_{1}^{t})}P_{\mathrm{avg}}}\right)$ are the only possible choices for the transmitter, and the decision rule of the receiver is chosen based on the rule in (10). However, for $d^{*}=0$ , there are infinitely many choices for the transmitter and the receiver, and all of them are essentially unique; i.e., they result in the same Bayes risks for the transmitter and the receiver. A similar argument holds for $d^{*}=\sqrt{\Big{|}{2\ln\tau(k_{0}-k_{1})\over(k_{0}+k_{1})}\Big{|}}$ ; i.e., there are infinitely many choices for the transmitter and the receiver, and all of them are essentially unique.

6.3 Nash Game Analysis

For $0<\tau<\infty$ , if the receiver applies a single-threshold rule777Due to Remark 4.1, $S_{0}^{*}=S_{1}^{*}$ , $a^{*}=0$ , and $\eta^{*}=\zeta(\tau-1)$ always constitute a non-informative equilibrium regardless of the values of the priors and costs of the players; i.e., $\delta:\Bigg{\{}ay\overset{\mathcal{H}_{1}}{\underset{\mathcal{H}_{0}}{\gtreqless}}\eta$ where $a\in\mathbb{R}-\{0\}$ , and $\eta\in\mathbb{R}$ , after analyzing the derivative of the Bayes risk of the transmitter in (16) with respect to the signals, the following can be obtained:

$C^{t}_{1i}=C^{t}_{0i}$ $\Rightarrow$ $S_{i}$ has no effect on the Bayes risk of the transmitter. 2. 2.

$C^{t}_{1i}<C^{t}_{0i}$ or $C^{t}_{1i}>C^{t}_{0i}$ $\Rightarrow$ $r^{t}(\mathcal{S},\delta)$ is a decreasing (increasing) function of $S_{i}$ if $a(C^{t}_{1i}-C^{t}_{0i})$ is negative (positive); thus the transmitter chooses the optimal signal level $S_{i}$ as large as possible in absolute value. Therefore, the transmitter prefers to utilize the maximum possible total power; i.e., the power constraint can be considered as $\pi_{1}^{t}S_{1}^{2}+\pi_{0}^{t}S_{0}^{2}=P_{\mathrm{avg}}$ rather than $\pi_{1}^{t}S_{1}^{2}+\pi_{0}^{t}S_{0}^{2}\leq P_{\mathrm{avg}}$ .

By using the analysis above, the cases can be listed as follows:

$C^{t}_{1i}=C^{t}_{0i}$ $\Rightarrow$ If $C^{t}_{1j}=C^{t}_{0j}$ also holds for $j\neq i$ , then neither $S_{0}$ nor $S_{1}$ changes the Bayes risk of the transmitter; thus, there exists a non-informative equilibrium. Otherwise; i.e., $C^{t}_{1j}\neq C^{t}_{0j}$ for $j\neq i$ , the transmitter chooses the optimal signal levels as $S_{i}=0$ and $S_{j}=-\text{sgn}\left(a(C^{t}_{1j}-C^{t}_{0j})\right)\sqrt{P_{\mathrm{avg}}\over\pi_{j}^{t}}$ , and the equilibrium is informative. 2. 2.

$C^{t}_{10}\neq C^{t}_{00}$ and $C^{t}_{11}\neq C^{t}_{01}$ $\Rightarrow$ Since the transmitter adjust the signal levels such that $\pi_{1}^{t}S_{1}^{2}+\pi_{0}^{t}S_{0}^{2}=P$ , the optimal signals must be in the form of $S_{0}=-\text{sgn}\big{(}a(C^{t}_{10}-C^{t}_{00})\big{)}x$ and $S_{1}=\text{sgn}\big{(}a(C^{t}_{01}-C^{t}_{11})\big{)}\sqrt{P_{\mathrm{avg}}-\pi_{0}^{t}x^{2}\over\pi_{1}^{t}}$ for $x\in\left[0,\sqrt{P_{\mathrm{avg}}\over\pi_{0}^{t}}\right]$ . Then, the Bayes risk of the transmitter in (16) can be expressed as

[TABLE]

Note that the convexity of $r^{t}(\mathcal{S},\delta)$ in (23) with respect to $x$ changes depending on the other parameters (i.e., priors, costs and the receiver policy); hence, the optimal $x$ cannot be expressed in a closed form. Let $x^{*}$ be an optimal solution to (23); i.e., $x^{*}=\arg\min_{x\in\left[0,\sqrt{P_{\mathrm{avg}}\over\pi_{0}^{t}}\right]}r^{t}(\mathcal{S},\delta)$ , which implies that the optimal signal levels are $S_{0}=-\text{sgn}\big{(}a(C^{t}_{10}-C^{t}_{00})\big{)}x^{*}$ and $S_{1}=\text{sgn}\big{(}a(C^{t}_{01}-C^{t}_{11})\big{)}\sqrt{P_{\mathrm{avg}}-\pi_{0}^{t}(x^{*})^{2}\over\pi_{1}^{t}}$ . Then, similar to (17), the following condition on the existence of an equilibrium can be obtained:

[TABLE]

Here, similar to the analysis under the individual power constraint in Theorem 4, unless (24) is satisfied, the best responses of the transmitter and the receiver cannot match each other. In particular,

(a)

$\xi_{0}<0$ and $\xi_{1}<0$ $\Rightarrow$ There does not exist a Nash equilibrium for $a\neq 0$ ; however, due to Remark 4.1, for $a=0$ , there always exist non-informative equilibria. 2. (b)

$\xi_{0}<0$ and $\xi_{1}>0$ $\Rightarrow$ If $\sqrt{P_{\mathrm{avg}}-\pi_{0}^{t}(x^{*})^{2}\over\pi_{1}^{t}}>x^{*}\Rightarrow x^{*}<\sqrt{P_{\mathrm{avg}}}$ , then the Nash equilibrium is informative. If $x^{*}=\sqrt{P_{\mathrm{avg}}}$ , there exists a non-informative equilibrium. Otherwise; i.e., if $x^{*}>\sqrt{P_{\mathrm{avg}}}$ , there does not exist a Nash equilibrium for $a\neq 0$ ; however, due to Remark 4.1, for $a=0$ , there always exist non-informative equilibria. 3. (c)

$\xi_{0}>0$ and $\xi_{1}<0$ $\Rightarrow$ If $x^{*}>\sqrt{P_{\mathrm{avg}}}$ , then the Nash equilibrium is informative. If $x^{*}=\sqrt{P_{\mathrm{avg}}}$ , there exists a non-informative equilibrium. Otherwise; i.e., if $x^{*}<\sqrt{P_{\mathrm{avg}}}$ , there does not exist a Nash equilibrium for $a\neq 0$ ; however, due to Remark 4.1, for $a=0$ , there always exist non-informative equilibria. 4. (d)

$\xi_{0}>0$ and $\xi_{1}>0$ $\Rightarrow$ There exists an informative Nash equilibrium.

7 CONCLUDING REMARKS

In this paper, we considered binary signaling problems in which the decision makers (the transmitter and the receiver) have subjective priors and/or misaligned objective functions. Depending on the commitment nature of the transmitter to his policies, we formulated the binary signaling problem as a Bayesian game under either Nash or Stackelberg equilibrium concepts and established equilibrium solutions and their properties.

We showed that there can be informative or non-informative equilibria in the binary signaling game under the Stackelberg and Nash assumptions, and derived the conditions under which an informative equilibrium exists. We also studied the effects of small perturbations around the team setup (with identical priors and costs) and showed that the game equilibrium behavior around the team setup is robust under the Nash assumption, whereas it is not robust under the Stackelberg assumption.

The binary setup considered here can be extended to the $M$ -ary hypothesis testing setup, and the corresponding signaling game structure can be formed in order to model a game between players with a multiple-bit communication channel. The extension to more general noise distributions is possible: the Nash equilibrium analysis holds identically when the noise distribution leads to a single-threshold test. Finally, in addition to the Bayesian approach considered here, different cost structures and parameters can be introduced by investigating the game under Neyman-Pearson and mini-max criteria.

Bibliography37

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Sarıtaş, S. Gezici, and S. Yüksel, “Binary signaling under subjective priors and costs as a game,” in 57th IEEE Conference on Decision and Control (CDC) , Dec. 2018, pp. 1130–1135.
2[2] H. Sandberg, S. S. Amin, and K. H. Johansson, “Cyberphysical security in networked control systems: An introduction to the issue,” IEEE Control Systems , vol. 35, no. 1, pp. 20–23, 2015.
3[3] A. Teixeira, I. Shames, H. Sandberg, and K. H. Johansson, “A secure control framework for resource-limited adversaries,” Automatica , vol. 51, pp. 135–148, 2015.
4[4] T. Alpcan and T. Başar, Network Security: A Decision and Game-Theoretic Approach , 1st ed. New York, NY, USA: Cambridge University Press, 2010.
5[5] Y. Mo, T.-H. Kim, K. Brancik, D. Dickinson, H. Lee, A. Perrig, and B. Sinopoli, “Cyber physical security of a smart grid infrastructure,” Proceedings of the IEEE , vol. 100, no. 1, pp. 195–209, Jan. 2012.
6[6] G. Dán and H. Sandberg, “Stealth attacks and protection schemes for state estimators in power systems,” in First IEEE International Conference on Smart Grid Communications (Smart Grid Comm) , 2010, pp. 214–219.
7[7] A. S. Rawat, P. Anand, H. Chen, and P. K. Varshney, “Collaborative spectrum sensing in the presence of Byzantine attacks in cognitive radio networks,” IEEE Transactions on Signal Processing , vol. 59, no. 2, pp. 774–786, Feb. 2011.
8[8] W. Hashlamoun, S. Brahma, and P. K. Varshney, “Mitigation of Byzantine attacks on distributed detection systems using audit bits,” IEEE Transactions on Signal and Information Processing over Networks , vol. 4, no. 1, pp. 18–32, Mar. 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Hypothesis Testing under Subjective Priors and Costs as a Signaling Game111This research was supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada. Part of this work [1] was presented at the 57th IEEE Conference on Decision and Control (CDC 2018).

Abstract

1 INTRODUCTION

1.1 Notation

1.2 Preliminaries

1.3 Two Motivating Setups

1.3.1 Subjective Priors

1.3.2 Biased Transmitter Cost333Here, the cost refers to the objective function (Bayes risk), not the cost of a particular decision, CjiC_{ji}Cji​. Note that, throughout the manuscript, the cost refers to CjiC_{ji}Cji​ except when it is used in the phrase Biased Transmitter Cost.

1.4 Related Literature

1.5 Contributions

2 TEAM THEORETIC ANALYSIS: CLASSICAL SETUP with IDENTICAL COSTS and PRIORS

Theorem 2.1**.**

Proof.

Remark 2.1**.**

3 STACKELBERG GAME ANALYSIS

3.1 Equilibrium Solutions

Theorem 3.1**.**

Remark 3.1**.**

Proof.

Remark 3.2**.**

3.2 Continuity and Robustness to Perturbations around the Team Setup

3.3 Application to the Motivating Examples

3.3.1 Subjective Priors

3.3.2 Biased Transmitter Cost

4 NASH GAME ANALYSIS

4.1 Equilibrium Solutions

Theorem 4.1**.**

Proof.

Remark 4.1**.**

Remark 4.2**.**

4.2 Continuity and Robustness to Perturbations around the Team Setup

4.3 Application to the Motivating Examples

4.3.1 Subjective Priors

4.3.2 Biased Transmitter Cost

5 EXTENSION to the MULTI-DIMENSIONAL CASE

5.1 Team Setup Analysis

Theorem 5.1**.**

Proof.

5.2 Stackelberg Game Analysis

Theorem 5.2**.**

Proof.

5.3 Nash Game Analysis

Theorem 5.3**.**

Proof.

Remark 5.1**.**

6 EXTENSION TO a SCENARIO with an AVERAGE POWER CONSTRAINT

Lemma 6.1**.**

Proof.

6.1 Team Theoretic Analysis

6.2 Stackelberg Game Analysis

6.3 Nash Game Analysis

7 CONCLUDING REMARKS

1.3.2 Biased Transmitter Cost333Here, the cost refers to the objective function (Bayes risk), not the cost of a particular decision, $C_{ji}$ . Note that, throughout the manuscript, the cost refers to $C_{ji}$ except when it is used in the phrase Biased Transmitter Cost.

Theorem 2.1.

Remark 2.1.

Theorem 3.1.

Remark 3.1.

Remark 3.2.

Theorem 4.1.

Remark 4.1.

Remark 4.2.

Theorem 5.1.

Theorem 5.2.

Theorem 5.3.

Remark 5.1.

Lemma 6.1.