Nash equilibria in games over graphs equipped with a communication   mechanism

Patricia Bouyer; Nathan Thomasset

arXiv:1906.07753·cs.GT·June 27, 2019

Nash equilibria in games over graphs equipped with a communication mechanism

Patricia Bouyer, Nathan Thomasset

PDF

TL;DR

This paper investigates pure Nash equilibria in infinite-duration graph games with partial visibility and communication, proposing an epistemic approach to characterize equilibria through a simple reporting mechanism.

Contribution

It introduces a novel epistemic game construction that captures players' knowledge and characterizes Nash equilibria using a simple communication pattern.

Findings

01

Communication mechanism effectively characterizes Nash equilibria.

02

Epistemic construction records players' knowledge for equilibrium analysis.

03

Potential for efficient algorithms to compute equilibria based on the construction.

Abstract

We study pure Nash equilibria in infinite-duration games on graphs, with partial visibility of actions but communication (based on a graph) among the players. We show that a simple communication mechanism consisting in reporting the deviator when seeing it and propagating this information is sufficient for characterizing Nash equilibria. We propose an epistemic game construction, which conveniently records important information about the knowledge of the players. With this abstraction, we are able to characterize Nash equilibria which follow the simple communication pattern via winning strategies. We finally discuss the size of the construction, which would allow efficient algorithmic solutions to compute Nash equilibria in the original game.

Equations50

v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \cdot (m_{1}, mes_{1}) \cdot v_{2} \dots (m_{s - 1}, mes_{s - 1}) \cdot v_{s} \in V \cdot ((Act P \times ({0, 1}^{*}) P) \cdot V)^{*}

v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \cdot (m_{1}, mes_{1}) \cdot v_{2} \dots (m_{s - 1}, mes_{s - 1}) \cdot v_{s} \in V \cdot ((Act P \times ({0, 1}^{*}) P) \cdot V)^{*}

v_{0} m_{0}, mes_{0} v_{1} m_{1}, mes_{1} v_{2} \dots m_{s - 1}, mes_{s - 1} v_{s}

v_{0} m_{0}, mes_{0} v_{1} m_{1}, mes_{1} v_{2} \dots m_{s - 1}, mes_{s - 1} v_{s}

v_{0} \cdot (m_{0} (\mathbbm n (a)), mes_{0} (\mathbbm n (a))) \cdot v_{1} \cdot (m_{1} (\mathbbm n (a)), mes_{1} (\mathbbm n (a))) \cdot v_{2} \dots \dots (m_{s - 1} (\mathbbm n (a)), mes_{s - 1} (\mathbbm n (a))) \cdot v_{s} \in V \cdot ((Act^{\mathbbm n (a)} \times ({0, 1}^{*})^{\mathbbm n (a)}) \cdot V)^{*}

v_{0} \cdot (m_{0} (\mathbbm n (a)), mes_{0} (\mathbbm n (a))) \cdot v_{1} \cdot (m_{1} (\mathbbm n (a)), mes_{1} (\mathbbm n (a))) \cdot v_{2} \dots \dots (m_{s - 1} (\mathbbm n (a)), mes_{s - 1} (\mathbbm n (a))) \cdot v_{s} \in V \cdot ((Act^{\mathbbm n (a)} \times ({0, 1}^{*})^{\mathbbm n (a)}) \cdot V)^{*}

\textup{payoff}(\rho)=\left\{\begin{array}[]{ll}(0,0,1,1,1)&\text{if}\ \rho\ \text{visits}\ v_{1}\ \text{infinitely often}\\ (0,0,2,2,2)&\text{if}\ \rho\ \text{visits}\ v_{1}\ \text{finitely often and}\ v^{\prime}_{1}\ \text{infinitely often}\\ (0,0,0,2,2)&\text{if}\ \rho\ \text{ends up in}\ v_{2}\\ (0,0,2,0,2)&\text{if}\ \rho\ \text{ends up in}\ v_{3}\\ (0,0,2,2,0)&\text{if}\ \rho\ \text{ends up in}\ v_{4}\\ (0,0,3,3,3)&\text{if}\ \rho\ \text{ends up in}\ v^{\prime}_{0}\end{array}\right.

\textup{payoff}(\rho)=\left\{\begin{array}[]{ll}(0,0,1,1,1)&\text{if}\ \rho\ \text{visits}\ v_{1}\ \text{infinitely often}\\ (0,0,2,2,2)&\text{if}\ \rho\ \text{visits}\ v_{1}\ \text{finitely often and}\ v^{\prime}_{1}\ \text{infinitely often}\\ (0,0,0,2,2)&\text{if}\ \rho\ \text{ends up in}\ v_{2}\\ (0,0,2,0,2)&\text{if}\ \rho\ \text{ends up in}\ v_{3}\\ (0,0,2,2,0)&\text{if}\ \rho\ \text{ends up in}\ v_{4}\\ (0,0,3,3,3)&\text{if}\ \rho\ \text{ends up in}\ v^{\prime}_{0}\end{array}\right.

\rho=\big{(}v_{0}\cdot(\alpha^{5},\textsf{mes}_{\epsilon})\cdot v_{1}\cdot(\alpha^{5},\textsf{mes}_{\epsilon})\big{)}^{\omega}

\rho=\big{(}v_{0}\cdot(\alpha^{5},\textsf{mes}_{\epsilon})\cdot v_{1}\cdot(\alpha^{5},\textsf{mes}_{\epsilon})\big{)}^{\omega}

f (2) (0) = f (3) (0) f (2) (1) = f (3) (1) = f (4) (1) f (3) (2) = f (4) (2) f (2) (3) = f (4) (3)

f (2) (0) = f (3) (0) f (2) (1) = f (3) (1) = f (4) (1) f (3) (2) = f (4) (2) f (2) (3) = f (4) (3)

∣ S_{Eve} ∣ \leq ∣ V ∣ + ∣ V ∣ \cdot ∣ Tab ∣^{2} \cdot (diam (G) + 2) and ∣ S_{Adam} ∣ \leq ∣ S_{Eve} ∣ \cdot ∣ Act ∣^{∣ \mathpzc P ∣^{2}}

∣ S_{Eve} ∣ \leq ∣ V ∣ + ∣ V ∣ \cdot ∣ Tab ∣^{2} \cdot (diam (G) + 2) and ∣ S_{Adam} ∣ \leq ∣ S_{Eve} ∣ \cdot ∣ Act ∣^{∣ \mathpzc P ∣^{2}}

ρ = v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \cdot (m_{1}, mes_{1}) \cdot v_{2} \dots (m_{r - 1}, mes_{r - 1}) \cdot v_{r} \dots

ρ = v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \cdot (m_{1}, mes_{1}) \cdot v_{2} \dots (m_{r - 1}, mes_{r - 1}) \cdot v_{r} \dots

ρ = v_{0} \cdot (m_{0}, mes_{ϵ}) \cdot v_{1} \cdot (m_{1}, mes_{ϵ}) \cdot v_{2} \dots (m_{r - 1}, mes_{ϵ}) \cdot v_{r} \dots

ρ = v_{0} \cdot (m_{0}, mes_{ϵ}) \cdot v_{1} \cdot (m_{1}, mes_{ϵ}) \cdot v_{2} \dots (m_{r - 1}, mes_{ϵ}) \cdot v_{r} \dots

h = v_{0} \cdot (\overset{m}{ˉ}_{0}, \overset{ˉ}{mes}_{0}) \cdot v_{1} \cdot (\overset{m}{ˉ}_{1}, \overset{ˉ}{mes}_{1}) \dots v_{r} \cdot (\overset{m}{ˉ}_{r}, \overset{ˉ}{mes}_{r}) \cdot h_{1}

h = v_{0} \cdot (\overset{m}{ˉ}_{0}, \overset{ˉ}{mes}_{0}) \cdot v_{1} \cdot (\overset{m}{ˉ}_{1}, \overset{ˉ}{mes}_{1}) \dots v_{r} \cdot (\overset{m}{ˉ}_{r}, \overset{ˉ}{mes}_{r}) \cdot h_{1}

\widetilde{\sigma}_{a}(h)=\sigma_{a}\big{(}h_{\leq r}\cdot(\bar{m}_{r},\textsf{mes}_{r}^{+d})\cdot h_{1}^{+d}\big{)}

\widetilde{\sigma}_{a}(h)=\sigma_{a}\big{(}h_{\leq r}\cdot(\bar{m}_{r},\textsf{mes}_{r}^{+d})\cdot h_{1}^{+d}\big{)}

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h)),\textsf{mes}_{\epsilon}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},\emptyset)

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h)),\textsf{mes}_{\epsilon}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},\emptyset)

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h))[d/\delta],\textsf{mes}_{d}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},\emptyset)

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h))[d/\delta],\textsf{mes}_{d}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},\emptyset)

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h))[d/\delta],\textsf{mes}_{d}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},X^{\prime})

E_{\zeta}\big{(}h\cdot\pi_{a}\big{(}\zeta(E_{\zeta}(h))[d/\delta],\textsf{mes}_{d}\big{)}\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,\emptyset),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},X^{\prime})

\sigma_{a}(h)=\left\{\begin{array}[]{ll}(m(a),\textsf{id}_{d})&\text{if there is}\ b\in{\mathbbm{n}}(a)\ \text{s.t.}\ \textsf{mes}(b)=\textsf{id}_{d},\\ (m(a),\epsilon)&\text{otherwise.}\end{array}\right.

\sigma_{a}(h)=\left\{\begin{array}[]{ll}(m(a),\textsf{id}_{d})&\text{if there is}\ b\in{\mathbbm{n}}(a)\ \text{s.t.}\ \textsf{mes}(b)=\textsf{id}_{d},\\ (m(a),\epsilon)&\text{otherwise.}\end{array}\right.

E_{\zeta}\big{(}h\cdot\pi_{a}(m[d/\delta],\textsf{mes}^{\prime})\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,X),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},X^{\prime})

E_{\zeta}\big{(}h\cdot\pi_{a}(m[d/\delta],\textsf{mes}^{\prime})\cdot v^{\prime}\big{)}=E_{\zeta}(h)\cdot((v,X),\zeta(E_{\zeta}(h)))\cdot(v^{\prime},X^{\prime})

\sigma_{a}(h^{\prime}_{r})=\left\{\begin{array}[]{ll}\Big{(}\big{(}\zeta(E_{\zeta}(\pi_{a}(h^{\prime}_{r})))(d)\big{)}_{a},\textsf{id}_{d}\Big{)}&\text{if there is}\ b\in{\mathbbm{n}}(a)\ \text{s.t.}\ \textsf{mes}^{\prime}_{r-1}(b)\neq\epsilon\\ \Big{(}\big{(}\zeta(E_{\zeta}(\pi_{a}(h^{\prime}_{r})))(d)\big{)}_{a},\epsilon\Big{)}&\text{otherwise}\end{array}\right.

\sigma_{a}(h^{\prime}_{r})=\left\{\begin{array}[]{ll}\Big{(}\big{(}\zeta(E_{\zeta}(\pi_{a}(h^{\prime}_{r})))(d)\big{)}_{a},\textsf{id}_{d}\Big{)}&\text{if there is}\ b\in{\mathbbm{n}}(a)\ \text{s.t.}\ \textsf{mes}^{\prime}_{r-1}(b)\neq\epsilon\\ \Big{(}\big{(}\zeta(E_{\zeta}(\pi_{a}(h^{\prime}_{r})))(d)\big{)}_{a},\epsilon\Big{)}&\text{otherwise}\end{array}\right.

h_{r + 1}^{'} = h_{r}^{'} \cdot (m^{'} [d / δ], mes_{r}^{'}) \cdot v_{r + 1}^{'}

h_{r + 1}^{'} = h_{r}^{'} \cdot (m^{'} [d / δ], mes_{r}^{'}) \cdot v_{r + 1}^{'}

R_{r + 1}^{'} = R_{r}^{'} \cdot ((v_{r}^{'}, X_{r}^{'}), ζ (R_{r}^{'})) \cdot (v_{r + 1}^{'}, X_{r + 1}^{'})

R_{r + 1}^{'} = R_{r}^{'} \cdot ((v_{r}^{'}, X_{r}^{'}), ζ (R_{r}^{'})) \cdot (v_{r + 1}^{'}, X_{r + 1}^{'})

h = v_{0} \cdot (f_{0}, mes_{ϵ}) \cdot v_{1} \cdot (f_{1}, mes_{ϵ}) \cdot v_{2} \dots (f_{s - 1}, mes_{ϵ}) \cdot v_{s}

h = v_{0} \cdot (f_{0}, mes_{ϵ}) \cdot v_{1} \cdot (f_{1}, mes_{ϵ}) \cdot v_{2} \dots (f_{s - 1}, mes_{ϵ}) \cdot v_{s}

h_{d} = v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \dots (m_{s - 1}, mes_{s_{1}}) \cdot v_{s}

h_{d} = v_{0} \cdot (m_{0}, mes_{0}) \cdot v_{1} \dots (m_{s - 1}, mes_{s_{1}}) \cdot v_{s}

v_{0}^{'} \cdot (f_{0}^{'}, mes_{0}) \cdot v_{1}^{'} \cdot (f_{1}^{'}, mes_{1}) \cdot v_{2}^{'} \dots (f_{s - 1}^{'}, mes_{s - 1}) \cdot v_{s}^{'} = Λ (R_{\leq s}^{'}) (d)

v_{0}^{'} \cdot (f_{0}^{'}, mes_{0}) \cdot v_{1}^{'} \cdot (f_{1}^{'}, mes_{1}) \cdot v_{2}^{'} \dots (f_{s - 1}^{'}, mes_{s - 1}) \cdot v_{s}^{'} = Λ (R_{\leq s}^{'}) (d)

Λ (R_{\leq s}^{'}) (d) \cdot (m [d / δ], mes) \cdot v^{'}

Λ (R_{\leq s}^{'}) (d) \cdot (m [d / δ], mes) \cdot v^{'}

Λ (R_{\leq s}^{'}) (d) \cdot (ζ (R_{\leq s}^{'}) (d) [d / δ], mes) \cdot v_{s + 1}^{'} = Λ (R_{\leq s + 1}^{'}) (d)

Λ (R_{\leq s}^{'}) (d) \cdot (ζ (R_{\leq s}^{'}) (d) [d / δ], mes) \cdot v_{s + 1}^{'} = Λ (R_{\leq s + 1}^{'}) (d)

ρ^{'} = out (σ [d / σ_{d}^{'}], v_{0}) = Λ (R^{'}) (d)

ρ^{'} = out (σ [d / σ_{d}^{'}], v_{0}) = Λ (R^{'}) (d)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

LSV, CNRS, ENS Paris-Saclay, Université Paris-Saclay, [email protected]://orcid.org/0000-0002-2823-091 LSV, ENS Paris-Saclay, CNRS, Université Paris-Saclay, [email protected]

\relatedversionFull version available at http://arxiv.org/abs/1906.07753 \fundingThis work has been partly supported by ERC project EQualIS (FP7-308087). \CopyrightP. Bouyer and N. Thomasset \ccsdescTheory of computation \ccsdescSolution concepts in game theory \ccsdescVerification by model checking \supplement

Acknowledgements.

\EventEditorsPeter Rossmanith, Pinar Heggernes, and Joost-Pieter Katoen \EventNoEds3 \EventLongTitle44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019) \EventShortTitleMFCS 2019 \EventAcronymMFCS \EventYear2019 \EventDateAugust 26–30, 2019 \EventLocationAachen, Germany \EventLogo \SeriesVolume138 \ArticleNo14

Nash equilibria in games over graphs equipped

with a communication mechanism

Patricia Bouyer

Nathan Thomasset

Abstract

We study pure Nash equilibria in infinite-duration games on graphs, with partial visibility of actions but communication (based on a graph) among the players. We show that a simple communication mechanism consisting in reporting the deviator when seeing it and propagating this information is sufficient for characterizing Nash equilibria. We propose an epistemic game construction, which conveniently records important information about the knowledge of the players. With this abstraction, we are able to characterize Nash equilibria which follow the simple communication pattern via winning strategies. We finally discuss the size of the construction, which would allow efficient algorithmic solutions to compute Nash equilibria in the original game.

keywords:

Multiplayer games, Nash equilibria, partial information

category:

\relatedversion

1 Introduction

Multiplayer concurrent games over graphs allow to model rich interactions between players. Those games are played as follows. In a state, each player chooses privately and independently an action, defining globally a move (one action per player); the next state of the game is then defined as the successor (on the graph) of the current state using that move; players continue playing from that new state, and form a(n infinite) play. Each player then gets a reward given by a payoff function (one function per player). In particular, objectives of the players may not be contradictory: those games are non-zero-sum games, contrary to two-player games used for controllers or reactive synthesis [16, 12].

Using solution concepts borrowed from game theory, one can describe interactions among the players, and in particular rational behaviours of selfish players. One of the most basic and classically studied solution concepts is that of Nash equilibria [13]. A Nash equilibrium is a strategy profile where no player can improve her payoff by unilaterally changing her strategy. The outcome of a Nash equilibrium can therefore be seen as a rational behaviour of the system. While very much studied by game theorists, e.g. over (repeated) matrix games, such a concept (and variants thereof) has been only rather recently studied over infinite-duration games on graphs. Probably the first works in that direction are [9, 8, 17, 18]. Several series of works have followed. To roughly give an idea of the existing results, pure Nash equilibria always exist in turn-based games for $\omega$ -regular objectives [20] but not in concurrent games, even with simple objectives; they can nevertheless be computed [20, 4, 7, 3] for large classes of objectives. The problem becomes harder with mixed (that is, stochastic) Nash equilibria, for which we often cannot decide the existence [19, 5].

Computing Nash equilibria requires to (i) find a good behaviour of the system; (ii) detect deviations from that behaviour, and identify deviating players; (iii) punish them. This simple characterization of Nash equilibria is made explicit in [10]. Variants of Nash equilibria require slightly different ingredients, but they are mostly of a similar vein.

Many of those results are proven using the construction of a two-player game, in which winning strategies correspond (in some precise sense) to Nash equilibria in the original game. This two-player game basically records the knowledge of the various players about everything which can be uncertain: (a) the possible deviators in [4], and (b) the possible states the game can be in [3]. Extensions of this construction can be used for other solution concepts like robust equilibria [7] or rational synthesis [11].

In this work, we consider infinite-duration games on graphs, in which the game arena is perfectly known by all the players, but players have only a partial information on the actions played by the other players.

The partial-information setting of this work is inspired by [15]: it considers repeated games played on matrices, where players only see actions of their neighbours. Neighbours are specified by a communication graph. To ensure a correct detection of deviators, the solution is to propagate the identity of the deviator along the communication graph. A fingerprint (finite sequence of actions) of every player is agreed at the beginning, and the propagation can be made properly if and only if the communication graph is 2-connected, ensuring large sets of Nash equilibria (formalized as a folk theorem). Fingerprints are not adapted to the setting of graphs, since they may delay the time at which a player will learn the identity of the deviator, which may be prohibitive if a bad component of a graph is then reached.

We therefore propose to add real communication among players. Similarly to [15], a player can communicate only with her neighbours (also specified by a communication graph), but can send arbitrary messages (modelled as arbitrary words over alphabet $\{0,1\}$ ). We assume that visited states are known by the players, hence only the deviator (if any) may be unknown to the players. In this setting, we show the following results:

•

We show that a very simple epidemic-like communication mechanism is sufficient for defining Nash equilibria. It consists in (a) reporting the deviator (for the neighbours of the deviator) as soon as it is detected, and (b) propagating this information (for the other players).

•

We build an epistemic game, which tracks those strategy profiles which follow the above simple communication pattern. This is a two-player turn-based game, in which Eve (the first player) suggest moves, and Adam (the second player) complies (to generate the main outcome), or not (to mimic single-player deviations). The correctness of the construction is formulated as follows: there is a Nash equilibrium in the original game of payoff $p$ if and only if there is a strategy for Eve in the epistemic game which is winning for $p$ .

•

We analyze the complexity of this construction.

Note that we do not assume connectedness of the communication graph, hence the particular case of a graph with no edges allows to recover the setting of [4] while a complete graph allows to recover the settings of [20, 7].

In Section 2, we define our model and give an example to illustrate the role of the communication graph. In Section 3, we prove the simple communication pattern. In Section 4, we construct the epistemic game and discuss its correctness. In Section 5, we discuss complexity issues. All proofs are postponed to the technical appendix.

2 Definitions

We use the standard notations (resp. , ) for the set of real (resp. rational, natural) numbers. If $S$ is a subset of , we write $\overline{S}$ for $S\cup\{-\infty,+\infty\}$ .

Let $S$ be a finite set and $R\subseteq S$ . If $m$ is an $S$ -vector over some set $\Sigma$ , we write $m(R)$ (resp. $m(-R)$ ) for the vector composed of the $R$ -components of $m$ (resp. all but the $R$ -components). We also use abusively the notations $m(i)$ (resp. $m(-i)$ ) when $i$ is a single element of $S$ , and may sometimes even use $m_{i}$ if this is clear in the context. Also, if $s\in S$ and $a\in\Sigma$ , then $m[s/a]$ is the vector where the value $m(s)$ is replaced by $a$ .

If $S$ is a finite set, we write $S^{*}$ (resp. $S^{+}$ , $S^{\omega}$ ) for the set of words (resp. non-empty word, infinite words) defined on alphabet $S$ .

2.1 Concurrent games and communication graphs

We use the model of concurrent multi-player games [4], based on the two-player model of [1].

Definition 2.1.

A concurrent multiplayer game is a tuple $=\langle\penalty 10000V,v_{\textup{init}},\penalty-1000\textup{Act},\penalty-1000\mathpzc{P},\penalty-1000\Sigma,\penalty-1000\textup{Allow},\penalty-1000\textup{Tab},(\textup{payoff}_{a})_{a\in\mathpzc{P}}\penalty 10000\rangle$ , where $V$ is a finite set of vertices, $v_{\textup{init}}\in V$ is the initial vertex, Act is a finite set of actions, $\mathpzc{P}$ is a finite set of players, $\Sigma$ is a finite alphabet, $\textup{Allow}\colon V\times\mathpzc{P}\to 2^{\textup{Act}}\setminus\{\emptyset\}$ is a mapping indicating the actions available to a given player in a given vertex,111This condition ensures that the game is non-blocking. $\textup{Tab}\colon V\times\textup{Act}^{\mathpzc{P}}\to V$ associates, with a given vertex and a given action tuple the target vertex, for every $a\in\mathpzc{P}$ , $\textup{payoff}_{a}\colon V^{\omega}\to$ is a payoff function with values in a domain $\subseteq\overline{}$ .

An element of $\textup{Act}{P}$ is called a move. Standardly (see [1] for two-player games and [4] for the multiplayer extension), concurrent games are played as follows: from a given vertex $v$ , each player selects independently an action (allowed by Allow), which altogether form a move $m$ ; then, the game proceeds to the next vertex, given by $\textup{Tab}(v,m)$ ; the game continues from that new vertex.

Our setting will refine this model, in that at each round, each player will also broadcast a message, which will be received by some of the players. The players that can receive a message will be specified using a communication graph that we will introduce later. The role of the messages will remain unclear until we commit to the definition of a strategy.

Formally, a full history $h$ in is a finite sequence

[TABLE]

such that for every $0\leq r<s$ , for every $a\in\mathpzc{P}$ , $m_{r}(a)\in\textup{Allow}(v_{r},a)$ , and $v_{r+1}=\textup{Tab}(v_{r},m_{r})$ . For every $0\leq r<s$ , for every $a\in\mathpzc{P}$ , the set $\textsf{mes}_{r}(a)$ is the message appended to action $m_{r}(a)$ at step $r+1$ , which will be broadcast to some other players. For readability we will also write $h$ as

[TABLE]

We write $\mathit{vertices}(h)=v_{0}\cdot v_{1}\cdots v_{s}$ , and $\textit{last}(h)$ for the last vertex of $h$ (that is, $v_{s}$ ). If $r\leq s$ , we also write $h_{\geq r}$ (resp. $h_{\leq r}$ ) for the suffix $v_{r}\cdot(m_{r},\textsf{mes}_{r})\cdot v_{r+1}\cdot(m_{r+1},\textsf{mes}_{r+1})\ldots(m_{s-1},\textsf{mes}_{s-1})\cdot v_{s}$ (resp. prefix $v_{0}\cdot(m_{0},\textsf{mes}_{0})\cdot v_{1}\cdot(m_{1},\textsf{mes}_{1})\ldots(m_{r-1},\textsf{mes}_{r-1})\cdot v_{r}$ ). We write $\textup{Hist}_{(}v_{0})$ (or simply $\textup{Hist}(v_{0})$ if is clear in the context) for the set of full histories in that start at $v_{0}$ . If $h\in\textup{Hist}(v_{0})$ and $h^{\prime}\in\textup{Hist}(\textit{last}(h))$ , then we write $h\cdot h^{\prime}$ for the obvious concatenation of histories (it then belongs to $\textup{Hist}(v_{0})$ ).

We add a communication (directed) graph $G=(\mathpzc{P},E)$ to the context. The set of vertices of $G$ is the set of players, and edges define a neighbourhood relation. An edge $(a,b)\in E$ (with $a\neq b$ ) means that player $b$ can see which actions are played by player $a$ together with the messages broadcast by player $a$ . Later we write $a{\mathord{\rightarrowtriangle}}b$ whenever $(a,b)\in E$ or $a=b$ , and ${\mathbbm{n}}(b)=\{a\in\mathpzc{P}\mid a{\mathord{\rightarrowtriangle}}b\}$ for the so-called neighbourhood of $b$ (that is, the set of players about which player $b$ has information). If $a,b\in\mathpzc{P}$ , we write $\textsf{dist}_{G}(a,b)$ for the distance in $G$ from $a$ to $b$ ( $+\infty$ if there is no path from $a$ to $b$ ).

Let $a\in\mathpzc{P}$ be a player. The projection of $h$ for $a$ is denoted $\pi_{a}(h)$ and is defined by

[TABLE]

This will be the information available to player $a$ . In particular, messages broadcast by the players are part of this information. Note that we assume perfect recall, that is, while playing, player $a$ will remember all her past knowledge, that is, all of $\pi_{a}(h)$ if $h$ has been played so far. We define the undistinguishability relation $\sim_{a}$ as the equivalence relation over full histories induced by $\pi_{a}$ : for two histories $h$ and $h^{\prime}$ , $h\sim_{a}h^{\prime}$ iff $\pi_{a}(h)=\pi_{a}(h^{\prime})$ . While playing, if $h\sim_{a}h^{\prime}$ , $a$ will not be able to know whether $h$ or $h^{\prime}$ has been played. We write $\textup{Hist}_{,a}(v_{0})$ for the set of histories for player $a$ (also called $a$ -histories) from $v_{0}$ .

We extend all the above notions to infinite sequences in a straightforward way and to the notion of full play. We write $\textup{Plays}_{(}v_{0})$ (or simply $\textup{Plays}(v_{0})$ if is clear in the context) for the set of full plays in that start at $v_{0}$ .

A strategy for a player $a\in\mathpzc{P}$ from vertex $v_{0}$ is a mapping $\sigma_{a}\colon\textup{Hist}(v_{0})\to\textup{Act}\times\{0,1\}^{*}$ such that for every history $h\in\textup{Hist}(v_{0})$ , $\sigma_{a}(h)[1]\in\textup{Allow}(\textit{last}(h),a)$ , where the notation $\sigma_{a}(h)[1]$ denotes the first component of the pair $\sigma(h)$ . The value $\sigma_{a}(h)[1]$ represents the action that player $a$ will do after $h$ , while $\sigma_{a}(h)[2]$ is the message that she will append to her action and broadcast to all players $b$ such that $a{\mathord{\rightarrowtriangle}}b$ . The strategy $\sigma_{a}$ is said $G$ -compatible if furthermore, for all histories $h,h^{\prime}\in\textup{Hist}(v_{0})$ , $h\sim_{a}h^{\prime}$ implies $\sigma_{a}(h)=\sigma_{a}(h^{\prime})$ . In that case, $\sigma_{a}$ can equivalently be seen as a mapping $\textup{Hist}_{,a}(v_{0})\to\textup{Act}\times\{0,1\}^{*}$ . An outcome of $\sigma_{a}$ is a(n infinite) play $\rho=v_{0}\cdot(m_{0},\textsf{mes}_{0})\cdot v_{1}\cdot(m_{1},\textsf{mes}_{1})\ldots$ such that for every $r\geq 0$ , $\sigma_{a}(\rho_{\leq r})=(m_{r}(a),\textsf{mes}_{r}(a))$ . We write $\textup{out}(\sigma_{a},v_{0})$ for the set of outcomes of $\sigma_{a}$ from $v_{0}$ .

A strategy profile is a tuple $\sigma{P}=(\sigma_{a})_{a\in\mathpzc{P}}$ , where, for every player $a\in\mathpzc{P}$ , $\sigma_{a}$ is a strategy for player $a$ . The strategy profile is said $G$ -compatible whenever each $\sigma_{a}$ is $G$ -compatible. We write $\textup{out}(\sigma{P},v_{0})$ for the unique full play from $v_{0}$ , which is an outcome of all strategies part of $\sigma{P}$ .

When $\sigma{P}$ is a strategy profile and $\sigma^{\prime}_{d}$ a player- $d$ $G$ -compatible strategy, we write $\sigma{P}[d/\sigma^{\prime}_{d}]$ for the profile where player $d$ plays according to $\sigma^{\prime}_{d}$ , and each other player $a$ ( $\neq d$ ) plays according to $\sigma_{a}$ . The strategy $\sigma^{\prime}_{d}$ is a deviation of player $d$ , or a $d$ -deviation w.r.t. $\sigma{P}$ . Such a $d$ -deviation is said profitable w.r.t. $\sigma{P}$ whenever $\textup{payoff}_{d}\Big{(}\mathit{vertices}(\textup{out}(\sigma{P},v_{0}))\Big{)}<\textup{payoff}_{d}\Big{(}\mathit{vertices}(\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0}))\Big{)}$ .

Definition 2.2.

A Nash equilibrium from $v_{0}$ is a $G$ -compatible strategy profile $\sigma{P}$ such that for every $d\in\mathpzc{P}$ , there is no profitable $d$ -deviation w.r.t. $\sigma{P}$ .

In this definition, deviation $\sigma^{\prime}_{d}$ needs not really to be $G$ -compatible, since the only meaningful part of $\sigma^{\prime}_{d}$ is along $\textup{out}(\sigma[d/\sigma^{\prime}_{d}],v_{0})$ , where there are no $\sim_{d}$ -equivalent histories: any deviation can be made $G$ -compatible without affecting the profitability of the resulting outcome.

Remark 2.3.

Before pursuing our study, let us make clear what information players have: a player knows the full arena of the game and the whole communication graph; when playing the game, a player sees the states which are visited, and see actions of and messages from her neighbours (in the communication graph). When playing the profile of a Nash equilibrium, all players know all strategies, hence a player knows precisely what is expected to be the main outcome; in particular, when the play leaves the main outcome, each player knows that a deviation has occurred, even though she didn’t see the deviator or received any message. Note that deviations which do not leave the main outcome may occur; in this case, only the neighbours of the deviator will know that such a deviation occurred; we will see that it is useless to take care of such deviations.

2.2 An example

We consider the five-player game described in Figure 1 in which we denote the players $\mathpzc{P}=\{0,1,2,3,4\}$ . The action alphabet is $\textup{Act}=\{\alpha,\beta\}$ , and the initial vertex is assumed to be $v_{0}$ . We suppose the payoff function vector is defined as (to be read as the list of payoffs of the players):

[TABLE]

We consider a (partial) strategy profile $\sigma$ whose main outcome is:

[TABLE]

where $\textsf{mes}_{\epsilon}(a)=\epsilon$ for every $a\in\mathpzc{P}$ . Note that players [math] and $1$ cannot benefit from any deviation since their payoffs is uniformly [math]. Then notice that no one alone can deviate from $\rho$ to $v^{\prime}_{0}$ . Now, the three players $2$ , $3$ and $4$ can alone deviate to $v^{\prime}_{1}$ and (try to) do so infinitely often. We examine those deviations. If players [math] and $1$ manage to learn who is the deviator, then, together, they can punish the deviator: if they learn that player $2$ (resp. $3$ , $4$ ) is the deviator, then they will enforce vertex $v_{2}$ (resp. $v_{3}$ , $v_{4}$ ). If they do not manage to learn who is the deviator, then they will not know what to do, and therefore, in any completion of the strategy profile, there will be some profitable deviation for at least one of the players (hence there will not be any Nash equilibrium whose main outcome is $\rho$ ).

We examine now the three communication graphs $G_{1}$ , $G_{2}$ and $G_{3}$ depicted in Figure 1. Using communication based on graph $G_{1}$ , if player $4$ deviates, then player [math] will see this immediately and will be able to communicate this fact to player $1$ ; if player $3$ deviates, then player $4$ will see this immediately and will be able to communicate this fact to player [math], which will transmit to player $1$ ; if player $2$ deviates, then no one will see anything, hence they will deduce the identity of the deviator in all the cases.

Using communication based on graph $G_{2}$ , if either player $3$ or player $4$ deviates, then player [math] will see this immediately and will be able to communicate this fact to player $1$ using the richness of the communication scheme (words over $\{0,1\}$ ). Like before, the identity of deviator $2$ will be guessed after a while.

Using communication based on graph $G_{3}$ , if player $4$ deviates, then player [math] will see this immediately and will be able to communicate this fact to player $1$ (as before); now, no one (except players $2$ and $3$ ) will be able to learn who is deviating, if player $2$ or player $3$ deviates.

We can conclude that there is a Nash equilibrium with graph $G_{1}$ or $G_{2}$ whose main outcome is $\rho$ , but not with graph $G_{3}$ .

2.3 Two-player turn-based game structures

Two-player turn-based game structures are specific cases of the previous model, where at each vertex, at most one player has more than one action in her set of allowed actions. But for convenience, we will give a simplified definition, with only objects that will be useful.

A two-player turn-based game structure is a tuple $G=\langle\penalty 10000S,S_{\textup{{Eve}}},\penalty-1000S_{\textup{{Adam}}},\penalty-1000s_{\textup{init}},\penalty-1000A,\penalty-1000\textup{Allow},\textup{Tab}\penalty 10000\rangle$ , where $S=S_{\textup{{Eve}}}\sqcup S_{\textup{{Adam}}}$ is a finite set of states (states in $S_{\textup{{Eve}}}$ belong to player Eve whereas states in $S_{\textup{{Adam}}}$ belong to player Adam), $s_{\textup{init}}\in S$ is the initial state, $A$ is a finite alphabet, $\textup{Allow}\colon S\to 2^{A}\setminus\{\emptyset\}$ gives the set of available actions, and $\textup{Tab}\colon S\times A\to S$ is the next-state function. If $s\in S_{\textup{{Eve}}}$ (resp. $S_{\textup{{Adam}}}$ ), $\textup{Allow}(s)$ is the set of actions allowed to Eve (resp. Adam) in state $s$ .

In this context, strategies will use sequences of states. That is, if $a$ denotes Eve or Adam, an $a$ -strategy is a partial function $\sigma_{a}:S^{*}\cdot S_{a}\to A$ such that for every $H\in S^{*}\cdot S_{a}$ such that $\sigma_{a}(H)$ is defined, $\sigma_{a}(H)\in\textup{Allow}(\textit{last}(H))$ . Note that we do not include any winning condition or payoff function in the tuple, hence the name structure.

2.4 The problems we are looking at

We are interested in the constrained existence of a Nash equilibrium. For simplicity, we define rectangular threshold constraints, but could well impose more complex constraints, like Boolean combinations of linear constraints.

Problem 2.4 (Constrained existence problem).

Given a concurrent game $=\langle V,v_{\textup{init}},\mathpzc{P},\textup{Act},\Sigma,\textup{Allow},\textup{Tab},(\textup{payoff}_{a})_{a\in\mathpzc{P}}\rangle$ , a communication graph $G$ for $\mathpzc{P}$ , a predicate $P$ over $\mathbb{R}^{|\mathpzc{P}|}$ , can we decide whether there exists a Nash equilibrium $\sigma{P}$ from $v_{\textup{init}}$ such that $\textup{payoff}(\mathit{vertices}(\textup{out}(\sigma{P},v_{\textup{init}})))\in P$ ? If so, compute one. If the predicate $P$ is trivial, we simply speak of the existence problem.

The case where the communication graph has no edge was studied in depth in [4], with a generic two-player construction called the suspect construction, allowing to decide the constrained existence problem for many kinds of payoff functions. The case where the communication graph is a clique was the subject of the work [7]. The general case of a communication graph has not been investigated so far, but induces interesting developments. In the next section, we show that we can restrict the search of Nash equilibria to the search of so-called normed strategy profiles, where the communication via messages follows a very simple pattern. We also argue that deviations which do not impact the visited vertices should not be considered in the analysis. Given those reductions, we then propose the construction of a two-player game, which will track those normed profiles. This construction is inspired by the suspect-game construction of [4] and of the epistemic game of [3].

3 Reduction to profiles following a simple communication mechanism

We fix a concurrent game $=\langle\penalty 10000V,v_{\textup{init}},\penalty-1000\textup{Act},\penalty-1000\mathpzc{P},\penalty-1000\Sigma,\penalty-1000\textup{Allow},\penalty-1000\textup{Tab},(\textup{payoff}_{a})_{a\in\mathpzc{P}}\penalty 10000\rangle$ and a communication graph $G$ . We assume that $v_{\textup{init}}=v_{0}$ . We will reduce the search for Nash equilibria to the search for strategy profiles with a very specific shape. In particular, we will show that the richness of the communication offered by the setting is somehow useless, and that a very simple communication pattern will be sufficient for characterizing Nash equilibria.

In the following, we write $\textsf{mes}_{\epsilon}$ for the vector assigning the empty word $\epsilon$ to every player $a\in\mathpzc{P}$ . Furthermore, for every $d\in\mathpzc{P}$ , we pick some word $\textsf{id}_{d}\in\{0,1\}^{+}$ which are all distinct (and different from $\epsilon$ ).

We first define restrictions for deviations. Let $\sigma{P}$ be a strategy profile. A player- $d$ deviation $\sigma^{\prime}_{d}$ is said immediately visible whenever, writing $h$ for the longest common prefix of $\textup{out}(\sigma{P},v_{0})$ and $\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ , $\textup{Tab}(\textit{last}(h),m)\neq\textup{Tab}(\textit{last}(h),m^{\prime})$ , where $m=\sigma{P}(h)[1]$ and $m^{\prime}=\big{(}\sigma{P}[d/\sigma^{\prime}_{d}](h)\big{)}[1]$ are the next moves according to $\sigma{P}$ and $\sigma{P}[d/\sigma^{\prime}_{d}]$ . That is, at the first position where player $d$ changes her strategy, it becomes public information that a deviation has occurred (even though some players know who deviated —all the players $a$ with $d{\mathord{\rightarrowtriangle}}a$ —, and some other don’t know). It is furthermore called honest whenever for every $h^{\prime}\in\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ such that $h$ is a (non-strict) prefix of $h^{\prime}$ , $\sigma^{\prime}_{d}(h^{\prime})[2]=\textsf{id}_{d}$ . Somehow, player $d$ admits she deviated, and does so immediately and forever.

The simple communication mechanism that we will design consists in reporting the deviator (role of the direct neighbours of the deviator), and propagating this information along the communication graph (for all the other players). Formally, let $\sigma{P}$ be a strategy profile, and let $\rho$ be its main outcome. The profile $\sigma{P}$ will be said normed whenever the following conditions hold:

for every $h\in\textup{out}(\sigma{P})\cup\bigcup_{d\in\mathpzc{P},\ \sigma^{\prime}_{d}}\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ , if $\mathit{vertices}(h)$ is a prefix of $\mathit{vertices}(\rho)$ , then for every $a\in\mathpzc{P}$ , $\sigma_{a}(h)[2]=\epsilon$ ; 2. 2.

for every $d\in\mathpzc{P}$ , for every $d$ -strategy $\sigma^{\prime}_{d}$ , if $h\cdot(m,\textsf{mes})\cdot v\in\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ is the first step out of $\mathit{vertices}(\rho)$ , then for every $d{\mathord{\rightarrowtriangle}}a$ , $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\textsf{id}_{d}$ ; 3. 3.

for every $d\in\mathpzc{P}$ , for every $d$ -strategy $\sigma^{\prime}_{d}$ , if $h\cdot(m,\textsf{mes})\cdot v\in\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ has left the main outcome for more than one step, then for every $a\in\mathpzc{P}$ , $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\epsilon$ if for all $b{\mathord{\rightarrowtriangle}}a$ , $\textsf{mes}(b)=\epsilon$ and $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\textsf{id}_{d}$ if there is $b{\mathord{\rightarrowtriangle}}a$ such that $\textsf{mes}(b)=\textsf{id}_{d}$ ; note that this is well defined since at most one id can be transmitted.

The first condition says that, as long as a deviation is not visible, then no message needs to be sent; the second condition says that as soon as a deviation becomes visible, then messages denouncing the deviator should be sent by “those who know”, that is, the (immediate) neighbours of the deviator; the third condition says that the name (actually, the id) of the deviator should propagate according to the communication graph in an epidemic way.

Note that the profiles discussed in Section 2.2 were actually normed.

Theorem 3.1.

The existence of a Nash equilibrium $\sigma{P}$ with payoff $p$ is equivalent to the existence of a normed strategy profile $\sigma^{\prime}{P}$ with payoff $p$ , which is resistant to immediately visible and honest single-player deviations.

The proof of this theorem is rather technical, hence postponed to Appendix A, page A. We only give some intuition here. First, we explain why being resistant to immediately visible and honest deviations is enough. Notice that as long as the sequence of vertices follows the main outcome, then one can simply ignore the deviation and act only when the deviation becomes visible, in a way as if the deviator had started deviating only at this moment. This will be enough to punish the deviator. The “honest” part comes from the fact that one should simply ignore the messages sent by the deviator as it can be only in her interest to not ignore them (if it was not, then why would she send any message at all?).

Second, we show why no one should communicate as long as the sequence of vertices follows the main outcome. The reason is that if no one has deviated then any message is essentially useless, and if a deviation has happened, as explained earlier it can just be ignored as long as it has not become visible.

Finally, we demonstrate why the richness of the communication mechanism is in a way useless. Intuitively, one can understand that the only factors that should matter when playing are the sequence of the vertices that have been visited (because payoff functions only take into account the visited vertices) and the identity of the deviator. Thus the messages should only be used so that players can know of the identity of the deviator in the fastest possible way, and we show that nothing is faster than a sort of epidemic mechanism where one simply broadcasts the identity of the deviator whenever one received the information.

4 The epistemic game abstraction

We fix a concurrent game $=\langle\penalty 10000V,v_{\textup{init}},\penalty-1000\textup{Act},\penalty-1000\mathpzc{P},\penalty-1000\Sigma,\penalty-1000\textup{Allow},\penalty-1000\textup{Tab},(\textup{payoff}_{a})_{a\in\mathpzc{P}}\penalty 10000\rangle$ for the rest of the section, and $G$ be a communication graph for $\mathpzc{P}$ . We will implement an epistemic abstraction, which will track normed strategy profiles, and check that there is no profitable immediately visible and honest single-player deviations.

4.1 Description of the epistemic game

A situation is a triple $(d,I,K)$ in $\mathpzc{P}\times 2{P}\times\Big{(}2^{\mathpzc{P}}\Big{)}{P}$ , which consists of a deviator $d\in\mathpzc{P}$ , a list of players $I$ having received the information that $d$ is the deviator, and a knowledge function $K$ that associates to every player $a$ a list of suspects $K(a)$ ; in particular, it should be the case that $d\in I$ and for every $a\in I$ , $K(a)=\{d\}$ . We write $\mathsf{Sit}$ for the set of situations.

The epistemic game ${}_{{}^{G}}$ of and $G$ is defined as a two-player game structure $\langle\penalty 10000S,S_{\textup{{Eve}}},\penalty-1000S_{\textup{{Adam}}},\penalty-1000s_{\textup{init}},\penalty-1000\Sigma^{\prime},\penalty-1000\textup{Allow}^{\prime},\textup{Tab}^{\prime}\penalty 10000\rangle$ . We describe the states and the transitions leaving those states; in particular, components $\Sigma^{\prime}$ , $\textup{Allow}^{\prime}$ , $\textup{Tab}^{\prime}$ of the above tuple will only be implicitely defined.

Eve’s states $S_{\textup{{Eve}}}$ consist of elements of $V\times 2^{\mathsf{Sit}}$ such that if $(v,X)$ is a state then for all $a\in\mathpzc{P}$ the set $\{(d,I,K)\in X\mid d=a\}$ is either a singleton or empty (there is at most one situation associated with a given player $a$ ). We write $\textsf{dev}(X)$ the set $\{d\in\mathpzc{P}\mid\exists(d,I,K)\in X\}$ of agents which are a deviator in one situation of $X$ . If $d\in\textsf{dev}(X)$ , we write $(d,I^{X}_{d},K^{X}_{d})$ for the unique triple belonging to $X$ having deviator $d$ . Hence, $X=\{(d,I^{X}_{d},K^{X}_{d})\mid d\in\textsf{dev}(X)\}$ . Intuitively, an Eve’s state $(v,X)$ will correspond to a situation where the game has proceeded to vertex $v$ , but, if $\textsf{dev}(X)\neq\emptyset$ , several players may have deviated. Each player $d\in\textsf{dev}(X)$ may be responsible for the deviation; some people will have received a message denouncing $d$ (those are in the set $I^{X}_{d}$ ), and some will deduce things from what they observe (this is given by $K^{X}_{d}$ ). Note that the (un)distinguishability relation of a player $a$ will be deduced from $X$ : if $d$ deviated and $a\in I^{X}_{d}$ , then $a$ will know $d$ deviated; if $a$ is neither in $I^{X}_{d}$ nor in $I^{X}_{d^{\prime}}$ , then $a$ will not be able to know whether $d$ or $d^{\prime}$ deviated (as we will prove later, in Lemma 4.1).

First let us consider the case where $X=\emptyset$ , which is to be understood as the case where no deviation has arisen yet. In state $(v,\emptyset)$ , Eve’s actions are moves in enabled in $v$ . When she plays move $m\in\textup{Act}{P}$ , the game progresses to Adam’s state $((v,\emptyset),m)\in S_{\textup{{Adam}}}$ where Adam’s actions are vertices $v^{\prime}\in V$ such that there exists a player $d\in\mathpzc{P}$ and an action $\delta\in\textup{Act}$ such that $\textup{Tab}(v,(m[d/\delta]))=v^{\prime}$ . When Adam plays $v^{\prime}$ , either $v^{\prime}=\textup{Tab}(v,m)$ and the game progresses to Eve’s state $(v^{\prime},\emptyset)$ or $v^{\prime}\neq\textup{Tab}(v,m)$ and the game progresses to Eve’s state $(v^{\prime},X^{\prime})$ where:

•

$d\in\textsf{dev}(X^{\prime})$ if and only if there is $\delta\in\textup{Act}$ such that $\textup{Tab}(v,(m[d/\delta]))=v^{\prime}$ . It means that given the next state $v^{\prime}$ , $d$ is a possible deviator;

•

if $d\in\textsf{dev}(X^{\prime})$ , then:

–

$I^{X^{\prime}}_{d}=\{a\in\mathpzc{P}\mid d{\mathord{\rightarrowtriangle}}a\}$ ;

–

for every $a\in I_{d}^{X^{\prime}}$ , $K_{d}^{X^{\prime}}(a)=\{d\}$ ;

–

for every $a\notin I^{X^{\prime}}_{d}$ , $K^{X^{\prime}}_{d}(a)=\{b\in\mathpzc{P}\mid\exists\beta\in\textup{Act}\ \text{s.t.}\ \textup{Tab}(v,(m[b/\beta]))=v^{\prime}\}\setminus\{b\in\mathpzc{P}\mid b{\mathord{\rightarrowtriangle}}a\}$ . Those are all the players that can be suspected by $a$ , given the vertex $v^{\prime}$ , and the absence of messages so far.

We write $X^{\prime}=\textsf{upd}((v,\emptyset),m,v^{\prime})$ . Note that $X^{\prime}=\emptyset$ whenever (and only when) $\textup{Tab}(v,m)=v^{\prime}$ .

In a state $(v,X)\in S_{\textup{{Eve}}}$ where $X\neq\emptyset$ , Eve’s actions consist of functions from $\textsf{dev}(X)$ to $\textup{Act}{P}$ that are compatible with players’ knowledge, that is: $f:\textsf{dev}(X)\to\textup{Act}{P}$ is an action enabled in $(v,X)$ if and only if (i) for all $d\in\textsf{dev}(X)$ , for each $a\in\mathpzc{P}$ , $f(d)(a)\in\textup{Allow}(v,a)$ , (ii) for all $d,d^{\prime}\in\textsf{dev}(X)$ , for all $a\in\mathpzc{P}$ , if $a\notin I^{X}_{d}\cup I^{X}_{d^{\prime}}$ and $K^{X}_{d}(a)=K^{X}_{d^{\prime}}(a)$ then $f(d)(a)=f(d^{\prime})(a)$ ;222Note in particular that “ $K^{X}_{d}(a)$ singleton” does not imply $a\in I_{d}^{X}$ , those are two distinguishable situations: the message with the identity of the deviator may not have been received in the first case, while it has been received in the second case. that is, if a player has not received any message so far but has the same knowledge about the possible deviators in two situations, then Eve’s suggestion for that player’s action must be the same in both situations. When Eve plays action $f$ in $(v,X)$ , the next state is $((v,X),f)\in S_{\textup{{Adam}}}$ , where Adam’s actions correspond to states of the game that are compatible with $(v,X)$ and $f$ , that is states $v^{\prime}$ such that there exists $d\in\textsf{dev}(X)$ and $\delta\in\textup{Act}$ such that $\textup{Tab}(v,f(d)[d/\delta])=v^{\prime}$ .

When Adam chooses the action $v^{\prime}$ in $((v,X),f)$ , the game progresses to Eve’s state $(v^{\prime},X^{\prime})$ , where:

•

$d\in\textsf{dev}(X^{\prime})$ if and only if $d\in\textsf{dev}(X)$ and there exists $\delta\in\textup{Act}$ such that $\textup{Tab}(v,f(d)[d/\delta])=v^{\prime}$ . It corresponds to a case where $d$ was already a possible deviator and can continue deviating so that the game goes to $v^{\prime}$ ;

•

if $d\in\textsf{dev}(X^{\prime})$ , then:

–

$I_{d}^{X^{\prime}}=I_{d}^{X}\cup\{a\in\mathpzc{P}\mid\exists b\in I_{d}^{X}\ \text{s.t.}\ b{\mathord{\rightarrowtriangle}}a\}$ . New players receive a message with the deviator id;

–

for every $a\in I_{d}^{X^{\prime}}$ , $K_{d}^{X^{\prime}}(a)=\{d\}$ ;

–

for every $a\notin I_{d}^{X^{\prime}}$ , $K_{d}^{X^{\prime}}(a)=\{b\in K_{d}^{X}(a)\mid\exists\beta\in\textup{Act}\ \text{s.t.}\ \textup{Tab}(v,f(b)[b/\beta])=v^{\prime}\}\setminus\{c\in\mathpzc{P}\mid\textsf{dist}_{G}(c,a)\leq\max\{\textsf{dist}_{G}(c,c^{\prime})\mid c^{\prime}\in I_{c}^{X}\}+1\}$ . Those are the players that could have deviated but for which player $a$ would not have received the signal yet.

We write $X^{\prime}=\textsf{upd}((v,X),f,v^{\prime})$ . Note that $X^{\prime}\neq\emptyset$ and that $\textsf{dev}(X^{\prime})\subseteq\textsf{dev}(X)$ .

We let $R=(v_{0},X_{0})\cdot((v_{0},X_{0}),f_{0})\cdot(v_{1},X_{1})\dots$ be an infinite play from $(v_{0},X_{0})=(v_{\textup{init}},\emptyset)$ . We write $\mathit{visited}(R)$ for $v_{0}v_{1}\dots\in V^{\omega}$ the sequence of vertices visited along $R$ . We also define $\textsf{dev}(R)=\emptyset$ if $X_{r}=\emptyset$ for every $r$ , and $\textsf{dev}(R)=\lim_{r\to+\infty}\textsf{dev}(X_{r})$ otherwise. This is the set of possible deviators along $R$ .

4.1.1 Winning condition of Eve.

A zero-sum game will be played on the game structure ${}_{{}^{G}}$ , and the winning condition of Eve will be given on the branching structure of the set of outcomes of a strategy for Eve, and not individually on each outcome, as standardly in two-player zero-sum games. We write $s_{\textup{init}}=(v_{\textup{init}},\emptyset)$ for the initial state. Let $p=(p_{a})_{a\in\mathpzc{P}}\in{P}$ , and $\zeta$ be a strategy for Eve in ${}_{{}^{G}}$ ; it is said winning for $p$ from $s_{\textup{init}}$ whenever $\textup{payoff}(\mathit{visited}(R))=p$ , where $R$ is the unique outcome of $\zeta$ from $s_{\textup{init}}$ where Adam complies to Eve’s suggestions, and for every other outcome $R^{\prime}$ of $\zeta$ , for every $d\in\textsf{dev}(R^{\prime})$ , $\textup{payoff}_{d}(\mathit{visited}(R^{\prime}))\leq p_{d}$ .

4.2 An example

In Figure 2 we present a part of the epistemic game corresponding to the game we described in Figure 1 with graph $G_{1}$ . In state $(v_{0},\emptyset)$ , Eve can play the action profile $\alpha^{5}$ and make the game go to $((v_{0},\emptyset),\alpha^{5})$ where Adam can either play $v_{1}=\textup{Tab}(v_{0},\alpha^{5})$ (we say that Adam complies with Eve) or choose a different state accessible from $v_{0}$ and an action profile that consists in a single-player deviation from $\alpha$ , for instance $v^{\prime}_{1}=\textup{Tab}(v_{0},\alpha^{2}\beta\alpha^{2})=\textup{Tab}(v_{0},\alpha^{3}\beta\alpha)=\textup{Tab}(v_{0},\alpha^{4}\beta)$ . If Adam chooses $v^{\prime}_{1}$ , then three players are possible deviators: $2$ , $3$ and $4$ . We write $X$ for the corresponding set of situations, and we already know that $\textsf{dev}(X)=\{2,3,4\}$ .

•

If player $2$ is the deviator, then no one (except himself) directly receives this information. Player [math] knows that player $4$ did not deviate (since $4{\mathord{\rightarrowtriangle}}0$ in $G_{1}$ ), hence $K_{2}^{X}(0)=\{2,3\}$ ; Player $1$ has no information hence $K_{2}^{X}(1)=\{2,3,4\}$ ; Player $3$ knows that he is not the deviator but cannot know more, hence $K_{2}^{X}(3)=\{2,4\}$ ; Finally, player $4$ can deduce many things: he knows he is not the deviator, and he saw that player $3$ is not the deviator (since $3{\mathord{\rightarrowtriangle}}4$ in $G_{1}$ ), hence $K_{2}^{X}(4)=\{2\}$ .

•

If player $3$ is the deviator, then both players $3$ and $4$ get the information, hence $I^{X}_{3}=\{3,4\}$ . Other players can guess some things, for instance player [math] sees that player $4$ cannot be the deviator, this is why $K^{X}_{3}(0)=\{2,3\}$ . Etc.

•

The reasoning for player $4$ is similar.

In the situation we have just described, when the game will proceed to $v^{\prime}_{1}$ , then either player [math] knows that player $4$ has deviated, or he knows that player $4$ didn’t deviate but he suspects both $2$ and $3$ . On the other hand, player $4$ will precisely know who deviated. And player $3$ knows whether he deviated or not, but if he didn’t, then he cannot know whether it was player $2$ or player $4$ who deviated. This knowledge is stored in situation $X$ we have described, and which is fully given in Figure 2.

Let us now illustrate how actions of Eve are defined in states with a non-empty set of situations. Assume we are in Eve’s state $(v_{0},X)$ , with $X$ as previously defined. From that state, an action for Eve is a mapping $f:\{2,3,4\}\to\textup{Act}{P}$ such that:

[TABLE]

The intuition behind these constraints is the following: Player [math] knows whether Player $4$ deviated or not, but in the case she did not cannot know whether Player $2$ or Player $3$ deviated; Player $1$ does not know who deviated, hence should play the same action in the three cases (that she cannot distinguish); Player $2$ does only know whether she deviated hence in the case she did not cannot know whether Player $3$ or Player $4$ deviated; the case for Player $3$ is similar; finally Player $4$ knows for sure who deviated: she saw if Player $3$ deviated and knows whether she herself deviated, thus can distinguish between the three cases.

4.3 Correctness of the epistemic game construction

When constructing the epistemic game, we mentioned that Eve’s states will allow to properly define the undistinguishability relation for all the players. Towards that goal, we show by an immediate induction the following result:

Lemma 4.1.

If $(v,X)$ is an Eve’s state reachable from some $(v_{0},\emptyset)$ in ${}_{{}^{G}}$ , then for all $d\in\textsf{dev}(X)$ :

•

for all $a\in I^{X}_{d}$ , $K^{X}_{d}(a)=\{d\}$ ;

•

for all $a\notin I^{X}_{d}$ , $K^{X}_{d}(a)=\textsf{dev}(X)\setminus\{d^{\prime}\in\textsf{dev}(X)\mid a\in I^{X}_{d^{\prime}}\}$ .

In particular, for all $d,d^{\prime}\in\textsf{dev}(X)$ , for all $a\notin I^{X}_{d}\cup I^{X}_{d^{\prime}}$ , $K^{X}_{d}(a)=K^{X}_{d^{\prime}}(a)$ .

So, either a player $a$ will have received from a neighbour the identity of the deviator, or she will not have received any deviator identity yet, and she will have a set of suspected deviators that she will not be able to distinguish.

This allows to deduce the following correspondence between and ${}_{{}^{G}}$ :

Proposition 4.2.

There is a winning strategy for Eve in ${}_{{}^{G}}$ for payoff $p$ if and only if there is a normed strategy profile in , whose main outcome has payoff $p$ and which is resistant to single-player immediately visible and honest deviations.

The proof of correctness of the epistemic game then goes through the following steps, which are detailed in Appendix B, page B. First, given an Eve’s strategy $\zeta$ , we build a function $E_{\zeta}$ associating with $a$ -histories (for every $a\in\mathpzc{P}$ ) in the original game Eve’s histories in the epistemic game such that Eve plays according to $\zeta$ along $E_{\zeta}$ .

Then we use this function to create a strategy profile $\Omega(\zeta)$ for the original game where the action prescribed by this profile to player $a$ after history $h$ corresponds in some sense to $\zeta(E_{\zeta}(h))(d)(a)$ , where $d$ is a suspected deviator according to player $a$ . This works because, thanks to Lemma 4.1, we know that either player $a$ knows who the deviator is, or player $a$ has a subset of suspect deviators and Eve’s suggestion for $a$ (by construction of ${}_{{}^{G}}$ ) is the same for all those possible deviators.

Finally we prove that if $\zeta$ is a winning strategy for Eve then $\Omega(\zeta)$ is both normed and resistant to single-player immediately visible and honest deviations in .

To prove the converse proposition we build a function $\Lambda$ associating with Eve’s histories in the epistemic game families of single-player histories in the original game. We then use this correspondence to build a function $\Upsilon$ associating with normed strategy profiles Eve’s strategies in a natural way.

Finally we prove that if $\sigma$ is normed and resistant to single-player immediately visible honest deviations, then $\Upsilon(\sigma)$ is a winning strategy for Eve.

Gathering results of Theorem 3.1 and of this proposition, we get the following theorem:

Theorem 4.3.

There is a Nash equilibrium with payoff $p$ in if and only if there is a winning strategy for Eve in ${}_{{}^{G}}$ for payoff $p$ .

Remark 4.4.

Note that all the results are constructive, hence if one can synthesize a winning strategy for Eve in ${}_{{}^{G}}$ , then one can synthesize a correponding Nash equilibrium in .

5 Complexity analysis

We borrow all notations of previous sections. A rough analysis of the size of the epistemic game ${}_{{}^{G}}$ gives an exponential bound. We will give a more precise bound, pinpointing the part with an exponential blowup. We write $\textsf{diam}(G)$ for the diameter of $G$ , that is $\textsf{diam}(G)=\max\{\textsf{dist}_{G}(a,b)\mid\textsf{dist}_{G}(a,b)<+\infty\}$ .

Lemma 5.1.

Assuming that Tab is given explicitely in , the number of states in the reachable part of ${}_{{}^{G}}$ from $s_{\textup{init}}=(v_{\textup{init}},\emptyset)$ is bounded by

[TABLE]

The number of edges is bounded by $|S_{\textup{{Adam}}}|+|S_{\textup{{Adam}}}|\cdot|S_{\textup{{Eve}}}|$ .

*If $|\mathpzc{P}|$ is assumed to be a constant of the problem, then the size of ${}_{{}^{G}}$ is polynomial in the size of . *

We will not detail algorithmics issues, but the winning condition of Eve in ${}_{{}^{G}}$ is very similar to the winning condition of Eve in the suspect-game construction of [4] (for Boolean or ordered objectives), or in the deviator-game construction of [7] (for mean-payoff), or in a closer context to the epistemic-game construction of [3]. Hence, when the size of the epistemic game is polynomial, rather efficient algorithms can be designed to compute Nash equilibria. For instance, in a setting where the size of ${}_{{}^{G}}$ is polynomial, using a bottom-up labelling algorithm similar to that of [2, Sect. 4.3], one obtains a polynomial space algorithm for deciding the (constrained) existence of a Nash equilibrium when payoffs are Boolean payoffs corresponding to parity conditions.

6 Conclusion

In this paper, we have studied multiplayer infinite-duration games over graphs, and focused on games where players can communicate with neighbours, given by a directed graph. We have shown that a very simple communication mechanism was sufficient to describe Nash equilibria. This mechanism is sort of epidemic, in that if a player deviates, then his neighbours will see it and transmit the information to their own neighbours; the information then propagates along the communication graph. This framework encompasses two standard existing frameworks, one where the actions are invisible (represented with a graph with no edges), and one where all actions are visible (represented by the complete graph). We know from previous works that in both frameworks, one can compute Nash equilibria for many kinds of payoff functions. In this paper, we also show that we can compute Nash equilibria in this generalized framework, by providing a reduction to a two-player game, the so-called epistemic game construction. Winning condition in this two-player game is very similar to winning conditions encountered in the past, yielding algorithmic solution to the computation of Nash equilibria. We have also analyzed the size of the abstraction, which is polynomial when the number of players is considered as a constant of the problem.

The current framework assumes messages can be appended to actions by players, allowing a rich communication between players. The original framework of [15] did not allow additional messages, but did encode identities of deviators by sequences of actions. This was possible in [15] since games were repeated matrix games, but it is harder to see how we could extend this approach and how we could encode identities of players with actions, taking into account the graph structure. For instance, due to the graph, having too long identifiers might be prohibitive to transmit in a short delay the identity of the deviator. Nevertheless, that could be interesting to see if something can be done in this framework.

Appendix A Proof of Section 3

See 3.1

We will divide the proof of this theorem into several parts and prove several intermediary results.

A.1 Reduction to immediately visible and honest deviations

Lemma A.1.

Assume $\sigma{P}$ is a strategy profile from $v_{0}$ such that there is no profitable single-player immediately visible honest deviation w.r.t. $\sigma{P}$ , then one can construct a Nash equilibrium $\widetilde{\sigma}{P}$ from $v_{0}$ such that (i) $\mathit{vertices}(\textup{out}(\sigma{P},v_{0}))=\mathit{vertices}(\textup{out}(\widetilde{\sigma}{P},v_{0}))$ ; (ii) for every $h\in\textup{Hist}(v_{0})$ such that $\mathit{vertices}(h)$ is a prefix of $\mathit{vertices}(\textup{out}(\widetilde{\sigma}{P},v_{0}))$ , for every $a\in\mathpzc{P}$ , $\widetilde{\sigma}_{a}(h)[2]=\epsilon$ .

Condition (ii) in the statement means somehow that no message is required along the main outcome and along deviations that are not visible yet (in the sense that they follow the same sequence of vertices as the main outcome).

Proof A.2.

From $\sigma{P}$ , we will build a profile $\widetilde{\sigma}{P}$ such that:

•

$\widetilde{\sigma}{P}$ ’s and $\sigma{P}$ ’s main outcomes visit the same sequence of vertices;

•

no message is broadcast by $\widetilde{\sigma}{P}$ as long as the sequence of vertices of the main outcome is followed;

•

any single-player $d$ -deviation of $\widetilde{\sigma}{P}$ will correspond to an immediately visible single-player honest $d$ -deviation of $\sigma{P}$ .

Let $\rho=\textup{out}(\sigma{P},v_{o})$ . For every $h\in\textup{Hist}(v_{0})$ such that $\mathit{vertices}(h)=\mathit{vertices}(\rho_{\leq r})$ for some $r$ , for every $a\in\mathpzc{P}$ , we define $\widetilde{\sigma}_{a}(h)=\big{(}\sigma_{a}(\rho_{\leq r})[1],\epsilon\big{)}$ . As a consequence of this definition, the main outcome $\widetilde{\rho}$ of $\widetilde{\sigma}{P}$ visits the same sequence of vertices (and even follows the same moves) as the main outcome $\rho$ of $\sigma$ ; only messages which are broadcast may differ. Thereafter, we write:

[TABLE]

and, by definition of $\widetilde{\sigma}{P}$ :

[TABLE]

where $\widetilde{\rho}=\textup{out}(\widetilde{\sigma}{P},v_{0})$ . Note also that the above definition takes care of single-player deviations w.r.t. $\widetilde{\sigma}{P}$ which do not ‘leave’ the main outcome (that is, which visit the same sequence of vertices): against such deviations, simply doing nothing and going on as if no one deviated will be enough to punish the deviator.

We will now extend the definition ot profile $\widetilde{\sigma}{P}$ to histories generated by single-player deviations. We will do so by induction, by considering histories that can be derived by single-player deviations of the so-far-defined profile, and which have become visible.

Let $h$ be a history resulting from a single-player (named $d$ ) deviation w.r.t. $\widetilde{\sigma}{P}$ , which has become visible. It can be decomposed as:

[TABLE]

where $\textit{first}(h_{1})\neq v_{r+1}$ (that is, the deviation becomes visible at that point!). Note however that it may be the case that player $d$ started deviating earlier, but this will not be taken into account. We remark that, by definition of $\widetilde{\sigma}{P}$ along the main outcome or for invisible deviations, it is the case that for every $0\leq s\leq r$ , $\bar{m}_{s}(-d)=m_{s}(-d)$ and $\bar{\textsf{mes}}_{s}(-d)=\textsf{mes}_{\epsilon}(-d)$ .

With these notations, we can set:

[TABLE]

where $\textsf{mes}_{r}^{+d}(d)=\textsf{id}_{d}$ , $\textsf{mes}_{r}^{+d}(-d)=\textsf{mes}_{r}(-d)=\epsilon$ , and $h_{1}^{+d}$ is the same as $h_{1}$ , but each message of player $d$ is set to $\textsf{id}_{d}$ . In some sense, we tell the players to ignore any deviation as long as it has not become visible and then treat it as if it were an immediately visible and honest deviation. Indeed, $h_{\leq r}\cdot(m^{\prime}_{r},\textsf{mes}_{r}^{+d})\cdot h_{1}^{+d}$ is the history resulting from playing an immediately visible and honest deviation visiting the same vertices as $h$ .

We argue why this is well-defined. Pick two such deviations, for players $d$ and $d^{\prime}$ respectively, which generate histories $h=h_{\leq r}\cdot(\bar{m}_{r},\bar{\textsf{mes}}_{r})\cdot h_{1}$ and $h^{\prime}=h^{\prime}_{\leq r}\cdot(\bar{m}^{\prime}_{r},\bar{\textsf{mes}}^{\prime}_{r})\cdot h^{\prime}_{1}$ respectively. We assume $a\neq d,d^{\prime}$ . It is the case that $a$ cannot distinguish between $h$ and $h^{\prime}$ if and only if $\pi_{a}(h)=\pi_{a}(h^{\prime})$ . By considering the players in the neighbourhood of $d$ , it is not difficult to get that $\pi_{a}(h)=\pi_{a}(h^{\prime})$ implies $\pi_{a}(h_{\leq r}\cdot(\bar{m}_{r},\textsf{mes}_{r}^{+d})\cdot h_{1}^{+d})=\pi_{a}(h^{\prime}_{\leq r}\cdot(\bar{m}^{\prime}_{r},{\textsf{mes}^{\prime}_{r}}^{+d})\cdot{h^{\prime}_{1}}^{+d})$ . Hence, this is well-defined.

Now, let $\rho^{\prime}$ be the outcome of a $d$ -deviation w.r.t. $\widetilde{\sigma}{P}$ . It can be written as $v_{0}\cdot(\bar{m}_{0},\bar{\textsf{mes}}_{0})\cdot v_{1}\cdot(\bar{m}_{1},\bar{\textsf{mes}}_{1})\ldots v_{r}\cdot(\bar{m}_{r},\bar{\textsf{mes}}_{r})\cdot\rho_{1}$ (as before). We have that $\rho_{\leq r}\cdot(\bar{m}_{r},\textsf{mes}_{r}^{+d})\cdot\rho_{1}^{+d}$ is the outcome of an immediately visible and honest $d$ -deviation of $\sigma{P}$ (since one can check all players but $d$ play according to $\sigma{P}$ along the play). Hence, it cannot be profitable to the deviator (by hypothesis on $\sigma{P}$ ). Since the two sequences $\mathit{vertices}(\rho^{\prime})$ and $\mathit{vertices}(\rho_{\leq r}\cdot(\bar{m}_{r},\textsf{mes}_{r}^{+d})\cdot\rho_{1}^{+d})$ coincide, we conclude that $\rho^{\prime}$ is not a profitable deviation, and therefore that $\widetilde{\sigma}{P}$ is a Nash equilibrium.

A.2 A simple communication pattern is sufficient! Reduction to

normed strategy profiles

In this part, we will show that the richness of the communication offered by the setting is somehow useless, in that we will show that a very simple communication pattern will be sufficient for generating Nash equilibria. This pattern consists in reporting the deviator (role of the direct neighbours of the deviator), and propagating this information (for all the other players, following the communication graph).

We recall the notion of a normed profile. Let $\sigma{P}$ be a strategy profile. Let $\rho$ be its main outcome. The profile $\sigma{P}$ is said normed whenever the following conditions hold:

for every $h\in\textup{out}(\sigma{P})\cup\bigcup_{d\in\mathpzc{P},\ \sigma^{\prime}_{d}}\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ , if $\mathit{vertices}(h)$ is a prefix of $\mathit{vertices}(\rho)$ , then for every $a\in\mathpzc{P}$ , $\sigma_{a}(h)[2]=\epsilon$ ; 2. 2.

for every $d\in\mathpzc{P}$ , for every $d$ -strategy $\sigma^{\prime}_{d}$ , if $h\cdot(m,\textsf{mes})\cdot v\in\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ is the first step out of $\mathit{vertices}(\rho)$ , then for every $d{\mathord{\rightarrowtriangle}}a$ , $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\textsf{id}_{d}$ ; 3. 3.

for every $d\in\mathpzc{P}$ , for every $d$ -strategy $\sigma^{\prime}_{d}$ , if $h\cdot(m,\textsf{mes})\cdot v\in\textup{out}(\sigma{P}[d/\sigma^{\prime}_{d}],v_{0})$ has left the main outcome for more than one step, then for every $a\in\mathpzc{P}$ , $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\epsilon$ if for all $b{\mathord{\rightarrowtriangle}}a$ , $\textsf{mes}(b)=\epsilon$ and $\sigma_{a}(h\cdot(m,\textsf{mes})\cdot v)[2]=\textsf{id}_{d}$ if there is $b{\mathord{\rightarrowtriangle}}a$ such that $\textsf{mes}(b)=\textsf{id}_{d}$ ; note that this is well defined since at most one id can be transmitted.

The first condition says that, as long as a deviation is not visible, then no message needs to be sent; the second condition says that as soon as a deviation becomes visible, then messages denouncing the deviator should be sent by “those who know”, that is, the (immediate) neighbours of the deviator; the third condition says that the name (actually, the id) of the deviator should propagate according to the communication graph.

Our first issue is that it can be the case that the deviator can deviate in two distinct ways (two different actions or messages) but leading to the same sequence of vertices. Those two deviations should be treated in the same way by the other players. This might in general not the case, hence we provide a construction to ensure it.

First, let us introduce a new equivalence relation on histories. Given a strategy profile $\sigma{P}$ we say that $h\equiv_{\sigma{P}}h^{\prime}$ if either $h=h^{\prime}$ or writing $h=v_{0}\cdot(m_{0},\textsf{mes}_{0})\cdot v_{1}...\cdot v_{s-1}\cdot(m_{s-1},\textsf{mes}_{s-1})\cdot v_{s}$ and $h^{\prime}=v_{0}\cdot(m^{\prime}_{0},\textsf{mes}^{\prime}_{0})\cdot v^{\prime}_{1}...\cdot v^{\prime}_{s-1}\cdot(m^{\prime}_{s-1},\textsf{mes}^{\prime}_{s-1})\cdot v^{\prime}_{s}$ , we have:

•

$\mathit{vertices}(h)=\mathit{vertices}(h^{\prime})$ ;

•

there is $d\in\mathpzc{P}$ such that $h$ and $h^{\prime}$ are both $d$ -deviations w.r.t. $\sigma{P}$ , and $\min\{l\mid(m_{l},\textsf{mes}_{l})\neq\sigma{P}(h_{\leq l})\}=\min\{l\mid(m^{\prime}_{l},\textsf{mes}^{\prime}_{l})\neq\sigma{P}(h^{\prime}_{\leq l})\}$ .

Essentially both $h$ and $h^{\prime}$ leave the main outcome of $\sigma{P}$ due to a single-player deviation from the same player, and follow the same sequence of vertices.

Lemma A.3.

Let $\sigma{P}$ be a strategy profile which is resistant to immediately visible single-player honest deviations w.r.t. $\sigma{P}$ . For every history $h$ we define a canonical representative of the equivalence class of $h$ under $\equiv_{\sigma{P}}$ which we denote by $\overline{h}$ , with the constraint that whenever possible $\overline{h}$ will be generated by playing $\sigma{P}$ against a single-player deviation (there exists $d$ , $\tau_{d}$ and $k$ such that $\overline{h}=\textup{out}(\sigma{P}[d/\tau_{d}],v_{0})_{\leq k}$ ). We define the strategy profile $\sigma^{\prime}{P}$ inductively by

•

if $h=\textup{out}(\sigma{P},v_{0})_{\leq k}$ for some $k$ , then $\sigma^{\prime}{P}(h)=\sigma{P}(h)$ ;

•

otherwise, if $h=\textup{out}(\sigma^{\prime}{P}[d/\tau_{d}],v_{0})_{\leq k}$ for some $k$ , some player $d\in\mathpzc{P}$ and some $d$ -deviation $\tau_{d}$ , then:

–

if $d{\mathord{\rightarrowtriangle}}a$ , then $\sigma^{\prime}_{a}(h)=\sigma_{a}(\overline{h})$ ;

–

else $\sigma^{\prime}_{a}(h)=\sigma_{a}(h)$ .

Then, $\sigma^{\prime}{P}$ is a strategy profile which is resistant to immediately visible single-player honest deviations w.r.t. $\sigma^{\prime}{P}$ , with the same outcome as $\sigma{P}$ and satisfies if $h\equiv_{\sigma^{\prime}{P}}h^{\prime}$ then $\sigma^{\prime}{P}(h)=\sigma^{\prime}{P}(h^{\prime})$ .

Proof A.4.

First we argue why $\sigma^{\prime}{P}$ is well-defined. Consider two histories $h$ and $h^{\prime}$ generated by playing $\sigma^{\prime}{P}$ against single-player deviations with deviators $d$ and $d^{\prime}$ respectively, and a player $a$ such that $h\sim_{a}h^{\prime}$ . If $d{\mathord{\rightarrowtriangle}}a$ or $d^{\prime}{\mathord{\rightarrowtriangle}}a$ , then, since $h\sim_{a}h^{\prime}$ , we deduce that $d=d^{\prime}$ . Thus, by hypothesis, $\overline{h}=\overline{h^{\prime}}$ , which implies $\sigma^{\prime}_{a}(h)=\sigma^{\prime}_{a}(h^{\prime})=\sigma_{a}(\overline{h})$ . If $d{\mathord{\not\rightarrowtriangle}}a$ and $d^{\prime}{\mathord{\not\rightarrowtriangle}}a$ , then $\sigma^{\prime}_{a}(h)=\sigma_{a}(h)=\sigma_{a}(h^{\prime})=\sigma^{\prime}_{a}(h^{\prime})$ .

It is fairly easy to show by induction that if $h\equiv_{\sigma^{\prime}{P}}h^{\prime}$ then $\sigma^{\prime}{P}(h)=\sigma^{\prime}{P}(h^{\prime})$ .

Finally, we explain why $\sigma^{\prime}{P}$ is resistant to immediately visible single-player deviations. We show by induction that for every $k\in\mathbb{N}$ , for every deviator $d$ and every $d$ -deviation $\tau_{d}$ w.r.t. $\sigma^{\prime}{P}$ there exists a $d$ -deviation $\tau^{\prime}_{d}$ w.r.t. $\sigma{P}$ such that $\mathit{vertices}(\textup{out}(\sigma^{\prime}{P}[d/\tau_{d}],v_{0})_{\leq k})=\mathit{vertices}(\textup{out}(\sigma{P}[d/\tau^{\prime}_{d}],v_{0})_{\leq k})$ .

Hence, since $\sigma{P}$ is resistant to immediately visible single-player deviations, then so is $\sigma^{\prime}{P}$ .

Let us now use this to prove the following result.

Lemma A.5.

Assume $\sigma{P}$ is a strategy profile which is resistant to immediately visible single-player honest deviations w.r.t. $\sigma{P}$ and such that, if $\mathit{vertices}(h)$ is a prefix of $\mathit{vertices}(\textup{out}(\sigma{P},v_{0}))$ , then $\sigma{P}(h)[2]=\epsilon$ . Then one can construct a normed strategy profile $\widehat{\sigma}$ which is resistant to immediately visible honest single-player deviations, and such that $\mathit{vertices}(\textup{out}(\sigma{P},v_{0}))=\mathit{vertices}(\textup{out}(\widehat{\sigma}{P},v_{0}))$ .

Proof A.6.

Let $\sigma{P}$ be a strategy profile which is resistant to immediately visible single-player honest deviations, with no communication along the main outcome. Furthermore, by usage of Lemma A.3, we can suppose that for all histories $h$ and $h^{\prime}$ such that $h\equiv_{\sigma{P}}h^{\prime}$ , we have that $\sigma{P}(h)=\sigma{P}(h^{\prime})$ . We will build a profile $\widehat{\sigma}{P}$ which will be normed and will coincide in some sense to $\sigma{P}$ . Let $\rho$ be the main outcome of $\sigma{P}$ . We analyse immediately visible deviations from $\sigma{P}$ at some given step of $\rho$ , say after prefix $h_{0}=\rho_{\leq r}$ , and show that the undistinguishability relation of deviations from that point is then very simple.

Lemma A.7.

Let $a$ be a player, and assume $d$ and $d^{\prime}$ are two players such that $\textsf{dist}_{G}(d,a)>n$ and $\textsf{dist}_{G}(d^{\prime},a)>n$ . Then, for each length- $(r+n)$ immediately visible $d$ -deviation $h$ w.r.t. $\sigma{P}$ , after $h_{0}$ , for each length- $(r+n)$ immediately visible $d^{\prime}$ -deviation $h^{\prime}$ w.r.t. $\sigma{P}$ , after $h_{0}$ , if the projection over vertices of $h$ and $h^{\prime}$ coincide, then $h\sim_{a}h^{\prime}$ .

Proof A.8.

We do the proof by induction on $n$ .

Case $n=1$ : We look at one-step deviations after $h_{0}$ , say $h=h_{0}\cdot(m,\textsf{mes})\cdot v$ and $h^{\prime}=h_{0}\cdot(m^{\prime},\textsf{mes}^{\prime})\cdot v$ (the final vertex $v$ is assumed to be the same, otherwise those two deviations will for sure be distinguished by every player). Then, the following holds:

•

$m(-d)=\sigma{P}(h_{0})(-d)[1]$ * and $\textsf{mes}(-d)=\textsf{mes}_{\epsilon}(-d)$ ;*

•

$m^{\prime}(-d^{\prime})=\sigma{P}(h_{0})(-d^{\prime})[1]$ * and $\textsf{mes}^{\prime}(-d^{\prime})=\textsf{mes}_{\epsilon}(-d^{\prime})$ .*

The fact that $\textsf{dist}_{G}(d,a)>n$ and $\textsf{dist}_{G}(d^{\prime},a)>n$ means that $d{\mathord{\not\rightarrowtriangle}}a$ and $d^{\prime}{\mathord{\not\rightarrowtriangle}}a$ . Hence, for every $b{\mathord{\rightarrowtriangle}}a$ , $m(b)=\sigma_{b}(h_{0})[1]=m^{\prime}(b)$ and $\textsf{mes}(b)=\epsilon=\textsf{mes}^{\prime}(b)$ . This implies that $\pi_{a}(h_{0}\cdot(m,\textsf{mes})\cdot v)=\pi_{a}(h_{0}\cdot(m^{\prime},\textsf{mes}^{\prime})\cdot v)$ , which means that $h\sim_{a}h^{\prime}$ . This concludes the case $n=1$ .

Inductive step: assume that $\textsf{dist}_{G}(d,a)>n+1$ and $\textsf{dist}_{G}(d^{\prime},a)>n+1$ , and that $h\cdot(m,\textsf{mes})\cdot v$ and $h^{\prime}\cdot(m^{\prime},\textsf{mes}^{\prime})\cdot v$ are respectively length- $(r+n+1)$ immediately visible $d$ - (resp. $d^{\prime}$ -)deviations after $h_{0}$ w.r.t. $\sigma{P}$ , which project on the same sequences of vertices. For every $b{\mathord{\rightarrowtriangle}}a$ , $\textsf{dist}_{G}(d,b)>n$ , hence, by induction hypothesis, $h\sim_{b}h^{\prime}$ . It implies that $\sigma_{b}(h)=\sigma_{b}(h^{\prime})$ , hence $m(b)=m^{\prime}(b)$ and $\textsf{mes}(b)=\textsf{mes}^{\prime}(b)$ . We deduce that $h\cdot(m,\textsf{mes})\cdot v\sim_{a}h^{\prime}\cdot(m^{\prime},\textsf{mes}^{\prime})\cdot v$ . This concludes the proof of the inductive step.

We will now see that this simple undistinguishability relation can be defined using the message propagation mechanism of normed profiles: a player doesn’t send any message, unless she receives the id of the deviator from (at least) one neighbour.

We first define normalized versions of immediately visible single-player deviations, and for this we set $\eta(\rho_{\leq r})=\rho_{\leq r}$ for every integer $r$ . Let now $h$ be an immediately visible and honest $d$ -deviation w.r.t. $\sigma{P}$ , which becomes visible right after $\rho_{\leq r}$ . We define $\eta(h)$ inductively as follows:

•

If $h=\rho_{\leq r}\cdot(m,\textsf{mes})\cdot v$ , we set $\eta(h)=(\rho_{\leq r}\cdot(m,\textsf{mes}_{d})\cdot v)$ where $\textsf{mes}_{d}(d)=\textsf{id}_{d}$ and $\textsf{mes}_{d}(-d)=\textsf{mes}_{\epsilon}(-d)$ ;

•

if $h$ already extends $\rho_{\leq r}$ , then $\eta(h\cdot(m,\textsf{mes})\cdot v)=(\eta(h)\cdot(m,\textsf{mes}^{\prime})\cdot v)$ where, for every $a\in\mathpzc{P}$ , $\textsf{mes}^{\prime}(a)=\epsilon$ if for every $b{\mathord{\rightarrowtriangle}}a$ , $\bar{\textsf{mes}}(b)=\epsilon$ and $\textsf{mes}^{\prime}(a)=\textsf{id}_{d}$ if there is some $b{\mathord{\rightarrowtriangle}}a$ such that $\bar{\textsf{mes}}(b)=\textsf{id}_{d}$ , where $\bar{\textsf{mes}}$ is the last message along $\eta(h)$ .

Somehow, $\eta$ propagates properly messages after a deviation has happened.

Now define profile $\widehat{\sigma}{P}$ as follows: for every $r$ , set $\widehat{\sigma}{P}(\rho_{\leq r})=\sigma{P}(\rho_{\leq r})$ . In particular, the main outcome of $\widehat{\sigma}{P}$ is $\rho$ . Let now $h$ be an immediately visible and honest $d$ -deviation w.r.t. $\sigma{P}$ , after $\rho_{\leq r}$ ; we set for every $a\in\mathpzc{P}$ :

•

if $|h|\geq\textsf{dist}_{G}(d,a)-r$ , we set $\widehat{\sigma}_{a}(\eta(h))=(\sigma_{a}(h)[1],\textsf{id}_{d})$ ;

•

if $|h|<\textsf{dist}_{G}(d,a)-r$ , we set $\widehat{\sigma}_{a}(\eta(h))=(\sigma_{a}(h)[1],\epsilon)$ .

We argue why $\widehat{\sigma}{P}$ is defined everywhere it should be defined. For that we show that every immediately visible single-player deviation of $\widehat{\sigma}{P}$ is the image by $\eta$ of some immediately visible single-player deviation of $\sigma{P}$ . This can easily be done by induction on the length of the deviation.

We then notice that, by construction, $\widehat{\sigma}{P}$ propagates messages properly. We finally argue below why $\widehat{\sigma}{P}$ is well-defined.

Assume that $h$ and $h^{\prime}$ are two histories generated by immediately visible $d$ - (resp. $d^{\prime}$ -)deviations w.r.t. $\widehat{\sigma}{P}$ such that $\eta(h)\sim_{a}\eta(h^{\prime})$ but such that $h\not\sim_{a}h^{\prime}$ . From Lemma A.7, we deduce that $\textsf{dist}_{G}(d,a)<r+n$ or $\textsf{dist}_{G}(d^{\prime},a)<r+n$ where $h$ and $h^{\prime}$ become visible at step $r+1$ and are of length $r+n$ (we know they are of the same length, become visible on the same step and that $\mathit{vertices}(h)=\mathit{vertices}(h^{\prime})$ from $\eta(h)\sim_{a}\eta(h^{\prime})$ ). This means in turns thanks to the construction of $\widehat{\sigma}{P}$ that player $a$ received message $\textsf{id}_{d}$ at some step along $\eta(h)$ or signal $\textsf{id}_{d^{\prime}}$ at some step along $\eta(h^{\prime})$ . Hence we have $d=d^{\prime}$ and player $a$ received message $\textsf{id}_{d}$ at the same step along both $\eta(h)$ and $\eta(h^{\prime})$ since otherwise, she would be able to tell the difference between them. Thus, $h$ and $h^{\prime}$ are generated by playing two $d$ -deviations visiting the same vertices and deviating from the main outcome at the same time. Hence, we have that $h\equiv_{\sigma{P}}h^{\prime}$ , which means from our hypothesis on $\sigma{P}$ that $\sigma{P}(h)=\sigma{P}(h^{\prime})$ , thus $\widehat{\sigma}{P}$ is well-defined.

Now we prove that $\widehat{\sigma}{P}$ is resistant to immediately visible single-player deviations. Toward a contradiction, assume there is a profitable immediately visible $d$ -deviation. As noticed above, this deviation is the image by $\eta$ of some immediately visible $d$ -deviation w.r.t. $\sigma{P}$ . Since $\eta$ preserves the sequences of vertices, this deviation is profitable as well. This is not possible, by assumption on $\sigma{P}$ , hence $\widehat{\sigma}{P}$ is resistant to immediately visible single-player deviations.

A.3 Proof of Theorem 3.1

See 3.1

Proof A.9.

Assume $\sigma{P}$ is a Nash equilibrium with payoff $p$ . It is in particular resistant to immediately visible single-player honest single-player deviations. Applying Lemma A.1, $\sigma{P}$ can be turned to a profile $\widetilde{\sigma}{P}$ satisfying the hypotheses of Lemma A.7, and having payoff $p$ . Applying Lemma A.7, there is another profile $\widehat{\sigma}{P}$ , which is normed and resistant to immediately visible single-player deviations, and such that its main outcome visits the same sequence of vertices as the main outcome of $\sigma{P}$ . In particular, it has payoff $p$ .

Assume that $\sigma{P}$ is a normed strategy profile, which is resistant to immediately visible and honest single-player deviations, and which has payoff $p$ . Thanks to Lemma A.1, one can construct a Nash equilibrium $\widetilde{\sigma}{P}$ , whose main outcome visits the same sequence of vertices as the main outcome of $\sigma{P}$ , hence its payoff is $p$ as well.

Appendix B Correctness of the epistemic game construction

In this section, we let be a concurrent game, $G$ be a communication graph for the players of , and we let ${}_{{}^{G}}$ be the corresponding epistemic game. We assume all previous notations.

We write $\textsf{ID}=\{\textsf{id}_{d}\mid d\in\mathpzc{P}\}$ and $\textsf{ID}_{\epsilon}=\textsf{ID}\cup\{\epsilon\}$ .

B.1 Basic properties of the epistemic game

Considering an G-history $H=(v_{0},X_{0}),(v_{0},X_{0},f_{0})...(v_{k},X_{k})$ (notice we always have perfect alternation between Eve’s states and Adam’s state), we write $\mathit{visited}(H)$ for $v_{0}...v_{k}\in V^{*}$ . We denote by $\textup{Hist}_{{}_{{}^{G}}}(v_{0},X_{0})$ the set of such histories. An important result that will be useful later is the following:

See 4.1

Proof B.1.

A proof by induction follows immediately from the structure of the epistemic game.

B.2 From Eve’s strategies in ${}_{{}^{G}}$ to normed profiles in

We fix a strategy $\zeta$ for Eve, associating to every possible G-history a certain Eve’s action. From $\zeta$ , one defines inductively (on the size of histories) a partial function $E_{\zeta}:\bigcup_{a\in\mathpzc{P}}\textup{Hist}_{,a}(v_{0})\to\textup{Hist}_{{}_{{}^{G}}}(v_{0},\emptyset)$ associating a G-history to some $a$ -history $h$ in the original game such that if $(m,\textsf{mes})\cdot v$ is a suffix of $h$ , then

•

(i) for every $b\in{\mathbbm{n}}(a)$ , $\textsf{mes}(b)\in\textsf{ID}_{\epsilon}$ ;

•

(ii) for every $b,c\in{\mathbbm{n}}(a)$ , $\textsf{mes}(b)\neq\textsf{mes}(c)$ implies $\textsf{mes}(b)=\epsilon$ or $\textsf{mes}(c)=\epsilon$ ; for every $b\in{\mathbbm{n}}(a)$ , if the message sent by $b$ earlier along $h$ is $\textsf{id}_{d}$ for some $d\in\mathpzc{P}$ , then $\textsf{mes}(b)=\textsf{id}_{d}$ ;

•

(iii) $\textit{last}(E_{\zeta}(h))=(v,X)$ , with the following additional properties:

–

$X=\emptyset$ implies for each $b\in{\mathbbm{n}}(a)$ , $\textsf{mes}(b)=\epsilon$ ;

–

if there is $b\in{\mathbbm{n}}(a)$ with $\textsf{mes}(b)=\textsf{id}_{d}$ , then $d\in\textsf{dev}(X)$ and $a\in I^{X}_{d}$ ;

–

if $X\neq\emptyset$ and for every $b\in{\mathbbm{n}}(a)$ , $\textsf{mes}(b)=\epsilon$ , then there exists $c\in\textsf{dev}(X)$ such that $a\notin I^{X}_{c}$ ;

We write $({\ddagger})$ those conditions.

Together with this partial function $E_{\zeta}$ , we define a strategy profile $\sigma{P}$ in , which will in some sense (that we will explicit later) correspond to $\zeta$ .

For history $v_{0}\in\textup{Hist}_{,a}(v_{0})$ , we abusively assume that the last message (denoted mes) assigns $\epsilon$ to every player in ${\mathbbm{n}}(a)$ (that is, $\textsf{mes}({\mathbbm{n}}(a))=\textsf{mes}_{\epsilon}({\mathbbm{n}}(a))$ ). All conditions $({\ddagger})$ are then immediately satisfied and we define $E_{\zeta}(v_{0})=(v_{0},\emptyset)$ .

Pick now a history $h\in\textup{Hist}_{,a}(v_{0})$ , ending with $(m,\textsf{mes})\cdot v$ , such that $E_{\zeta}(h)$ is well-defined (induction hypothesis). Notice this implies that $h$ satisfies all conditions $({\ddagger})$ . Writing $\textit{last}(E_{\zeta}(h))=(v,X)$ , we will define $\sigma_{a}(h)$ , and extend $E_{\zeta}$ to several extensions of $h$ . We distinguish several cases:

•

Assume first that $X=\emptyset$ . Then, $\zeta(E_{\zeta}(h))\in\textup{Act}{P}$ . We set $\sigma_{a}(h)=\big{(}\zeta(E_{\zeta}(h))(a),\epsilon\big{)}$ .

We extend $E_{\zeta}$ as follows.

–

First, if all players follow the suggestion of Eve (that is, do not deviate), then the next state will be $v^{\prime}=\textup{Tab}\big{(}v,\zeta(E_{\zeta}(h))\big{)}$ . In this case, no message should be broadcast, and in ${}_{{}^{G}}$ (under $E_{\zeta}$ ), Adam complies with the suggestion of Eve and goes to $(v^{\prime},\emptyset)$ :

[TABLE]

–

Else, for every $d\in\mathpzc{P}$ and $\delta\in\textup{Act}$ , writing $v^{\prime}=\textup{Tab}\big{(}v,\zeta(E_{\zeta}(h))[d/\delta]\big{)}$ and assuming that $v^{\prime}=\textup{Tab}\big{(}v,\zeta(E_{\zeta}(h))\big{)}$ (which means it is an invisible deviation), no message should be broadcast as well since the deviation is somehow harmless, and in ${}_{{}^{G}}$ (under $E_{\zeta}$ ), Adam complies with the suggestion of Eve and goes to $(v^{\prime},\emptyset)$ :

[TABLE]

where $\textsf{mes}_{d}(d)=\textsf{id}_{d}$ and $\textsf{mes}_{d}(-d)=\textsf{mes}_{\epsilon}(-d)$ .

–

Finally, for every $d\in\mathpzc{P}$ and $\delta\in\textup{Act}$ , writing $v^{\prime}=\textup{Tab}\big{(}v,\zeta(E_{\zeta}(h))[d/\delta]\big{)}$ and assuming that $v^{\prime}\neq\textup{Tab}\big{(}v,\zeta(E_{\zeta}(h))\big{)}$ (which means it is a visible deviation), and $X^{\prime}=\textsf{upd}((v,\emptyset),\zeta(E_{\zeta}(h)),v^{\prime})$ (which is then defined and nonempty):

[TABLE]

where $\textsf{mes}_{d}(d)=\textsf{id}_{d}$ and $\textsf{mes}_{d}(-d)=\textsf{mes}_{\epsilon}(-d)$ .

Note that in all cases, the expected conditions are satisfied.

•

Assume that $X\neq\emptyset$ . If there is $b\in{\mathbbm{n}}(a)$ such that $\textsf{mes}(b)\neq\epsilon$ , then pick $d\in\textsf{dev}(X)$ such that $\textsf{mes}(b)=\textsf{id}_{d}$ ; otherwise pick any $d\in\textsf{dev}(X)$ such that $a\notin I^{X}_{d}$ (it must exist as well by induction hypothesis). We let $m=\zeta(E_{\zeta}(h))(d)$ , and we set:

[TABLE]

In the first case, player $a$ has received the message blaming player $d$ , the deviator, while in the second case player $a$ has received no message.

This is well-defined for two different reasons: (i) those two cases can be distinguished in $h$ thanks to the messages $\pi_{a}(\textsf{mes})=\textsf{mes}({\mathbbm{n}}(a))$ ; (ii) in the second case, thanks to Lemma 4.1 and to the definition of actions in the epistemic game, the value $\zeta(E_{\zeta}(h))(d)(a)$ is independent of the choice of $d\in\textsf{dev}(X)$ satisfying $a\notin I^{X}_{d}$ .

We furthermore extend $E_{\zeta}$ as follows.

–

For every $\delta\in\textup{Act}$ , writing $v^{\prime}=\textup{Tab}(v,m[d/\delta])$ , and $X^{\prime}=\textsf{upd}((v,X),\zeta(E_{\zeta}(h)),v^{\prime})$ (which is then defined and nonempty):

[TABLE]

with $\textsf{mes}^{\prime}(b)=\textsf{id}_{d}$ if $b\in I^{X}_{d}$ and $\textsf{mes}^{\prime}(b)=\epsilon$ otherwise.

Note that all conditions of the induction hypothesis are satisfied.

Note that $({\ddagger})$ are satisfied by every $h$ such that $E_{\zeta}(h)$ is defined.

This allows to define a function $\Omega$ associating a strategy profile in the original game to every Eve’s strategy in the epistemic game.

Pick $\zeta$ an Eve’s strategy in the epistemic game. Let $\sigma=\Omega(\zeta)$ be the strategy profile in the original game we have just constructed, together with the partial function $E_{\zeta}$ . We show the following lemma:

Lemma B.2.

•

Profile $\sigma$ is normed.

•

Let $\rho$ be the main outcome of profile $\sigma$ . Then, taking $E_{\zeta}$ at the limit, for every player $a\in\mathpzc{P}$ , $E_{\zeta}(\pi_{a}(\rho))$ is well-defined, and equal to $R$ , the unique outcome of $\zeta$ where Adam complies to Eve’s suggestions. In particular, $E_{\zeta}(\pi_{a}(\rho))$ is independent of the choice of player $a\in\mathpzc{P}$ .

•

Consider a honest and immediately visible single-player deviation, and let $\rho^{\prime}$ be the corresponding outcome. Then, taking $E_{\zeta}$ at the limit, for every $a\in\mathpzc{P}$ , $E_{\zeta}(\pi_{a}(\rho^{\prime}))$ is well-defined, and equal to $R^{\prime}$ , an outcome of $\zeta$ , where Adam does not comply to Eve’s suggestions. In particular, $R^{\prime}\neq R$ .

Proof B.3.

Let $R=(v_{0},\emptyset)\cdot((v_{0},\emptyset),f_{0})\cdot(v_{1},\emptyset)\dots(v_{s},\emptyset)\dots$ be the outcome of $\zeta$ , where Adam complies to Eve’s suggestion. For every $r\geq 0$ , $f_{r}\in\textup{Act}{P}$ . We let $\rho=v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\dots(f_{s-1},\textsf{mes}_{\epsilon})\cdot v_{s}\dots$ . We can show (but omit it, since it is obvious) by induction on $s$ that (i) for every $a$ , $E_{\zeta}(\pi_{a}(v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\dots(f_{s-1},\textsf{mes}_{\epsilon})\cdot v_{s}))$ is defined, and is equal to $(v_{0},\emptyset)\cdot((v_{0},\emptyset),f_{0})\cdot(v_{1},\emptyset)\dots(v_{s},\emptyset)$ , the prefix of length $s$ of $R$ ; and (ii) $\sigma_{a}(v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\dots(f_{s-1},\textsf{mes}_{\epsilon})\cdot v_{s})=((f_{s})_{a},\epsilon)$ . In particular, the unique outcome of profile $\sigma$ is $\rho$ , and messages are handled correctly by $\sigma$ on that part.

Then, notice that single-player deviations that do follow the sequence of vertices of the main outcome of $\sigma$ are considered as non-deviations by $\sigma$ (by construction, second item of the case $X=\emptyset$ ).

Pick a ‘honest and visible’ deviation $\sigma^{\prime}_{d}$ for player $d$ . We show that messages propagate properly, and that the outcome of $\sigma[d/\sigma^{\prime}_{d}]$ corresponds to an outcome of $\zeta$ , distinct from the main outcome, where Adam complies to Eve’s suggestions. We write $\rho^{\prime}=v^{\prime}_{0}\cdot(f^{\prime}_{0},\textsf{mes}^{\prime}_{0})\cdot v^{\prime}_{1}\dots(f^{\prime}_{s-1},\textsf{mes}^{\prime}_{s-1})\cdot v^{\prime}_{s}\dots$ for the outcome of $\sigma[d/\sigma^{\prime}_{d}]$ . There exists $s$ such that $v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\dots(f_{s-1},\textsf{mes}_{\epsilon})\cdot v_{s}=v^{\prime}_{0}\cdot(f^{\prime}_{0},\textsf{mes}^{\prime}_{0})\cdot v^{\prime}_{1}\dots(f^{\prime}_{s-1},\textsf{mes}^{\prime}_{s-1})\cdot v^{\prime}_{s}$ , $v^{\prime}_{s+1}\neq v_{s+1}$ and $f^{\prime}_{s}=f_{s}[d/\delta]$ for some action $\delta$ , while $\textsf{mes}^{\prime}_{s}(d)=\textsf{id}_{d}$ and $\textsf{mes}^{\prime}_{s}(a)=\epsilon$ if $a\neq d$ (this is by definition of a honest and visible deviation). So far, the message propagation system is working well. For every $r\geq s+1$ , we write $h^{\prime}_{r}$ for the prefix of length $r$ of $\rho^{\prime}$ . We show by induction on $r\geq s+1$ that for every $a$ , $E_{\zeta}(\pi_{a}(h^{\prime}_{r}))$ is well-defined and equal to $R^{\prime}_{r}$ , with $\textit{last}(R^{\prime}_{r})=(v^{\prime}_{r},X^{\prime}_{r})$ ( $X^{\prime}_{r}\neq\emptyset$ ), $R^{\prime}_{r}$ is an outcome of $\zeta$ , and the message propagation system has been working properly along $h^{\prime}_{r}$ .

•

By construction of partial function $E_{\zeta}$ , for every $a$ , $E_{\zeta}(\pi_{a}(h^{\prime}_{s+1}))$ is well-defined, and is equal to $R^{\prime}_{s+1}=(v_{0},\emptyset)\cdot((v_{0},\emptyset),f_{0})\cdot(v_{1},\emptyset)\dots(v_{s},\emptyset)\cdot((v_{s},\emptyset),f_{s+1})\cdot(v^{\prime}_{s+1},X^{\prime}_{s+1})$ , where $X^{\prime}_{s+1}=\textsf{upd}((v_{s},\emptyset),f_{s+1},v^{\prime}_{s+1})$ (with $X^{\prime}_{s+1}\neq\emptyset$ ). Obviously, $R^{\prime}_{s+1}$ is a prefix of an outcome of $\zeta$ . So far, the message propagating system has worked well along $h^{\prime}_{s+1}$ .

•

We assume that for every $a$ , $E_{\zeta}(\pi_{a}(h^{\prime}_{r}))$ (with $r\geq s+1$ ) is well-defined and equal to $R^{\prime}_{r}$ , with $\textit{last}(R^{\prime}_{r})=(v^{\prime}_{r},X^{\prime}_{r})$ ( $X^{\prime}_{r}\neq\emptyset$ ). We also assume that $R^{\prime}_{r}$ is (a prefix of) an outcome of $\zeta$ . Finally we assume that the message propagation system has worked properly along $h^{\prime}_{r}$ .

We show that the same properties hold for $h^{\prime}_{r+1}$ . We fix a player $a$ . Since $E_{\zeta}(\pi_{a}(h^{\prime}_{r}))=R^{\prime}_{r}$ , we have that $X^{\prime}_{r}$ and $\textsf{mes}^{\prime}_{r-1}$ satisfy the conditions $({\ddagger})$ given on page • ‣ B.2.

To do the inductive step, we first look at $\sigma_{a}(h^{\prime}_{r})$ :

[TABLE]

Hence, in all cases, we have $\sigma_{a}(h^{\prime}_{r})[1]=\big{(}\zeta(E_{\zeta}(\pi_{a}(h^{\prime}_{r})))(d)\big{)}_{a}=\big{(}\zeta(R^{\prime}_{r})(d)\big{)}_{a}$ , and messages are properly propagated by player $a$ . We write $m^{\prime}=\zeta(R^{\prime}_{r})(d)$ , and define the message $\textsf{mes}^{\prime}_{r}$ by $\textsf{mes}^{\prime}_{r}(a)=\textsf{id}_{d}$ if there is $b\in{\mathbbm{n}}(a)$ such that $\textsf{mes}^{\prime}_{r-1}(b)=\textsf{id}_{d}$ , and $\textsf{mes}^{\prime}_{r}(a)=\epsilon$ otherwise. Now, there is an action $\delta\in\textup{Act}$ such that

[TABLE]

We define

[TABLE]

with $X^{\prime}_{r+1}=\textsf{upd}((v^{\prime}_{r},X),\zeta(R^{\prime}_{r}),v^{\prime}_{r+1})$ . Note that $X^{\prime}_{r+1}$ is well-defined and nonempty since this is witnessed by $\textup{Tab}(v^{\prime}_{r},\big{(}\zeta(R^{\prime}_{r})(d)\big{)}[d/a])=v^{\prime}_{r+1}$ . By construction of $E_{\zeta}$ , we have that $E_{\zeta}(\pi_{a}(h^{\prime}_{r+1}))$ is well-defined and equal to $R^{\prime}_{r+1}$ . Finally, along $h^{\prime}_{r+1}$ , the communication system has been working properly, and $R^{\prime}_{r+1}$ is obviously (a prefix of) an outcome of $\zeta$ (distinct from $R$ ). This concludes the proof of the inductive step, and of the lemma.

The following statement is an obvious consequence of the construction of $E_{\zeta}$ :

Lemma B.4.

If $\zeta$ is a winning strategy for Eve in ${}_{{}^{G}}$ , then $\sigma=\Omega(\zeta)$ is a normed strategy profile, which is resistant to single-player visible and honest deviations, and whose payoff is equal to the payoff of the outcome of $\zeta$ where Adam complies to Eve’s suggestions.

Proof B.5.

We assume that $\zeta$ is a winning strategy for Eve, and that the payoff of the main outcome $R$ of $\zeta$ is $p=(p_{a})_{a\in\mathpzc{P}}$ . Then, for each other outcome of $\zeta$ , the payoff is bounded by $p$ .

Applying Lemma B.2, the main outcome $\rho$ of $\sigma$ is such that $E_{\zeta}(\rho)=R$ , yielding payoff $p$ for $\rho$ . Pick a honest and visible $d$ -deviation $\sigma^{\prime}_{d}$ , and let $\rho^{\prime}$ be the outcome of $\sigma[d/\sigma^{\prime}_{d}]$ . Then, $E_{\zeta}(\rho^{\prime})$ is defined and is an outcome of $\zeta$ (again by application of Lemma B.2), which payoff is therefore bounded by $p$ . Hence, $\sigma$ satisfies the expected conditions.

B.3 From normed profiles in to Eve’s strategies in

${}_{{}^{G}}$

We assume an arbitrary total order $<$ on the set Act. This will be used to define unique corresponding (local) histories in .

We first define a mapping assigning families of (local) histories in to histories in ${}_{{}^{G}}$ . Consider an Eve’s history $H=(v_{0},X_{0})\cdot(v_{0},X_{0},f_{0})\dots(v_{s},X_{s})$ in ${}_{{}^{G}}$ .

•

If $X_{s}=\emptyset$ , then for every $r<s$ , $X_{r}=\emptyset$ as well, and therefore $f_{r}\in\textup{Act}{P}$ . We then associate with $H$ the single full history, which is easily seen to be well-defined:

[TABLE]

•

If $X_{s}\neq\emptyset$ , then there is a smallest index $0<r_{0}\leq s$ such that $X_{r_{0}}\neq\emptyset$ , and for every $r_{0}\leq r\leq s$ , $X_{r}\neq\emptyset$ . Note also that for every $r_{0}\leq r_{1}\leq r_{2}\leq s$ , $\textsf{dev}(X_{r_{2}})\subseteq\textsf{dev}(X_{r_{1}})$ . We then associate with $H$ and with every deviator $d\in\textsf{dev}(X_{s})$ , the (unique) full history:

[TABLE]

such that:

–

$l<r_{0}$ implies $m_{l}=f_{l}$ and $\textsf{mes}_{l}=\textsf{mes}_{\epsilon}$ ;

–

Let $l\geq r_{0}$ . For every $a\in\mathpzc{P}$ , $\textsf{mes}_{l}(a)=\textsf{id}_{d}$ if $a\in I^{X_{l}}_{d}$ , otherwise $\textsf{mes}_{l}(a)=\epsilon$ . Then, $m_{l}(-d)=f_{l}(d)(-d)$ , and $m_{l}(d)$ is the $<$ -smallest action $\delta\in\textup{Act}$ such that $v_{l+1}=\textup{Tab}(v_{l},f_{l}(d)[d/\delta])$ .

We denote by $\Lambda$ the function that associates with every Eve’s history $H$ in ${}_{{}^{G}}$ , either the single full history $h$ (first case), or the family of full histories $(h_{d})_{d\in\textsf{dev}(X_{s})}$ (second case).

Let $\sigma$ be a normed strategy profile in the original game. We define the Eve’s strategy $\zeta$ as follows:

•

if $H$ is such that $\textit{last}(H)=(v,\emptyset)$ , then $\zeta(H)\in\textup{Act}{P}$ and $\zeta(H)(a)=\sigma_{a}(\Lambda(H))[1]$ ;

•

if $H$ is such that $\textit{last}(H)=(v,X)$ with $X\neq\emptyset$ , then $\zeta(H):\textsf{dev}(X)\to\textup{Act}{P}$ is such that $\zeta(H)(d)(a)=\sigma_{a}(\Lambda(H)(d))[1]$ for all $d\in\textsf{dev}(X)$ .

This allows to define a function $\Upsilon$ associating a strategy in the epistemic game to every normed profile in the original game.

Pick $\sigma$ a normed strategy profile in , and write $\zeta=\Upsilon(\sigma)$ for the corresponding strategy in ${}_{{}^{G}}$ .

Lemma B.6.

•

Let $R$ be the unique outcome of $\zeta$ where Adam complies to Eve’s suggestions. Then, at the limit, $\Lambda(R)$ is the unique outcome $\rho$ of profile $\sigma$ .

•

Let $R^{\prime}$ be an outcome of $\zeta$ along which Adam does not always comply to Eve’s suggestions. Then, for every $d\in\lim_{s\to+\infty}\textsf{dev}(X^{\prime}_{s})$ , there exists some honest and visible deviation $\sigma^{\prime}_{d}$ such that $\Lambda(R^{\prime})(d)=\textup{out}(\sigma[d/\sigma^{\prime}_{d}],v_{0})$ .

Proof B.7.

Write $R=(v_{0},\emptyset)\cdot((v_{0},\emptyset),f_{0})\cdot(v_{1},\emptyset)\cdot((v_{1},\emptyset),f_{1})\cdot(v_{2},\emptyset)\dots$ for the outcome of $\zeta$ , along which Adam complies to Eve’s suggestions. Then it is easy to argue that the outcome of $\sigma$ is $\rho=v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\cdot(f_{1},\textsf{mes}_{\epsilon})\cdot v_{2}\dots$ , and $\rho$ coincides with $\Lambda(R)$ (at the limit).

Let $R^{\prime}=(v^{\prime}_{0},X^{\prime}_{0})\cdot((v^{\prime}_{0},X^{\prime}_{0}),f^{\prime}_{0})\cdot(v^{\prime}_{1},X^{\prime}_{1})\dots$ such that there is some (the first) $r$ such that $X^{\prime}_{r}\neq\emptyset$ . We show by induction on $s\geq r$ that for every $d\in\textsf{dev}(X^{\prime}_{s})$ , there is a honest $d$ -deviation $\sigma^{\prime}_{d}$ such that the length- $s$ outcome of $\sigma[d/\sigma^{\prime}_{d}]$ is

[TABLE]

In the same induction, we will prove that $I^{X^{\prime}_{s}}_{d}=\{a\in\mathpzc{P}\mid\exists b\in{\mathbbm{n}}(a)\ \text{s.t.}\ \textsf{mes}_{s-1}(b)=\textsf{id}_{d}\}$ .

Before proving the induction, notice that $v^{\prime}_{0}\cdot(f^{\prime}_{0},\textsf{mes}_{0})\cdot v^{\prime}_{1}\cdot(f^{\prime}_{1},\textsf{mes}_{1})\dots v^{\prime}_{r-1}=v_{0}\cdot(f_{0},\textsf{mes}_{\epsilon})\cdot v_{1}\cdot(f_{1},\textsf{mes}_{\epsilon})\dots v_{r-1}$ (that is, it follows the main outcome), and $v^{\prime}_{r}\neq v_{r}$ .

•

Assume $s=r$ . Then, $X^{\prime}_{r}=\textsf{upd}((v_{r-1},\emptyset),\zeta(R_{\leq r-1}),v^{\prime}_{r})$ (defined and nonempty), with $\zeta(R_{\leq r-1})(a)=\sigma_{a}(\rho_{\leq r-1})[1]$ . By construction, for every $d\in\textsf{dev}(X^{\prime}_{r})$ , there is $\delta\in\textup{Act}$ such that $v^{\prime}_{r}=\textup{Tab}(v_{r-1},\sigma(\rho_{\leq r-1})[d/\delta])$ , where $\sigma(\rho_{\leq r-1})$ denotes abusively the tuple $(\sigma_{a}(\rho_{\leq r-1})[1])_{a\in\mathpzc{P}}$ . We define $\sigma^{\prime}_{d}(\rho_{\leq r-1})=(\delta,\textsf{id}_{d})$ (this is a honest and visible deviation). Then, $\rho_{\leq r}\in\textup{out}(\sigma[d/\sigma^{\prime}_{d}],v_{0})$ . Also, since the system of message propagation under $\sigma$ is behaving well, it is the case that for every $a$ , $\textsf{mes}_{s-1}(a)=\epsilon$ if $a\neq d$ , and $\textsf{mes}_{s-1}(d)=\textsf{id}_{d}$ (as given by $\sigma^{\prime}_{d}(\rho_{\leq r-1})$ ). It is an easy check to prove the property on messages.

•

Assume we have proven the result at rank $s$ , and consider $R^{\prime}_{\leq s+1}=R^{\prime}_{\leq s}\cdot((v^{\prime}_{s},X^{\prime}_{s}),f^{\prime}_{s})\cdot(v^{\prime}_{s+1},X^{\prime}_{s+1})$ . Pick $d\in\textsf{dev}(X^{\prime}_{s+1})$ . Since then, $d\in\textsf{dev}(X^{\prime}_{s})$ as well, we can apply the induction hypothesis to $R^{\prime}_{\leq s}$ , and we have a deviation $\sigma^{\prime}_{d}$ such that the length- $s$ * outcome of $\sigma[d/\sigma^{\prime}_{d}]$ is $\Lambda(R^{\prime}_{\leq s})(d)$ .*

By definition, $X^{\prime}_{s+1}=\textsf{upd}((v^{\prime}_{s},X^{\prime}_{s}),f^{\prime}_{s},v^{\prime}_{s+1})$ , hence there exists $\delta\in\textup{Act}$ such that $\textup{Tab}(v^{\prime}_{s},f^{\prime}_{s}(d)[d/\delta])=v^{\prime}_{s+1}$ . We then define $\sigma^{\prime}_{d}(\Lambda(R^{\prime}_{\leq s})_{d})=(\delta,\textsf{id}_{d})$ . The length- $(s+1)$ * outcome of $\sigma[d/\sigma^{\prime}_{d}]$ is*

[TABLE]

where:

–

for every $a\in\mathpzc{P}$ , $m(a)=\sigma_{a}(\Lambda(R^{\prime}_{\leq s})(d))[1]$

–

for every $a\in\mathpzc{P}$ , $\textsf{mes}(a)=\sigma_{a}(\Lambda(R^{\prime}_{\leq s})(d))[2]$

–

$v^{\prime}=\textup{Tab}(v^{\prime}_{s},m[d/\delta])$ **

Now, we show $m[d/\delta]=f^{\prime}_{s}(d)$ : $f^{\prime}_{s}(d)(a)=(\zeta(R^{\prime}_{\leq s})(d))(a)=\sigma_{a}(\Lambda(R^{\prime}_{\leq s})(d)[1]=m(a)$ . Hence, $v^{\prime}=v^{\prime}_{s+1}$ . We conclude that the length- $(s+1)$ * outcome of $\sigma[d/\sigma^{\prime}_{d}]$ is*

[TABLE]

Concerning the messages: $I^{X^{\prime}_{s+1}}_{d}=\{a\in\mathpzc{P}\mid\exists b\in I^{X^{\prime}_{s}}_{d}\ \text{s.t.}\ \textsf{dist}_{G}(a,b)\leq 1\}$ . Since $d$ is the deviator and the deviation is honest, $\textsf{mes}_{s-1}(d)=\textsf{mes}_{s}(d)=\textsf{id}_{d}$ , and for every $a\neq d$ , $\textsf{mes}_{s}(a)=\textsf{mes}(a)=\sigma_{a}(\Lambda(R^{\prime}_{\leq s})(d))[2]$ . Since $\sigma$ is normed, messages are propagated correctly by all players $a\neq d$ . Hence the expected equality holds.

Lemma B.8.

If $\sigma$ is a normed strategy profile in , which is resistant to single-player immediately visible and honest deviations, then $\zeta=\Upsilon(\sigma)$ is a winning strategy in ${}_{{}^{G}}$ .

Proof B.9.

Let $p$ be the payoff associated with $\rho=\textup{out}(\sigma,v_{0})$ . The outcome of $\zeta$ when Adam complies to Eve’s suggestions is $R$ such that $\rho=\Lambda(R)$ . In particular, $R$ has the same payoff as $\rho$ , that is, $p$ .

Assume now that $R^{\prime}=(v_{0},X_{0})\cdot((v_{0},X_{0}),f_{0})\cdot(v_{1},X_{1})\dots$ is a play in the epistemic game such that from some point on, $X_{i}\neq\emptyset$ . Then, from Lemma B.6, for every $d\in\lim_{s\to+\infty}\textsf{dev}(X^{\prime}_{s})$ , there exists $\sigma^{\prime}_{d}$ such that

[TABLE]

The payoff of $\rho^{\prime}$ and $R^{\prime}$ therefore coincide (and are equal to $p^{\prime}$ ). Since $\sigma$ is a Nash equilibrium, $p^{\prime}_{d}\leq p_{d}$ . Hence for every $d\in\lim_{s\to+\infty}\textsf{dev}(X^{\prime}_{s})$ , the payoff of player $d$ along $R^{\prime}$ is bounded by $p_{d}$ . Hence $\zeta$ is winning.

B.4 Conclusion

As a consequence of Theorem 3.1, Lemmas B.4 and B.8, we get Theorem 4.3:

See 4.3

Note that all the results are constructive, hence if one can synthesize a winning strategy for Eve in ${}_{{}^{G}}$ , then we can synthesize a correponding Nash equilibrium in .

Appendix C Complexity analysis

By application of Lemma 4.1, we get:

Lemma C.1.

Let $(v,\emptyset)\cdot((v,\emptyset),f_{0})\cdot(v_{1},X_{1})\cdot((v_{1},X_{1}),f_{1})\cdot(v_{2},X_{2})\dots$ with $X_{1}\neq\emptyset$ be a history in ${}_{{}^{G}}$ . Then for every $r\geq 1$ :

•

$\textsf{dev}(X_{r})\subseteq\textsf{dev}(X_{1})$ ;

•

for every $d\in\textsf{dev}(X_{r})$ , for every $a\in\mathpzc{P}$ , $\textsf{dist}_{G}(d,a)\leq r$ iff $a\in I_{d}^{X_{r}}$ .

•

If $d\in\textsf{dev}(X_{r})$ and $\textsf{dist}_{G}(d,a)\leq r$ , then $K_{d}^{X_{r}}(a)=\{d\}$ .

•

If $d\in\textsf{dev}(X_{r})$ and $\textsf{dist}_{G}(d,a)>r$ , then $K_{d}^{X_{r}}(a)=\textsf{dev}(X_{r})\setminus\{d^{\prime}\in\textsf{dev}(X_{r})\mid\textsf{dist}_{G}(d^{\prime},a)\leq r\}$ .

See 5.1

Proof C.2.

We start by evaluating the number of Eve’s states. First, the number of Eve’s states $(v,\emptyset)$ is obviously $|V|$ .

Then, pick an Eve state $(v^{\prime},X^{\prime})$ with $X^{\prime}\neq\emptyset$ , such that there is a transition $((v,\emptyset),f)\to(v^{\prime},X^{\prime})$ in ${}_{{}^{G}}$ (those are immediate visible deviations). Then, following an argument used in [4, Prop. 4.8], we can show that $|\textup{Tab}|\geq 2^{|\textsf{dev}(X^{\prime})|}$ : indeed, each player in $\textsf{dev}(X^{\prime})$ has been able to deviate, hence can at least do two actions from the current state, yielding the claimed bound.

We will now analyze the part which is reachable from $(v^{\prime},X^{\prime})$ . Applying Lemma C.1, any Eve’s state $(v^{\prime\prime},X^{\prime\prime})$ reachable from $(v^{\prime},X^{\prime})$ is such that $\textsf{dev}(X^{\prime\prime})\subseteq\textsf{dev}(X^{\prime})$ , and is fully characterized by $(v^{\prime},\textsf{dev}(X^{\prime\prime}),r)$ where $\textsf{dev}(X^{\prime\prime})\subseteq\textsf{dev}(X^{\prime})$ and $r$ is the distance from $(v^{\prime},X^{\prime})$ . Hence, this number of states is bounded by $|V|\cdot 2^{|\textsf{dev}(X^{\prime})|}\cdot(\textsf{diam}(G)+2)$ , where $\textsf{diam}(G)$ is the maximal diameter of the connected components of $G$ (the $+2$ term is for “distance $+\infty$ ” and for “distance larger than the diameter”). Hence it is bounded by $|V|\cdot|\textup{Tab}|\cdot(\textsf{diam}(G)+2)$ .

Since there are at most $|\textup{Tab}|$ possible deviation starting points $(v^{\prime},X^{\prime})$ , the number of Eve’s states is bounded by $|V|+|V|\cdot|\textup{Tab}|^{2}\cdot(\textsf{diam}(G)+2)$ .

*Now we evaluate the number of Adam’s states. There are at most $|\textup{Tab}|$ states of the form $((v,\emptyset),f)$ . Now, from an Eve’s state $(v,X)$ with $X\neq\emptyset$ , there are Adam’s states $((v,X),f)$ with $f:\textsf{dev}(X)\to\textup{Act}{P}$ . It is a priori difficult to reduce the number of such $f$ , which is bounded by $|\textup{Act}|^{|\textsf{dev}(X)|\cdot|\mathpzc{P}|}$ , hence by $|\textup{Act}|^{|\mathpzc{P}|^{2}}$ . *

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Rajeev Alur, Thomas A. Henzinger, and Orna Kupferman. Alternating-time temporal logic. Journal of the ACM , 49:672–713, 2002.
2[2] Patricia Bouyer. Games on graphs with a public signal monitoring. Research Report https://arxiv.org/abs/1710.07163 , ar Xiv, 2017.
3[3] Patricia Bouyer. Games on graphs with a public signal monitoring. In Proc. 21st International Conference on Foundations of Software Science and Computation Structures (Fo S Sa CS’18) , volume 10803 of Lecture Notes in Computer Science , pages 530–547. Springer, 2018.
4[4] Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels. Pure Nash equilibria in concurrent games. Logical Methods in Computer Science , 11(2:9), 2015.
5[5] Patricia Bouyer, Nicolas Markey, and Daniel Stan. Mixed Nash equilibria in concurrent games. In Proc. 33rd Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS’14) , volume 29 of LIP Ics , pages 351–363. Leibniz-Zentrum für Informatik, 2014.
6[6] Patricia Bouyer and Nathan Thomasset. Nash equilibria in games over graphs equipped with a communication mechanism. Research Report https://arxiv.org/abs/1906.07753 , ar Xiv, 2019.
7[7] Romain Brenguier. Robust equilibria in mean-payoff games. In Proc. 19th International Conference on Foundations of Software Science and Computation Structures (Fo S Sa CS’16) , volume 9634 of Lecture Notes in Computer Science , pages 217–233. Springer, 2016.
8[8] Krishnendu Chatterjee, Thomas A. Henzinger, and Marcin Jurdziński. Games with secure equilibria. Theoretical Computer Science , 365(1-2):67–82, 2006.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Acknowledgements.

Nash equilibria in games over graphs equipped

Abstract

keywords:

category:

1 Introduction

2 Definitions

2.1 Concurrent games and communication graphs

Definition 2.1**.**

Definition 2.2**.**

Remark 2.3**.**

2.2 An example

2.3 Two-player turn-based game structures

2.4 The problems we are looking at

Problem 2.4** (Constrained existence problem).**

3 Reduction to profiles following a simple communication mechanism

Theorem 3.1**.**

4 The epistemic game abstraction

4.1 Description of the epistemic game

4.1.1 Winning condition of Eve.

4.2 An example

4.3 Correctness of the epistemic game construction

Lemma 4.1**.**

Proposition 4.2**.**

Theorem 4.3**.**

Remark 4.4**.**

5 Complexity analysis

Lemma 5.1**.**

6 Conclusion

Appendix A Proof of Section 3

A.1 Reduction to immediately visible and honest deviations

Lemma A.1**.**

Proof A.2**.**

A.2 A simple communication pattern is sufficient! Reduction to

Lemma A.3**.**

Proof A.4**.**

Lemma A.5**.**

Proof A.6**.**

Lemma A.7**.**

Proof A.8**.**

A.3 Proof of Theorem 3.1

Proof A.9**.**

Appendix B Correctness of the epistemic game construction

B.1 Basic properties of the epistemic game

Proof B.1**.**

B.2 From Eve’s strategies in G{}_{{}^{G}}G​ to normed profiles in

Lemma B.2**.**

Proof B.3**.**

Lemma B.4**.**

Proof B.5**.**

B.3 From normed profiles in to Eve’s strategies in

Lemma B.6**.**

Proof B.7**.**

Lemma B.8**.**

Proof B.9**.**

B.4 Conclusion

Appendix C Complexity analysis

Lemma C.1**.**

Proof C.2**.**

Definition 2.1.

Definition 2.2.

Remark 2.3.

Problem 2.4 (Constrained existence problem).

Theorem 3.1.

Lemma 4.1.

Proposition 4.2.

Theorem 4.3.

Remark 4.4.

Lemma 5.1.

Lemma A.1.

Proof A.2.

Lemma A.3.

Proof A.4.

Lemma A.5.

Proof A.6.

Lemma A.7.

Proof A.8.

Proof A.9.

Proof B.1.

B.2 From Eve’s strategies in ${}_{{}^{G}}$ to normed profiles in

Lemma B.2.

Proof B.3.

Lemma B.4.

Proof B.5.

Lemma B.6.

Proof B.7.

Lemma B.8.

Proof B.9.

Lemma C.1.

Proof C.2.