Computer aided synthesis: a game theoretic approach

V\'eronique Bruy\`ere

arXiv:1706.00652·cs.GT·June 5, 2017

Computer aided synthesis: a game theoretic approach

V\'eronique Bruy\`ere

PDF

Open Access

TL;DR

This paper introduces game theory concepts and methods applied to computer aided synthesis, covering classical results on two-player and multi-player games, with connections to automata theory and solution approaches.

Contribution

It provides a comprehensive overview of game theoretic approaches in synthesis, including classical results and solution strategies, with illustrative examples and proof intuitions.

Findings

01

Classical results on two-player zero-sum games

02

Extensions to multi-player non-zero-sum games

03

Connections between one-player games and automata theory

Abstract

In this invited contribution, we propose a comprehensive introduction to game theory applied in computer aided synthesis. In this context, we give some classical results on two-player zero-sum games and then on multi-player non zero-sum games. The simple case of one-player games is strongly related to automata theory on infinite words. All along the article, we focus on general approaches to solve the studied problems, and we provide several illustrative examples as well as intuitions on the proofs.

Equations7

f_{i} (ρ) ⪯_{i} f_{i} (ρ^{'})

f_{i} (ρ) ⪯_{i} f_{i} (ρ^{'})

f_{i} (ρ) ≺_{i} f_{i} (ρ^{'})

max {v a l_{i} (ρ_{k}) ∣ ρ_{k} \in V_{i}} ⪯_{i} f_{i} (ρ) \mbox an d μ_{i} ⪯_{i} f_{i} (ρ) \mbox (r es p . μ_{i} ⪯_{i} f_{i} (ρ) ⪯_{i} ν_{i}) .

max {v a l_{i} (ρ_{k}) ∣ ρ_{k} \in V_{i}} ⪯_{i} f_{i} (ρ) \mbox an d μ_{i} ⪯_{i} f_{i} (ρ) \mbox (r es p . μ_{i} ⪯_{i} f_{i} (ρ) ⪯_{i} ν_{i}) .

(f_{i} (⟨(σ_{i})_{i \in Π} ⟩_{v_{0}}))_{i \in Π} ⊀_{sec, i} (f_{i} (⟨ σ_{i}^{'}, σ_{- i} ⟩_{v_{0}}))_{i \in Π}

(f_{i} (⟨(σ_{i})_{i \in Π} ⟩_{v_{0}}))_{i \in Π} ⊀_{sec, i} (f_{i} (⟨ σ_{i}^{'}, σ_{- i} ⟩_{v_{0}}))_{i \in Π}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Logic, programming, and type systems · semigroups and automata theory

Full text

11institutetext: Computer Science Department, University of Mons

20 Place du Parc, B-7000-Mons, Belgium

11email: [email protected]

Véronique Bruyère

Computer aided synthesis:

a game-theoretic approach

Véronique Bruyère

Abstract

In this invited contribution, we propose a comprehensive introduction to game theory applied in computer aided synthesis. In this context, we give some classical results on two-player zero-sum games and then on multi-player non zero-sum games. The simple case of one-player games is strongly related to automata theory on infinite words. All along the article, we focus on general approaches to solve the studied problems, and we provide several illustrative examples as well as intuitions on the proofs.

Keywords:

Games played on graphs, Boolean objective, quantitative objective, winning strategy, Nash equilibrium, synthesis.

1 Introduction

Game theory is a well-developed branch of mathematics that is applied to various domains like economics, biology, computer science, etc. It is the study of mathematical models of interaction and conflict between individuals and the understanding of their decisions assuming that they are rational [54, 70].

The last decades have seen a lot of research on algorithmic questions in game theory motivated by problems from computer aided synthesis. One important line of research is concerned with reactive systems that must continuously react to the uncontrollable events produced by the environment in which they evolve. A controller of a reactive system indicates which actions it has to perform to satisfy a certain objective against any behavior of the environment. An example in air traffic management is the autopilot that controls the speed of the plane, but have no control on the weather conditions. Such a situation can be modeled by a two-player game played on a graph: the system and the environment are the two players, the vertices of the graph model the possible configurations, the infinite paths in the graph model all the continuous interactions between the system and the environment. In this game, the system wants to achieve a certain objective while the environment tries to prevent it to do so. The objectives of the two players are thus antagonistic and we speak of zero-sum games. In this framework, checking whether the system is able to achieve its objective reduces to the existence of a winning strategy in the corresponding game, and building a controller reduces to computing such a strategy [38]. Whether such a controller can be automatically designed from the objective is known as the synthesis problem.

Another, more recent, line of research is concerned with the modelization and the study of complex systems. Instead of the simple situation of a system embedded in a hostile environment, we are faced with systems/environments formed of several components each of them with their own objectives that are not necessarily conflicting. Imagine the situation of several users behind their computers on a shared network. In this case, we use the model of multi-player non zero-sum games played on graphs: the components are the different players, each of them aiming at satisfying his objective. In this context, the synthesis problem is a little different: winning strategies are no longer appropriate and are replaced by the concept of equilibrium, that is, a strategy profile where no player has an incentive to deviate [39]. Different kinds of equilibria have been investigated among which the famous notion of Nash equilibrium [53].

A lot of study has been done about Boolean objectives, in particular about the class of $\omega$ -regular objectives, like avoiding a deadlock, always granting a request, etc [38]. An infinite path in the game graph is either winning or losing depending on whether the objective is satisfied or not. To allow richer objectives, such as minimizing the energy consumption or guaranteeing a limited response time to a request, existing models have been enriched with quantitative aspects in a way to associate a payoff (or a cost) to all paths in the game graph [19]. In this setting, we speak of quantitative objectives, and a classical decision problem in two-player zero-sum games is whether there exists a winning strategy for the system that ensures a payoff satisfying some given constraints no matter how the environment behaves. For instance we would like an energy consumption lying within a certain given interval. The same kind of question is also considered for multi-player non zero-sum games, that is, whether there exists an equilibrium such that the payoff of each player satisfies the constraints.

Decidability of those problems is not enough. Indeed in case of positive answer, it is important to know the exact complexity class of the problem and how complex are the strategies used to solve it. Given past interactions between the players, a strategy for a player indicates the next action he has to perform. The amount of memory on those past interactions is one of the ways to express the complexity of the strategy. The simplest strategies are those that require no memory at all. When all these characteristics are known and indicate practical applicability of the models, the final step is the implementation of the solving strategies into a program (like for instance a controller for a reactive system) by using adequate data structures and possibly heuristics.

In this article, we propose a comprehensive introduction to classical algorithmic solutions to the synthesis problem for two-player zero-sum games and for multi-player non zero-sum games. A complementary survey can be found in [9], and detailed expositions in the case of Boolean objectives are provided in [38, 39]. We study the existence of winning strategies (in two-player zero-sum games) and equilibria (in multi-player non zero-sum games) satisfying some given constraints, in particular the complexity class of the decision problem and the memory required for the related strategies. We provide several illustrative examples as well as intuitions on some proofs. We do not intend to present an exhaustive survey, but rather focus on some lines of research, with an emphasis on general approaches. In particular, we only consider (i) turned-based (and not concurrent) games such that the players choose their actions in a turned-based way (and not concurrently), (ii) deterministic (and not stochastic) games such that their edges are deterministic and not labeled by probabilities (iii) pure (and not randomized) strategies such that the next action is chosen in a deterministic way (and not according to a probability distribution).

Our approach is as follows. We begin with a general definition of game that includes the class of games with Boolean objectives and the class of games with quantitative objectives. For two-player zero-sum games, we present a criterium [36] that implies, for several large families of games, the existence of memoryless winning strategies ensuring a payoff satisfying some given constraints. For non zero-sum multi-player games, we present a characterization of plays (used for instance in [14, 65]) that are the outcome of a Nash equilibrium. The existence of Nash equilibrium in many different families of games is derived from this characterization, as well as results on the existence of a Nash equilibrium satisfying some constraints. We also present two other well-studied equilibria: the secure equilibria [23] and the subgame perfect equilibria [61]. For the studied decision problems, in addition to the results derived from our general approaches, we provide in this survey an overview of known results for games with Boolean and quantitative objectives.

The article is organized in the following way. In Section 2, we introduce the concepts of game and strategy, we then present the studied decision problems, and we finally recall the Boolean and quantitative objectives that are classically studied. In Section 3 devoted to two-player zero-sum games, we begin with the simple case of one-player games, and show how the decision problems are connected to problems in automata theory and numeration systems. We then present the general criterium mentioned before, and then the solutions to the decision problems for the classes of games with Boolean and quantitative objectives. Finally, we present several recent extensions of those classes of games, where for instance the single objective is replaced by a Boolean intersection of several objectives. The case of multi-player non zero-sum games is investigated in Section 4 by starting with the characterization of outcomes of Nash equilibrium. Derived results on the existence of Nash equilibrium (under some given constraints) are then detailed, followed by a study of other kinds of equilibria like secure and subgame perfect equilibria. We provide a short conclusion in Section 5.

2 Terminology and studied problems

We consider multi-player turn-based games played on finite directed graphs. The set of vertices are partitioned among the different players. A play is an infinite sequence of vertices obtained by moving an imaginary pebble from vertex to vertex according to existing edges. The owner of the current vertex decides what is the next move of the pebble according to some strategy. Each player follows a strategy in a way to achieve a certain objective. This objective depends on a preference relation that the player has on the payoffs assigned to plays. In this section, we introduce all these notions and state the problems studied in this article.

2.1 Preliminaries

2.1.1 Games

We begin with the notions of arena and game.

Definition 1

An arena is a tuple $A=(\Pi,V,(V_{i})_{i\in\Pi},E)$ where:

•

$\Pi$ is a finite set of players,

•

$V$ is a finite set of vertices and $E\subseteq V\times V$ is a set of edges, such that each vertex has at least one outgoing edge111This condition guarantees that there is no deadlock. It can be assumed w.l.o.g. for all the problems considered in this article.,

•

$(V_{i})_{i\in\Pi}$ is a partition of $V$ , where $V_{i}$ is the set of vertices owned222We also say that player $i$ controls the vertices of $V_{i}$ . by player $i\in\Pi$ .

A play is an infinite sequence $\rho=\rho_{0}\rho_{1}\ldots\in V^{\omega}$ of vertices such that $(\rho_{k},\rho_{k+1})\in E$ for all $k\in\mathbb{N}$ . Histories are finite sequences $h=h_{0}\ldots h_{n}\in V^{*}$ defined in the same way. We often use notation $hv$ to mention the last vertex $v\in V$ of the history. The set of plays is denoted by $Plays$ and the set of non empty histories (resp. ending with a vertex in $V_{i}$ ) by $Hist$ (resp. by $Hist_{i}$ ). A prefix (resp. suffix) of a play $\rho=\rho_{0}\rho_{1}\ldots$ is a finite sequence $\rho_{\leq n}=\rho_{0}\dots\rho_{n}$ (resp. infinite sequence $\rho_{\geq n}=\rho_{n}\rho_{n+1}\ldots$ ). We often use notation $h\rho$ for a play of which history $h$ is prefix. Given a play $\rho$ , we denote by $in\!f(\rho)$ the set of vertices visited infinitely often by $\rho$ . We say that $\rho$ is a lasso if it is equal to $hg^{\omega}$ with $h,g$ being two histories. This lasso is called simple if $hg$ has no repeated vertices.

Definition 2

A game $G$ is an arena $A=(\Pi,V,(V_{i})_{i\in\Pi},E)$ such that each player $i$ has:

•

a payoff function $f_{i}~{}:Plays\to P_{i}$ where $P_{i}$ is a set of payoffs,

•

a preference relation $\prec_{i}$ $\subseteq P_{i}\times P_{i}$ on his set of payoffs.

A preference relation $\prec_{i}$ is a strict total order333that is, an irreflexive, transitive and total binary relation.. It allows player $i$ to compare two plays $\rho,\rho^{\prime}\in Plays$ with respect to their payoffs: $f_{i}(\rho)\prec_{i}f_{i}(\rho^{\prime})$ means that player $i$ prefers $\rho^{\prime}$ to $\rho$ . Given $p,p^{\prime}\in P_{i}$ , we write $p\preceq_{i}p^{\prime}$ when $p\prec_{i}p^{\prime}$ or $p=p^{\prime}$ ; notice that $p\nprec_{i}p^{\prime}$ iff $p^{\prime}\preceq_{i}p$ since $\prec_{i}$ is total.

A payoff function $f_{i}$ is prefix-independent if $f_{i}(h\rho)=f_{i}(\rho)$ for all $h\rho\in Plays$ . It is prefix-linear if for all $h\rho,h\rho^{\prime}\in Plays$ ,

[TABLE]

Any prefix-independent function $f_{i}$ is prefix-linear.

When an initial vertex $v_{0}\in V$ is fixed, we call $(G,v_{0})$ an initialized game. In this case, plays and histories are supposed to start in $v_{0}$ , and we then use notations $Plays(v_{0})$ , $Hist(v_{0})$ , and $Hist_{i}(v_{0})$ (instead of $Plays$ , $Hist$ , and $Hist_{i}$ ).

Example 1

Consider the initialized two-player game $(G,v_{0})$ in Figure 1 such that player $1$ (resp. player $2$ ) controls vertices $v_{0},v_{2},v_{3}$ (resp. vertex $v_{1}$ ).444In all examples of this article, circle (resp. square) vertices are controlled by player $1$ (resp. player $2$ ). Both players use the same set $P$ of payoffs equal to $\{p_{1},p_{2},p_{3}\}$ , and the same payoff function $f$ that is prefix-independent: $f((v_{0}v_{1})^{\omega})=p_{1}$ , $f(v_{2}^{\omega})=p_{2}$ , and $f(v_{3}^{\omega})=p_{3}$ . The preference relation for player $1$ (resp. player $2$ ) is $p_{1}\prec_{1}p_{2}\prec_{1}p_{3}$ (resp. $p_{2}\prec_{2}p_{3}\prec_{2}p_{1}$ ).

2.1.2 Strategies

Let $(G,v_{0})$ be an initialized game. A strategy $\sigma_{i}$ for player $i$ in $(G,v_{0})$ is a function $\sigma_{i}:Hist_{i}(v_{0})\to V$ assigning to each history $hv\in Hist_{i}(v_{0})$ a vertex $v^{\prime}=\sigma_{i}(hv)$ such that $(v,v^{\prime})\in E$ . Thus $\sigma_{i}(hv)$ is the next vertex chosen by player $i$ (that controls vertex $v$ ) after history $hv$ has been played. A play $\rho\in Plays(v_{0})$ is consistent with $\sigma_{i}$ if $\rho_{n+1}=\sigma_{i}(\rho_{\leq n})$ for all $n$ such that $\rho_{n}\in V_{i}$ .

A strategy $\sigma_{i}$ for player $i$ is positional if it only depends on the last vertex of the history, i.e., $\sigma_{i}(hv)=\sigma_{i}(v)$ for all $hv\in Hist_{i}(v_{0})$ . More generally, it is finite-memory if $\sigma_{i}(hv)$ needs only a finite information out of the history $hv$ . This is possible with a finite-state machine that keeps track of histories of plays. The strategy chooses the next vertex depending on the current state of the machine and the current vertex in the game.555This informal definition is enough for this survey. See for instance [38] for a definition. The previous definition of positional strategy $\sigma_{i}$ for player $i$ is given for an initialized game $(G,v_{0})$ . We call it uniform if it is defined for all $hv\in Hist_{i}$ (instead of $Hist_{i}(v_{0})$ ), that is, when $\sigma_{i}$ is a positional strategy in all initialized games $(G,v)$ , $v\in V$ .

A strategy profile is a tuple $(\sigma_{i})_{i\in\Pi}$ of strategies, where each $\sigma_{i}$ is a strategy of player $i$ . It is called positional (resp. uniform, finite-memory) if all $\sigma_{i}$ , $i\in\Pi$ , are positional (resp. uniform, finite-memory). Given an initial vertex $v_{0}$ , such a strategy profile determines a unique play of $(G,v_{0})$ that is consistent with all strategies $\sigma_{i}$ . This play is called the outcome of $(\sigma_{i})_{i\in\Pi}$ in $(G,v_{0})$ and is denoted by $\langle(\sigma_{i})_{i\in\Pi}\rangle_{v_{0}}$ .

Example 1 (continued)

An example of strategy profile $(\sigma_{1},\sigma_{2})$ in $(G,v_{0})$ is the following one:

•

the positional strategy $\sigma_{2}$ for player $2$ is defined such that $\sigma_{2}(hv_{1})=v_{3}$ for all $hv_{1}\in Hist(v_{0})$ ,

•

the finite-memory strategy $\sigma_{1}$ for player $1$ is defined such that $\sigma_{1}(v_{0})=v_{1}$ and $\sigma_{1}(hv_{0})=v_{2}$ for all $hv_{0}\in Hist(v_{0})\setminus\{v_{0}\}$ .666As player $1$ can only loop on vertices $v_{2}$ and $v_{3}$ , we do not formally define $\sigma_{1}$ on histories ending with $v_{2}$ or $v_{3}$ . Hence player $1$ chooses to move to $v_{1}$ (resp. to $v_{2}$ ) at the first visit (resp. next visits) to $v_{0}$ . The needed memory is whether the current history has visited $v_{0}$ once or more time.

The outcome $\langle(\sigma_{1},\sigma_{2})\rangle_{v_{0}}$ is equal to $v_{0}v_{1}v_{3}^{\omega}$ with payoff $p_{3}$ .

2.2 Studied problems

In this paper, we want to study two problems. In the first problem, one designated player, say player $1$ , wants to apply a strategy that guarantees certain constraints on the payoffs of the plays (with respect to his preference relation) against any strategy of the other players. The other players can thus be considered as one player, say player $2$ , being the opponent of player $1$ . This is the class of so-called two-player zero-sum games.

Problem 1

Let $(G,v_{0})$ be an initialized two-player zero-sum game and $\mu,\nu\in P_{1}$ be two bounds. Decide whether player $1$ has a strategy $\sigma_{1}$ such that $\mu\preceq_{1}f_{1}(\rho)$ (resp. $\mu\preceq_{1}f_{1}(\rho)\preceq_{1}\nu$ ) for all plays $\rho\in Plays(v_{0})$ consistent with $\sigma_{1}$ .777This problem is focused on Player $1$ , the payoff function $f_{2}$ and preference relation $\prec_{2}$ of Player $2$ do not matter.

Case $\mu\preceq_{1}f_{1}(\rho)$ is called the threshold problem whereas case $\mu\preceq_{1}f_{1}(\rho)\preceq_{1}\nu$ is called the constraint problem. When a strategy $\sigma_{1}$ as required in Problem 1 exists, it is called winning and a play $\rho$ consistent with $\sigma_{1}$ is also called winning; we also say that player $1$ can ensure a payoff $f_{1}(\rho)$ such that $\mu\preceq_{1}f_{1}(\rho)$ (resp. $\mu\preceq_{1}f_{1}(\rho)\preceq_{1}\nu$ ). When this problem is decidable, we are interested in finding its complexity class and the simplest winning strategies $\sigma_{1}$ , like positional or finite-memory ones when they exist.

In a two-player zero-sum game $G$ , the opposition between player $1$ and player $2$ is most often described in terms of objectives. An objective $\Omega$ for player $1$ is a subset of $Plays$ , here the set of plays $\rho$ such that $\mu\preceq_{1}f_{1}(\rho)$ (resp. $\mu\preceq_{1}f_{1}(\rho)\preceq_{1}\nu$ ). Player 1 wants to ensure a play in $\Omega$ against any strategy of player $2$ . As an opponent, player $2$ wants wants to avoid plays in $\Omega$ , that is, to ensure the opposite objective $Plays\setminus\Omega$ . We say that the game $G$ with objective $\Omega$ is determined if for each initial vertex $v_{0}$ , either player 1 has a winning strategy to ensure $\Omega$ in $(G,v_{0})$ or player 2 has a winning strategy to ensure $Plays\setminus\Omega$ . Martin’s theorem [51] states that every two-player zero-sum game with Borel objectives is determined. Nevertheless, it gives no information on which player has a winning strategy and on the shape of such a winning strategy. This motivates studying Problem 1.

Example 2

Let us come back to the game of Figure 1 seen as a two-player zero-sum game (we thus focus on player $1$ ). In $(G,v_{0})$ , player $1$ has a winning strategy $\sigma_{1}$ for the threshold problem with $\mu=p_{2}$ , that is, for the objective $\Omega=\{\rho\mid f(\rho)\in\{p_{2},p_{3}\}\}$ : take the positional strategy $\sigma_{1}$ such that $\sigma_{1}(v_{0})=v_{2}$ . However he has no winning strategy for the threshold problem with $\mu=p_{3}$ . Indeed with the positional strategy $\sigma_{2}$ such that $\sigma_{2}(v_{1})=v_{0}$ , player $2$ has a winning strategy for the opposite objective since he can ensure a payoff equal to $p_{1}$ or $p_{2}$ .

In the second problem studied in this article, we come back to multi-player games where each player has his own payoff function and preference relation. Here, the players are not necessarily antagonistic: this is the class of so-called multi-player non zero-sum games. Instead of looking for a strategy ensuring a certain objective for one designated player, we are now interested in strategy profiles, called solution profiles, that provide payoffs satisfactory to all players with respect to their own objectives. A classical example of solution profile is the notion of Nash equilibrium (NE) [53]. Informally, a strategy profile is an NE if no player has an incentive to deviate (with respect to his preference relation) when the other players stick to their own strategies. In other words, an NE can be seen as a contract that makes every player satisfied in the sense that nobody wants to break the contract if the others follow it.

Definition 3

Given an initialized game $(G,v_{0})$ , a strategy profile $(\sigma_{i})_{i\in\Pi}$ is a Nash equilibrium if $f_{i}(\langle(\sigma_{i})_{i\in\Pi}\rangle_{v_{0}})\nprec_{i}f_{i}(\langle\sigma^{\prime}_{i},\sigma_{-i}\rangle_{v_{0}})$ for all players $i\in\Pi$ and all strategies $\sigma^{\prime}_{i}$ of player $i$ .

In this definition, notation $(\sigma^{\prime}_{i},\sigma_{-i})$ means the strategy profile such that all players stick to their own strategy except player $i$ who shifts from strategy $\sigma_{i}$ to strategy $\sigma^{\prime}_{i}$ . We say that $\sigma^{\prime}_{i}$ is a deviating strategy from $\sigma_{i}$ . When $f_{i}(\langle(\sigma_{i})_{i\in\Pi}\rangle_{v_{0}})\prec_{i}f_{i}(\langle\sigma^{\prime}_{i},\sigma_{-i}\rangle_{v_{0}})$ , $\sigma^{\prime}_{i}$ is called a profitable deviation for player $i$ with respect to $(\sigma_{i})_{i\in\Pi}$ .

Example -1 (continued)

Let us reconsider the non zero-sum game $G$ of Figure 1 and the strategy profile $(\sigma_{1},\sigma_{2})$ given previously in $(G,v_{0})$ ( $\sigma_{1}(v_{0})=v_{1}$ , $\sigma_{1}(hv_{0})=v_{2}$ for all $hv_{0}\in Hist(v_{0})\setminus\{v_{0}\}$ , and $\sigma_{2}(hv_{1})=v_{3}$ for all $hv_{1}\in Hist(v_{0})$ ). This strategy profile is an NE with outcome $\langle(\sigma_{1},\sigma_{2})\rangle_{v_{0}}=v_{0}v_{1}v_{3}^{\omega}$ . Indeed, player $1$ has no incentive to deviate since the payoff $p_{3}$ of $\langle\sigma_{1},\sigma_{2}\rangle_{v_{0}}$ is the best possible with respect to $\prec_{1}$ . If player $2$ uses the deviating strategy $\sigma^{\prime}_{2}$ from $\sigma_{2}$ such that $\sigma^{\prime}_{2}(v_{0}v_{1})=v_{0}$ , then the resulting outcome $\langle\sigma_{1},\sigma^{\prime}_{2}\rangle_{v_{0}}=v_{0}v_{1}v_{0}v_{2}^{\omega}$ has a less preferable payoff for him since $p_{2}\prec_{2}p_{3}$ . So player $2$ has no profitable deviation.

Other kinds of solution profiles will be studied in Section 4.

Problem 2

Let $(G,v_{0})$ be an initialized multi-player non zero-sum game and $(\mu_{i})_{i\in\Pi},(\nu_{i})_{i\in\Pi}\in(P_{i})_{i\in\Pi}$ be two tuples of bounds. Decide whether there exists a solution profile $(\sigma_{i})_{i\in\Pi}$ such that $\mu_{i}\preceq_{i}f_{i}(\langle(\sigma_{i})_{i\in\Pi}\rangle_{v_{0}})$ (resp. $\mu_{i}\preceq_{i}f_{i}(\langle(\sigma_{i})_{i\in\Pi}\rangle_{v_{0}})\preceq_{i}\nu_{i}$ ) for all players $i\in\Pi$ .

Similarly to Problem 1, the two cases are respectively called threshold problem and constraint problem, and we want to compute the complexity class and the simplest solution profiles in case of decidability.

In Sections 3 and 4, we present some known results about solutions to Problems 1 and 2 respectively with an emphasis on general approaches. Before, we end Section 2 with a list of payoff functions that are classically studied.

2.3 Classical payoff functions

In the classes of games that are classically studied, each player $i\in\Pi$ uses a real-valued payoff function $f_{i}~{}:Plays\to\mathbb{R}$ and a preference relation $\prec_{i}$ equal to the usual ordering $<$ on $P_{i}=\mathbb{R}$ . Hence, player $i$ prefers to maximize the payoff $f_{i}(\rho)$ of a play $\rho$ .888Alternatively, $\prec_{i}$ can be the ordering $>$ meaning that player $i$ prefers to minimize the payoff of a play. In this classical setting, we focus on two particular subclasses: the Boolean payoff functions and the quantitative payoff functions.

Boolean payoff functions

A particular subclass of games $G$ are those equipped with Boolean functions $f_{i}~{}:Plays\to\{0,1\}$ , for all $i\in\Pi$ , where payoff $1$ (resp. payoff [math]) means that the play is the most (resp. the less) preferred by player $i$ . Particularly interesting related objectives are $\Omega_{i}=\{\rho\in Plays\mid f_{i}(\rho)=1\}$ , $i\in\Pi$ . Classical such objectives $\Omega_{i}$ are $\omega$ -regular objectives like the following ones [38, 39, 55].

Definition 4

•

Let $U\subseteq V$ ,

–

Reachability : $\Omega_{i}=\{\rho\in Plays\mid\rho$ visits a vertex of $U$ at least once $\}$ ,

–

Safety: $\Omega_{i}=\{\rho\in Plays\mid\rho$ visits no vertex of $U\}$ ,

–

Büchi: $\Omega_{i}=\{\rho\in Plays\mid in\!f(\rho)\cap U\neq\emptyset\}$ ,

–

Co-Büchi: $\Omega_{i}=\{\rho\in Plays\mid in\!f(\rho)\cap U=\emptyset\}$ .

•

Let $c~{}:V\rightarrow\mathbb{N}$ be a coloring of the vertices by integers,

–

Parity: $\Omega_{i}=\{\rho\in Plays\mid$ the maximum color seen infinitely often along $c(\rho_{0})c(\rho_{1})\ldots$ is even $\}$ .

•

Let $(F_{k},G_{k})_{1\leq k\leq l}$ be a family of pairs of sets $F_{k},G_{k}\subseteq V$ ,

–

Rabin: $\Omega_{i}=\{\rho\in Plays\mid\exists k$ , $1\leq k\leq l$ , such that $in\!f(\rho)\cap F_{k}=\emptyset$ and $in\!f(\rho)\cap G_{k}\neq\emptyset\}$ ,

–

Streett: $\Omega_{i}=\{\rho\in Plays\mid\forall k$ , $1\leq k\leq l$ , $in\!f(\rho)\cap F_{k}\neq\emptyset$ or $in\!f(\rho)\cap G_{k}=\emptyset\}$ .

•

Let ${\cal F}\subseteq 2^{V}$ be a family of subsets of vertices,

–

Muller999A colored variant of Muller objective is defined from a coloring $c~{}:V\rightarrow\mathbb{N}$ of the vertices: the family $\cal F$ is composed of subsets of $c(V)$ (instead of $V$ ) and $\Omega_{i}=\{\rho\in Plays\mid in\!f(c(\rho_{0})c(\rho_{1})\ldots)\in{\cal F}\}$ [39]. See [42] for several variants of Muller games.: $\Omega_{i}=\{\rho\in Plays\mid in\!f(\rho)\in{\cal F}\}$ .

Notice that reachability and safety (resp. Büchi and co-Büchi, Rabin and Streett) are dual objectives. The complement of a parity (resp. Muller) objective is again a parity (resp. Muller) objective: from the coloring function $c~{}:V\rightarrow\mathbb{N}$ , define the new function $c^{\prime}$ such that $c^{\prime}(v)=c(v)+1$ for all $v\in V$ (resp. from the family ${\cal F}\subseteq 2^{V}$ , define the new family ${\cal F^{\prime}}=2^{V}\setminus{\cal F}$ ). A Büchi (resp. co-Büchi) objective is a particular case of a parity objective: assign color $2$ to vertices of $U$ and $1$ to vertices of $V\setminus U$ (resp. color $1$ to $U$ and [math] to $V\setminus U$ ). Similarly, one can easily prove that a parity objective is both a Rabin and a Streett objective which are themselves a Muller objective [38].

In the previous definition, the payoff function $f_{i}$ is prefix-independent in each case except for reachability and safety where only condition (1) of prefix-linearity is satisfied.

Example 3

Suppose that in the game of Figure 1, player $1$ wants to achieve the Büchi objective with $U=\{v_{2},v_{3}\}$ whereas player $2$ wants to achieve the Muller objective with ${\cal F}=\{\{v_{0},v_{1}\},\{v_{3}\}\}$ . Then the play $\rho=(v_{0}v_{1})^{\omega}$ has payoff $(0,1)$ , that is a payoff 0 for player $1$ and a payoff $1$ for player $2$ .

Quantitative payoff functions

Classical quantitative payoff functions $f_{i}:Plays\to\mathbb{R}$ are defined from a weight function $w_{i}~{}:E\to\mathbb{Q}$ as follows [19] (each edge of the game $G$ is thus labeled by a $|\Pi|$ -tuple of weights).

Definition 5

Let $w_{i}~{}:E\to\mathbb{Q}$ be a weight function and $\lambda\in\;]0,1[$ be a rational discount factor. Then $f_{i}~{}:Plays\to\mathbb{R}$ is defined as one among the following payoff functions: let $\rho=\rho_{0}\rho_{1}\ldots\in Plays$ ,

•

Supremum: ${\sf Sup}_{i}(\rho)=\sup_{n\in\mathbb{N}}w_{i}(\rho_{n},\rho_{n+1})$ ,

•

Infimum: ${\sf Inf}_{i}(\rho)=\inf_{n\in\mathbb{N}}w_{i}(\rho_{n},\rho_{n+1})$ ,

•

Limsup: ${\sf LimSup}_{i}(\rho)=\limsup\limits_{n\to\infty}w_{i}(\rho_{n},\rho_{n+1})$ ,

•

Liminf: ${\sf LimInf}_{i}(\rho)=\liminf\limits_{n\to\infty}w_{i}(\rho_{n},\rho_{n+1})$ ,

•

Mean-payoff ${\sf\overline{MP}}_{i}$ : ${\sf\overline{MP}}_{i}(\rho)=\limsup\limits_{n\to\infty}\frac{1}{n}\sum\limits_{k=0}^{n-1}w_{i}(\rho_{k},\rho_{k+1})$ ,

•

Mean-payoff ${\sf\underline{MP}}_{i}$ : ${\sf\underline{MP}}_{i}(\rho)=\liminf\limits_{n\to\infty}\frac{1}{n}\sum\limits_{k=0}^{n-1}w_{i}(\rho_{k},\rho_{k+1})$ ,

•

Discounted sum: ${\sf Disc}^{\lambda}_{i}(\rho)=\sum_{n=0}^{\infty}w_{i}(\rho_{n},\rho_{n+1})\lambda^{n}$ .

Some of these payoff functions provide natural generalizations of the previous $\omega$ -regular objectives. Indeed the supremum (resp. infimum, limsup, liminf) function is a quantitative generalization of the reachability (resp. safety, Büchi, co-Büchi) objective. The mean-payoff and discounted sum functions are much studied in classical game theory [33].

There are two variants of mean-payoff functions because the limit may not exist. Nevertheless in case of a lasso $\rho=hg^{\omega}$ , both payoffs ${\sf\overline{MP}}_{i}(\rho)$ and ${\sf\underline{MP}}_{i}(\rho)$ coincide and are equal to the average weight of the cycle $g$ (with respect to the weight function $w_{i}$ ).

In Definition 5, the payoff function $f_{i}$ is prefix-independent in limsup, liminf and mean-payoff cases, prefix-linear in discounted sum case, and satisfies condition (1) of prefix-linearity in supremum and infimum cases.

Example 4

We equip the game of Figure 1 with two weight functions $w_{1},w_{2}$ , leading to the game of Figure 2. Suppose that $f_{1}={\sf LimSup}_{1}$ and $f_{2}={\sf\overline{MP}}_{2}$ . The preferences of the players with respect to plays $(v_{0}v_{1})^{\omega}$ and $v_{0}v_{2}^{\omega}$ are opposed since $f_{1}((v_{0}v_{1})^{\omega})=1<f_{1}(v_{0}v_{2}^{\omega})=2$ for player $1$ , and $f_{2}(v_{0}v_{2}^{\omega})=1<f_{2}((v_{0}v_{1})^{\omega})=3$ for player $2$ .

In the sequel, games with the Boolean payoff functions of Definition 4 are called Boolean games. Similarly games with the quantitative payoff functions of Definition 5 are called quantitative games. We also speak about reachability game, supremum game, etc, when we want to refer to a game where all the players use the same type of payoff function. The complexity results mentioned later depend on the number of vertices, edges and players, as well as on the number of colors (resp. pairs, elements of $\cal F$ ) for parity (resp. Rabin/Streett, Muller) games, and on numerical rational values (of weights, discount factor, and bounds) given in binary for quantitative games.

3 Two-player zero-sum games

In two-player zero-sum games, players $1$ and $2$ have opposite objectives. This class of games has been much studied. In particular solutions to Problem 1 are well established for Boolean games and quantitative games as introduced in Section 2.3. Before presenting them, we begin with the simplest situation of games played by a unique player and we show that the problems studied in this article are connected to problems in automata theory and numeration systems.

3.1 One-player games

In one-player games, player 1 has no opponent, he is the only player to choose the next vertex at any moment of a play. In other words, a strategy $\sigma_{1}$ for player $1$ is nothing else than a play $\rho$ in the game. The statement of Problem 1 thus simplifies as follows:101010In Section 3.1, we omit index $1$ everywhere since player $1$ is the unique player of the game.

Problem 3

Let $(G,v_{0})$ be an initialized one-player game. Let $\mu,\nu\in P$ be two bounds. Decide whether there exists a play $\rho\in Plays(v_{0})$ such that $\mu\preceq f(\rho)$ (resp. $\mu\preceq f(\rho)\preceq\nu$ )?

3.1.1 Boolean games

For Boolean games, this problem is interesting only with bounds $\mu=\nu=1$ . Indeed recall that the payoff function $f$ is Boolean and that player $1$ prefers plays $\rho$ such that $f(\rho)=1$ . This is the classical well-known non emptiness problem for automata [55]. For instance, Problem 3 for one-player reachability (resp. Büchi) games with $\mu=\nu=1$ is the non emptiness problem for automata accepting finite words (resp. Büchi automata accepting infinite words).

Theorem 3.1

Let $(G,v_{0})$ be an initialized one-player Boolean game. Then Problem 3 (with $\mu=\nu=1$ ) is decidable in polynomial time with positional winning strategies, except for Streett and Muller games where finite-memory strategies are necessary and sufficient.

Let us comment this theorem. Notice that a winning strategy for player 1 that is finite-memory (resp. positional) means that the corresponding winning play $\rho$ , or in terms of automata the accepted word, is a (resp. simple) lasso. It is well-known that positional strategies are sufficient for Büchi objectives. This also happens for the other objectives except for Streett and Muller objectives (we will discuss this point in more details in Section 3.2, see Theorem 3.5). Example 5 illustrates that finite-memory strategies are necessary for Streett and Muller games. In cases where positional strategies are sufficient, an algorithm for Problem 3 has thus to concentrate on the existence of winning simple lassos, which can be easily done in polynomial time. The case of Streett and Muller games can also be solved in polynomial time [31, 41]. Problem 3 is NL-complete for reachability and Büchi games [45, 67] as well as for safety, co-Büchi, Rabin, parity, and Muller games, and it is P-complete for Streett games [31, 60].

Example 5

Consider the initialized one-player game $(G,v_{0})$ of Figure 3 with $V=\{v_{0},v_{1},v_{2}\}$ . For the Muller objective with ${\cal F}=\{V\}$ (or the Streett objective with the two pairs $(F_{1},G_{1}),(F_{2},G_{2})$ such that $F_{1}=\{v_{1}\}$ , $G_{1}=V$ and $F_{2}=\{v_{2}\}$ , $G_{2}=V$ ), a winning play $\rho\in Plays(v_{0})$ cannot be a simple lasso as it has to alternate between $v_{1}$ and $v_{2}$ .

3.1.2 Quantitative games

Let us turn to quantitative games. The existence of plays $\rho$ with $\mu\leq f(\rho)$ in one-player quantitative games (threshold problem) have been studied in [19].

Theorem 3.2

[19]** Let $(G,v_{0})$ be an initialized one-player quantitative game, and $\mu\in\mathbb{Q}$ be a rational threshold. Then deciding whether there exists a play $\rho\in Plays(v_{0})$ such that $\mu\leq f(\rho)$ is solvable in polynomial time with positional strategies.

Let us comment this theorem. Dealing with functions ${\sf Sup}$ , ${\sf Inf}$ , ${\sf LimSup}$ , and ${\sf LimInf}$ is equivalent to respectively consider reachability, safety, Büchi, and co-Büchi objectives (studied in Theorem 3.1). For instance, satisfying $\mu\leq{\sf Sup}(\rho)$ is equivalent to visiting an edge with a weight $\geq\mu$ along $\rho$ . For functions ${\sf\overline{MP}}$ , ${\sf\underline{MP}}$ , and ${\sf Disc}^{\lambda}$ , once one knows that positional strategies are sufficient (we will discuss this point in more details in Section 3.2), the problem again reduces to the existence of a simple lasso $\rho=hg^{\omega}$ with maximum payoff $f(\rho)$ . In case of mean-payoff function, recall that both payoffs ${\sf\overline{MP}}(\rho)$ , ${\sf\underline{MP}}(\rho)$ coincide and are equal to the average weight of the cycle $g$ . A polynomial algorithm is proposed in [47] to compute a cycle in a weighted graph with maximum average weight. The case of function ${\sf Disc}^{\lambda}$ is polynomially solved by a linear programming approach in [2].

We now discuss the existence of a play $\rho$ such that $\mu\leq f(\rho)\leq\nu$ , given two rational bounds $\mu,\nu\in\mathbb{Q}$ (constraint problem). The problem is more involved, in particular it is currently unsolved for function ${\sf Disc}^{\lambda}$ .

Theorem 3.3

[43, 66]** Let $(G,v_{0})$ be an initialized one-player quantitative (except discounted sum) game, and $\mu,\nu\in\mathbb{Q}$ be two rational bounds. Then deciding whether there exists a play $\rho\in Plays(v_{0})$ such that $\mu\leq f(\rho)\leq\nu$ is solvable in polynomial time. Positional strategies are sufficient for supremum, infimum, limsup, and liminf games, whereas finite-memory is necessary and sufficient for mean-payoff ${\sf\overline{MP}}$ and ${\sf\underline{MP}}$ games.

Let us comment this theorem. If we focus on function ${\sf LimSup}$ , looking for a play $\rho$ such that $\mu\leq{\sf LimSup}(\rho)\leq\nu$ reduces to the non emptiness problem for Rabin automata (studied in Theorem 3.1). Indeed the required play $\rho$ is such that at least one weight seen infinitely often along $\rho$ is $\geq\mu$ and none of them is $>\nu$ . A similar approach exists for functions ${\sf Sup}$ , ${\sf Inf}$ and ${\sf LimInf}$ . Whereas positional winning strategies are sufficient in all these cases, finite-memory is needed for mean-payoff functions as indicated in Example 6. Since finite-memory strategies are sufficient [43], the problem in both cases ${\sf\overline{MP}}$ , ${\sf\underline{MP}}$ reduces to the existence of a lasso $\rho$ satisfying the constraints. This can be checked in polynomial time by solving a linear program [66].

Example 6

Consider the game of Figure 3 equipped with the weight function $w$ that labels the two left edges by [math] and the two right edges by $2$ . A winning play $\rho$ for $\mu=\nu=1$ cannot be a simple lasso (with payoff either [math] or $2$ ). However the non simple lasso $\rho=(v_{0}v_{1}v_{0}v_{2})^{\omega}$ is winning.

Concerning function ${\sf Disc}^{\lambda}$ , Problem 3 is open. It is closely related to the following open problem, called target discounted-sum problem in [6].

Problem 4

Given three rational numbers $a,b$ and $t$ , and a rational discount factor $\lambda\in\,]0,1[$ , does there exist an infinite sequence $u=u_{0}u_{1}\ldots\in\{a,b\}^{\omega}$ such that $\sum_{n=0}^{\infty}u_{n}\lambda^{n}$ is equal to $t$ ?

The authors of [6] show that Problem 4 is related to several open questions in mathematics and computer science. In particular it is related to numeration systems and more precisely to $\beta$ -representations of real numbers [4, 50]. Given $\beta>1$ a real number (the base) and $A\subseteq\mathbb{N}$ a finite alphabet (the set of digits), a $\beta$ -representation of a real number $x\geq 0$ is an infinite sequence $(x_{n})_{n\leq k}\in A^{\omega}$ , also written $x_{k}\ldots x_{0}.x_{-1}x_{-2}\ldots$ , such that $x=\sum_{n\leq k}x_{n}\beta^{n}$ . A well-known result [58] is that every $x\geq 0$ has a $\beta$ -representation using $A=\{0,1,\ldots,\lceil\beta-1\rceil\}$ . It follows that Problem 4 asks whether $t$ has a $\beta$ -representation $x_{0}.x_{-1}x_{-2}\ldots$ (with $k=0$ ) using $\beta=\frac{1}{\lambda}$ and $A=\{a,b\}$ . This problem is therefore decidable when $a=0,b=1$ and $\lambda\geq\frac{1}{2}$ . Indeed using the result of [58], either $t>\frac{1}{\beta-1}$ and it has no $\beta$ -representation $x_{0}.x_{-1}x_{-2}\ldots\in\{0,1\}^{\omega}$ , or $t\leq\frac{1}{\beta-1}$ and it has such a $\beta$ -representation. Other partial results to Problem 4 can be found in [6].

3.2 Two-player games

We now turn to two-player zero-sum games. In Problem 1, the objective of player 1 is the set $\Omega$ of plays $\rho$ such that $\mu\preceq_{1}f_{1}(\rho)$ (resp. $\mu\preceq_{1}f_{1}(\rho)\preceq_{1}\nu$ ), whereas player 2 has the opposite objective $Plays\setminus\Omega$ . Examples of the threshold problem are the following ones: in a reachability game, player 1 aims at reaching some target set of vertices whereas player 2 tries to prevent him from reaching it; in a limsup game, player 1 aims at maximize the payoff ${\sf LimSup}(\rho)$ of the play $\rho$ (in a way to be $\geq\mu$ ) whereas player 2 tries to minimize it. Recall that by Martin’s theorem, every two-player zero-sum games with Borel objectives is determined. This large class of games includes the objectives $\Omega$ of player $1$ in Problem 1 for the Boolean and quantitative games introduced in Section 2.3. A lot of research has been developed to solve Problem 1 that we present in this section. In Sections 3.2 and 3.3, as the objectives $\Omega$ and $Plays\setminus\Omega$ of players $1$ and $2$ only depend on $f_{1}$ , $\prec_{1}$ , and $P_{1}$ , we simplify the used notation by omitting index $1$ .

3.2.1 Criterium for uniform optimal strategies

We begin by studying the winning strategies that player $1$ can use for the threshold problem in Problem 1. This is related to the notion of value and optimal strategy.

Definition 6

Let $(G,v_{0})$ be an initialized two-player zero-sum game. If there exists ${val}(v_{0})\in P$ such that

•

player $1$ has a strategy $\sigma_{1}$ such that ${val}(v_{0})\preceq f(\rho)$ for all plays $\rho$ in $Plays(v_{0})$ consistent with $\sigma_{1}$ , and

•

player $2$ has a strategy $\sigma_{2}$ such that $f(\rho)\preceq{val}(v_{0})$ for all plays $\rho$ in $Plays(v_{0})$ consistent with $\sigma_{2}$ ,

then ${val}(v_{0})=f(\langle\sigma_{1},\sigma_{2}\rangle_{v_{0}})$ is the value of $v_{0}$ and $\sigma_{1}$ (resp. $\sigma_{2}$ ) is an optimal strategy for player $1$ (resp. player $2$ ).

Intuitively, ${val}(v_{0})$ is the highest threshold $\mu$ for which player $1$ can ensure (with an optimal strategy) a payoff $f(\rho)$ such that $\mu\preceq f(\rho)$ . In this definition, the antagonistic player $2$ behaves in the opposite way. When the value ${val}(v_{0})$ exists and is computable, the threshold problem is easily solved: we just check whether the given threshold $\mu$ satisfies $\mu\preceq{val}(v_{0})$ . Moreover both players can limit themselves to use optimal strategies, that is, if player $1$ has a winning strategy (resp. no winning strategy) for the threshold problem, then player $1$ (resp. player $2$ ) can use an optimal strategy as winning strategy (resp. for the opposite objective).

Example -8 (continued)

Let us come back to the two-player zero-sum game of Example 2. Recall that in $(G,v_{0})$ , player $1$ has a winning strategy for the threshold problem with $\mu=p_{2}$ but not with $\mu=p_{3}$ , meaning that ${val}(v_{0})=p_{2}$ . Indeed, one can check that ${val}(v_{0})={val}(v_{1})={val}(v_{2})=p_{2}$ and ${val}(v_{3})=p_{3}$ , and that both players have optimal strategies that are positional, and even more uniform. The values are indicated under the vertices in Figure 4, and the two uniform optimal strategies are given as thick edges.

We will see later in this section that Boolean and quantitative games often have uniform optimal strategies (see Theorems 3.5 and 3.6). In [36], the authors propose a unified approach to all these results: they give a general criterium on the payoff function that guarantees uniform optimal strategies for both players.

Theorem 3.4

[36]111111The hypotheses of this theorem are those given in the full version of [36] available at http://www.labri.fr/perso/gimbert/ Let $G$ be a two-player zero-sum game with a preference relation $\prec$ on $P$ such that each subset of $P$ has an infimum and a supremum. If the payoff function $f$ is fairly mixing, that is,

$\forall h\rho,h\rho^{\prime}\in Plays$ , if $f(\rho)\preceq f(\rho^{\prime})$ then $f(h\rho)\preceq f(h\rho^{\prime})$ , 2. 2.

$\forall h\rho,h\rho^{\prime}\in Plays$ , $\min\{f(\rho),f(h^{\omega})\}\preceq f(h\rho)\preceq\max\{f(\rho),f(h^{\omega})\}$ , 3. 3.

$\forall h_{k}\in Hist,k\in\mathbb{N}$ ,

$\min\{f(h_{0}h_{2}h_{4}\ldots),f(h_{1}h_{3}h_{5}\ldots),\inf_{k}f(h_{k}^{\omega})\}$ **

$\quad\preceq f(h_{0}h_{1}h_{2}h_{3}\ldots)$ **

$\quad\preceq\max\{f(h_{0}h_{2}h_{4}\ldots),f(h_{1}h_{3}h_{5}\ldots),\sup_{k}f(h_{k}^{\omega})\}$ ,

then both players have uniform optimal strategies.

Let us comment this theorem. The first condition is condition (1) of prefix-linearity. If $f$ is prefix-independent, then the first and the second conditions are trivially satisfied. The third condition is concerned with shuffles of histories. Let us apply this theorem to quantitative games, for instance to function ${\sf LimSup}$ (see Definition 5). This function is prefix-independent and satisfies the third condition since $\inf_{k}{\sf LimSup}(h_{k}^{\omega})\leq{\sf LimSup}(h_{0}h_{1}h_{2}h_{3}\ldots)\leq\sup_{k}{\sf LimSup}(h_{k}^{\omega})$ . One can check that the payoff functions of all quantitative games are fairly mixing, as well as the payoff functions of the Boolean games with reachability, safety, Büchi, co-Büchi, and parity objectives [36] (but not with Streett and Muller objectives).

The proof121212Theorem 3.4 is given in [36] for real-valued payoff functions $f:Plays\to\mathbb{R}$ and the usual ordering $<$ , but its proof is easily generalized to the statement given here. of Theorem 3.4 is simple and elegant; it is by induction on $|E|-|V|$ . If $|E|=|V|$ then there is exactly one outgoing edge for each vertex and thus both players have a unique possible strategy that is therefore uniform and optimal. Suppose that $|E|>|V|$ and let us focus on player $1$ (a symmetric argument is used for player $2$ ). If all vertices $v\in V_{1}$ have only one outgoing edge, then player $1$ has a unique strategy, and it is uniform and optimal. Suppose that some $v\in V_{1}$ has at least two outgoing edges. We partition this set of edges into two non empty subsets $E^{\prime}_{v}$ and $E^{\prime\prime}_{v}$ . From $G$ we define two smaller games $G^{\prime}$ and $G^{\prime\prime}$ with the same vertices and edges except that the set of outgoing edges from $v$ is restricted to $E^{\prime}_{v}$ in $G^{\prime}$ and to $E^{\prime\prime}_{v}$ in $G^{\prime\prime}$ . By induction hypothesis, $v$ has a value ${val}^{\prime}(v)$ in $G^{\prime}$ and ${val}^{\prime\prime}(v)$ in $G^{\prime\prime}$ , and both players have uniform optimal strategies, respectively $\sigma^{\prime}_{1},\sigma^{\prime}_{2}$ in $G^{\prime}$ and $\sigma^{\prime\prime}_{1},\sigma^{\prime\prime}_{2}$ in $G^{\prime\prime}$ . W.l.o.g. ${val}^{\prime\prime}(v)\preceq{val}^{\prime}(v)$ , we then choose $\sigma^{\prime}_{1}$ as optimal strategy for player $1$ in $G$ and for all $u\in V$ , we take their value ${val}^{\prime}(u)$ in $G^{\prime}$ as their value in $G$ . Clearly $\sigma^{\prime}_{1}$ is optimal and uniform in $G$ . The rest of the proof consists in defining a strategy for player $2$ (from $\sigma^{\prime}_{2}$ and $\sigma^{\prime\prime}_{2}$ ) that is optimal in $G$ . This is possible thanks to the three conditions of Theorem 3.4 applied on plays decomposed according to occurrences of $v$ .

Further results can be found in [37]: a characterization of payoff functions is given guaranteeing the existence of uniform optimal strategies for both players. From this characterization, it follows that if both players have uniform optimal strategies when playing solitary in one-player games, then they also have uniform optimal strategies in zero-sum two-player games.

3.2.2 Boolean games

Let us now focus on Boolean games. As for one-player games, we limit the study of Problem 1 (threshold and constraint problems) to the only interesting case $\mu=\nu=1$ . The following theorem for two-player games is the counterpart of Theorem 3.1 for one-player games.

Theorem 3.5

Let $(G,v_{0})$ be an initialized two-player zero-sum Boolean game. Then Problem 1 (with $\mu=\nu=1$ ) is

•

P*-complete with uniform winning strategies for reachability, safety, Büchi, and co-Büchi objectives [3, 30, 38, 44],*

•

P*-complete with finite-memory winning strategies for Muller131313It is PSPACE-complete for the colored variant of Muller objective [42, 52]. objective [41],*

•

NP*-complete with uniform winning strategies for Rabin objective [29, 30],*

•

co-NP*-complete with finite-memory winning strategies for Streett objective [16, 30],*

•

in NP $\cap$ co-NP with uniform winning strategies for parity objectives **[30]**.

Let us comment this theorem. The existence of uniform winning strategies (for all objectives except Rabin and Muller objectives) was previously mentioned as a consequence of Theorem 3.4 [36]. Notice that here a value ${val}(v_{0})=1$ is equivalent to say that player $1$ has a winning strategy for Problem 1. In case player $1$ has no winning strategy ( ${val}(v_{0})=0$ ), it follows that player $2$ has a winning strategy for the opposite objective by Martin’s theorem. Hence Theorem 3.5 also gives information for player $2$ by considering the opposite objective. In [48], the author gives general conditions on Boolean objectives that guarantee the existence of a uniform winning strategy for one of the players (and not necessarily for both players). This includes the case of Rabin games where player $1$ has a uniform winning strategy (whereas player $2$ needs to use a finite-memory strategy to win the opposite Streett objective).

Problem 1 is decidable in $O(|V|+|E|)$ time for reachability and safety games [38], and the current best algorithm for Büchi and co-Büchi games is in $O(|V|^{2})$ time [22]. For Muller games with ${\cal F}\subseteq 2^{V}$ , the complexity is in $O(|{\cal F}|\cdot(|{\cal F}|+|V|\cdot|E|)^{2})$ time [41], whereas for Rabin and Streett games with $l$ pairs $(F_{k},G_{k})$ , it is in $O(|V|^{l+1}l!)$ time [56]. Concerning parity games, the complexity class of Problem 1 is refined to UP $\cap$ co-UP in [46] and a major open problem is whether it can be solved in polynomial time. Very recently, a breakthrough quasi-polynomial time algorithm has been proposed in [17] for parity games.

3.2.3 Quantitative games

Let us turn to quantitative games for which we first give results for the threshold problem, and then for the constraint problem. The following theorem provides the known results to the threshold problem. It describes the simplest form of winning strategies for player $1$ (resp. player $2$ ) when he has a winning strategy for this problem (resp. for ensuring the opposite objective when player $1$ has no winning strategy).

Theorem 3.6

Let $(G,v_{0})$ be an initialized two-player zero-sum quantitative game, and $\mu\in\mathbb{Q}$ be a rational bound. Then the threshold problem (in Problem 1) is

•

P*-complete for supremum, infimum, limsup, and liminf games with uniform winning strategies for both players,*

•

in NP $\cap$ co-NP for mean-payoff and discounted sum games with uniform winning strategies for both players **[71]**.

Let us comment this theorem. We already know the existence of uniform winning strategies from Theorem 3.4 [36]. The P-completeness for supremum, infimum, limsup, and liminf games follows from the P-completeness for reachability, safety, Büchi, and co-Büchi games in Theorem 3.5. Parity games are polynomially reducible to mean-payoff games [46] which are themselves polynomially reducible to discounted sum games [71]. For these three classes of games, from the existence of uniform winning strategies, we get a threshold problem in NP as follows: guess a uniform strategy $\sigma_{1}$ for player $1$ (by choosing one outgoing edge $(v,v^{\prime})$ for all $v\in V_{1}$ ), fix this strategy $\sigma_{1}$ in the game $G$ to get a one-player game $G_{\sigma_{1}}$ , apply the related polynomial time algorithm of Theorems 3.1 or 3.2 (from the point of view of player $2$ who controls $G_{\sigma_{1}}$ ). The co-NP membership is symmetrically obtained with player $2$ .

Concerning the constraint problem, recall that it is more complex already for one-player games (see Section 3.1) with no known solution for discounted sum games (see Problem 4).

Theorem 3.7

Let $(G,v_{0})$ be an initialized two-player zero-sum quantitative (except discounted sum) game, and $\mu,\nu\in\mathbb{Q}$ be rational bounds. Then the constraint problem (in Problem 1) is

•

P*-complete for supremum, infimum, limsup, and liminf games with uniform winning strategies for both players [13, 43],*

•

in NP $\cap$ co-NP for mean-payoff games with finite-memory (resp. uniform) winning strategies for player $1$ (resp. player $2$ ) **[43]**.

Discounted sum games are studied with bounds $\mu,\nu$ such that $\mu<\nu$ (to avoid the case $\mu=\nu$ of Problem 4) in [43] where it is proved that the constraint problem is PSPACE-complete with finite-memory winning strategies for both players.

3.3 Variants of preferences

Several extensions141414The reader who prefers to know classical solutions to Problem 2 for multi-player non zero-sum games can skip this section and go directly to Section 4. of two-player zero-sum Boolean and quantitative games have been studied in the literature, by using preferences that are irreflexive and transitive but not necessarily total, or more generally by using preorders $\preceq$ that are reflexive and transitive binary relations (hence, $\preceq$ is not supposed to be total and one can have $p\preceq p^{\prime}$ and $p^{\prime}\preceq p$ such that $p\neq p^{\prime}$ ).

Such variants naturally appear when we study intersection of objectives instead of a single objective as in Section 2.3:

•

Intersection of homogeneous objectives. For instance player $1$ has $l$ reachability objectives $U_{1},\ldots,U_{l}$ (instead of just one), and he wants to visit all the sets $U_{1},\ldots,U_{l}$ .

•

Intersection of heterogeneous objectives. In this more general case, player $1$ has several objectives not necessarily of the same type. Let us imagine a situation where he has two quantitative objectives depending on two weight functions on the graph, like ensuring a threshold for the liminf of weights with respect to the first weight function and another threshold for the mean-payoff with respect to the second weight function.

In this context, for player $1$ , we consider a tuple $\bar{f}$ of payoff functions and a tuple $\bar{w}$ of weight functions (instead of a single payoff function $f$ defined from a single weight function $w$ ) such that each function $f_{k}:Plays\to\mathbb{R}$ is defined from $w_{k}:E\to\mathbb{Q}$ .151515This tuple of payoff functions is used by player $1$ contrarily to Definition 2 where function $f_{i}$ is used by player $i$ for all $i\in\Pi$ . Tuples of payoffs $\bar{p}=\bar{f}(\rho)$ and $\bar{p}^{\prime}=\bar{f}(\rho^{\prime})$ are then compared using the usual ordering on tuples of reals: $\bar{p}\prec_{\sf ord}\bar{p}^{\prime}$ iff $p_{k}\leq p^{\prime}_{k}$ for all components $k$ and there exists $k$ such that $p_{k}<p^{\prime}_{k}$ (the preference relation $\prec_{\sf ord}$ is not total). Let us mention some results first for quantitative objectives and then for Boolean objectives.

3.3.1 Combination of quantitative objectives

The threshold problem takes the following form: given a tuple $\bar{\mu}$ of rational thresholds, decide whether player $1$ has a strategy $\sigma_{1}$ that ensures a payoff $\bar{f}(\rho)$ such that $\bar{\mu}\prec_{\sf ord}\bar{f}(\rho)$ for all plays $\rho$ consistent with $\sigma_{1}$ .

Theorem 3.8

[69]** Let $(G,v_{0})$ be an initialized two-player zero-sum game with homogeneous intersections of mean-payoff objectives. Then the threshold problem (in Problem 1) is

•

in NP $\cap$ co-NP for functions ${\sf\overline{MP}}$ ,

•

is co-NP-complete for functions ${\sf\underline{MP}}$ .

In both cases, infinite memory is required for winning strategies of player $1$ whereas uniform winning strategies are sufficient for player $2$ .

This theorem indicates different behaviors for the functions ${\sf\overline{MP}}$ and ${\sf\underline{MP}}$ . This is illustrated with the example of the initialized one-player game $(G,v_{0})$ depicted in Figure 5, where player $1$ wants to ensure the intersection of two homogeneous objectives. It is shown in [69] that for a pair of functions ${\sf\underline{MP}}$ , player $1$ can ensure a threshold $(1,1)$ , and that for a pair of functions ${\sf\overline{MP}}$ , he can ensure a threshold $(2,2)$ (which is impossible with ${\sf\underline{MP}}$ ). In both cases infinite memory is necessary. Indeed recall that with a finite-memory strategy the produced play is a lasso $\rho=hg^{\omega}$ such that ${\sf\overline{MP}}(\rho)={\sf\underline{MP}}(\rho)$ is the average weight of the cycle $g$ . Here this average weight has the form $a\cdot(2,0)+b\cdot(0,0)+c\cdot(0,2)=(2a,2c)$ , with $a+b+c=1$ and $b>0$ . Clearly $(1,1)\not\prec_{\sf ord}(2a,2c)$ showing that player $1$ is losing for threshold $(1,1)$ with finite-memory strategies.

In [68], the author studies objectives equal to Boolean combinations of inequalities $f_{k}(\rho)\sim\mu_{k}$ , with $\sim$ $\in\{\leq,\geq\}$ and $f_{k}\in\{{\sf\overline{MP}},{\sf\underline{MP}}\}$ : deciding whether player $1$ has a winning strategy in $(G,v_{0})$ becomes undecidable. However, this problem remains decidable and is EXPTIME-complete for CNF/DNF Boolean combinations of functions taken among $\{{\sf Sup},{\sf Inf},{\sf LimSup},{\sf LimInf},\sf{WMP}\}$ [13], where $\sf{WMP}$ is an interesting window variant of mean-payoff introduced in [20]. The threshold problem is P-complete (resp. EXPTIME-complete) for a single $\sf{WMP}$ objective (resp. an intersection of $\sf{WMP}$ objectives) [20]. Recall that it is in NP $\cap$ co-NP for a single function ${\sf\overline{MP}}$ or ${\sf\underline{MP}}$ (see Theorem 3.6).

3.3.2 Combination of Boolean objectives

Concerning one-player games, Boolean combinations of Büchi and co-Büchi objectives are introduced in [31] as a generalization of Rabin and Streett objectives. It is proved that the non emptiness problem for this class of automata is NP-complete (for a comparison see Theorem 3.1). Concerning two-player games, the intersection of homogeneous objectives is simple for safety, co-Büchi, Streett, and Muller cases. Indeed the intersection of safety (resp. co-Büchi, Streett, Muller) objectives is again a safety (resp. co-Büchi, Streett, Muller) objective. In the other cases, we have the following results to be compared with those of Theorem 3.5.

Theorem 3.9

Let $(G,v_{0})$ be an initialized two-player zero-sum game with an intersection of homogeneous objectives. Then Problem 1 is

•

PSPACE*-complete for reachability objectives with finite-memory winning strategies for both players [32],*

•

P*-complete for Büchi objectives with finite-memory (resp. uniform) winning strategies for player $1$ (resp. player $2$ ) [21],*

•

co-NP*-complete for parity objectives with finite-memory (resp. uniform) winning strategies for player $1$ (resp. player $2$ ) [24],*

•

PSPACE*-complete for Rabin objectives with finite-memory winning strategies for both players.161616We found no reference for this result. The PSPACE membership (resp. the finite memory of the strategies) follows from [1] (resp. [13]). In [1], games with a union of a Streett objective and a Rabin objective are shown to be PSPACE-hard. It is thus also the case for games with a union of Streett objectives. By Martin’s theorem, it follows that games with an intersection of Rabin objectives are PSPACE-hard.*

Problem 1 is PSPACE-complete for heterogeneous intersections of reachability and Büchi objectives [13] as well as for Boolean combinations of Büchi objectives [1, 42].

3.3.3 Lexicographic and secure preferences

For a tuple $\bar{f}$ of payoff functions defined from a tuple $\bar{w}$ of weight functions for player $1$ , let us mention two other natural preference relations $\prec$ .

Definition 7

Let $\bar{p},\bar{p}^{\prime}$ be two tuples of real payoffs.

•

lexicographic preference: $\bar{p}\prec_{\sf lex}\bar{p}^{\prime}$ iff there exists $k$ such that $p_{k}<p^{\prime}_{k}$ and $p_{j}=p^{\prime}_{j}$ for all $j\leq k$ . That is, player $1$ prefers to first maximize the first component, then the second, then the third, etc (see for instance [5]).

•

secure preference: $\bar{p}\prec_{\sf sec}\bar{p}^{\prime}$ iff either $p_{1}<p^{\prime}_{1}$ or { $p_{1}=p^{\prime}_{1}$ , $p_{k}\geq p^{\prime}_{k}$ for all components $k>1$ , and there exists $k>1$ such that $p_{k}>p^{\prime}_{k}$ }. That is, player $1$ prefers to first maximize the first component, and then to minimize all the other components (see for instance [28]).

The lexicographic preference is total whereas the secure preference is total only for pairs (instead of tuples) of payoffs. In the latter case, we get a preference which is close to the lexicographic ordering: player $1$ prefers to maximize the first component, and then to minimize the second one. The secure preference is used in the notion of secure equilibrium discussed later in Section 4.3.1.

Theorem 3.10

Let $(G,v_{0})$ be an initialized two-player zero-sum game.

•

Suppose that $\prec$ is the lexicographic preference $\prec_{\sf lex}$ . Then the threshold problem (in Problem 1) for function ${\sf\underline{MP}}$ is in NP $\cap$ co-NP with uniform winning strategies for both players **[5]**.

•

Suppose that $\prec$ is the secure preference $\prec_{\sf sec}$ on pairs of payoffs. Then the threshold problem (in Problem 1) is in NP $\cap$ co-NP (resp. P-complete) for functions ${\sf\overline{MP}}$ , ${\sf\underline{MP}}$ , and ${\sf Disc}^{\lambda}$ (resp. for functions ${\sf Sup}$ , ${\sf Inf}$ , ${\sf LimSup}$ , and ${\sf LimInf}$ ). Moreover both players have uniform (resp. positional) winning strategies for functions ${\sf LimSup}$ , ${\sf LimInf}$ , ${\sf\overline{MP}}$ , ${\sf\underline{MP}}$ , and ${\sf Disc}^{\lambda}$ (resp. for functions ${\sf Sup}$ and ${\sf Inf}$ ) **[14]**.

In this theorem, it is supposed that the components $f_{k}$ of $\bar{f}$ are all of the same type (for instance they are all limsup functions); and some results stating the existence of uniform winning strategies can be established thanks to Theorem 3.4. Notice that the secure preference is limited to pairs of payoffs in a way to be total, which is a necessary condition when dealing with values. Notice also that the authors in [5] consider liminf average of the weight vector under lexicographic ordering whereas the authors in [14] consider the secure ordering of components where each component is the liminf average value.

The threshold problem is studied in [7] in a general context: the players can use various preorders (like the lexicographic preference, a preorder given by a Boolean circuit, etc), the players play concurrently and not in a turned-based way, and the objectives are Boolean as in Definition 4.

4 Multi-player non zero-sum games

In multi-player non zero-sum games, the different players $i\in\Pi$ are not necessarily antagonistic, they have their own payoff functions $f_{i}$ and preference relations $\prec_{i}$ . Each of them follows a strategy $\sigma_{i}$ , the resulting strategy profile $(\sigma_{i})_{i\in\Pi}$ induces a play that should be satisfactory to all players. As explained in Section 2.2 (see Definition 3), a classical solution profile is the notion of NE, where no player has an incentive to deviate when the other players stick to their own strategies. It is proved in [25, 39] that there exists an NE in every initialized multi-player non zero-sum game with Borel Boolean objectives. We go further by presenting in this section additional existence results for quantitative games and some known results for NEs as a solution to Problem 2 (threshold problem and constraint problem). As in Section 3, we focus on general approaches.

4.1 Characterization of outcomes of NE

Given a multi-player non zero-sum game $G$ and an initial vertex $v_{0}$ , we begin by a characterization of plays $\rho\in Plays(v_{0})$ that are the outcome of an NE $(\sigma_{i})_{i\in\Pi}$ in $(G,v_{0})$ . It will imply the existence of NE in large classes of games (see Corollaries 1 and 2), and will be useful for the study of Problem 2 (see Theorems 4.1 and 4.2). This characterization is related to a family of two-player zero-sum games $G_{i}$ , one for each $i\in\Pi$ , associated with $G$ and defined as follows. (i) The game $G_{i}$ has the same arena as $G$ , (ii) the two players are player $i$ (player $1$ ) and player $-i$ (player $2$ ) formed by the coalition of the other players $j\in\Pi\setminus\{i\}$ , (iii) the payoff function of player $i$ is equal to $f_{i}$ and his preference relation is equal to $\prec_{i}$ 171717Recall that the payoff function and the preference relation of the second player do not matter in two-player zero-sum games.. For all $v\in V$ , when it exists, we denote by ${val}_{i}(v)$ the value of vertex $v$ in game $G_{i}$ , and by $\tau_{i}^{v}$ , $\tau_{-i}^{v}$ the related optimal strategies for players $i,-i$ respectively in $(G_{i},v)$ (see Definition 6).

Proposition 1

Let $G$ be a multi-player non zero-sum game such that for all $i\in\Pi$ ,

•

the payoff function $f_{i}$ is prefix-linear, and

•

in the game $G_{i}$ , all vertices has a value.

Then $\rho=\rho_{0}\rho_{1}\ldots\in Plays(v_{0})$ is the outcome of an NE in $(G,v_{0})$ iff ${val}_{i}(\rho_{k})\preceq_{i}f_{i}(\rho_{\geq k})$ for all $i\in\Pi$ and all $k\in\mathbb{N}$ such that $\rho_{k}\in V_{i}$ .

The condition of this proposition asks that for all $k$ , if vertex $\rho_{k}$ is controlled by player $i$ , then in the two-player zero-sum game $G_{i}$ , its value is less preferred or equal to the payoff of the suffix $\rho_{\geq k}$ . The proposed characterization appears under various particular forms, for instance in [14, 39, 65]. It is here given under two general conditions already studied in Section 3.2. Recall that almost all the payoff functions considered in Section 2.3 are prefix-linear and that for all the related two-player zero-sum games $G_{i}$ , the vertices have a value. Notice that when $f_{i}$ is prefix-independent, condition ${val}_{i}(\rho_{k})\preceq_{i}f_{i}(\rho_{\geq k})$ for all $k\in\mathbb{N}$ with $\rho_{k}\in V_{i}$ simplifies in $\max\{{val}_{i}(\rho_{k})\mid k\in\mathbb{N},\rho_{k}\in V_{i}\}\preceq_{i}f_{i}(\rho)$ (the maximum exists since $V_{i}$ is finite).

Example -19 (continued)

An example of NE with outcome $\rho=v_{0}v_{1}v_{3}^{\omega}$ was given in Example 1 for the initialized game $(G,v_{0})$ of Figure 1. Let us verify that $\rho$ satisfies the characterization of Proposition 1. Recall that both players use the same payoff function $f$ that is prefix-independent. The values of $G_{1}$ were computed in Example -8: ${val}_{1}(v_{0})={val}_{1}(v_{1})={val}_{1}(v_{2})=p_{2}$ and ${val}_{1}(v_{3})=p_{3}$ . Similarly one can compute the values of $G_{2}$ : ${val}_{2}(v_{0})={val}_{2}(v_{2})=p_{2}$ and ${val}_{2}(v_{1})={val}_{2}(v_{3})=p_{3}$ . One checks that $\max\{{val}_{i}(\rho_{k})\mid k\in\mathbb{N},\rho_{k}\in V_{i}\}\preceq_{i}f(\rho)=p_{3}$ , for $i=1,2$ .

The proof of Proposition 1 is easy to establish.

Firstly suppose that $\rho$ is the outcome of an NE $(\sigma_{i})_{i\in\Pi}$ and that there exist $i\in\Pi$ and $k\in\mathbb{N}$ with $\rho_{k}\in V_{i}$ such that $f_{i}(\rho_{\geq k})\prec_{i}{val}_{i}(\rho_{k})$ . Let us show that player $i$ has a profitable deviation $\sigma^{\prime}_{i}$ with respect to $(\sigma_{i})_{i\in\Pi}$ in contradiction with $(\sigma_{i})_{i\in\Pi}$ being an NE. The strategy $\sigma^{\prime}_{i}$ consists in playing according to $\sigma_{i}$ until producing $\rho_{\leq k}$ and from $\rho_{k}$ in playing according to his optimal strategy $\tau_{i}^{\rho_{k}}$ (in $(G_{i},\rho_{k})$ ). The payoff of the resulting play $\pi$ from $\rho_{k}$ is such that ${val}_{i}(\rho_{k})\preceq_{i}f_{i}(\pi)$ by optimality of $\tau_{i}^{\rho_{k}}$ , and thus $f_{i}(\rho_{\geq k})\prec_{i}f_{i}(\pi)$ . From prefix-linearity of $f_{i}$ it follows that $f_{i}(\rho)=f_{i}(\rho_{<k}\rho_{\geq k})\prec_{i}f_{i}(\rho_{<k}\pi)$ as required.

Secondly suppose that ${val}_{i}(\rho_{k})\preceq_{i}f_{i}(\rho_{\geq k})$ for all $i\in\Pi$ and all $k\in\mathbb{N}$ such that $\rho_{k}\in V_{i}$ . We are going to construct an NE by using a well-known method in classical game theory that is used in the proof of the Folk Theorem in repeated games [54]. We define a strategy profile $(\sigma_{i})_{i\in\Pi}$ that produces $\rho$ as outcome, and as soon as some player $i$ deviates from $\rho$ , say at vertex $\rho_{k}$ , all the other players (as a coalition) punish him by playing from $\rho_{k}$ the optimal strategy $\tau_{-i}^{\rho_{k}}$ (in $(G_{i},\rho_{k})$ ). Let us show that $(\sigma_{i})_{i\in\Pi}$ is an NE. Let $\sigma^{\prime}_{i}$ be a deviating strategy from $\sigma_{i}$ for player $i$ , and let $\rho^{\prime}$ be the outcome of the strategy profile $(\sigma^{\prime}_{i},\sigma_{-i})$ . Consider the longest common prefix $\rho_{\leq k}$ of $\rho$ and $\rho^{\prime}$ . Then $\rho_{k}\in V_{i}$ and by optimality of $\tau_{-i}^{\rho_{k}}$ , we get $f_{i}(\rho^{\prime}_{\geq k})\preceq_{i}{val}_{i}(\rho_{k})$ and thus $f_{i}(\rho^{\prime}_{\geq k})\preceq_{i}f_{i}(\rho_{\geq k})$ . From prefix-linearity of $f_{i}$ it follows that $f_{i}(\rho^{\prime})\preceq_{i}f_{i}(\rho)$ showing that $\sigma^{\prime}_{i}$ is not a profitable deviation for player $i$ .

Notice that in this proof, the first (resp. second) implication only requires condition (2) (resp. (1)) of prefix-linearity of $f_{i}$ . The next corollary follows from this observation and Proposition 1.

Corollary 1

[27]** Let $G$ be a multi-player non zero-sum game such that for all $i\in\Pi$ ,

•

the payoff function $f_{i}$ satisfies $f_{i}(\rho)\preceq_{i}f_{i}(\rho^{\prime})\Rightarrow f_{i}(h\rho)\preceq_{i}f_{i}(h\rho^{\prime})$ for all $h\rho,h\rho^{\prime}\in Plays$ , and

•

each game $G_{i}$ has uniform optimal strategies for both players.

Then there exists a finite-memory NE in each initialized game $(G,v_{0})$ .

This corollary is a generalization of a theorem181818In [12], one hypothesis is missing: the required optimal strategies must be uniform. given in [12, 27] for the existence of NEs in games equipped with payoff functions $f_{i}:Plays\to\mathbb{R}$ , $i\in\Pi$ . The proof of Corollary 1 is as follows. Let us consider the play $\rho\in Plays(v_{0})$ produced by the players when each player $i$ plays according to his optimal strategy $\tau_{i}$ in $(G_{i},v_{0})$ ( $\tau_{i}^{v}=\tau_{i}$ for all vertices $v$ since it is uniform). By construction, $\rho$ is the outcome of an NE because it satisfies the characterization of Proposition 1. Notice that $\rho$ is a simple lasso since each $\tau_{i}$ , $i\in\Pi$ , is uniform. Therefore the strategies of the constructed NE are finite-memory with a small memory size bounded by $|V|+|\Pi|$ to remember this lasso and the first player who deviates from $\rho$ .

The existence of an NE is also guaranteed in the following corollary that does not require optimal strategies that are uniform, but in counterpart requires payoff functions that are prefix-independent.

Corollary 2

[27]** Let $G$ be a multi-player non zero-sum game such that for all $i\in\Pi$ ,

•

the payoff function $f_{i}$ is prefix-independent, and

•

each game $G_{i}$ has (resp. finite-memory) optimal strategies for both players.

Then there exists an (resp. finite-memory) NE in each initialized game $(G,v_{0})$ .

This is a generalization of a result given in [27] for games equipped with payoff functions $f_{i}:Plays\to\mathbb{R}$ , $i\in\Pi$ . The proof is as follows: under the hypotheses of Corollary 2, one can show that there exist optimal strategies ${\tau}_{i}^{v_{0}}$ in $(G_{i},v_{0})$ , $i\in\Pi$ , such that for all plays $\rho\in Plays(v_{0})$ consistent with ${\tau}_{i}^{v_{0}}$ , we have $\max\{{val}_{i}(\rho_{k})\mid k\in\mathbb{N},\rho_{k}\in V_{i}\}\preceq_{i}f_{i}(\rho)$ . Then as in Corollary 1, we consider the play $\rho\in Plays(v_{0})$ obtained when each player $i$ plays according to his optimal strategy ${\tau}_{i}^{v_{0}}$ . As each $f_{i}$ is prefix-independent, $\rho$ satisfies the characterization of Proposition 1.

From Corollaries 1 and 2, it follows that there exists an NE (which can be constructed) in every game of Section 2.3; the case of Boolean (resp. quantitative) game is proved in [25, 39] (resp. in [12, 27]). The existence of an NE in discounted sum games can be obtained in a second way: the function ${\sf Disc}^{\lambda}$ is continuous and all games with real-valued continuous payoff functions always have an NE [35, 40]. Notice that the two previous corollaries allow mixing the types of functions $f_{i}$ , like for instance $f_{1}$ associated with a Büchi objective, a limsup function $f_{2}$ , a mean-payoff function $f_{3}$ , etc.

Conditions generalizing those of Corollaries 1 and 2 are given in [59] that guarantee the existence of a finite-memory NE. Moreover, for most of the given conditions counterexamples are provided that show that they cannot be dispensed with.

4.2 Solution to Problem 2

In this section we study how to solve Problem 2 for NEs (threshold problem and constraint problem). The characterization given in Proposition 1 provides a general approach to solve this problem. Indeed consider the case of initialized games $(G,v_{0})$ with prefix-independent payoff functions $f_{i}$ and such that the vertices of each game $G_{i}$ , $i\in\Pi$ , has a value. Then given two tuples of bounds $(\mu_{i})_{i\in\Pi},(\nu_{i})_{i\in\Pi}$ , we simply have to check whether there exists a play $\rho\in Plays(v_{0})$ such that for all $i\in\Pi$ ,

[TABLE]

Thanks to this general approach or variations based on Proposition 1, Problem 2 is solved for Büchi, co-Büchi, Streett, and parity games in [64], and for the other Boolean games in [26].

Theorem 4.1

[26, 64]** Let $(G,v_{0})$ be an initialized multi-player non zero-sum Boolean game. Then Problem 2 is

•

is P-complete for Büchi and Muller191919We found no reference for Muller objectives. A sketch of proof is given in the appendix. Problem 2 is PSPACE-complete for the colored variant of Muller objectives [26].* games,*

•

NP*-complete for reachability, safety, co-Büchi, parity and Streett games,*

•

in ${\sf P^{NP}}$ , and NP-hard, co-NP-hard for Rabin games.

Let us explain the proof of NP membership for parity games and the constraint problem with bounds $(\mu_{i})_{i\in\Pi},(\nu_{i})_{i\in\Pi}$ . As each $G_{i}$ is a two-player zero-sum parity game, recall that the constraint problem is in NP $\cap$ co-NP with uniform winning strategies for both players (see Theorem 3.5). The required algorithm in NP is as follows. (i) For all $i\in\Pi$ , in the game $G_{i}$ , guess a subset $U_{i}\subseteq V$ of vertices and two uniform strategies $\tau_{i},\tau_{-i}$ for players $i,-i$ respectively (intuitively we guess $U_{i}=\{v\in V\mid{val}_{i}(v)=1\}$ and $V\setminus U_{i}=\{v\in V\mid{val}_{i}(v)=0\}$ ). Check in polynomial time202020Recall our comment after Theorem 3.6. that $\tau_{i}$ is a winning strategy for player $i$ for the constraint problem in each $(G_{i},v)$ with $v\in U_{i}$ and that $\tau_{-i}$ is a winning strategy for player $-i$ for the opposite objective in each $(G_{i},v)$ with $v\in V\setminus U_{i}$ . (ii) Then for all $i\in\Pi$ , we guess $r_{i}\in V_{i}$ (intuitively we guess $r_{i}$ such that ${val}_{i}(r_{i})=\max\{{val}_{i}(\rho_{k})\mid k\in\mathbb{N},\rho_{k}\in V_{i}\}$ for the required play $\rho$ ). Construct in polynomial time a one-player game $G^{\prime}$ from $G$ such that each set $V_{i}$ of vertices is limited to $\{v\in V_{i}\mid{val}_{i}(v)\leq{val}_{i}(r_{i})\}$ and the unique player is formed by the coalition of all players $i\in\Pi$ . (iii) By (3) it remains to check whether there exists a play $\rho$ in $(G^{\prime},v_{0})$ such that for all $i\in\Pi$ , ${val}_{i}(r_{i})\leq f_{i}(\rho)$ and $\mu_{i}\leq f_{i}(\rho)\leq\nu_{i}$ . Recall that the existence of plays satisfying certain constraints in one-player games was studied in Section 3.1, see Theorem 3.1. Here we are faced with the existence of a play in game with an intersection of parity objectives which can be checked in polynomial time by [31].

Problem 2 can be similarly solved for quantitative games.

Theorem 4.2

[26, 64, 65]** Let $(G,v_{0})$ be an initialized multi-player non zero-sum quantitative (except discounted sum) game. Then Problem 2 is

•

P*-complete for limsup games,*

•

NP*-complete for supremum, infimum, liminf, mean-payoff ${\sf\underline{MP}}_{i}$ , and mean-payoff ${\sf\overline{MP}}_{i}$ games.*

The case of supremum, infimum, limsup and liminf games is equivalent to the case of reachability, safety, Büchi and co-Büchi games presented in Theorem 4.1, whereas the case of mean-payoff games is studied in [65]. The proof of NP membership for mean-payoff games is based on the approach (3), and is similar to the one given above for parity games. The case of discounted sum games is open. Indeed it is proved in [14] that Problem 4 reduces to Problem 2 with the discounted sum function.212121The reduction is given for another kind of solution profile but it also works for NEs.

Problem 2 is studied in [7] in a general context: the players can use various preorders, they play concurrently and not in a turned-based way, and the objectives are Boolean as in Definition 4. The general approach proposed in [7] is different from the one of Proposition 1.

4.3 Other solution profiles

In this section, we present some other solution profiles. Indeed the notion of NE has several drawbacks: (i) Each player is selfish since he is only concerned with his own payoff, and not with the payoff of the other players. (ii) An NE does not take into account the sequential nature of games played on graphs. We illustrate these drawbacks in the following two examples of quantitative game.

Example 7

Consider the two-player quantitative game of Figure 7 such that $f_{i}={\sf LimSup}_{i}$ for $i=1,2$ . The strategy profile depicted with thick edges is an NE. Notice that player $1$ could decide to deviate at $v_{0}$ by moving to $v_{2}$ . Indeed he then keeps the same payoff of $1$ but also decreases the payoff of player $2$ (from $2$ to $1$ ) which is bad for player $2$ . To avoid such a drawback, we will introduce hereafter the concept of secure equilibrium, where each player take cares of his own payoff as well as the payoff of the other players (but in a negative way).

Consider now the game of Figure 7 where the weights of the loops have been modified. The depicted strategy profile is again an NE. Player $1$ has no incentive to deviate at $v_{0}$ due to the threat of player $2$ : player $1$ will receive a payoff of $0<1$ . Such a threat of player $2$ is non credible because in the subgame induced by $v_{2},v_{3},v_{4}$ , at vertex $v_{2}$ , it is more rational for player $2$ to move to $v_{4}$ to get a payoff of $2$ instead of going to $v_{3}$ where he only receives a payoff of $1$ . To avoid such a drawback, we will introduce hereafter the concept of subgame perfect equilibrium that takes into account rational behaviors of the players in all subgames of the initial game.

4.3.1 Secure equilibria

The notion of secure equilibrium (SE) is introduced in [23] for two-player non zero-sum games. The idea of an SE is that no player has an incentive to deviate in the following sense: he will not be able to increase his payoff, and keeping the same payoff he will not be able to decrease the payoff of the other player. An SE can thus be seen as a contract between the two players which strengthens cooperation: if a player chooses another strategy that is not harmful to himself, then this cannot harm the other player if the latter follows the contract.

The definition of an SE is given in the context of games equipped with payoff functions $f_{i}:Plays\to\mathbb{R}$ , $i\in\Pi$ . It uses the notion of secure preference introduced in Section 3.3 (see Definition 7222222The definition was given for player $1$ .). Let us recall the secure preference $\prec_{{\sf sec},i}$ for player $i$ : given $\bar{p}=(f_{i}(\rho))_{i\in\Pi},\bar{p}^{\prime}=(f_{i}(\rho^{\prime}))_{i\in\Pi}$ , we have $\bar{p}\prec_{{\sf sec},i}\bar{p}^{\prime}$ iff either $p_{i}<p^{\prime}_{i}$ or { $p_{i}=p^{\prime}_{i}$ , $p_{k}\geq p^{\prime}_{k}$ for all components $k\neq i$ , and there exists $k\neq i$ such that $p_{k}>p^{\prime}_{k}$ }. Hence player $i$ prefers to increase his own payoff, and in case of equality to decrease the payoffs of all the other players. This preference relation is not total except when there are only two players.

The definition of an SE is very close to the one of NE (see Definition 3). The only difference is that it uses the secure preference:

Definition 8

Given an initialized game $(G,v_{0})$ , a strategy profile $(\sigma_{i})_{i\in\Pi}$ is a secure equilibrium if

[TABLE]

for all players $i\in\Pi$ and all strategies $\sigma^{\prime}_{i}$ of player $i$ .

Example 6 (continued)

The strategy profile of Figure 7 is not an SE because player $1$ has a profitable deviation if at $v_{0}$ he chooses to move to $v_{2}$ : $(1,2)\prec_{{\sf sec},1}(1,1)$ .

By definition, every SE is an NE but the converse is false as shown in the previous example. It is proved in [23] that every two-player non zero-sum game with Borel Boolean objectives has an SE; this result is generalized to multi-player games in [28].

Let us turn to quantitative games such that all the players have the same type of payoff function $f_{i}$ . General hypotheses are provided in [28] that guarantee the existence of an SE in quantitative games, except for functions ${\sf\overline{MP}}_{i}$ and ${\sf\underline{MP}}_{i}$ . Thanks to Corollary 1 and Theorem 3.10, for two-player232323A restriction to two-player games is necessary to deal with a secure preference that is total. quantitative games (now including functions ${\sf\overline{MP}}_{i}$ and ${\sf\underline{MP}}_{i}$ ), there exists such an SE that is finite-memory [14]. Moreover, with the same general approach (3) described previously for NEs, Problem 2 is solved as follows for SEs.

Theorem 4.3

[14]** Let $(G,v_{0})$ be an initialized two-player non zero-sum quantitative (except discounted sum) game. Then Problem 2 for SEs is

•

P*-complete for supremum, infimum, limsup, and liminf functions,*

•

in NP $\cap$ co-NP for functions ${\sf\overline{MP}}_{i}$ and ${\sf\underline{MP}}_{i}$ .

The case of discounted sum function is open since it is proved in [14] that Problem 4 reduces to Problem 2 with ${\sf Disc}^{\lambda}$ . The complexity class of the problem of deciding whether, in an initialized two-player parity game $(G,v_{0})$ , there exists an SE with payoff respectively equal to $(0,0)$ , $(0,1)$ , $(1,0)$ , and $(1,1)$ , is studied in [23, 39].

4.3.2 Subgame perfect equilibria

A solution profile that avoids incredible threats by taking into account the sequential nature of games played on graphs is the notion of subgame perfect equilibrium (SPE) [61]. For being an SPE, a strategy profile is not only required to be an NE from the initial vertex but after every possible history of the game.

Before giving the definition of an SPE, we need to introduce the following concepts for an initialized game $(G,v_{0})$ with payoff functions $f_{i}$ and preference relations $\prec_{i}$ , for all $i\in\Pi$ . Given a history $hv\in Hist(v_{0})$ , the subgame $({G}_{|{h}},v)$ of $(G,v_{0})$ is the initialized game with payoff functions ${f_{i}}_{|{h}}$ , $i\in\Pi$ , such that ${f_{i}}_{|{h}}(\rho)=f_{i}(h\rho)$ for all plays $\rho\in Plays(v)$ (the preference relation of player $i$ is his preference $\prec_{i}$ in $G$ ). Given a strategy $\sigma_{i}$ for player $i$ in $(G,v_{0})$ , the strategy ${\sigma_{i}}_{|{h}}$ in $({G}_{|{h}},v)$ is defined as ${\sigma_{i}}_{|{h}}(h^{\prime})=\sigma_{i}(hh^{\prime})$ for all $h^{\prime}\in Hist_{i}(v)$ .

Definition 9

Given an initialized game $(G,v_{0})$ , a strategy profile $(\sigma_{i})_{i\in\Pi}$ is a subgame perfect equilibrium if $({\sigma_{i}}_{|{h}})_{i\in\Pi}$ is an NE in each subgame $({G}_{|{h}},v)$ of $(G,v_{0})$ with $hv\in Hist(v_{0})$ .

Example 4 (continued)

The strategy profile $(\sigma_{1},\sigma_{2})$ of Figure 7 is not an SPE because in the subgame $({G}_{|{v_{0}}},v_{2})$ , player $2$ has a profitable deviation with respect to $({\sigma_{1}}_{|{v_{0}}},{\sigma_{2}}_{|{v_{0}}})$ if at $v_{2}$ he chooses to move to $v_{4}$ .

By definition, every SPE is an NE but the converse is false as shown in the previous example. A well-known result is the existence of an SPE in every initialized game $(G,v_{0})$ such that its arena is a tree rooted at $v_{0}$ 242424In this particular context, plays are finite paths. [49]. The SPE is constructed backwards from the leaves to the initial vertex $v_{0}$ in the following way. Suppose that the current vertex $v$ is controlled by player $i$ , and that for each son $v^{\prime}$ of $v$ one has already constructed an SPE $(\sigma_{i}^{v^{\prime}})_{i\in\Pi}$ in the subtree rooted at $v^{\prime}$ . Then player $i$ chooses the edge $(v,v^{\prime})$ such that $(\sigma_{i}^{v^{\prime}})_{i\in\Pi}$ has the best outcome with respect to his preference relation $\prec_{i}$ . The resulting strategy profile $(\sigma^{v}_{i})_{i\in\Pi}$ is an SPE in the subtree rooted at $v$ .

It is proved in [63] that there exists an SPE in every multi-player non zero-sum game with Borel Boolean objectives, and that in case of $\omega$ -regular objectives there exists one that is finite-memory. Existence of an SPE also holds for games with continuous real-valued payoff functions [35, 40] (this is also holds when the functions are upper-semicontinuous (resp. lower-semicontinuous) and with finite range [34] (resp. [57])).

For subgame perfect equilibria, we are not aware of a characterization like the one in Proposition 1. Therefore a solution to Problem 2 for SPEs needs a different approach. Few solutions are known: this problem is in EXPTIME for Rabin games [63] and is NP-hard for co-Büchi games [39].

Whereas NEs exist for large classes of games, see Corollaries 1 and 2, SPEs fail to exist even in simple games like the one of Figure 1 [62]. Variants of SPE, weak SPE and very weak SPE, have thus been proposed in [11] as interesting alternatives. In a weak SPE (resp. very weak SPE), a player who deviates from a strategy $\sigma$ is allowed to use deviating strategies that differ from $\sigma$ on a finite number of histories only (resp. only on the initial vertex). Deviating strategies that only differ on the initial vertex are a well-known notion that for instance appears in the proof of Kuhn’s theorem [49] with the one-step deviation property. By definition, every SPE is a weak SPE, and every weak SPE is a very weak SPE. Weak SPE and very weak SPE are equivalent notions, but this is not true for SPE and weak SPE [11].

The following theorem gives two general conditions such that each of them separately guarantees the existence of a weak SPE.

Theorem 4.4

[15]** Let $G$ be a multi-player non zero-sum game such that

•

either each payoff function $f_{i}$ , $i\in\Pi$ , is prefix-independent,

•

or each $f_{i}$ , $i\in\Pi$ , has a finite range.

Then there exists a weak SPE in each initialized game $(G,v_{0})$ .

This theorem has to be compared with Corollary 2 that gives general conditions for the existence of an NE, one of them being prefix-independence of $f_{i}$ , $i\in\Pi$ . This latter condition is here enough to guarantee the existence of a weak SPE (the existence of an SPE is not possible as mentioned before with the game of Figure 1 [62]). It follows from Theorem 4.4 that there exists a weak SPE in all the Boolean and quantitative games of Section 2.3 (except for the case of discounted sum payoff that is neither prefix-independent nor with finite range).

In addition to SEs and (weak) SPEs, other solution profiles have been recently proposed, like Doomsday equilibria in [18], robust equilibria in [8], and equilibria using admissible strategies in [10]. We also refer the reader to the survey [9].

5 Conclusion

In this invited contribution, we gave an overview of classical as well as recent results about the threshold and constraint problems for games played on graphs. Solutions to these problems are winning strategies in case of two-player zero-sum games, and equilibria in case of multi-player non zero-sum games. We tried to present a unified approach through the notion of games equipped with a payoff function and a preference relation for each player, in a way to include classes of Boolean games and quantitative games that are usually studied. We also focussed on general approaches from which one can derived several different results: a criterium that guarantees the existence of uniform optimal strategies in two-player zero-sum games, and a characterization of plays that are the outcome of an Nash equilibrium in multi-player non zero-sum games. Several illustrative examples were provided as well as some intuition on the proofs when they are simple.

Acknowledgments

We would like to thank Patricia Bouyer, Thomas Brihaye, Emmanuel Filiot, Hugo Gimbert, Quentin Hautem, Mickaël Randour, and Jean-François Raskin for their useful discussions and comments that helped us to improve the presentation of this article.

Appendix

In this appendix, we give a sketch of proof for Muller games in Theorem 4.1. Recall that each player $i$ has the objective $\Omega_{i}=\{\rho\in Plays\mid in\!f(\rho)\in{\cal F}_{i}\}$ with ${\cal F}_{i}\subseteq 2^{V}$ , and that the values $val_{i}(v)$ , $v\in V$ , in each game $G_{i}$ can be computed in polynomial time (Theorem 3.5). To prove P membership for the constraint problem with bounds $(\mu_{i})_{i\in\Pi},(\nu_{i})_{i\in\Pi}$ , we apply the approach (3). Notice that for the required play $\rho\in Plays(v_{0})$ in (3), the set $U=in\!f(\rho)$ must be a strongly connected component that is reachable from the initial vertex $v_{0}$ . Moreover if for some $i$ , $f_{i}(\rho)=0$ then ${val}_{i}(\rho_{k})=0$ for all $\rho_{k}\in V_{i}$ , and if $\nu_{i}=0$ , then $f_{i}(\rho)=0$ . We thus proceed as follows. (i) For each $i$ such that $\nu_{i}=1$ , for each $U\in{\cal F}_{i}$ (seen as a potential $U=in\!f(\rho)$ ), the following computations are done in polynomial time for all $j\in\Pi$ :

•

if $\mu_{j}=1$ ((3) imposes $f_{j}(\rho)=1$ ), test whether $U\in{\cal F}_{j}$ ,

•

if $\nu_{j}=0$ ((3) imposes $f_{j}(\rho)=0$ ), test whether $U\not\in{\cal F}_{j}$ and whether each $v\in U\cap V_{j}$ has value ${val}_{j}(v)=0$ ,

•

if $\mu_{j}=0$ and $\nu_{j}=1$ ((3) allows either $f_{j}(\rho)=0$ or $f_{j}(\rho)=1$ ), then if $U\not\in{\cal F}_{j}$ , test whether each $v\in U\cap V_{j}$ has value ${val}_{j}(v)=0$ .

Finally, construct in polynomial time the game $G^{\prime}$ from $G$ such that each $V_{j}$ is limited to $\{v\in V_{j}\mid{val}_{j}(v)=0\}$ whenever $U\not\in{\cal F}_{j}$ , and test whether $U$ is a strongly connected component that is reachable from $v_{0}$ in $G^{\prime}$ . As soon as this sequence of tests is positive, there exists $\rho$ satisfying (3). (ii) It may happen that step (i) cannot be applied (because there is no $j$ such that $\mu_{j}=1$ , and for $j$ such that $\mu_{j}=0$ and $\nu_{j}=1$ , there is no potential $U=in\!f(\rho)$ ). In this case, we construct in polynomial time a two-player game $G^{\prime}$ from $G$ such that each $V_{i}$ is limited to $\{v\in V_{i}\mid{val}_{i}(v)=0\}$ , player $1$ controls no vertex and player $2$ is formed by the coalition of all $i\in\Pi$ , and the objective is a Muller objective with ${\cal F}=\cup_{i\in\Pi}{\cal F}_{i}$ . We then test in polynomial time whether player $1$ has no winning strategy from $v_{0}$ in this Muller game.

Bibliography71

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Rajeev Alur, Salvatore La Torre, and P. Madhusudan. Playing games with boxes and diamonds. In CONCUR Proceedings , volume 2761 of Lecture Notes in Comput. Sci. , pages 127–141. Springer, 2003.
2[2] Daniel Andersson. An improved algorithm for discounted payoff games. In ESSLLI Student Session , pages 91–98, 2006.
3[3] Catriel Beeri. On the membership problem for functional and multivalued dependencies in relational databases. ACM Trans. Database Syst. , 5(3), September 1980.
4[4] Valérie Berthé and Michel Rigo, editors. Combinatorics, Words and Symbolic Dynamics , volume 135. Cambridge University Press, 2016.
5[5] Roderick Bloem, Krishnendu Chatterjee, Thomas A. Henzinger, and Barbara Jobstmann. Better quality in synthesis through quantitative objectives. In CAV Proceedings , volume 5643 of Lecture Notes in Comput. Sci. , pages 140–156. Springer, 2009.
6[6] Udi Boker, Thomas A. Henzinger, and Jan Otop. The target discounted-sum problem. In LICS Proceedings , pages 750–761. IEEE Computer Society, 2015.
7[7] Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels. Pure Nash equilibria in concurrent deterministic games. Logical Methods in Comput. Sci. , 11(2), 2015.
8[8] Romain Brenguier. Robust equilibria in mean-payoff games. In Proceedings of FOSSACS , volume 9634 of Lecture Notes in Comput. Sci. , pages 217–233. Springer, 2016.