Probabilistic approach to finite state mean field games

Alekos Cecchin; Markus Fischer

arXiv:1704.00984·math.PR·February 1, 2018

Probabilistic approach to finite state mean field games

Alekos Cecchin, Markus Fischer

PDF

TL;DR

This paper introduces a probabilistic framework for finite state mean field games using stochastic differential equations driven by Poisson measures, establishing existence, approximation, and uniqueness results.

Contribution

It develops a probabilistic representation for finite state mean field games, proving existence of solutions and their role as approximate Nash equilibria for large N-player games.

Findings

01

Existence of solutions in relaxed controls

02

Mean field solutions form approximate Nash equilibria with error rate 1/√N

03

Uniqueness under small time horizon or monotonicity

Abstract

We study mean field games and corresponding $N$ -player games in continuous time over a finite time horizon where the position of each agent belongs to a finite state space. As opposed to previous works on finite state mean field games, we use a probabilistic representation of the system dynamics in terms of stochastic differential equations driven by Poisson random measures. Under mild assumptions, we prove existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls. Relying on the probabilistic representation and a coupling argument, we show that mean field game solutions provide symmetric $ϵ_{N}$ -Nash equilibria for the $N$ -player game, both in open-loop and in feedback strategies (not relaxed), with $ϵ_{N} \leq \frac{constant}{N}$ . Under stronger assumptions, we also find solutions of the mean field game in ordinary…

Equations459

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, α_{i}^{N} (s), μ^{N} (s^{-})) N_{i}^{N} (d s, d u), i = 1, \dots, N,

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, α_{i}^{N} (s), μ^{N} (s^{-})) N_{i}^{N} (d s, d u), i = 1, \dots, N,

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, α (s), m (s)) N (d s, d u),

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, α (s), m (s)) N (d s, d u),

f (t, x, u, a, p) := y \in Σ \sum (y - x) \mathbbm 1_{] 0, λ (t, x, y, a, p) [} (u_{y}) .

f (t, x, u, a, p) := y \in Σ \sum (y - x) \mathbbm 1_{] 0, λ (t, x, y, a, p) [} (u_{y}) .

P [X (t + h) = y ∣ X (t) = x] = λ (t, x, y, α, m) \cdot h + o (h)

P [X (t + h) = y ∣ X (t) = x] = λ (t, x, y, α, m) \cdot h + o (h)

S := P (Σ) = {p \in R^{d} : p_{j} \geq 0, j = 1, \dots, d; p_{1} + \dots + p_{d} = 1}

S := P (Σ) = {p \in R^{d} : p_{j} \geq 0, j = 1, \dots, d; p_{1} + \dots + p_{d} = 1}

L := {m : [0, T] ⟶ S : ∣ m (t) - m (s) ∣ \leq K ∣ t - s ∣, m (0) = m_{0}}

L := {m : [0, T] ⟶ S : ∣ m (t) - m (s) ∣ \leq K ∣ t - s ∣, m (0) = m_{0}}

\int_{U} ∣ f (s, x, u, a, p) - f (s, x, u, a, p) ∣ ν (d u) \leq K_{1} ∣ x - y ∣

\int_{U} ∣ f (s, x, u, a, p) - f (s, x, u, a, p) ∣ ν (d u) \leq K_{1} ∣ x - y ∣

\int_{U} ∣ f (t, x, u, a, p) - f (s, x, u, b, q) ∣ ν (d u) \leq w_{f} (∣ t - s ∣ + d i s t (a, b) + ∣ p - q ∣)

\int_{U} ∣ f (t, x, u, a, p) - f (s, x, u, b, q) ∣ ν (d u) \leq w_{f} (∣ t - s ∣ + d i s t (a, b) + ∣ p - q ∣)

\int_{U} ∣ f (t, x, u, a, p) - f (t, y, u, a, q) ∣ ν (d u) \leq K_{1} (∣ x - y ∣ + ∣ p - q ∣);

\int_{U} ∣ f (t, x, u, a, p) - f (t, y, u, a, q) ∣ ν (d u) \leq K_{1} (∣ x - y ∣ + ∣ p - q ∣);

\int_{U} ∣ f (t, x, u, a, p) - f (t, y, u, b, q) ∣ ν (d u) \leq K_{1} (∣ x - y ∣ + ∣ p - q ∣ + d i s t (a, b));

\int_{U} ∣ f (t, x, u, a, p) - f (t, y, u, b, q) ∣ ν (d u) \leq K_{1} (∣ x - y ∣ + ∣ p - q ∣ + d i s t (a, b));

∣ c (t, x, a, p) - c (t, y, a, q) ∣ + ∣ ψ (x, p) - ψ (y, q) ∣ \leq K_{2} (∣ x - y ∣ + ∣ p - q ∣);

∣ c (t, x, a, p) - c (t, y, a, q) ∣ + ∣ ψ (x, p) - ψ (y, q) ∣ \leq K_{2} (∣ x - y ∣ + ∣ p - q ∣);

∣ c (t, x, a, p) - c (t, y, b, q) ∣ + ∣ ψ (x, p) - ψ (y, q) ∣ \leq K_{2} [∣ x - y ∣ + d i s t (a, b) + ∣ p - q ∣] .

∣ c (t, x, a, p) - c (t, y, b, q) ∣ + ∣ ψ (x, p) - ψ (y, q) ∣ \leq K_{2} [∣ x - y ∣ + d i s t (a, b) + ∣ p - q ∣] .

Λ_{t}^{a, p} g (x) := \int_{U} [g (x + f (t, x, u, a, p)) - g (x)] ν (d u)

Λ_{t}^{a, p} g (x) := \int_{U} [g (x + f (t, x, u, a, p)) - g (x)] ν (d u)

H (t, x, a, p, g) := Λ_{t}^{a, p} g (x) + c (t, x, a, p) .

H (t, x, a, p, g) := Λ_{t}^{a, p} g (x) + c (t, x, a, p) .

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, α_{i}^{N} (s), μ^{N} (s^{-})) N_{i}^{N} (d s, d u) i = 1, \dots, N,

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, α_{i}^{N} (s), μ^{N} (s^{-})) N_{i}^{N} (d s, d u) i = 1, \dots, N,

A^{N} := {((Ω, F, P; F), α^{N}, ξ^{N}, N^{N})}

A^{N} := {((Ω, F, P; F), α^{N}, ξ^{N}, N^{N})}

A^{N} := {((Ω, F, P; F), γ^{N}, ξ^{N}, N^{N})}

A^{N} := {((Ω, F, P; F), γ^{N}, ξ^{N}, N^{N})}

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, γ_{i}^{N} (s, X^{N} (s^{-})), μ^{N} (s^{-})) N_{i}^{N} (d s, d u)

X_{i}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, γ_{i}^{N} (s, X^{N} (s^{-})), μ^{N} (s^{-})) N_{i}^{N} (d s, d u)

α^{N} [γ^{N}]_{i} (s) = γ_{i}^{N} (s, X^{N} (s^{-})) .

α^{N} [γ^{N}]_{i} (s) = γ_{i}^{N} (s, X^{N} (s^{-})) .

J_{i}^{N} (α^{N}) := E [\int_{0}^{T} c (t, X_{i}^{N} (t), α_{i}^{N} (t), μ^{N} (t)) d t + Ψ (X_{i}^{N} (T), μ^{N} (T))] .

J_{i}^{N} (α^{N}) := E [\int_{0}^{T} c (t, X_{i}^{N} (t), α_{i}^{N} (t), μ^{N} (t)) d t + Ψ (X_{i}^{N} (T), μ^{N} (T))] .

[α^{N, - i}; β]_{j} = {α_{j}^{N} β j \neq = i j = i .

[α^{N, - i}; β]_{j} = {α_{j}^{N} β j \neq = i j = i .

⎩ ⎨ ⎧ X_{i}^{N} (t) = X_{j}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, β (s), μ_{N} (s^{-})) N_{i}^{N} (d s, d u) ξ_{j}^{N} + \int_{0}^{t} \int_{U} f (s, X_{j}^{N} (s^{-}), u, γ_{j}^{N} (s, X^{N} (s^{-})), μ_{N} (s^{-})) N_{j}^{N} (d s, d u) \mbox i f j \neq = i .

⎩ ⎨ ⎧ X_{i}^{N} (t) = X_{j}^{N} (t) = ξ_{i}^{N} + \int_{0}^{t} \int_{U} f (s, X_{i}^{N} (s^{-}), u, β (s), μ_{N} (s^{-})) N_{i}^{N} (d s, d u) ξ_{j}^{N} + \int_{0}^{t} \int_{U} f (s, X_{j}^{N} (s^{-}), u, γ_{j}^{N} (s, X^{N} (s^{-})), μ_{N} (s^{-})) N_{j}^{N} (d s, d u) \mbox i f j \neq = i .

[γ^{N, - i}; β]_{j} (t) = {γ_{j}^{N} (t, X_{j}^{N} (t^{-})) β (t) j \neq = i j = i .

[γ^{N, - i}; β]_{j} (t) = {γ_{j}^{N} (t, X_{j}^{N} (t^{-})) β (t) j \neq = i j = i .

J_{i}^{N} (α^{N}) \leq J_{N}^{i} ([α^{N, - i}; β]) + ε

J_{i}^{N} (α^{N}) \leq J_{N}^{i} ([α^{N, - i}; β]) + ε

J_{i}^{N} (γ^{N}) \leq J_{N}^{i} ([γ^{N, - i}; β]) + ε

J_{i}^{N} (γ^{N}) \leq J_{N}^{i} ([γ^{N, - i}; β]) + ε

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, α (s), m (s)) N (d s, d u), t \in [0, T] .

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, α (s), m (s)) N (d s, d u), t \in [0, T] .

A := {((Ω, F, P; F), α, ξ, N)}

A := {((Ω, F, P; F), α, ξ, N)}

A := {((Ω, F, P; F), γ, ξ, N)}

A := {((Ω, F, P; F), γ, ξ, N)}

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, γ (s, X (s^{-})), m (s)) N (d s, d u), t \in [0, T] .

X (t) = ξ + \int_{0}^{t} \int_{U} f (s, X (s^{-}), u, γ (s, X (s^{-})), m (s)) N (d s, d u), t \in [0, T] .

α^{γ} (t) := γ (t, X_{γ, m} (t^{-})) .

α^{γ} (t) := γ (t, X_{γ, m} (t^{-})) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Probabilistic approach to finite state mean field games

Alekos Cecchin

Department of Mathematics “Tullio Levi Civita”

University of Padua

Via Trieste 63, 35121 Padova, Italy

[email protected]

and

Markus Fischer

[email protected] http://www.math.unipd.it/ fischer

(Date: April 2, 2017; revised December 1, 2017)

Abstract.

We study mean field games and corresponding $N$ -player games in continuous time over a finite time horizon where the position of each agent belongs to a finite state space. As opposed to previous works on finite state mean field games, we use a probabilistic representation of the system dynamics in terms of stochastic differential equations driven by Poisson random measures. Under mild assumptions, we prove existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls. Relying on the probabilistic representation and a coupling argument, we show that mean field game solutions provide symmetric $\varepsilon_{N}$ -Nash equilibria for the $N$ -player game, both in open-loop and in feedback strategies (not relaxed), with $\varepsilon_{N}\leq\frac{\text{constant}}{\sqrt{N}}$ . Under stronger assumptions, we also find solutions of the mean field game in ordinary feedback controls and prove uniqueness either in case of a small time horizon or under monotonicity.

Key words and phrases:

Mean field games, finite state space, relaxed controls, relaxed Poisson measures, $N$ -person games, approximate Nash equilibria, chattering lemma

1991 Mathematics Subject Classification:

60J27, 60K35, 91A10, 93E20

The first author is supported by the PhD program in Mathematical Sciences, Department of Mathematics, University of Padua (Italy) and Progetto Dottorati - Fondazione Cassa di Risparmio di Padova e Rovigo (CaRiPaRo). The second author acknowledges partial support through the research projects “Mean Field Games and Nonlinear PDEs” (CPDA157835) of the University of Padua and “Nonlinear Partial Differential Equations: Asymptotic Problems and Mean-Field Games” of the Fondazione CaRiPaRo. Both authors thank an anonymous Referee for her/his helpful critique and detailed comments and suggestions.

1. Introduction

Mean field games, as independently introduced by Lasry and Lions (2007) and by Huang et al. (2006), represent limit models for symmetric non-zero-sum non-cooperative $N$ -player dynamic games with mean field interactions when the number $N$ of players tends to infinity. For an introduction to mean field games see Cardaliaguet (2013), Carmona et al. (2013) and Bensoussan et al. (2013); the latter two works also deal with optimal control problems of McKean-Vlasov type. There is by now a wealth of works dealing with different classes of mean field games; for a partial overview see Gomes et al. (2015) and the references therein. Here, we restrict attention to a class of finite time horizon problems with continuous time dynamics and fully symmetric cost structure, where the position of each agent belongs to a finite state space.

The relation between the limit model (the mean field game) and the corresponding prelimit models (the $N$ -player games) can be understood in two different directions: approximation and convergence. By approximation we mean that a solution of the mean field game allows to construct approximate Nash equilibria for the $N$ -player games, where the approximation error is arbitrarily small for $N$ big enough. By convergence we mean that Nash equilibria for the $N$ -player games may be expected to converge to a solution of the mean field game as $N$ tends to infinity.

Results in the approximation direction are more common and usually provide the justification for the definition of the mean field game. When the underlying dynamics is of Itô type without jumps, such results were established by Huang et al. (2006) and, more recently, by for instance Carmona and Delarue (2013), Carmona and Lacker (2015) and Bensoussan et al. (2016). When the dynamics is driven by generators of Lévy type, but with the control appearing only in the drift, an approximation result is found in Kolokoltsov et al. (2011).

Rigorous results on convergence to the mean field game limit in the non stationary case (finite time horizon) are even more recent. While the limits of $N$ -player Nash equilibria in stochastic open-loop strategies can be completely characterized (see Lacker (2016) and Fischer (2017) for general systems of Itô type), the convergence problem is more difficult for Nash equilibria in Markov feedback strategies with global state information. A breakthrough was achieved by Cardaliaguet et al. (2015). Their proof of convergence relies on having a regular solution to the so-called master equation. This is a kind of transport equation on the space of probability measures associated with the mean field game; its solution yields a solution to the mean field game for any initial time and initial distribution. If the mean field game is such that its master equation possesses a unique regular solution, then that solution can be used to prove convergence of the costs associated with the $N$ -player Nash equilibria, as well as a weak form of convergence of the corresponding feedback strategies. An important ingredient in the proof is a coupling argument similar to the one employed in deriving the propagation of chaos property for uncontrolled mean field systems (cf. Sznitman, 1991). This kind of coupling argument, in which independent copies of the limit process are compared to their prelimit counterparts, is useful also for obtaining approximation results; cf. for instance the above cited works by Huang et al. (2006) and Carmona and Delarue (2013).

In this paper, we focus on games where the position of each agent belongs to a given finite state space $\Sigma:=\left\{1,\ldots,d\right\}$ . Such games have been studied by Gomes et al. (2013), and also by Basna et al. (2014). Their approach to the problem is based on PDE / ODE methods and the infinitesimal generator ( $Q$ matrix) of the system dynamics; we will return to this shortly.

Here, we adopt a different approach based on a probabilistic representation. We write the dynamics of the $N$ -player game as a system of stochastic differential equations driven by independent stationary Poisson random measures with the same intensity measure $\nu$ , weakly coupled through the empirical measure of the system states:

[TABLE]

where $\alpha^{N}_{i}$ is the control of player $i$ (here in open loop form) with values in a compact set $A$ and $\mu^{N}(s^{-})$ is the empirical measure of the system immediately before time $s$ . The dynamics for the one representative player of the mean field limit is analogously written as

[TABLE]

where $\alpha$ is the control and $m:[0,T]\rightarrow\mathcal{P}(\Sigma)$ a deterministic flow of probability measures, which takes the place of $\mu^{N}$ .

Representations (1.1) and (1.2) of the system dynamics allow to obtain approximation results with error bounds of the form $\frac{\text{constant}}{\sqrt{N}}$ for the approximate $N$ -player Nash equilibria via the aforementioned coupling argument. This is what we will do here. The probabilistic representation is useful also for the problem of convergence to the mean field limit; see below.

The function $f$ appearing in (1.1) and (1.2) can be chosen so that the corresponding state processes $X^{N}_{i}$ , $X$ have prescribed transition rates when the control and measure variable are held constant. Following an idea of Graham (1992), we choose $U\subset\mathbb{R}^{d}$ , let the intensity measure $\nu$ be given by $d$ copies of Lebesgue measure on the line (cf. (2.29) below), and set

[TABLE]

With this $f$ we have, as $h\rightarrow 0$ ,

[TABLE]

if $y\neq x$ , for any constant control $\alpha$ and probability measure $m$ . Thus, $\lambda(t,x,y,\alpha,m)$ is the transition rate from state $x$ to state $y$ .

We will consider several types of controls: open-loop, feedback, relaxed open-loop and relaxed feedback. Each player wants to optimize his cost functional over a finite time horizon $T$ . The coefficients representing running and terminal costs may depend on the measure variable and are the same for all players.

We first study the mean field game and show that it admits a solution in relaxed controls. The solution of the mean field game can be seen as a fixed point. For a given flow of measures $m(\cdot)$ , find a strategy $\alpha_{m}$ that is optimal and let $X^{\alpha_{m},m}$ be the corresponding solution of Eq. (1.2). Now find $m$ such that $Law(X(t))=m(t)$ for all $t\in[0,T]$ . Under mild hypotheses, we prove existence of solutions in relaxed open-loop controls using the Ky Fan fixed point theorem for point-to-set maps. This is analogous to the existence result obtained by Lacker (2015) for general dynamics driven by Wiener processes. As there, we will characterize solutions to Eq. (1.2) through the associated controlled martingale problem. In order to write the dynamics when using a relaxed control, we need to work with relaxed Poisson measures in the sense of Kushner and Dupuis (2001); also see Appendix A below. The same assumptions that give existence in relaxed open-loop controls also yield existence of solutions in relaxed feedback controls. Relaxed controls are used only for the limit model.

Then we show that those relaxed mean field game solutions provide $\varepsilon_{N}$ -Nash equilibria for the $N$ -player game both in ordinary open-loop and ordinary feedback strategies. To this end, we approximate a limiting optimal relaxed control by an ordinary one, using a version of the chattering lemma that also works for feedback controls, at least in our finite setting. The approximating control is then shown to provide a symmetric $\varepsilon_{N}$ -Nash equilibrium, with $\varepsilon_{N}\leq\frac{\text{constant}}{\sqrt{N}}$ , decentralized when considering feedback strategies. As explained above, our proof relies on the probabilistic representation of the system and a coupling argument.

We also study the problem of finding solutions of the mean field game in ordinary feedback controls. There, we need stronger assumptions in order to guarantee the uniqueness of an optimal feedback control for any fixed $m$ (existence always holds). Moreover, we prove that the feedback mean field game solution is unique either if the time horizon $T$ is small enough or if the cost coefficients satisfy the monotonicity conditions of Lasry and Lions (cf. below).

Roughly speaking, we need to assume only the continuity of the rates $\lambda$ in order to have relaxed or relaxed feedback mean field game solutions and to obtain $\varepsilon_{N}$ -Nash equilibria for the $N$ -player game, both open-loop and feedback. Under stronger assumptions, namely affine dependence of $\lambda$ on the control and strict convexity of the cost, we have uniqueness of the optimal feedback control for any $m$ through the uniqueness of the minimizer of the associated Hamiltonian. Under assumptions similar to these latter, Basna et al. (2014) study the problem in the framework of non-linear Markov processes and find $\frac{1}{N}$ -Nash equilibria for the $N$ -player game. In Gomes et al. (2013), the transition rates coincide with the control, in analogy with the original works of Lasry and Lions, and $\frac{1}{\sqrt{N}}$ -Nash equilibria are obtained. Both these works consider ordinary feedback controls only, hence feedback solutions of the mean field game.

The work by Gomes et al. (2013) also contains a result in the convergence direction. More precisely, convergence of $N$ -player Nash equilibria in feedback controls to the mean field limit is established, but only if the time horizon $T$ is sufficiently small. Moreover, the authors prove a result about the uniqueness of feedback mean field game solutions for arbitrary time horizon in case the Lasry-Lions monotonicity conditions hold.

Lastly, let us mention several recent preprints. In Doncel et al. (2016), continuous time mean field games with finite state space and finite action space are studied. The authors prove existence of solutions to the mean field game, corresponding to what we call solutions in relaxed feedback controls. Their prelimit models (the $N$ -player games) are different and difficult to compare to ours since they are set in discrete time. The second work we mention is Benazzoli et al. (2017). There, the authors study a class of mean field games with jump diffusion dynamics. An existence result for the mean field game in the spirit of Lacker (2015) is given. The authors also obtain a convergence result in a special situation where Nash equilibria for the $N$ -player games can be found explicitly. In their model, the jump heights are directly (and linearly) controlled, not the jump intensities.

The last two preprints, which appeared nearly simultaneously, after submission of the present paper, concern the convergence problem for finite state mean field games. In Cecchin and Pelino (2017), a joint work of the first author, the convergence of feedback Nash equilibria to solutions of the mean field game is studied following the ideas of Cardaliaguet et al. (2015) sketched above. The Master Equation, which in this case is a first order PDE stated in $\mathcal{P}(\Sigma)$ , is employed to obtain convergence of the feedback Nash equilibria, the value functions and a propagation of chaos property for the $N$ -player optimal trajectories. Provided that the Master Equation possesses a (unique) classical solution, convergence is established through a coupling argument, which relies on the probabilistic representation of the dynamics introduced here. Existence of a unique classical solution to the Master Equation is verified under the Lasry-Lions monotonicity conditions. In addition, a central limit theorem and a large deviation principle for the $N$ -player empirical measure processes are proved. In the independent work by Bayraktar and Cohen (2017), the authors again use the Master Equation in the spirit of Cardaliaguet et al. (2015) to find the same convergence result as in Cecchin and Pelino (2017), but using a slightly different probabilistic representation of the dynamics. They also obtain a central limit theorem for the fluctuations of the $N$ -player empirical measure processes.

Structure of the paper

In Section 2, we introduce the notation and various assumptions to be used in the sequel. Then we describe the $N$ -player games as well as the corresponding mean field game, giving the relevant definitions of Nash equilibrium and solution of the mean field game. Relaxed controls (open-loop and feedback) are introduced there as well, while a proper definition of relaxed Poisson measures is given in Appendix A. All main assumptions are verified to hold for the natural shape of $f$ in (1.3).

In Section 3, we establish existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls.

In Section 4, we find, under additional assumptions, mean field game solutions in non-relaxed feedback controls by proving the uniqueness of the optimal control for any flow of measures. Moreover, uniqueness of solutions is proved either for small $T$ or under the Lasry-Lions monotonicity conditions.

In Section 5, we first establish a version of the chattering lemma that works also for feedback controls. Then we turn to the construction of approximate Nash equilibria coming from a solution of the mean field game, and derive the error bound mentioned above for feedback as well as open-loop strategies.

Section 6 contains a summary of the main results.

2. Description of the model

2.1. Notations and assumptions

Throughout the paper, we fix $\Sigma=\{1,\ldots,d\}$ to be the finite state space of any player. Let $T$ be the finite time horizon and $(A,dist)$ be a compact metric space, the space of control values. Let $U$ be a compact set in $\mathbb{R}^{d}$ and let $\nu$ be a Radon measure on $U$ . Let

[TABLE]

be the space of probability measures on $\Sigma$ , which is the probability simplex in $\mathbb{R}^{d}$ . Let $f:[0,T]\times\Sigma\times U\times A\times S\longrightarrow\{-d,\ldots,d\}$ be a measurable function (the one appearing in the dynamics (1.2) and (1.1)) such that $f(t,x,u,a,p)\in\{1-x,\ldots,d-x\}$ . Let $c:[0,T]\times\Sigma\times A\times S\longrightarrow\mathbb{R}$ , $\psi:\Sigma\times S\longrightarrow\mathbb{R}$ be measurable functions, representing the running and the terminal costs, respectively, which will be the same for all players.

We will denote by $\mathcal{N}$ any stationary Poisson random measure on $[0,T]\times U$ with intensity measure $\nu$ on $U$ , and by $\mathcal{N}^{N}=(\mathcal{N}_{1}^{N},\ldots,\mathcal{N}_{N}^{N})$ a vector of $N$ i.i.d. stationary Poisson random measures, each with the same law as $\mathcal{N}$ . The initial point of the $N$ -player game will be represented by $N$ i.i.d. random variables $\xi_{1},\ldots,\xi_{N}$ with values in $\Sigma$ and common distribution $m_{0}\in S$ , which will be fixed throughout. Similarly, the initial point of the limiting system will be represented by a random variable $\xi$ with law $m_{0}$ .

The state of player $i$ at time $t$ is denoted by $X_{i}^{N}(t)$ . The trajectories of any process $X_{i}^{N}$ are assumed to be in $D([0,T],\Sigma)$ , which denotes the space of càdlàg functions from $[0,T]$ to $\Sigma$ , endowed with the Skorokhod $J_{1}$ -topology. Let $\mu^{N}(t):=\frac{1}{N}\sum_{i=1}^{N}\delta_{X^{N}_{i}(t)}$ be the empirical measure of the system of $N$ players. In the limiting dynamics, the empirical measure is replaced by a deterministic flow of probability measures $m:[0,T]\longrightarrow S$ .

The space of measures $S$ can be equipped with any norm in $\mathbb{R}^{d}$ , as they are all equivalent, so we choose the Euclidean norm $|p|$ . We observe that $S$ is a compact and convex subset of $\mathbb{R}^{d}$ . Denote by $\mathcal{C}([0,T],S)$ the space of continuous functions from $[0,T]$ to $S$ , endowed with the uniform norm. The space of flows of probability measures on $S$ will be denoted by $\mathcal{L}\subset\mathcal{C}([0,T],S)$ , which will be shown in Subsection 3.1 to be

[TABLE]

where the constant is given by $K:=2\nu(U)\sqrt{d}$ .

We will study several types of controls. Pathwise existence and uniqueness of solutions to the controlled dynamics (1.2), with trajectories that remain in $\Sigma$ , is guaranteed by the following Lipschitz condition:

[TABLE]

for every $x,y\in\Sigma,s\in[0,T],a\in A$ and $p\in S$ , where $K_{1}$ is a constant. The above condition is always satisfied in our model since $|x-y|\geq 1$ for each $x\neq y\in\Sigma$ and $\int_{U}|f(s,x,u,a,p)|\leq\nu(U)d$ ; thus we may take $K_{1}=2\nu(U)d$ .

Let us summarize here the various sets of assumptions we will make use of:

(A)

The function $\tilde{f}:[0,T]\times\Sigma\times A\times S\longrightarrow L^{1}(\nu)$ defined by $\tilde{f}(t,x,a,p):=f(t,x,\cdot,a,p)\in L^{1}(\nu)$ is continuous in $t,a,p$ (uniformly, and is bounded), that is, there exists a function $w_{f}$ such that $\lim_{h\rightarrow 0}w_{f}(h)=0$ and

[TABLE]

for every $t,s\in[0,T]$ , $x\in\Sigma$ , $a,b\in A$ , $p,q\in S$ ;

(A’)

Assumption (A) holds and $\tilde{f}$ is Lipschitz in $p\in S$ :

[TABLE]

(A”)

Assumption (A’) holds and $\tilde{f}$ is Lipschitz also in $a\in A$ :

[TABLE]

(B)

The running cost $c$ is continuous (and bounded) in $t,x,a,p$ and the terminal cost is continuous (and bounded) in $x,p$ ;

(B’)

Assumption (B) holds and the costs $c$ and $\psi$ are Lipschitz in $p$ :

[TABLE]

(B”)

Assumption (B’) holds and the running cost $c$ is Lipschitz also in $a$ :

[TABLE]

The above assumptions will be used in Sections 3 and 5 to find solutions of the mean field game and then approximate Nash equilibria for the $N$ -player game, both in open-loop and in feedback form.

Our last assumption will be more implicit. We identify the set of functions $g:\Sigma\longrightarrow\mathbb{R}$ with $\mathbb{R}^{d}$ and observe that any $g$ is bounded and Lipschitz. For any $x\in\Sigma$ , $0\leq t\leq T$ , $a\in A$ , $p\in S$ and $g\in\mathbb{R}^{d}$ define the generator

[TABLE]

and the pre-Hamiltonian

[TABLE]

In order to obtain existence and uniqueness of feedback mean field game solutions, in Section 4, we will make the additional hypothesis:

(C)

For any $t$ , $x$ , $p$ and $g$ there exists a unique $a^{\ast}=a^{\ast}(t,x,p,g)$ minimizer of $H(t,x,a,p,g)$ in $A$ ;

We observe that for any fixed $p$ and $g$ the function $a^{\ast}(t,x)$ is measurable, thanks to Theorem D.5 in Hernández-Lerma and Lasserre (1996). We remark also that the limiting dynamics (1.2) always admits a pathwise unique solution thanks to (2.1).

2.2. N-player game

In the prelimit, we consider a system of $N$ symmetric players governed by the dynamics

[TABLE]

where $X^{N}=(X^{N}_{1},\ldots,X^{N}_{N})$ and $\mu^{N}(t):=\frac{1}{N}\sum_{i=1}^{N}\delta_{X^{N}_{i}(t)}$ . Here, the controls $\alpha_{i}^{N}$ are in open-loop form. Let us specify the controls to be used in the $N$ -player game.

Definition 1.

Define the set of strategy vectors as

[TABLE]

where $(\Omega,\mathcal{F},P;\mathbb{F})$ is a filtered probability space, $\xi^{N}:=(\xi^{N}_{1},\ldots,\xi^{N}_{N})$ is a vector of $N$ i.i.d. $\mathcal{F}_{0}$ -measurable random variables with law $m_{0}$ , the initial points, $\mathcal{N}^{N}=(\mathcal{N}_{1}^{N},\ldots,\mathcal{N}_{N}^{N})$ is a vector of $N$ i.i.d. stationary Poisson random measures with respect to the filtration $\mathbb{F}=(\mathcal{F}_{t})_{t\in[0,T]}$ with intensity measure $\nu$ on $U$ , $\mathcal{F}_{T}=\mathcal{F}$ , and $\alpha^{N}=(\alpha^{N}_{1},\ldots,\alpha^{N}_{N})$ is a vector of $A$ -valued $\mathbb{F}$ -predictable processes $\alpha^{N}_{i}$ . We will often write $\alpha^{N}\in\mathcal{A}^{N}$ to indicate the process $\alpha^{N}$ .

Define the set of feedback strategy vectors as

[TABLE]

where $\gamma^{N}=(\gamma^{N}_{1},\ldots,\gamma^{N}_{N}):[0,T]\times\Sigma^{N}\rightarrow A^{N}$ is measurable and the filtered probability space and the $\xi^{N}$ and $\mathcal{N}^{N}$ are as above. We will often write $\gamma^{N}\in\mathbb{A}^{N}$ to indicate the function $\gamma^{N}$ .

We observe that the above definition of feedback strategy vector is not standard, as it is given together with the probability space and the noise. We give such a definition because in this way any strategy gives a unique pathwise solution to dynamics (2.9). Indeed, provided that $\tilde{f}$ is Lipschitz in $p$ , we have pathwise existence and uniqueness of solutions to the system (2.9), for any $\alpha^{N}=(\alpha^{N}_{1},\ldots,\alpha^{N}_{N})\in\mathcal{A}^{N}$ .

Given a feedback strategy vector $\gamma^{N}=(\gamma^{N}_{1},\ldots,\gamma^{N}_{N})\in\mathbb{A}^{N}$ , equation (2.9) is written as

[TABLE]

for each $i=1,\ldots,N$ . The same assumption as above provides existence and uniqueness of solutions $X^{N}_{i}$ to this equation, so we can define the related open-loop control $\alpha^{N}[\gamma^{N}]$ by

[TABLE]

In view of Definition 1, the open-loop control $\alpha^{N}[\gamma^{N}]$ has to be given together with a filtered probability space, a vector of initial conditions and a vector of Poisson random measures, which we impose to be the same as those given with the feedback control $\gamma^{N}$ .

Next, we define the object of the minimization. Let $\alpha^{N}=(\alpha^{N}_{1},\ldots,\alpha^{N}_{N})\in\mathcal{A}^{N}$ be a strategy vector and $X^{N}=(X^{N}_{i},\ldots,X^{N}_{N})$ be the solution to dynamics (2.9). For $i=1,\ldots,N$ set

[TABLE]

Define also $J^{N}_{i}(\gamma^{N}):=J^{N}_{i}(\alpha^{N}[\gamma^{N}])$ for any $\gamma^{N}\in\mathbb{A}^{N}$ .

We look for approximate Nash equilibria for the $N$ -player game. So let us define what are the perturbed strategy vectors we consider.

Notation 1.

Let $\beta$ be an $A$ -valued $\mathbb{F}$ -predictable process. For a strategy vector $\alpha^{N}=(\alpha^{N}_{1},\ldots,\alpha^{N}_{N})$ in $\mathcal{A}^{N}$ denote by $[\alpha^{N,-i};\beta]$ the strategy vector such that

[TABLE]

For a feedback strategy vector $\gamma^{N}=(\gamma^{N}_{1},\ldots,\gamma^{N}_{N})\in\mathbb{A}^{N}$ , let $\widetilde{X}^{N}$ be the solution to

[TABLE]

Denote then by $[\gamma^{N,-i};\beta]\in\mathcal{A}^{N}$ the strategy vector such that

[TABLE]

Definition 2.

Let $\varepsilon>0$ . A strategy vector $\alpha_{N}$ is said to be an $\varepsilon$ -Nash equilibrium if for each $i=1,\ldots,N$

[TABLE]

for every $\beta$ such that $[\alpha^{N,-i};\beta]$ is a strategy vector.

A vector $\gamma^{N}\in\mathbb{A}^{N}$ is called a feedback $\varepsilon$ -Nash equilibrium if

[TABLE]

for every $\beta$ such that $[\gamma^{N,-i};\beta]$ is a strategy vector.

We remark that the above definition of feedback $\varepsilon$ -Nash equilibrium is not standard. Indeed, the perturbed strategy vector $[\gamma^{N,-i};\beta]$ is usually required to be in feedback form. In our definition, a slightly more restrictive (or stronger) condition is used since the perturbing strategy $\beta$ is allowed to be in open-loop form. As a consequence, the approximation result of Section 5 will be slightly stronger than with the standard definition.

2.3. Mean field game

The mean field limiting system consists of a single player whose state evolves according to the dynamics

[TABLE]

Here the empirical measure appearing in (2.9) is replaced by a deterministic flow of probability measures $m:[0,T]\longrightarrow S$ .

Definition 3.

The set of open-loop controls is the set

[TABLE]

where $(\Omega,\mathcal{F},P;\mathbb{F})$ is a filtered probability space, $\xi$ is an $\mathcal{F}_{0}$ -measurable random variable with law $m_{0}$ , the initial condition, $\mathcal{N}$ is a stationary Poisson random measure with respect to the filtration $\mathbb{F}=(\mathcal{F}_{t})_{t\in[0,T]}$ with intensity measure $\nu$ on $U$ , $\mathcal{F}_{T}=\mathcal{F}$ , and $\alpha$ is an $A$ -valued $\mathbb{F}$ -predictable process. We will often write $\alpha\in\mathcal{A}$ to indicate the process $\alpha$ .

Define the set of feedback controls as

[TABLE]

where $\gamma:[0,T]\times\Sigma\rightarrow A$ is measurable and the filtered probability space, the initial condition and the Poisson random measure $\mathcal{N}$ are as above. We will often write $\gamma\in\mathbb{A}$ to indicate the function $\gamma$ .

We remark that the feedback control is given with the probability space and the noise, in analogy with Definition 1 for the prelimit system.

Thanks to the Lipschitz condition (2.1), the limiting dynamics is well defined. More precisely, given any open-loop control $((\Omega,\mathcal{F},P;\mathbb{F}),\alpha,\xi,\mathcal{N})\in\mathcal{A}$ and flow of measures $m\in\mathcal{L}$ , there exists a pathwise unique solution $X$ of Eq. (2.10), which we will denote by $X_{\alpha,m}$ . Similarly, given any feedback control $((\Omega,\mathcal{F},P;\mathbb{F}),\gamma,\xi,\mathcal{N})\in\mathbb{A}$ and flow of measures $m\in\mathcal{L}$ , there exists a pathwise unique process $X=X_{\gamma,m}$ solving

[TABLE]

The corresponding open-loop control is then defined as

[TABLE]

In view of Definition 3, the open-loop control $\alpha^{\gamma}$ has to be given together with a filtered probability space, an initial condition and a Poisson random measure, which we impose to be the same as those given with the feedback control $\gamma$ .

We define the object of the minimization for the mean field game. For any $\alpha\in\mathcal{A}$ and $m\in\mathcal{L}$ set

[TABLE]

Define also $J(\gamma,m):=J(\alpha^{\gamma},m)$ for any $\gamma\in\mathbb{A}$ .

The notion of solution for the limiting mean field game, which will provide approximate Nash equilibria for the $N$ -player game, is the following.

Definition 4.

An open-loop solution of the mean field game (2.10) is a triple

[TABLE]

such that

(1)

$\left((\Omega,\mathcal{F},P;\mathbb{F}),\alpha,\xi,\mathcal{N}\right)\in\mathcal{A}$ , $m\in\mathcal{L}$ , $(X(t))_{t\in[0,T]}$ is adapted to the filtration $\mathbb{F}$ and $X=X_{\alpha,m}$ ; 2. (2)

Optimality*: $J(\alpha,m)\leq J(\beta,m)$ for every $\beta\in\mathcal{A}$ ;* 3. (3)

Mean Field Condition*: $Law(X(t))=m(t)$ for every $t\in[0,T]$ .*

We say that $\left(\left((\Omega,\mathcal{F},P;\mathbb{F}),\gamma,\xi,\mathcal{N}\right),m,X\right)$ is a feedback solution of the mean field game if $\gamma\in\mathbb{A}$ and $\left(\left((\Omega,\mathcal{F},P;\mathbb{F}),\alpha^{\gamma},\xi,\mathcal{N}\right),m,X\right)$ is an open-loop solution of the mean field game, where $\alpha^{\gamma}$ is defined in (2.12).

In our writing, we will often drop the filtered probability space and the Poisson random measure from the notation.

In condition (3) of the above definition, $Law(X(t)):=P\circ X(t)^{-1}$ as usual. Let us denote by $Flow(X):[0,T]\longrightarrow S$ the flow of the process $X$ , that is, $Flow(X).t:=Law(X(t))$ . Then the mean field condition can be written as $Flow(X)=m$ .

2.4. Relaxed controls

The space $\mathcal{A}$ is not itself compact. In order to always have convergence along subsequences, we need to enlarge the space of controls, considering relaxed controls and related relaxed Poisson measures. They are used only for the limiting system.

Definition 5.

A deterministic relaxed control is a measure $\rho$ on the Borel sets $\mathcal{B}([0,T]\times A)$ such that

[TABLE]

The space of deterministic relaxed controls will be denoted by $\mathcal{D}$ .

Given $\rho\in\mathcal{D}$ , the time derivative exists for Lebesgue-almost every $t\in(0,T]$ ; it is the probability measure $\rho_{t}$ on $A$ given by

[TABLE]

As a consequence, $\rho$ can be factorized according to

[TABLE]

The space $\mathcal{D}$ is endowed with the topology of weak convergence of measures, i.e. $\rho_{n}\rightarrow\rho$ if and only if

[TABLE]

for every continuous $\varphi$ on $[0,T]\times A$ . Moreover there exists a metric which makes $\mathcal{D}$ a compact metric space (for instance, Kushner and Dupuis, 2001).

Definition 6.

The space of (stochastic) relaxed controls is

[TABLE]

where $(\Omega,\mathcal{F},P;\mathbb{F})$ is a filtered probability space, $\rho$ is a $\mathcal{D}$ -valued random variable such that $\rho([0,\cdot]\times E)$ is $\mathbb{F}$ -adapted for every $E\in\mathcal{B}(A)$ , and $\mathcal{N}$ is a stationary Poisson random measure with respect to the filtration $\mathbb{F}$ with intensity measure $\nu$ on $U$ . We will often write $\rho\in\mathcal{R}$ to denote the process $\rho$ .

The space of relaxed feedback controls is the set

[TABLE]

where $\widehat{\gamma}:[0,T]\times\Sigma\longrightarrow\mathcal{P}(A)$ is measurable, $\mathcal{P}(A)$ is endowed with the topology of weak convergence, and the filtered probability space, the initial condition and the Poisson random measure are as above. We will often write $\widehat{\gamma}\in\widehat{\mathbb{A}}$ to denote the process $\widehat{\gamma}$ .

The relaxed feedback control is given with the probability space and the noise, in analogy with Definition 3. Because of (2.14), the derivative $(\rho_{t}(E))_{0\leq t\leq T}$ is an $\mathbb{F}$ -predictable process for any $E\in\mathcal{B}(A)$ . An ordinary open-loop control $\alpha\in\mathcal{A}$ can be viewed as a relaxed control $\rho^{\alpha}\in\mathcal{R}$ in which the derivative in time is a Dirac measure:

[TABLE]

We also have to introduce the corresponding relaxed Poisson measure in order to have well-defined dynamics. This will be done properly in Appendix A. Given any $\rho\in\mathcal{R}$ , Borel sets $U_{0}\subseteq U$ , $A_{0}\subseteq A$ , the relaxed Poisson measure $\mathcal{N}_{\rho}$ related to the relaxed control $\rho$ has the property that the processes

[TABLE]

are $\mathbb{F}$ -martingales, and are orthogonal for disjoint $U_{0}\times A_{0}$ . This martingale property and the fact that $\mathcal{N}_{\rho}$ is a counting measure valued process define the distribution of $\mathcal{N}_{\rho}$ and the joint law of $(\mathcal{N}_{\rho},\rho,\xi,\mathcal{N})$ uniquely (see Appendix A). The martingale property (2.17) also implies that the process

[TABLE]

is an $\mathbb{F}$ -martingale, for any bounded and measurable $\varphi$ . For an ordinary control $\alpha\in\mathcal{A}$ (or the relaxed control it induces), the corresponding relaxed Poisson measure is explicitly given by

[TABLE]

The stochastic differential equation (2.10) in this more general framework with a relaxed Poisson measure is written as

[TABLE]

for any relaxed control $\rho\in\mathcal{R}$ and $m\in\mathcal{L}$ .

Given a relaxed feedback control $\widehat{\gamma}\in\widehat{\mathbb{A}}$ and a process $X$ , define the corresponding relaxed open-loop control through

[TABLE]

Let $\mathcal{N}_{\rho^{\widehat{\gamma},X}}$ be the relaxed Poisson measure corresponding to $\rho^{\widehat{\gamma},X}$ . Equation (2.20) then becomes

[TABLE]

where the solution process $X$ appears also in the relaxed Poisson measure.

The proof of the following lemma is given in Appendix A.1.

Lemma 1.

For any $m\in\mathcal{L}$ and $\rho\in\mathcal{R}$ , respectively $\widehat{\gamma}\in\widehat{\mathbb{A}}$ , there exists a pathwise unique solution to the stochastic differential equation (2.20), respectively (2.22).

The solutions to (2.20) and (2.22) will be denoted by $X_{\rho,m}$ and $X_{\widehat{\gamma},m}$ respectively. For $\widehat{\gamma}\in\widehat{\mathbb{A}}$ , let $\rho^{\widehat{\gamma}}$ denote the corresponding relaxed control defined by (2.21), that is, $\rho^{\widehat{\gamma}}$ is the relaxed open-loop control such that

[TABLE]

In view of Definition 6, the relaxed open-loop control $\rho^{\widehat{\gamma}}$ has to be given together with a filtered probability space, an initial condition and a Poisson random measure, which we impose to be the same as those coming with the relaxed feedback control $\widehat{\gamma}$ .

Let $\rho\in\mathcal{R}$ and $m\in\mathcal{L}$ . Let $X=X_{\rho,m}$ . Thanks to the martingale property (2.18), we obtain that the process

[TABLE]

is an $\mathbb{F}$ -martingale, for any $g\in\mathbb{R}^{d}$ . This yields the Dynkin formula

[TABLE]

The cost to be minimized is

[TABLE]

Define also $J(\widehat{\gamma},m):=J(\rho^{\widehat{\gamma}},m)$ for $\widehat{\gamma}\in\widehat{\mathbb{A}}$ . The definitions of relaxed solution of the mean field game (2.20) and relaxed feedback solution are analogous to Definition 4, where ordinary controls are replaced by relaxed controls.

Definition 7.

A relaxed solution of the mean field game (2.10) is a triple

[TABLE]

such that

(1)

$\left((\Omega,\mathcal{F},P;\mathbb{F}),\rho,\xi,\mathcal{N}\right)\in\mathcal{R}$ , $m\in\mathcal{L}$ , $(X(t))_{t\in[0,T]}$ is adapted to the filtration $\mathbb{F}$ and $X=X_{\rho,m}$ ; 2. (2)

Optimality*: $J(\rho,m)\leq J(\sigma,m)$ for every $\sigma\in\mathcal{R}$ ;* 3. (3)

Mean Field Condition*: $Law(X(t))=m(t)$ for every $t\in[0,T]$ .*

We say that $\left(\left((\Omega,\mathcal{F},P;\mathbb{F}),\widehat{\gamma},\xi,\mathcal{N}\right),m,X\right)$ is a relaxed feedback solution of the mean field game if $\widehat{\gamma}\in\widehat{\mathbb{A}}$ and $\left(\left((\Omega,\mathcal{F},P;\mathbb{F}),\rho^{\widehat{\gamma}},\xi,\mathcal{N}\right),m,X\right)$ is a relaxed solution of the mean field game, where $\rho^{\widehat{\gamma}}$ is defined in (2.23).

In our writing, we will often drop the filtered probability space and the Poisson random measure from the notation.

In Section 3.2 we will show the existence of relaxed mean field game solutions via a fixed point argument, while existence of a relaxed feedback mean field game solution is established in Section 3.3.

We will use the characterization of solutions to (2.20) via the controlled martingale problem. The proof of the following lemma is omitted; it can be derived by mimicking the one of Theorem 2.8.1 in Kushner (1990, p. 42).

Lemma 2.

Let $\left((\Omega^{\prime},\mathcal{F}^{\prime},P^{\prime};\mathbb{F}^{\prime}),\rho,\xi,\mathcal{N}\right)\in\mathcal{R}$ and $m\in\mathcal{L}$ . Then $X$ solves equation (2.20) in distribution if and only if the process $M_{g}^{X}(t)$ defined in (2.24) is an $\mathbb{F}$ -martingale for any $g\in\mathbb{R}^{d}$ . The underlying filtered probability space can always be assumed to be $D([0,T],\Sigma)\times\Omega$ , where $\Omega$ is the canonical space for $(\mathcal{N}_{\rho},\rho,\xi,\mathcal{N})$ defined in Appendix A, $\mathbb{F}$ the canonical filtration, and $X$ is the canonical process.

The martingale property holds if and only if

[TABLE]

for every $h:\Sigma^{j}\rightarrow\mathbb{R}$ and every choice of $j$ , $t$ , $s$ , $t_{i}$ , $i=1,\ldots,j$ such that $0\leq t_{i}\leq t\leq t+s$ .

In Section 4, under additional assumptions, we will prove existence of feedback mean field game solutions (not relaxed); such solutions will be shown to be unique either if the time horizon is small or if the Lasry-Lions monotonicity assumptions apply.

2.5. Example

We show how our assumptions are satisfied for a natural shape of the function $f$ for which, when considering $\alpha$ and $m$ constants, the transition rates of the Markov chain $X$ solution of the dynamics (2.10) appear explicitly. Consider then $f$ defined by

[TABLE]

and the intensity measure $\nu$ on $U\in\mathcal{B}(\mathbb{R}^{d})$ defined by

[TABLE]

where $U_{y}:=\left\{u\in U:u_{z}=0\quad\forall z\neq y\right\}$ , which is viewed as a subset of $\mathbb{R}$ , and $\ell$ is the Lebesgue measure on $\mathbb{R}$ .

The function $\lambda$ appearing in (2.28) yields the transition rates of the Markov chain $X$ solution of (2.10), that is, for $x\neq y$ , as $h\rightarrow 0$ ,

[TABLE]

Moreover, the measure $\nu$ defined in (2.29) has the property that

[TABLE]

for any bounded and measurable $\varphi:\mathbb{R}^{d}\longrightarrow\mathbb{R}$ . In particular,

[TABLE]

for any function $\varphi_{y}:\mathbb{R}\longrightarrow\mathbb{R}$ such that $\varphi_{y}(0)=0$ , $y\in\Sigma$ .

If we want $f$ to depend also on a control and a flow of measures, we may consider the rate $\lambda$ to depend also on $a\in A$ and $p\in S$ , so that (2.28) is rewritten as

[TABLE]

We also assume that $\lambda$ is bounded by a constant $M$ (which holds a posteriori by the assumptions of the next lemma) and $U:=[0,M]^{d}$ . With this $f$ , (2.30) becomes

[TABLE]

where $X=X_{\alpha,m}$ is the solution of (2.10) under the control $\alpha\in\mathcal{A}$ and flow of measures $m\in\mathcal{L}$ and $E_{t,x}$ denotes expectation with respect to the conditional probability $P[\cdot|X(t)=x]$ provided $P(X(t)=x)>0$ . In particular, if $\alpha(t)=\gamma(X(t))$ , then the transition rate is $\lambda(t,x,y,\gamma(x),m(t))$ . A proof of (2.30), (2.31) and (2.34) can be found in Turchi (2015), where the examples (2.28) and (2.33) were treated.

Let us check whether our assumptions on the model are satisfied for the above choice of $f$ and $\nu$ .

Lemma 3.

Let $f$ be defined by (2.33) and $\nu$ by (2.29).

•

If the rate $\lambda$ appearing in (2.33) is continuous in $t,a$ and $p$ , then (A) holds;

•

If in addition $\lambda$ is Lipschitz in $p$ , then (A’) holds;

•

If in addition $\lambda$ is Lipschitz also in $a$ , then (A”) holds.

Proof.

Let $t,s\in[0,T]$ , $a,b\in A$ , $p,q\in S$ and fix $x\in\Sigma$ . Then

[TABLE]

Applying (2.32), the last expression above is equal to

[TABLE]

which gives the claims. ∎

In order to verify assumption (C), we need additional hypotheses on the structure of the model.

Lemma 4.

Let $f$ be defined by (2.33) and $\nu$ by (2.29). Assume that $A$ is a compact and convex subset of a metric topological vector space. Let the running cost $c$ be strictly convex in $a$ and the rate $\lambda$ appearing in (2.33) be affine (in the sense of being both convex and concave) in $a$ . Then assumption (C) is satisfied.

Proof.

We have $H(t,x,a,p,g)=\Lambda^{a,p}_{t}g(x)+c(t,x,a,p)$ where

[TABLE]

Applying formula (2.31) we obtain

[TABLE]

which is an affine function of $a$ if $\lambda$ is affine in $a$ . Therefore, $H$ is a strictly convex function of $a$ if $c$ is strictly convex, and thus it has a unique minimum in $A$ . ∎

3. Relaxed Mean Field Game Solutions

3.1. The space $\mathcal{L}$

In order to prove the existence of solutions we use a fixed point theorem. First of all, we want to find a suitable space where all the flows of probability measures lie. Set $K:=2\nu(U)\sqrt{d}$ and denote by

[TABLE]

the space of Lipschitz continuous flows of probability measures, with the same Lipschitz constant $K$ and initial point $m_{0}$ . This space is easily seen to be convex and compact with respect to the uniform norm, thanks to the Ascoli-Arzelà theorem. The following lemma allows to restrict attention to flows of probability measures in $\mathcal{L}$ .

Lemma 5.

Let $\alpha\in\mathcal{A}$ , or $\rho\in\mathcal{R}$ , and let $m:[0,T]\longrightarrow S$ be any deterministic flow of probability measures. Then the flow of the solution process $Flow(X_{\alpha,m})$ , or $Flow(X_{\rho,m})$ , is in $\mathcal{L}$ .

Proof.

We prove the claim for relaxed controls, so the conclusion follows also when considering the subset of ordinary controls. Let $g:\Sigma\longrightarrow\mathbb{R}$ be a function, which is then Lipschitz and bounded and can be viewed as a vector in $\mathbb{R}^{d}$ . Denote $|g|_{\infty}:=\max_{x\in\Sigma}g(x)$ . Let $\rho\in\mathcal{R}$ and $m$ be fixed, and set $X=X_{\rho,m}$ . The function $m:[0,T]\rightarrow S$ has a priori no regularity, except for being measurable. By the Dynkin formula (2.25) we have, for any $0\leq s\leq t\leq T$ ,

[TABLE]

Hence

[TABLE]

thanks to the fact that $\rho_{r}$ is a probability measure on $A$ for any $r$ . Clearly, $E[g(X(t))]=g\cdot Law(X(t))$ . Thus, for any $t$ and $s$ ,

[TABLE]

which gives the claim. ∎

3.2. Existence of relaxed mean field game solutions

3.2.1. Tightness and continuity for $m$ fixed

Consider a sequence of random variables

[TABLE]

where $\rho^{n}$ is a relaxed control, $\mathcal{N}_{\rho^{n}}$ is the related relaxed Poisson measure and $X^{n}=X_{\rho^{n},m}$ , $m\in\mathcal{L}$ is fixed. The state space of these random variables is $D([0,T],\Sigma)\times\mathcal{D}\times\mathcal{M}$ , where $\mathcal{M}=\mathcal{M}([0,T]\times U\times A)$ denotes the set of finite positive measures on $[0,T]\times U\times A$ endowed with the topology of weak convergence.

The following is of fundamental importance, and is similar to Theorem 13.2.1 in Kushner and Dupuis (2001, p. 363).

Theorem 1.

Assume (A) and (B). Then

(1)

any sequence of the form (3.2) is tight; 2. (2)

the limit in distribution $(X,\rho,\widetilde{\mathcal{N}})$ of any converging subsequence is such that $\widetilde{\mathcal{N}}$ is the relaxed Poisson measure related to the relaxed control $\rho$ and $X=X_{\rho,m}$ in distribution; 3. (3)

$J(\rho,m)$ * is continuous in $\rho$ .*

Proof.

(1) The sequence of relaxed controls is tight as $\mathcal{D}$ is compact. For any $\varepsilon>0$ , the set

[TABLE]

is compact in $\mathcal{M}$ , since $[0,T]\times U\times A$ is compact. From (2.13) and the martingale property (2.17), it follows that $N_{\rho^{n}}(t,U,A)-t\nu(U)$ is a martingale for any $n$ and so $E[\mathcal{N}_{\rho^{n}}(T,U,A)]=T\nu(U)$ . Therefore, by Chebychev’s inequality,

[TABLE]

for any $n$ , saying that the sequence of relaxed Poisson measures is tight. The properties of the stochastic integral give

[TABLE]

for any $\mathbb{F}$ -stopping time $\tau$ , uniformly in $n$ , which yields the tightness of the processes in $D([0,T],\Sigma)$ by Aldous’s criterion (Aldous, 1978).

(2) By abuse of notations, denote by $(X^{n},\rho^{n},\mathcal{N}_{\rho^{n}})$ the subsequence which converges in distribution to $(X,\rho,\widetilde{\mathcal{N}})$ . From the martingale property (2.17), it follows that $\widetilde{\mathcal{N}}(t,U_{0},A_{0})-\nu(U_{0})\rho(t,A_{0})$ is a martingale for any Borel sets $A_{0}\subset A$ and $U_{0}\subset U$ , where the limiting measure is defined on the canonical space and the filtration is the canonical filtration (both defined in Appendix A). The limit random measure $\widetilde{\mathcal{N}}$ is integer valued (Theorem 15.7.4 in Kallenberg, 1986), so the uniqueness property says that $\widetilde{\mathcal{N}}=\mathcal{N}_{\rho}$ in distribution. The claim $X=X_{\rho,m}$ in distribution will be shown also in the proof of Theorem 2, where $m$ is not fixed, using the controlled martingale problem, so we do not repeat the argument here.

(3) $\lim_{n\rightarrow\infty}J(\rho_{n},m)=J(\rho,m)$ since $c$ and $\psi$ are bounded and continuous by assumption (B). ∎

By the chattering lemma, which we will present later as Lemma 8, we have

[TABLE]

The minimum on the left hand side exists by the above Theorem 1. The infimum on the right hand side is actually a minimum, too; see Theorem 4 below, where the existence of optimal feedback controls will be shown. However, there might exist more optima among relaxed open-loop controls than among ordinary feedback controls.

3.2.2. Fixed point argument

Let $2^{\mathcal{L}}$ be the set of subsets of $\mathcal{L}$ and define the point-to-set map $\Phi:\mathcal{L}\longrightarrow 2^{\mathcal{L}}$ by

[TABLE]

A flow $m\in\mathcal{L}$ is called a fixed point of this point-to-set map if $m\in\Phi(m)$ . We need this map since the optimal control is not necessarily unique.

By construction, $\Phi$ has a fixed point if and only if there exists a relaxed solution to the mean field game, in the sense of Definition 7. In order to prove the existence of a fixed point, we are going to apply Theorem 1 in Fan (1952), which requires the following definition.

Definition 8.

Let $\mathcal{L}$ be a metric space. A map $\Phi:\mathcal{L}\longrightarrow 2^{\mathcal{L}}$ is said to have closed graph if $m_{n}\in\mathcal{L}$ , $y_{n}\in\mathcal{L}$ , $y_{n}\in\Phi(m_{n})$ for any $n\in\mathbb{N}$ and $m_{n}\rightarrow m$ , $y_{n}\rightarrow y$ in $\mathcal{L}$ implies $y\in\Phi(m)$ .

Proposition 1 (Ky Fan).

Let $\mathcal{L}$ be a non empty, compact and convex subset of a locally convex metric topological vector space. Let $\Phi:\mathcal{L}\longrightarrow 2^{\mathcal{L}}$ have closed graph and assume that $\Phi(m)$ is non empty and convex for any $m\in\mathcal{L}$ . Then the set of fixed points of $\Phi$ is non empty and compact.

By means of this proposition we are now able to state and prove the following main theorem concerning existence of relaxed solutions, while uniqueness is not guaranteed.

Theorem 2.

Under assumptions (A) and (B) there exists at least one relaxed solution of the mean field game (2.20).

Proof.

We want to show the existence of a fixed point for the map $\Phi:\mathcal{L}\longrightarrow 2^{\mathcal{L}}$ defined in (3.3), applying Proposition 1. Recall that any element of $\Phi(m)$ is in $\mathcal{L}$ by Lemma 5, and the set $\mathcal{L}$ defined in (3.1) is a compact and convex subset of $\mathcal{C}([0,T],S)$ endowed with the uniform norm. By Theorem 1, $\Phi(m)$ is non empty for any $m$ . It remains to prove that $\Phi(m)$ is convex and $\Phi$ has closed graph.

$\Phi(m)$ ** is convex**. Let $m$ be fixed and let $\rho_{1},\rho_{2}\in\mathcal{R}$ be such that $Flow(X_{\rho_{1},m})$ and $Flow(X_{\rho_{2},m})$ belong to $\Phi(m)$ , i.e. $\rho_{1}$ and $\rho_{2}$ are optimal controls for $m$ , and take $\theta\in[0,1]$ . Let $\zeta$ be a Bernoulli random variable with parameter $\theta$ , $\mathcal{F}_{0}$ measurable and independent of $\rho_{1}$ and $\rho_{2}$ . Define $\rho_{3}\in\mathcal{R}$ by

[TABLE]

for any $E\in\mathcal{B}(A)$ and $t\in[0,T]$ . We have

[TABLE]

for every $G\in\mathcal{C}_{b}(D([0,T],\Sigma),\mathbb{R})$ . This implies that

[TABLE]

and then in particular

[TABLE]

Since $\rho_{1}$ and $\rho_{2}$ are optimal for $m$ we have, thanks to (3.4),

[TABLE]

for any $\sigma\in\mathcal{R}$ , which means that also $\rho_{3}$ is optimal for $m$ and hence (3.5) says that $\Phi(m)$ is convex.

$\Phi$ ** has closed graph**. Let $m_{n},y_{n},m,y\in\mathcal{L}$ be such that $m_{n}\rightarrow m$ , $y_{n}\rightarrow y$ in $\mathcal{L}$ and $y_{n}\in\Phi(m_{n})$ for every $n\in\mathbb{N}$ . We have to prove that $y\in\Phi(m)$ . Let $\rho_{n}\in\mathcal{R}$ be optimal for $m_{n}$ and such that $y_{n}=Flow(X_{\rho_{n},m_{n}})$ . Set $X_{n}:=X_{\rho_{n},m_{n}}$ and let $\mathcal{N}_{n}:=\mathcal{N}_{\rho_{n}}$ be the relaxed Poisson measure related to $\rho_{n}$ .

The tightness of the sequence $(X_{n},\rho_{n},\mathcal{N}_{n})$ is proved as in Theorem 1. Let $(X_{n_{k}},\rho_{n_{k}},\mathcal{N}_{n_{k}})$ be a subsequence which converges in distribution to $(X,\rho,\widetilde{\mathcal{N}})$ . We have $\widetilde{\mathcal{N}}=\mathcal{N}_{\rho}$ in distribution, i.e. it is the relaxed Poisson measure related to $\rho$ . In order to prove that $X=X_{\rho,m}$ in distribution, we use the controlled martingale problem formulation stated in Lemma 2, and hence let us assume that the processes are defined in the canonical space.

Property (2.27) holds for $X_{n_{k}}$ , $\rho_{n_{k}}$ and $m_{n_{k}}$ , any $k\in\mathbb{N}$ . Let $M^{n_{k}}_{g}$ denote the process defined by

[TABLE]

for any $g\in\mathbb{R}^{d}$ . Property (2.27) and the convergence in distribution of the sequence $(X_{n_{k}},\rho_{n_{k}},\mathcal{N}_{n_{k}})$ imply that

[TABLE]

thanks to continuity assumption (A), uniform convergence of $m_{n}$ and (2.16). Therefore we have proved that $X=X_{\rho,m}$ in distribution.

Thus we obtain

[TABLE]

which implies the convergence

[TABLE]

that is, $Flow(X_{n_{k}})\rightarrow Flow(X)$ uniformly. The convergence is then proved along a subsequence, but by hypothesis the limit $Flow(X_{n})\rightarrow y$ exists in $\mathcal{L}$ , hence $y=Flow(X)=Flow(X_{\rho,m})$ .

It remains to prove that $\rho$ is optimal for $m$ . Again the convergence in distribution of the sequence $(X_{n_{k}},\rho_{n_{k}},\mathcal{N}_{n_{k}})$ implies that $\lim_{k}J(\rho_{n_{k}},m_{n_{k}})=J(\rho,m)$ thanks to continuity assumption (B), uniform convergence of $m_{n}$ and (2.16). Then from the optimality of $\rho_{n}$ for $m_{n}$ , i.e. $J(\rho_{n_{k}},m_{n_{k}})\leq J(\sigma,m_{n_{k}})$ for every $\sigma\in\mathcal{R}$ , taking the limit as $k\rightarrow\infty$ we get $J(\rho,m)\leq J(\sigma,m)$ for every $\sigma\in\mathcal{R}$ , which means that $\rho$ is optimal for $m$ and thus $y=Flow(X_{\rho,m})\in\Phi(m)$ as required. ∎

3.3. Relaxed feedback mean field game solutions

Theorem 2 provides a relaxed (open-loop) solution of the mean field game (2.20). Under the same assumptions we obtain here a relaxed feedback mean field game solution which has the same cost and flow of the open-loop one. This result is similar to Theorem 3.7 in Lacker (2015) and will provide approximate feedback Nash equilibria for the $N$ -player game.

Theorem 3.

Assume (A) and (B) and let $\left(((\Omega,\mathcal{F},P;\mathbb{F}),\rho,\xi,\mathcal{N}),m,X_{\rho,m}\right)$ be a relaxed mean field game solution. Then there exists a relaxed feedback control $\widehat{\gamma}\in\widehat{\mathbb{A}}$ such that the tuple $\left(((\Omega,\mathcal{F},P;\mathbb{F}),\widehat{\gamma},\xi,\mathcal{N}),m,X_{\widehat{\gamma},m}\right)$ is a relaxed feedback mean field game solution; namely

[TABLE]

Proof.

The flow $m\in\mathcal{L}$ is fixed and set $X=X_{\rho,m}$ . We claim that there exists a measurable function $\widehat{\gamma}:[0,T]\times\Sigma\longrightarrow\mathcal{P}(A)$ such that

[TABLE]

This holds if and only if

[TABLE]

for any bounded and measurable $\varphi:[0,T]\times\Sigma\times A\longrightarrow\mathbb{R}$ . In order to construct $\widehat{\gamma}$ , define the probability measure $\Theta$ on $[0,T]\times\Sigma\times A$ by

[TABLE]

Then build $\widehat{\gamma}$ by disintegration of $\Theta$ :

[TABLE]

where $\Theta_{1}$ denotes the $[0,T]\times\Sigma$ marginal of $\Theta$ and $\widehat{\gamma}:[0,T]\times\Sigma\longrightarrow\mathcal{P}(A)$ is measurable. Following Lacker (2015), we show that such $\widehat{\gamma}$ satisfies (3.8): for every bounded and measurable $h:[0,T]\times\Sigma\longrightarrow\mathbb{R}$ we get

[TABLE]

which provides (3.8) thanks to Lemma 5.2 in Brunick and Shreve (2013).

Having $\widehat{\gamma}$ , (3.8) yields

[TABLE]

$\ell\otimes P$ -almost everywhere.

Then we solve equation (2.22) in the same probability space of $X$ , under the relaxed feedback control $\widehat{\gamma}$ , and denote by $Y=X_{\widehat{\gamma},m}$ its solution. By the Dynkin formula (2.25), we have for any $g\in\mathbb{R}^{d}$ ,

[TABLE]

and then thanks to (3.8)

[TABLE]

while Dynkin’s formula for $Y$ yields

[TABLE]

Comparing (3.9) and (3.10) we obtain that $Law(X(t))$ and $Law(Y(t))$ , which are vectors in $S\subset\mathbb{R}^{d}$ , satisfy the same ODE in integral form, namely

[TABLE]

for any $g\in\mathbb{R}^{d}$ , the unknown being denoted by $\pi:[0,T]\rightarrow S$ . Taking $g=e_{j}$ , $j=1,\ldots,d$ , the corresponding system of ODEs, which is clearly linear in $\pi$ , has a unique absolutely continuous solution $\pi\in\mathcal{L}$ , hence (3.6) is proved.

Similarly, (3.8) gives

[TABLE]

and then we use (3.6) to conclude that

[TABLE]

∎

4. Feedback Mean field Game Solutions

4.1. Feedback optimal control for $m$ fixed

We show the existence of an optimal non-relaxed feedback control $\gamma_{m}$ for $J(\alpha,m)$ for any $m$ , using the verification theorem for the related Hamilton-Jacobi-Bellman equation. Let $m\in\mathcal{L}$ be fixed.

For any $t\in[0,T]$ , $x\in\Sigma$ and $\alpha\in\mathcal{A}$ let $X_{\alpha}^{t,x}$ be the solution to

[TABLE]

and set

[TABLE]

Next, define the value function by

[TABLE]

Recall that the generator was defined in (2.7) by

[TABLE]

for any $t,x,a,p$ and $g\in\mathbb{R}^{d}$ . For a function $v=v(t,x)$ the generator will be applied to the space variable, i.e. denote $\Lambda^{a,p}_{t}v(t,x)=\Lambda^{a,p}_{t}v(t,\cdot)(x)$ .

Thanks to Theorem D.5 in Hernández-Lerma and Lasserre (1996) on measurable selectors, there exists a feedback control $\gamma_{m}\in\mathbb{A}$ (i.e. measurable) such that

[TABLE]

where $V_{m}$ is the value function (4.2). Let us remark that the above minimum exists for any $t$ and $x$ if (A) and (B) hold, as the right hand side turns out to be a continuous function of the variable $a$ , since the value function is trivially Lipschitz continuous in $x$ .

Theorem 4.

Assume (A) and (B). Let $m\in\mathcal{L}$ . Then any feedback control $\gamma_{m}$ defined by (4.3) is optimal, that is, $J(\gamma_{m},m)\leq J(\alpha,m)$ for any $\alpha\in\mathcal{A}$ .

In order to prove Theorem 4, we use the Hamilton-Jacobi-Bellman equation of the problem (see, for instance, Chapter 3 in Fleming and Soner (2006)):

[TABLE]

for a function $v\!:[0,T]\times\Sigma\rightarrow\mathbb{R}$ . Let us define, for $g\in\mathbb{R}^{d}\equiv\{\Sigma\rightarrow\mathbb{R}\}$ ,

[TABLE]

Since $\Sigma$ is finite, we shall denote $W_{x}(t):=v(t,x)$ , $W(t):=(W_{1}(t),\ldots,W_{d}(t))$ , $F_{x}(t,g):=G(t,x,g)$ , $F(t,g):=(F_{1}(t,g),\ldots,F_{d}(t,g))$ , $\Psi_{x}:=\psi(x,m(T))$ and $\Psi:=(\Psi_{1}(t),\ldots,\Psi_{d}(t))$ . Therefore (4.4) can be written as

[TABLE]

which is in fact an ODE.

Define a classical solution to (4.5) as an absolutely continuous function $W$ from $[0,T]$ to $\mathbb{R}^{d}$ such that $W(t)=\Psi+\int_{t}^{T}F(s,W(s))ds$ for every $t\in[0,T]$ . We apply to our problem the following verification theorem, which is a version of Theorem 3.8.1 in Fleming and Soner (2006, p. 135):

Proposition 2 (Verification).

Let $v$ be a classical solution to (4.5), and let $\gamma_{m}$ be any feedback control such that (4.3) holds for Lebesgue almost every $t$ . Then

[TABLE]

for any $t\in[0,T]$ and $x\in\Sigma$ , where $V_{m}$ is the value function (4.2).

We are now in the position to prove Theorem 4.

Proof of Theorem 4.

In view of Proposition 2, we have just to show that there exists a classical solution to (4.5). Hence it is enough to prove that $F=F(t,w)$ is globally Lipschitz continuous in $w\in\mathbb{R}^{d}$ , uniformly in $t\in[0,T]$ . So let $t$ be fixed and take $w,z\in\mathbb{R}^{d}$ and $x\in\Sigma$ . Recall that

[TABLE]

and let $b$ be a minimizer for $F_{x}(t,z)$ . Then

[TABLE]

Changing the role of $w$ and $z$ we obtain the converse, hence

[TABLE]

for any $x$ , which implies

[TABLE]

Therefore $F$ is Lipschitz continuous in $w$ in the norm $||w||=\max_{y\in\Sigma}\left|w_{y}\right|$ , which is equivalent to the Euclidean norm in $\mathbb{R}^{d}$ . ∎

4.2. Uniqueness of the feedback control for $m$ fixed

Consider the pre-Hamiltonian, as defined in (2.8),

[TABLE]

for $(t,x,a,p)\in[0,T]\times\Sigma\times A\times S$ and $g\in\mathbb{R}^{d}$ . We make the additional assumption (C); so let us recall that $a^{\ast}(t,x,p,g)$ is the unique minimizer of $H(t,x,a,p,g)$ in $a\in A$ . Define for $m\in\mathcal{L}$ the feedback control

[TABLE]

where $V_{m}$ is the value function (4.2).

Theorem 5.

Assume (A), (B) and (C). Given $m\in\mathcal{L}$ , let $\sigma\in\mathcal{R}$ be any optimal relaxed control for $m$ and let $X_{\sigma,m}$ be the corresponding solution to (2.20). Then $\sigma_{t}=\gamma_{m}(t,X_{\sigma,m}(t))$ for $\ell\otimes P$ -almost every $(t,\omega)$ , that is, $\sigma$ corresponds to the feedback control $\gamma_{m}$ .

This result and the proof of Theorem 2 imply that any relaxed solution of the mean field game must correspond to a feedback solution:

Corollary 1.

Assume (A), (B) and (C). Then there exists a feedback solution $(\gamma,m,X)$ of the mean field game, and any solution is such that its control coincides with $\gamma_{m}$ .

Let $Q\in\mathcal{P}(A)$ , and define

[TABLE]

Lemma 6.

If $H$ is continuous in $a$ , then

[TABLE]

for any $t,x,p$ and $g$ . Moreover, if (C) holds, then there exists a unique $Q^{\ast}\in\mathcal{P}(A)$ such that

[TABLE]

and $Q^{\ast}=\delta_{a^{\ast}}$ , where $a^{\ast}=a^{\ast}(t,x,p,g)$ .

Proof.

If $H$ is continuous in $a$ , then $\widetilde{H}$ is continuous in $Q\in\mathcal{P}(A)$ in the weak topology. Since $\mathcal{P}(A)$ is compact, there exists a minimum: let $Q^{\ast}$ be a minimizer. For fixed $t,x,p$ and $g$ we have

[TABLE]

and

[TABLE]

which means that $\widetilde{H}(t,x,Q^{\ast},p,g)=H(t,x,a^{\ast},p,g)$ .

Consider $H(t,x,a,p,g)-H(t,x,a^{\ast},p,g)$ as a function of $a$ : it is non-negative and, if (C) holds, it equals zero if and only if $a=a^{\ast}$ . Therefore,

[TABLE]

which implies the claim, namely that $Q^{\ast}(\left\{a^{\ast}\right\})=1$ . ∎

Remark 1.

Note that if (C) does not hold, then $Q^{\ast}$ is supported on the set of all minimizers of $H$ . Thus it might not be a Dirac measure. This implies that there may exist an optimal relaxed control which is not an ordinary control (not even open-loop).

Proof of Theorem 5.

Fix $m\in\mathcal{L}$ . Let $\sigma\in\mathcal{R}$ be an optimal relaxed control and denote by $X_{\sigma}=X_{\sigma,m}$ the corresponding optimal trajectory. By the chattering lemma, which we will state later as Lemma 8111Here only the open loop part of the chattering lemma is needed, which is well known, and so we postpone the proof of the lemma to Section 5, where we also give the feedback part.,

[TABLE]

where $V=V_{m}$ is the value function defined in (4.2). Thanks to (4.4), the Hamilton-Jacobi-Bellman equation, and (4.7), we have

[TABLE]

By the Dynkin formula (2.25) and the terminal condition for $V$ ,

[TABLE]

It follows that

[TABLE]

hence, in view of (4.8),

[TABLE]

for $\ell\otimes P$ -almost every $(t,\omega)$ , which means that

[TABLE]

for $\ell\otimes P$ -almost every $(t,\omega)$ . If (C) holds, then, by Lemma 6, the unique minimizer of $Q\mapsto\widetilde{H}(t,x,Q,m(t),V(t,\cdot))$ is the measure $Q^{\ast}=\delta_{a^{\ast}}\in\mathcal{P}(A)$ with $a^{\ast}=a^{\ast}(t,x,m(t),V(t,\cdot))$ . It follows that $\sigma_{t}=\gamma_{m}(t,X_{\sigma}(t))$ for $\ell\otimes P$ -almost every $(t,\omega)$ .

∎

4.3. Uniqueness of the feedback MFG solution for small time

In this subsection, we focus only on the dynamics for $f$ in (2.33),

[TABLE]

and $\nu$ defined in (2.29) with $U:=[0,M]^{d}$ . Moreover, we assume that $A=U$ and, for $x\neq y$ ,

[TABLE]

where $\zeta:S\rightarrow\mathbb{R}$ is some Lipschitz continuous function with Lipschitz constant $K_{\zeta}$ such that $\zeta(p)\geq\kappa$ for some $\kappa>0$ . Since $\lambda$ determines the transition rates, we set $\lambda(t,x,x,a,p):=-\sum_{y\neq x}\lambda(t,x,y,a,p)$ , $x\in\Sigma$ .

We assume that the cost $c$ in the variable $a$ is in $\mathcal{C}^{1}(A)$ , $\nabla_{a}c$ is Lipschitz continuous in the variable $p$ with Lipschitz constant $K_{a}$ and $c$ is uniformly convex, that is, there exists $\theta>0$ such that

[TABLE]

for all $t,x,a,b,p$ .

This setup is analogous to the one considered in Gomes et al. (2013). The assumptions of Lemma 4 are satisfied and thus for any $g\in\mathbb{R}^{d}$ there exist a unique minimizer $a^{\ast}(t,x,p,g)$ of $H(t,x,a,p,g)$ , which in this setting becomes

[TABLE]

The assumptions of Lemma 1 are satisfied so that (A”) and (B”) hold. We need $a^{\ast}$ to be Lipschitz continuous in $p$ and $g$ ; this fact is proved in Proposition 1 in Gomes et al. (2013). We state the result in the following

Lemma 7.

Under the above assumptions (in this subsection), the function $a^{\ast}$ is Lipschitz continuous in $p$ and $g$ :

[TABLE]

for any $t,x,p,q,g,h$ .

Let us fix here the filtered probability space, the initial condition and the Poisson random measure. Define $\gamma_{m}(t,x)=a^{\ast}(t,x,m(t),V_{m}(t,\cdot))$ as in (4.6): it is the unique feedback control for given flow of measures $m\in\mathcal{L}$ , where $V_{m}(t,x)$ is the value function defined in (4.2) with respect to $m$ . The cost functions $c$ and $\psi$ are uniformly bounded and so is the value function: Let us denote by $M_{V}$ the maximum of its absolute value. Denote by $M_{\zeta}$ the maximum of $\zeta$ and fix the constants

[TABLE]

Let $T^{\ast}>0$ be such that

[TABLE]

Theorem 6.

Under the assumptions of this subsection, for any $0<T<T^{\ast}$ there exists a unique feedback solution $(\gamma,m,X)$ of the mean field game. It is such that $\gamma$ is the feedback control $\gamma_{m}$ .

Proof.

In the notation of Theorem 2, the map $\Phi:\mathcal{L}\rightarrow\mathcal{L}$ is defined by $\Phi(m)=\left\{Flow(X_{\gamma_{m},m})\right\}$ , a singleton. If we prove that this map is a contraction for small time horizon $T$ , then the assertion follows by the Banach-Cacciopoli Theorem. So let $m,n\in\mathcal{L}$ and set $X:=X_{\gamma_{m},m}$ and $Y:=X_{\gamma_{n},n}$ . For a vector $v\in\mathbb{R}^{d}$ denote $|v|_{\infty}=\max_{x\in\Sigma}|v_{x}|$ .

First we prove that the value function $V_{m}$ is Lipschitz continuous with respect to $m$ . Thanks to the HJB equation (4.4) we have

[TABLE]

The Hamiltonian $H$ is Lipschitz in $(a,p,g)$ ; in fact, by (2.6) and (4.9) we have

[TABLE]

Then using (4.10) and (4.11) we obtain

[TABLE]

for any $x$ , hence Gronwall’s lemma implies that

[TABLE]

for any $0\leq t\leq T$ .

Therefore, by applying again (4.10) and (4.11), we obtain

[TABLE]

and thus, again by Gronwall’s lemma,

[TABLE]

for any $0\leq t\leq T$ . Since $|Law(X(t))-Law(Y(t))|\leq 2\sqrt{d}E|X(t)-Y(t)|$ we have

[TABLE]

and then the claim holds for $T^{\ast}$ satisfying (4.12). ∎

4.4. Uniqueness under monotonicity

Uniqueness of mean field game solutions was shown in Theorem 2 in Gomes et al. (2015) for arbitrary time horizon under the Lasry-Lions monotonicity assumptions. Here, we give a different proof of this result, which relies on the probabilistic representation of the mean field game, and allows for less restrictive assumptions on the data.

Specifically, we suppose that the function $f$ in the dynamics (2.10) does not depend on $p\in S$ and that the running cost $c$ splits in $c(t,x,a,p)=c_{0}(t,x,a)+c_{1}(x,p)$ . Moreover we assume that $c_{1}$ and $\psi$ satisfy the following monotonicity property:

[TABLE]

for any $p\neq p^{\prime}\in S$ . For example, $c_{1}$ and $\psi$ could be the gradient of convex functions in $\mathbb{R}^{d}$ .

Theorem 7.

Suppose that (A), (B) and the assumptions above hold. Let $(\gamma,m,X)$ and $(\gamma^{\prime},m^{\prime},X^{\prime})$ be two feedback mean field game solutions. Then $m(t)=m^{\prime}(t)$ for any $t$ . Also the corresponding value functions $V_{m}$ and $V_{m^{\prime}}$ are the same. Moreover, if (C) holds, then $\gamma(t,x)=\gamma^{\prime}(t,x)$ for any $t,x$ .

Proof.

Since the dynamics does not depend on $p\in S$ , we have $X=X^{\gamma}=X^{\gamma,m}=X^{\gamma,m^{\prime}}$ and $X^{\prime}=X^{\gamma^{\prime}}=X^{\gamma^{\prime},m}=X^{\gamma^{\prime},m^{\prime}}$ . The optimality of $\gamma$ yields $J(\gamma,m)\leq J(\gamma^{\prime},m)$ and similarly $J(\gamma^{\prime},m^{\prime})\leq J(\gamma,m^{\prime})$ , hence

[TABLE]

Summing these two inequalities and using the fact that $Law(X(t))=m(t)$ for any $t$ , we obtain

[TABLE]

If $m(t)\neq m^{\prime}(t)$ for some $t$ , then the latter expression is $<0$ , thanks to (4.13), (4.14) and the continuity of $m$ ; a contradiction. Therefore $m(t)=m^{\prime}(t)$ for all $t$ .

The fact that $V_{m}=V_{m^{\prime}}$ is implied by the uniqueness of solutions to the HJB equation (4.4). Assuming (A) and (B), the optimal feedback $\gamma$ satisfies (4.3). Thus, if (C) holds, then $\gamma=\gamma^{\prime}$ . ∎

5. Approximation of $N$ -player game

5.1. Approximation of relaxed controls

In order to get an $\varepsilon$ -Nash equilibrium for the $N$ -player game in open-loop strategies, respectively in feedback strategies, we have first to find an approximation of the optimal relaxed control, respectively relaxed feedback control, for the mean field game. To this end, we will make use of the following version of the chattering lemma.

Lemma 8 (Chattering).

For any relaxed control $\rho\in\mathcal{R}$ , there exists a sequence of stochastic open-loop controls $\alpha_{n}\in\mathcal{A}$ such that, denoting by $\rho^{\alpha_{n}}(dt,da)=\delta_{\alpha_{n}(t)}(da)dt$ their relaxed control representation,

[TABLE]

where the limit is in the weak topology in $\mathcal{M}([0,T]\times A)$ . Moreover, any $\alpha_{n}$ takes values in a finite subset of $A$ .

For any relaxed feedback control $\widehat{\gamma}\in\widehat{\mathbb{A}}$ , there exists a sequence of feedback controls $\gamma_{n}\in\mathbb{A}$ such that

[TABLE]

uniformly in $x\in\Sigma$ and

[TABLE]

where $\rho^{\gamma_{n}}$ denotes the relaxed control representation of the open-loop control $\alpha^{\gamma_{n}}$ corresponding to $\gamma^{n}$ , as in (2.12), and $\rho^{\widehat{\gamma}}$ is defined in (2.23); i.e. $\rho^{\gamma_{n}}_{t}(da)=\delta_{\gamma_{n}(t,X_{\gamma_{n}}(t^{-})}(da)$ and $\rho^{\widehat{\gamma}}_{t}(da)=[\widehat{\gamma}(t,X_{\widehat{\gamma}}(t^{-}))](da)$ .

Proof.

The first part is proved as Theorem 3.5.2 in Kushner (1990, p. 59), and the construction of the approximating sequence in the proof gives the $\gamma_{n}$ for the second part; let us show how to build them. Let $\widehat{\gamma}\in\widehat{\mathbb{A}}$ , cover $A$ by $M_{r}$ disjoint sets $C_{i}^{r}$ which contain a point $a_{i}^{r}$ and set $A^{r}:=\left\{a_{i}^{r}:i\leq M_{r}\right\}$ , a finite subset of $A$ . For any $\Delta>0$ and $i,j$ define the function

[TABLE]

Divide any interval $[(i+1)\Delta,(i+2)\Delta[$ into $M^{r}$ subintervals $I_{ij}^{\Delta r}(x)$ of length $\tau_{ij}^{\Delta r}(x)$ and define the feedback control $\gamma^{\Delta r}$ , which is piecewise constant, by

[TABLE]

where $a_{0}$ is an arbitrary value in $A$ . The proof in Kushner (1990) shows that

[TABLE]

weakly, for any $x\in\Sigma$ . Since $\Sigma$ is finite we obtain that there exists a sequence of ordinary feedback controls $(\gamma_{n})$ such that (5.1) holds uniformly in $x$ . Let $m\in\mathcal{L}$ be fixed and $X_{n}$ be the solution to (2.11) corresponding to the feedback control $\gamma_{n}$ . By Theorem 1, the sequence $X_{n}$ is tight and there are a subsequence, which we still denote as $(X_{n})$ , and a process $X$ such that $\lim_{n\rightarrow\infty}X_{n}=X$ in distribution. Possibly applying the Skorokhod representation (Theorem 4.30 in Kallenberg, 2001, p. 79), we may assume that this convergence is with probability one in the space of càdlàg functions $D([0,T],\Sigma)$ equipped with the Skorokhod metric. This implies in particular that

[TABLE]

where $\overline{\eta}$ is the finite random set of discontinuity points (the jumps) of $X$ .

Let now $\varphi\in\mathcal{C}([0,T]\times A)$ be any continuous function, which is also bounded as $A$ is compact. We have to show the convergence to zero, almost surely, of

[TABLE]

where

[TABLE]

Any feedback control is Lipschitz in $x$ , i.e. $dist(\gamma_{n}(t,x),\gamma_{n}(t,y))\leq Diam(A)|x-y|$ , and so $Y_{n}$ tends to zero thanks to (5.3), the continuity of $\varphi$ and dominated convergence. As to $Z_{n}$ , write $Z_{n}=\sum_{x\in\Sigma}Z_{n}^{x}$ where

[TABLE]

and $\eta_{x}$ is the random set in $[0,T]$ where $X(t)=x$ . For each $x$ , the random set $D_{x}$ of discontinuity points of the function $\mathbbm{1}_{\eta_{x}}(t)\varphi(t,a)$ is a subset of $\overline{\eta}_{x}\times A$ for some finite random set $\overline{\eta}_{x}\subset[0,T]$ . Thus $D_{x}$ has null measure with respect to the limiting control $\widehat{\gamma}(t,x)(da)dt$ with probability one, for each $x$ , thanks to Definition 5. Hence by (5.1) we get that $Z_{n}^{x}$ tends to zero for each $x$ and so does $Z_{n}$ since $\Sigma$ is finite.

Let $\alpha_{n}(t)=\gamma_{n}(t,X_{n}(t^{-}))$ be the open-loop control corresponding to $\gamma_{n}$ and $\rho^{n}$ its relaxed control representation. We have just proved that $\lim_{n\rightarrow\infty}\rho^{n}=[\widehat{\gamma}(t,X(t^{-}))](da)dt$ $P$ -almost surely and thus Theorem 1 says that $X$ must have the same law as the solution to (2.22) under the relaxed feedback control $\widehat{\gamma}$ . That solution is unique by Lemma 1, meaning that $X=X_{\widehat{\gamma}}$ in distribution. Therefore (5.2) follows since $\rho_{t}^{\widehat{\gamma}}=\widehat{\gamma}(t,X_{\widehat{\gamma}}(t^{-}))$ by (2.23). ∎

Remark 2.

In the above proof we strongly used the finiteness of $\Sigma$ to get the approximation in feedback controls. While the result in the open-loop setting holds for general state space $\Sigma$ , when considering feedback controls it is not clear whether the above lemma can be generalized to uncountably infinite state spaces.

We are now able to state the approximation result:

Proposition 3.

Let $m\in\mathcal{L}$ , $\rho\in\mathcal{R}$ and $\widehat{\gamma}\in\widehat{\mathbb{A}}$ . Then for every $\varepsilon>0$ there exist $\alpha\in\mathcal{A}$ and $\gamma\in\mathbb{A}$ such that

[TABLE]

Proof.

Let $(\alpha_{n})$ be a sequence in $\mathcal{A}$ that approximates $\rho$ as in Lemma 8. Then we apply Theorem 1 to the sequence $(X_{\alpha_{n},m},\alpha_{n},m)$ : it is tight, a subsequence $(X_{\alpha_{n_{k}},m},\alpha_{n_{k}},m)$ converges in distribution to $(X_{\rho,m},\rho,m)$ and $\lim_{k\rightarrow\infty}J(\alpha_{n_{k}},m)=J(\rho,m)$ . Thus there exist $\alpha_{n_{k}}=:\alpha$ for which (5.4) and (5.6) hold. In a similar way, one proves (5.5) and (5.7) for feedback controls. ∎

5.2. $\varepsilon_{N}$ -Nash equilibria

We can now define the approximate Nash equilibrium for the $N$ -player game, first in open-loop form.

Notation 2.

Let $\left(\left((\Omega,\mathcal{F},P;(\mathcal{F}_{t})_{t\in[0,T]}),\alpha,\xi,\mathcal{N}\right),m,X_{\rho,m}\right)$ be a relaxed solution of the mean field game (2.20), which exists assuming (A) and (B) by Theorem 2. Fix $N\in\mathbb{N}$ and let $\alpha\in\mathcal{A}$ be as in Proposition 3, satisfying (5.4) and (5.6) with $\varepsilon=\frac{1}{\sqrt{N}}$ . Then $\left((\Omega,\mathcal{F},P;(\mathcal{F}_{t})_{t\in[0,T]}),\alpha^{N},\xi^{N},\mathcal{N}^{N}\right)$ denotes the strategy vector where $\alpha^{N}=(\alpha^{N}_{1},\ldots,\alpha^{N}_{N})$ , $\xi^{N}=(\xi^{N}_{1},\ldots,\xi^{N}_{N})$ , $\mathcal{N}^{N}=(\mathcal{N}_{1}^{N},\ldots,\mathcal{N}_{N}^{N})$ , such that

[TABLE]

Equation (5.8) says that this control is symmetric. The following is our main result, whose proof is carried out in the next subsection. In addition to (A) and (B), we make the Lipschitz assumptions (A’) and (B’).

Theorem 8.

Assume (A’) and (B’). Then the vector strategy defined in Notation 5.8 is an $\varepsilon_{N}$ -Nash equilibrium for the $N$ -player game for any $N$ where $\varepsilon_{N}\leq\frac{C}{\sqrt{N}}$ and $C=C(T,d,\nu(U),K_{1},K_{2})$ is a constant.

An analogous result holds when considering feedback strategies, but we state it separately.

Notation 3.

Let $\left(\left((\Omega,\mathcal{F},P;(\mathcal{F}_{t})_{t\in[0,T]}),\widehat{\gamma},\xi,\mathcal{N}\right),m,X_{\widehat{\gamma},m}\right)$ be a relaxed feedback solution of the mean field game (2.22), which exists assuming (A) and (B) by Theorem 3. Fix $N\in\mathbb{N}$ and let $\gamma\in\mathbb{A}$ be as in Proposition 3, satisfying (5.5) and (5.7) with $\varepsilon=\frac{1}{\sqrt{N}}$ . Then the tuple $\left((\Omega,\mathcal{F},P;(\mathcal{F}_{t})_{t\in[0,T]}),\gamma^{N},\xi^{N},\mathcal{N}^{N}\right)$ denotes the feedback strategy vector where $\xi^{N}=(\xi^{N}_{1},\ldots,\xi^{N}_{N})$ , $\mathcal{N}^{N}=(\mathcal{N}_{1}^{N},\ldots,\mathcal{N}_{N}^{N})$ , $\gamma^{N}=(\gamma^{N}_{1},\ldots,\gamma^{N}_{N})$ such that

[TABLE]

for any $t$ , $i$ and $x^{N}=(x^{N}_{1},\ldots,x^{N}_{N})\in\Sigma^{N}$ , and the $(\xi^{N}_{i},\mathcal{N}_{i}^{N})$ are $N$ i.i.d copies of $(\xi,\mathcal{N})$ .

Equation (5.9) says that this feedback strategy vector is symmetric and decentralized. In order to obtain feedback $\varepsilon$ -Nash equilibria from a mean field game solution, we need the Lipschitz assumptions (A”) and (B”).

Theorem 9.

Assume (A”), (B”). Then the feedback strategy vector defined in Notation 3 is a feedback $\varepsilon_{N}$ -Nash equilibrium for the $N$ -player game for any $N$ where $\varepsilon_{N}\leq\frac{C}{\sqrt{N}}$ and $C=C(T,d,\nu(U),K_{1},K_{2})$ is a constant.

5.3. Proofs of the results

In the following $C$ will denote any constant which depends on $T$ , $d$ , $\nu(U)$ and the Lipschitz constants $K_{1}$ and $K_{2}$ , but not on $N$ , and is allowed to change from line to line. We focus first on open-loop controls. Fix $N\in\mathbb{N}$ and let the strategy vector $\alpha^{N}$ be as in Notation 5.8. We play this strategy in the $N$ -player game:

[TABLE]

This will be coupled with $Y^{N}$ defined by

[TABLE]

Let $\mu^{N}(t):=\frac{1}{N}\sum_{i=1}^{N}\delta_{X^{N}_{i}(t)}$ be the empirical measure of the system (5.10) and $\overline{\mu}_{N}$ be the empirical measure of (5.11). Denote $\overline{m}(t):=Law(X_{\alpha,m}(t))$ . By (5.4) we have

[TABLE]

for any $t\geq 0$ , since $Flow(X_{\rho,m})=m$ . From (5.8) it follows that

[TABLE]

This implies, thanks to Theorem 1 in Fournier and Guillin (2015), that

[TABLE]

for any $t\in[0,T]$ and $N\in\mathbb{N}$ , where $C$ is a constant. This upper bound in $N^{-\frac{1}{2}}$ cannot be improved, since for these discrete measures a lower bound still in $N^{-\frac{1}{2}}$ can be found, see again Fournier and Guillin (2015).

Lemma 9.

Under assumption (A’), for every $t\geq 0$ and $i=1,\ldots,N$

[TABLE]

Proof.

From (5.12) and (5.13) it follows that

[TABLE]

We estimate $|\mu^{N}(t)-\overline{\mu}^{N}(t)|$ using the 1-Wasserstein metric (which is equivalent to the Euclidean metric in $\mathbb{R}^{d}$ ) and (5.10), (5.11) and the Lipschitz assumption (2.3):

[TABLE]

Hence applying (5.16)

[TABLE]

Then we obtain, by Gronwall’s lemma,

[TABLE]

Similarly we show (5.15): using (5.10), (5.11) and (5.14) we get, for any $i$ ,

[TABLE]

and hence $E|X^{N}_{i}(t)-Y^{N}_{i}(t)|\leq\frac{C}{\sqrt{N}}$ by Gronwall’s lemma. ∎

We are now in the position to state the result about the costs. Because of the symmetry of the problem, for the prelimit we shall consider only player one ( $i=1$ ).

Lemma 10.

Under assumptions (A’) and (B’)

[TABLE]

Proof.

Inequality (5.6), together with notation 5.8, yields

[TABLE]

While from (2.5), (5.14) and (5.15) we have

[TABLE]

which, combined with (5.18), gives the claim. ∎

We consider then any $\beta\in\mathcal{A}$ and the perturbed strategy vector $[\alpha^{N,-1},\beta]$ . We denote by $\widetilde{X}^{N}$ the solution to

[TABLE]

for each $i=1,\ldots,N$ . Set also $\widetilde{Y}_{1}:=X_{\beta,m}$ and $\widetilde{\mu}_{N}(t):=\frac{1}{N}\sum_{i=1}^{N}\delta_{\widetilde{X}^{N}_{i}(t)}.$

Lemma 11.

Under assumption (A’), for any $t\geq 0$ and $\beta\in\mathcal{A}$

[TABLE]

Proof.

We make the rough estimate

[TABLE]

Hence

[TABLE]

and then, by Gronwall’s lemma,

[TABLE]

Therefore (5.20) is proved. Estimate (5.21) follows from (5.20) and (5.14) and the fact that $\frac{1}{N}\leq\frac{1}{\sqrt{N}}$ for any $N\in\mathbb{N}$ . While (5.22) is a consequence of (5.21):

[TABLE]

and we conclude by Gronwall’s lemma. ∎

Lemma 12.

Under assumptions (A’) and (B’)

[TABLE]

Proof.

Inequalities (2.5), (5.21) and (5.22) give

[TABLE]

∎

Theorem 8 is now a consequence of Lemmata 5.17 and 5.23:

Proof of Theorem 8.

Inequalities (5.17), (5.23), and the optimality of $\rho$ yield

[TABLE]

∎

Remark 3.

We observe that $\alpha^{N}$ is still an $\varepsilon_{N}$ -Nash equilibrium if we assume only (B) instead of (B’), but without the estimate of the order of convergence $\varepsilon_{N}\leq\frac{C}{\sqrt{N}}$ . Namely, there exists a sequence $(\varepsilon_{N})$ such that $\lim_{N\rightarrow\infty}\varepsilon_{N}=0$ .

Proof of Theorem 9.

The argument is the same as in the proof of Theorem 8. The difference is that equations (5.10), (5.11) and (5.19) become respectively, for each $i=1,\ldots,N$ ,

[TABLE]

where the latter means that

[TABLE]

and

[TABLE]

for $i=2,\ldots,N$ , thanks to Notation 1. The estimates we need to apply Gronwall’s lemma, in particular in the proof of Lemma 11, are found using also (2.4) and the fact that $dist(\gamma(s,x),\gamma(s,y))\leq Diam(A)|x-y|$ for every $s$ and each $x$ and $y$ in the finite $\Sigma$ . ∎

6. Conclusions

We summarize here the results we have obtained. The assumptions are given in Section 2.1 and verified for a natural shape of the dynamics in Lemmata 3 and 4.

(1)

Under assumptions (A) and (B), there exist a relaxed mean field game solution and a relaxed feedback mean field game solution (in the sense of Definition 7), see Theorems 2 and 3, respectively. 2. (2)

Assuming (A), (B) and (C), there exists a feedback solution of the mean field game (Definition 4), see Corollary 1. The feedback mean field game solution is unique for small $T$ under the additional assumptions of Section 4.3 by Theorem 6; uniqueness for arbitrary time horizon holds under the Lasry-Lions monotonicity assumptions, see Theorem 7. 3. (3)

The relaxed mean field game solutions provide $\varepsilon_{N}$ -Nash equilibria for the $N$ -player game (cf. Definition 2), both in open-loop and in feedback form (not relaxed), with $\varepsilon_{N}\leq\frac{C}{\sqrt{N}}$ . If (A’) and (B’) hold, then the symmetric open-loop strategy vector defined in Notation 5.8 is an $\varepsilon_{N}$ -Nash equilibrium by Theorem 8. Assuming (A”) and (B”), the feedback strategy vector defined in Notation 3, which is symmetric and decentralized, is a feedback $\varepsilon_{N}$ -Nash equilibrium thanks to Theorem 9.

Appendix A Relaxed Poisson measures

In order to state the definition of the relaxed Poisson random measure we first need to define the canonical space of integer valued random measures on a metric space $E$ . Following Jacod (1979), the setting is:

•

$\overline{\Omega}$ is the set of sequences $(t_{n},y_{n})\subset[0,+\infty]\times E$ such that $(t_{n})$ is increasing and $t_{n}<t_{n+1}$ if $t_{n}<+\infty$ ; set $t_{0}:=0$ and $t_{\infty}:=\lim_{n}t_{n}$ ;

•

if $\overline{\omega}=(t_{n},y_{n})_{n\in\mathbb{N}}$ write $T_{n}(\overline{\omega}):=t_{n}$ and $Y_{n}(\overline{\omega}):=y_{n}$ ;

•

the canonical random measure is

[TABLE]

for any $B\in\mathcal{B}([0,+\infty[\times E)$ ;

•

$\overline{\mathcal{G}}_{t}:=\sigma\left(\overline{\mathcal{N}}(\cdot,B):B\in\mathcal{B}([0,t]\times E)\right)$ , $\overline{\mathcal{F}}_{0}$ is given, $\overline{\mathcal{F}}_{t}=\overline{\mathcal{F}_{0}}\vee\left(\cap_{s<t}\overline{\mathcal{G}}_{s}\right)$ , $\overline{\mathcal{F}}=\overline{\mathcal{F}}_{\infty}$ and $\overline{\mathbb{F}}=(\overline{\mathcal{F}}_{t})_{t\geq 0}$ .

The filtered space $(\overline{\Omega},\overline{\mathcal{F}},\overline{\mathbb{F}})$ is then the canonical space of integer valued random measures on $E$ . A probability measure on it is the law of an integer valued random measure on $E$ , given an initial condition on $\overline{\mathcal{F}_{0}}$ . Note that the canonical measure $\overline{\mathcal{N}}$ is not the identity: for this reason we can work with $\mathcal{M}=\mathcal{M}([0,+\infty[\times E)$ as the state space of a random measure. Moreover, the set of integer valued random measures is vaguely closed in $\mathcal{M}$ : see Theorem 15.7.4 in Kallenberg (1986) and the references therein.

Let now $\Theta$ be any integer valued random measure defined on a filtered probability space $(\Omega,\mathcal{F},\mathbb{F},P)$ . It is determined by a sequence of stopping times $T_{n}$ and random variables $X_{n}$ which are $\mathcal{F}_{T_{n}}$ -measurable. To any $\Theta$ is associated its compensator, that is, a positive random measure $\eta$ on $E$ such that

(1)

$\eta([0,t]\times B)_{t\geq 0}$ is predictable for any $B\in\mathcal{B}(E)$ ; 2. (2)

$(\Theta([0,t\wedge T_{n}]\times B)-\eta([0,t\wedge T_{n}]\times B))_{t\geq 0}$ is an $\mathbb{F}$ -martingale for each $n$ and $B$ ; 3. (3)

$\eta(\left\{t\right\}\times E)\leq 1$ for each $t$ and $\eta([T_{\infty},\infty[\times E)=0$ .

The compensator exists and is unique (up to a modification on a $P$ -null set) for any $\Theta$ . The proof can be found in Jacod (1975), where the author also shows that a process with the above properties uniquely determines an integer valued random measure.

Consider then an arbitrary measurable space $(\Omega^{\prime},\mathcal{F}^{\prime})$ and define $\Omega:=\overline{\Omega}\times\Omega^{\prime}$ . Set $\overline{\mathcal{F}_{0}}:=\left\{\varnothing,\overline{\Omega}\right\}$ and $\mathcal{F}_{0}:=\overline{\mathcal{F}_{0}}\otimes\mathcal{F}^{\prime}$ . The canonical random measure $\overline{\mathcal{N}}$ on $\overline{\Omega}$ is extended to $\Omega$ via $(T_{n},Y_{n}).(\overline{\omega},\omega^{\prime}):=(T_{n},Y_{n}).(\overline{\omega})$ . Set $\mathcal{F}_{t}:=\overline{\mathcal{F}}_{t}\vee\mathcal{F}_{0}$ .

Theorem 10 (Jacod (1975)).

Let $P_{0}$ be a probability measure on $(\Omega,\mathcal{F}_{0})$ and $\eta$ a predictable random measure satisfying (1) and (3). Then there exists a unique probability measure $P$ on $(\Omega,\mathcal{F}_{\infty})$ whose restriction to $\mathcal{F}_{0}$ is $P_{0}$ and for which $\eta$ is the compensator of $\overline{\mathcal{N}}$ .

By means of this theorem, we are able to define properly a relaxed Poisson measure. Consider a relaxed control $((\Omega^{\prime\prime},\mathcal{F}^{\prime\prime},P^{\prime\prime};\mathbb{F}^{\prime\prime}),\rho,\xi,\mathcal{N})\in\mathcal{R}$ and let $\Omega^{\prime}=\mathcal{D}\times\Sigma\times\overline{\Omega}$ be the state space of the process $\rho$ , the initial distribution $\xi$ and the Poisson random measure $\mathcal{N}$ . The $\sigma$ -algebra $\mathcal{F}^{\prime}$ is generated by the processes and $P_{0}$ is the joint law of $(\rho,\xi,\mathcal{N})$ . So a relaxed Poisson measure $\mathcal{N}_{\rho}$ , related to the relaxed control $\rho$ , is an integer valued random measure on $[0,T]\times U\times A$ whose compensator $\eta$ , calculated on $[0,t]$ , $U_{0}$ , $A_{0}$ , is $\nu(U_{0})\rho([0,t]\times A_{0})$ . Its law is uniquely determined on $\overline{\Omega}$ and thus has the martingale properties (2.17) and (2.18). Moreover, the joint law of $(\mathcal{N}_{\rho},\rho,\xi,\mathcal{N})$ is uniquely determined.

We can give an explicit construction of $\mathcal{N}_{\rho}$ . Let $\rho\in\mathcal{R}$ and $(\alpha_{n})$ be a sequence in $\mathcal{A}$ which tends to $\rho$ in the sense of Lemma 8, the chattering lemma. Denote by $\rho^{\alpha_{n}}$ the relaxed control representation of $\alpha_{n}$ and construct $\mathcal{N}_{\alpha_{n}}$ as in (2.19): $\mathcal{N}_{\alpha_{n}}(t,U_{0},A_{0}):=\int_{0}^{t}\int_{U_{0}}\mathbbm{1}_{A_{0}}(\alpha_{n}(s))\mathcal{N}(ds,du)$ . Then, by Theorem 1, the sequence $(X_{\alpha_{n}},\rho^{\alpha_{n}},\mathcal{N}_{\alpha_{n}})$ is tight and any subsequence converges in distribution to $(X_{\rho},\rho,\mathcal{N}_{\rho})$ . The marginals are uniquely defined in this way, while to show that the joint law of $(\rho,\mathcal{N}_{\rho})$ is unique we need to invoke the above Theorem 10.

A.1. Proof of Lemma 1

Let $m\in\mathcal{L}$ be fixed, which we shall omit. Let $\mathcal{Z}$ be the space of stochastic processes with paths in $D([0,T],\Sigma)$ and equip it with the norm $||X||=E\left[\sup_{0\leq t\leq T}|X(t)|\right]$ . Let $\rho\in\mathcal{R}$ and define the map $G:\mathcal{Z}\longrightarrow\mathcal{Z}$ by

[TABLE]

for any $X\in\mathcal{Z}$ . If we prove that this map is a contraction in the norm $||\cdot||$ , then pathwise existence and uniqueness of solutions to equation (2.20) follow. We have, for any $X,Y\in\mathcal{Z}$ ,

[TABLE]

hence

[TABLE]

thanks to (2.1) and the fact that $\rho_{s}$ is a probability measure. Therefore $G$ is a contraction if $T<\frac{1}{K_{1}}$ , and so uniqueness is proved for small time horizon; but then iterating the same argument, we have uniqueness for any $T$ .

Consider now $\widehat{\gamma}\in\widehat{\mathbb{A}}$ and define $\widehat{G}:\mathcal{Z}\longrightarrow\mathcal{Z}$ by

[TABLE]

for any process $X\in\mathcal{Z}$ . Then for any $X$ and $Y$ we have $||\widehat{G}(X)-\widehat{G}(Y)||\leq||Z_{1}||+||Z_{2}||$ where

[TABLE]

and

[TABLE]

where $|\Theta|$ denotes the total variation of the signed measure $\Theta$ defined for any $C\in\mathcal{B}([0,T]\times U\times A)$ by $|\Theta|(C):=\sup_{E\subseteq C}|\Theta(E)|$ ; while the total variation norm is $||\Theta||_{TV}=|\Theta|([0,T]\times U\times A)$ . The first term $Z_{1}$ is bounded as above yielding $||Z_{1}||\leq K_{1}T||X-Y||$ . For the second term, we use $|f|\leq d$ to obtain

[TABLE]

Thanks to (2.17) and (2.13), we have $E||\mathcal{N}_{\rho^{\widehat{\gamma},X}}-\mathcal{N}_{\rho^{\widehat{\gamma},Y}}||_{TV}\leq 2T\nu(U)$ , saying that the right-hand side above is finite $P$ -a.s. Since the measure $\mathcal{N}_{\rho^{\widehat{\gamma},X}}-\mathcal{N}_{\rho^{\widehat{\gamma},Y}}$ is integer valued, we can assume that the above supremum is attained on a set $C(\omega)$ for $P$ -a.e. $\omega$ , giving thus a random set $C$ . Moreover, we may assume that on such a set the random measure considered is positive. The martingale property (2.18) now gives

[TABLE]

where in the last line above we have used the fact that $\widehat{\gamma}$ is a probability measure and $|x-y|\geq 1$ for each $x\neq y\in\Sigma$ . Therefore, for $T<\frac{1}{2K_{1}}$ , the map $\widehat{G}$ is a contraction; the claim follows iterating the above procedure.

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aldous (1978) D. Aldous. Stopping times and tightness. Ann. Probab. , 6(2):335–340, 1978.
2Basna et al. (2014) R. Basna, A. Hilbert, and V. N. Kolokoltsov. An epsilon-Nash equilibrium for non-linear Markov games of mean-field-type on finite spaces. Comm. Stoch. Anal. , 8(4):449–468, 2014.
3Bayraktar and Cohen (2017) E. Bayraktar and A. Cohen. Analysis of a finite state many player game using its master equation. ar Xiv:1707.02648 [math.AP], July 2017.
4Benazzoli et al. (2017) C. Benazzoli, L. Campi, and L. Di Persio. Mean-field games with controlled jumps. ar Xiv:1703.01919 [math.PR], March 2017.
5Bensoussan et al. (2013) A. Bensoussan, J. Frehse, and P. Yam. Mean Field Games and Mean Field Type Control Theory . Springer Briefs in Mathematics. Springer, New York, 2013.
6Bensoussan et al. (2016) A. Bensoussan, K. Sung, S. Yam, and S. Yung. Linear-quadratic mean field games. J. Optim. Theory Appl. , 169(2):496–529, 2016.
7Brunick and Shreve (2013) G. Brunick and S. E. Shreve. Mimicking an Itô process by a solution of a stochastic differential equation. Ann. Appl. Probab. , 23(4):1584–1628, 2013.
8Cardaliaguet (2013) P. Cardaliaguet. Notes on mean field games. Technical report, Université de Paris - Dauphine, September 2013.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Probabilistic approach to finite state mean field games

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction

Structure of the paper

2. Description of the model

2.1. Notations and assumptions

2.2. N-player game

Definition 1**.**

Notation 1**.**

Definition 2**.**

2.3. Mean field game

Definition 3**.**

Definition 4**.**

2.4. Relaxed controls

Definition 5**.**

Definition 6**.**

Lemma 1**.**

Definition 7**.**

Lemma 2**.**

2.5. Example

Lemma 3**.**

Proof.

Lemma 4**.**

Proof.

3. Relaxed Mean Field Game Solutions

3.1. The space L\mathcal{L}L

Lemma 5**.**

Proof.

3.2. Existence of relaxed mean field game solutions

3.2.1. Tightness and continuity for mmm fixed

Theorem 1**.**

Proof.

3.2.2. Fixed point argument

Definition 8**.**

Proposition 1** (Ky Fan).**

Theorem 2**.**

Proof.

3.3. Relaxed feedback mean field game solutions

Theorem 3**.**

Proof.

4. Feedback Mean field Game Solutions

4.1. Feedback optimal control for mmm fixed

Theorem 4**.**

Proposition 2** (Verification).**

Proof of Theorem 4.

4.2. Uniqueness of the feedback control for mmm fixed

Theorem 5**.**

Corollary 1**.**

Lemma 6**.**

Proof.

Remark 1**.**

Proof of Theorem 5.

4.3. Uniqueness of the feedback MFG solution for small time

Lemma 7**.**

Theorem 6**.**

Proof.

4.4. Uniqueness under monotonicity

Theorem 7**.**

Proof.

5. Approximation of NNN-player game

5.1. Approximation of relaxed controls

Lemma 8** (Chattering).**

Proof.

Remark 2**.**

Proposition 3**.**

Proof.

5.2. εN\varepsilon_{N}εN​-Nash equilibria

Notation 2**.**

Theorem 8**.**

Notation 3**.**

Theorem 9**.**

5.3. Proofs of the results

Definition 1.

Notation 1.

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6.

Lemma 1.

Definition 7.

Lemma 2.

Lemma 3.

Lemma 4.

3.1. The space $\mathcal{L}$

Lemma 5.

3.2.1. Tightness and continuity for $m$ fixed

Theorem 1.

Definition 8.

Proposition 1 (Ky Fan).

Theorem 2.

Theorem 3.

4.1. Feedback optimal control for $m$ fixed

Theorem 4.

Proposition 2 (Verification).

4.2. Uniqueness of the feedback control for $m$ fixed

Theorem 5.

Corollary 1.

Lemma 6.

Remark 1.

Lemma 7.

Theorem 6.

Theorem 7.

5. Approximation of $N$ -player game

Lemma 8 (Chattering).

Remark 2.

Proposition 3.

5.2. $\varepsilon_{N}$ -Nash equilibria

Notation 2.

Theorem 8.

Notation 3.

Theorem 9.

Lemma 9.

Lemma 10.

Lemma 11.

Lemma 12.

Remark 3.

Theorem 10 (Jacod (1975)).