Probabilistic approach to finite state mean field games
Alekos Cecchin, Markus Fischer

TL;DR
This paper introduces a probabilistic framework for finite state mean field games using stochastic differential equations driven by Poisson measures, establishing existence, approximation, and uniqueness results.
Contribution
It develops a probabilistic representation for finite state mean field games, proving existence of solutions and their role as approximate Nash equilibria for large N-player games.
Findings
Existence of solutions in relaxed controls
Mean field solutions form approximate Nash equilibria with error rate 1/√N
Uniqueness under small time horizon or monotonicity
Abstract
We study mean field games and corresponding -player games in continuous time over a finite time horizon where the position of each agent belongs to a finite state space. As opposed to previous works on finite state mean field games, we use a probabilistic representation of the system dynamics in terms of stochastic differential equations driven by Poisson random measures. Under mild assumptions, we prove existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls. Relying on the probabilistic representation and a coupling argument, we show that mean field game solutions provide symmetric -Nash equilibria for the -player game, both in open-loop and in feedback strategies (not relaxed), with . Under stronger assumptions, we also find solutions of the mean field game in ordinary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Probabilistic approach to finite state mean field games
Alekos Cecchin
Department of Mathematics “Tullio Levi Civita”
University of Padua
Via Trieste 63, 35121 Padova, Italy
and
Markus Fischer
[email protected] http://www.math.unipd.it/ fischer
(Date: April 2, 2017; revised December 1, 2017)
Abstract.
We study mean field games and corresponding -player games in continuous time over a finite time horizon where the position of each agent belongs to a finite state space. As opposed to previous works on finite state mean field games, we use a probabilistic representation of the system dynamics in terms of stochastic differential equations driven by Poisson random measures. Under mild assumptions, we prove existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls. Relying on the probabilistic representation and a coupling argument, we show that mean field game solutions provide symmetric -Nash equilibria for the -player game, both in open-loop and in feedback strategies (not relaxed), with . Under stronger assumptions, we also find solutions of the mean field game in ordinary feedback controls and prove uniqueness either in case of a small time horizon or under monotonicity.
Key words and phrases:
Mean field games, finite state space, relaxed controls, relaxed Poisson measures, -person games, approximate Nash equilibria, chattering lemma
1991 Mathematics Subject Classification:
60J27, 60K35, 91A10, 93E20
The first author is supported by the PhD program in Mathematical Sciences, Department of Mathematics, University of Padua (Italy) and Progetto Dottorati - Fondazione Cassa di Risparmio di Padova e Rovigo (CaRiPaRo). The second author acknowledges partial support through the research projects “Mean Field Games and Nonlinear PDEs” (CPDA157835) of the University of Padua and “Nonlinear Partial Differential Equations: Asymptotic Problems and Mean-Field Games” of the Fondazione CaRiPaRo. Both authors thank an anonymous Referee for her/his helpful critique and detailed comments and suggestions.
1. Introduction
Mean field games, as independently introduced by Lasry and Lions (2007) and by Huang et al. (2006), represent limit models for symmetric non-zero-sum non-cooperative -player dynamic games with mean field interactions when the number of players tends to infinity. For an introduction to mean field games see Cardaliaguet (2013), Carmona et al. (2013) and Bensoussan et al. (2013); the latter two works also deal with optimal control problems of McKean-Vlasov type. There is by now a wealth of works dealing with different classes of mean field games; for a partial overview see Gomes et al. (2015) and the references therein. Here, we restrict attention to a class of finite time horizon problems with continuous time dynamics and fully symmetric cost structure, where the position of each agent belongs to a finite state space.
The relation between the limit model (the mean field game) and the corresponding prelimit models (the -player games) can be understood in two different directions: approximation and convergence. By approximation we mean that a solution of the mean field game allows to construct approximate Nash equilibria for the -player games, where the approximation error is arbitrarily small for big enough. By convergence we mean that Nash equilibria for the -player games may be expected to converge to a solution of the mean field game as tends to infinity.
Results in the approximation direction are more common and usually provide the justification for the definition of the mean field game. When the underlying dynamics is of Itô type without jumps, such results were established by Huang et al. (2006) and, more recently, by for instance Carmona and Delarue (2013), Carmona and Lacker (2015) and Bensoussan et al. (2016). When the dynamics is driven by generators of Lévy type, but with the control appearing only in the drift, an approximation result is found in Kolokoltsov et al. (2011).
Rigorous results on convergence to the mean field game limit in the non stationary case (finite time horizon) are even more recent. While the limits of -player Nash equilibria in stochastic open-loop strategies can be completely characterized (see Lacker (2016) and Fischer (2017) for general systems of Itô type), the convergence problem is more difficult for Nash equilibria in Markov feedback strategies with global state information. A breakthrough was achieved by Cardaliaguet et al. (2015). Their proof of convergence relies on having a regular solution to the so-called master equation. This is a kind of transport equation on the space of probability measures associated with the mean field game; its solution yields a solution to the mean field game for any initial time and initial distribution. If the mean field game is such that its master equation possesses a unique regular solution, then that solution can be used to prove convergence of the costs associated with the -player Nash equilibria, as well as a weak form of convergence of the corresponding feedback strategies. An important ingredient in the proof is a coupling argument similar to the one employed in deriving the propagation of chaos property for uncontrolled mean field systems (cf. Sznitman, 1991). This kind of coupling argument, in which independent copies of the limit process are compared to their prelimit counterparts, is useful also for obtaining approximation results; cf. for instance the above cited works by Huang et al. (2006) and Carmona and Delarue (2013).
In this paper, we focus on games where the position of each agent belongs to a given finite state space . Such games have been studied by Gomes et al. (2013), and also by Basna et al. (2014). Their approach to the problem is based on PDE / ODE methods and the infinitesimal generator ( matrix) of the system dynamics; we will return to this shortly.
Here, we adopt a different approach based on a probabilistic representation. We write the dynamics of the -player game as a system of stochastic differential equations driven by independent stationary Poisson random measures with the same intensity measure , weakly coupled through the empirical measure of the system states:
[TABLE]
where is the control of player (here in open loop form) with values in a compact set and is the empirical measure of the system immediately before time . The dynamics for the one representative player of the mean field limit is analogously written as
[TABLE]
where is the control and a deterministic flow of probability measures, which takes the place of .
Representations (1.1) and (1.2) of the system dynamics allow to obtain approximation results with error bounds of the form for the approximate -player Nash equilibria via the aforementioned coupling argument. This is what we will do here. The probabilistic representation is useful also for the problem of convergence to the mean field limit; see below.
The function appearing in (1.1) and (1.2) can be chosen so that the corresponding state processes , have prescribed transition rates when the control and measure variable are held constant. Following an idea of Graham (1992), we choose , let the intensity measure be given by copies of Lebesgue measure on the line (cf. (2.29) below), and set
[TABLE]
With this we have, as ,
[TABLE]
if , for any constant control and probability measure . Thus, is the transition rate from state to state .
We will consider several types of controls: open-loop, feedback, relaxed open-loop and relaxed feedback. Each player wants to optimize his cost functional over a finite time horizon . The coefficients representing running and terminal costs may depend on the measure variable and are the same for all players.
We first study the mean field game and show that it admits a solution in relaxed controls. The solution of the mean field game can be seen as a fixed point. For a given flow of measures , find a strategy that is optimal and let be the corresponding solution of Eq. (1.2). Now find such that for all . Under mild hypotheses, we prove existence of solutions in relaxed open-loop controls using the Ky Fan fixed point theorem for point-to-set maps. This is analogous to the existence result obtained by Lacker (2015) for general dynamics driven by Wiener processes. As there, we will characterize solutions to Eq. (1.2) through the associated controlled martingale problem. In order to write the dynamics when using a relaxed control, we need to work with relaxed Poisson measures in the sense of Kushner and Dupuis (2001); also see Appendix A below. The same assumptions that give existence in relaxed open-loop controls also yield existence of solutions in relaxed feedback controls. Relaxed controls are used only for the limit model.
Then we show that those relaxed mean field game solutions provide -Nash equilibria for the -player game both in ordinary open-loop and ordinary feedback strategies. To this end, we approximate a limiting optimal relaxed control by an ordinary one, using a version of the chattering lemma that also works for feedback controls, at least in our finite setting. The approximating control is then shown to provide a symmetric -Nash equilibrium, with , decentralized when considering feedback strategies. As explained above, our proof relies on the probabilistic representation of the system and a coupling argument.
We also study the problem of finding solutions of the mean field game in ordinary feedback controls. There, we need stronger assumptions in order to guarantee the uniqueness of an optimal feedback control for any fixed (existence always holds). Moreover, we prove that the feedback mean field game solution is unique either if the time horizon is small enough or if the cost coefficients satisfy the monotonicity conditions of Lasry and Lions (cf. below).
Roughly speaking, we need to assume only the continuity of the rates in order to have relaxed or relaxed feedback mean field game solutions and to obtain -Nash equilibria for the -player game, both open-loop and feedback. Under stronger assumptions, namely affine dependence of on the control and strict convexity of the cost, we have uniqueness of the optimal feedback control for any through the uniqueness of the minimizer of the associated Hamiltonian. Under assumptions similar to these latter, Basna et al. (2014) study the problem in the framework of non-linear Markov processes and find -Nash equilibria for the -player game. In Gomes et al. (2013), the transition rates coincide with the control, in analogy with the original works of Lasry and Lions, and -Nash equilibria are obtained. Both these works consider ordinary feedback controls only, hence feedback solutions of the mean field game.
The work by Gomes et al. (2013) also contains a result in the convergence direction. More precisely, convergence of -player Nash equilibria in feedback controls to the mean field limit is established, but only if the time horizon is sufficiently small. Moreover, the authors prove a result about the uniqueness of feedback mean field game solutions for arbitrary time horizon in case the Lasry-Lions monotonicity conditions hold.
Lastly, let us mention several recent preprints. In Doncel et al. (2016), continuous time mean field games with finite state space and finite action space are studied. The authors prove existence of solutions to the mean field game, corresponding to what we call solutions in relaxed feedback controls. Their prelimit models (the -player games) are different and difficult to compare to ours since they are set in discrete time. The second work we mention is Benazzoli et al. (2017). There, the authors study a class of mean field games with jump diffusion dynamics. An existence result for the mean field game in the spirit of Lacker (2015) is given. The authors also obtain a convergence result in a special situation where Nash equilibria for the -player games can be found explicitly. In their model, the jump heights are directly (and linearly) controlled, not the jump intensities.
The last two preprints, which appeared nearly simultaneously, after submission of the present paper, concern the convergence problem for finite state mean field games. In Cecchin and Pelino (2017), a joint work of the first author, the convergence of feedback Nash equilibria to solutions of the mean field game is studied following the ideas of Cardaliaguet et al. (2015) sketched above. The Master Equation, which in this case is a first order PDE stated in , is employed to obtain convergence of the feedback Nash equilibria, the value functions and a propagation of chaos property for the -player optimal trajectories. Provided that the Master Equation possesses a (unique) classical solution, convergence is established through a coupling argument, which relies on the probabilistic representation of the dynamics introduced here. Existence of a unique classical solution to the Master Equation is verified under the Lasry-Lions monotonicity conditions. In addition, a central limit theorem and a large deviation principle for the -player empirical measure processes are proved. In the independent work by Bayraktar and Cohen (2017), the authors again use the Master Equation in the spirit of Cardaliaguet et al. (2015) to find the same convergence result as in Cecchin and Pelino (2017), but using a slightly different probabilistic representation of the dynamics. They also obtain a central limit theorem for the fluctuations of the -player empirical measure processes.
Structure of the paper
In Section 2, we introduce the notation and various assumptions to be used in the sequel. Then we describe the -player games as well as the corresponding mean field game, giving the relevant definitions of Nash equilibrium and solution of the mean field game. Relaxed controls (open-loop and feedback) are introduced there as well, while a proper definition of relaxed Poisson measures is given in Appendix A. All main assumptions are verified to hold for the natural shape of in (1.3).
In Section 3, we establish existence of solutions to the mean field game in relaxed open-loop as well as relaxed feedback controls.
In Section 4, we find, under additional assumptions, mean field game solutions in non-relaxed feedback controls by proving the uniqueness of the optimal control for any flow of measures. Moreover, uniqueness of solutions is proved either for small or under the Lasry-Lions monotonicity conditions.
In Section 5, we first establish a version of the chattering lemma that works also for feedback controls. Then we turn to the construction of approximate Nash equilibria coming from a solution of the mean field game, and derive the error bound mentioned above for feedback as well as open-loop strategies.
Section 6 contains a summary of the main results.
2. Description of the model
2.1. Notations and assumptions
Throughout the paper, we fix to be the finite state space of any player. Let be the finite time horizon and be a compact metric space, the space of control values. Let be a compact set in and let be a Radon measure on . Let
[TABLE]
be the space of probability measures on , which is the probability simplex in . Let be a measurable function (the one appearing in the dynamics (1.2) and (1.1)) such that . Let , be measurable functions, representing the running and the terminal costs, respectively, which will be the same for all players.
We will denote by any stationary Poisson random measure on with intensity measure on , and by a vector of i.i.d. stationary Poisson random measures, each with the same law as . The initial point of the -player game will be represented by i.i.d. random variables with values in and common distribution , which will be fixed throughout. Similarly, the initial point of the limiting system will be represented by a random variable with law .
The state of player at time is denoted by . The trajectories of any process are assumed to be in , which denotes the space of càdlàg functions from to , endowed with the Skorokhod -topology. Let be the empirical measure of the system of players. In the limiting dynamics, the empirical measure is replaced by a deterministic flow of probability measures .
The space of measures can be equipped with any norm in , as they are all equivalent, so we choose the Euclidean norm . We observe that is a compact and convex subset of . Denote by the space of continuous functions from to , endowed with the uniform norm. The space of flows of probability measures on will be denoted by , which will be shown in Subsection 3.1 to be
[TABLE]
where the constant is given by .
We will study several types of controls. Pathwise existence and uniqueness of solutions to the controlled dynamics (1.2), with trajectories that remain in , is guaranteed by the following Lipschitz condition:
[TABLE]
for every and , where is a constant. The above condition is always satisfied in our model since for each and ; thus we may take .
Let us summarize here the various sets of assumptions we will make use of:
- (A)
The function defined by is continuous in (uniformly, and is bounded), that is, there exists a function such that and
[TABLE]
for every , , , ;
- (A’)
Assumption (A) holds and is Lipschitz in :
[TABLE]
- (A”)
Assumption (A’) holds and is Lipschitz also in :
[TABLE]
- (B)
The running cost is continuous (and bounded) in and the terminal cost is continuous (and bounded) in ;
- (B’)
Assumption (B) holds and the costs and are Lipschitz in :
[TABLE]
- (B”)
Assumption (B’) holds and the running cost is Lipschitz also in :
[TABLE]
The above assumptions will be used in Sections 3 and 5 to find solutions of the mean field game and then approximate Nash equilibria for the -player game, both in open-loop and in feedback form.
Our last assumption will be more implicit. We identify the set of functions with and observe that any is bounded and Lipschitz. For any , , , and define the generator
[TABLE]
and the pre-Hamiltonian
[TABLE]
In order to obtain existence and uniqueness of feedback mean field game solutions, in Section 4, we will make the additional hypothesis:
- (C)
For any , , and there exists a unique minimizer of in ;
We observe that for any fixed and the function is measurable, thanks to Theorem D.5 in Hernández-Lerma and Lasserre (1996). We remark also that the limiting dynamics (1.2) always admits a pathwise unique solution thanks to (2.1).
2.2. N-player game
In the prelimit, we consider a system of symmetric players governed by the dynamics
[TABLE]
where and . Here, the controls are in open-loop form. Let us specify the controls to be used in the -player game.
Definition 1**.**
Define the set of strategy vectors as
[TABLE]
where is a filtered probability space, is a vector of i.i.d. -measurable random variables with law , the initial points, is a vector of i.i.d. stationary Poisson random measures with respect to the filtration with intensity measure on , , and is a vector of -valued -predictable processes . We will often write to indicate the process .
Define the set of feedback strategy vectors as
[TABLE]
where is measurable and the filtered probability space and the and are as above. We will often write to indicate the function .
We observe that the above definition of feedback strategy vector is not standard, as it is given together with the probability space and the noise. We give such a definition because in this way any strategy gives a unique pathwise solution to dynamics (2.9). Indeed, provided that is Lipschitz in , we have pathwise existence and uniqueness of solutions to the system (2.9), for any .
Given a feedback strategy vector , equation (2.9) is written as
[TABLE]
for each . The same assumption as above provides existence and uniqueness of solutions to this equation, so we can define the related open-loop control by
[TABLE]
In view of Definition 1, the open-loop control has to be given together with a filtered probability space, a vector of initial conditions and a vector of Poisson random measures, which we impose to be the same as those given with the feedback control .
Next, we define the object of the minimization. Let be a strategy vector and be the solution to dynamics (2.9). For set
[TABLE]
Define also for any .
We look for approximate Nash equilibria for the -player game. So let us define what are the perturbed strategy vectors we consider.
Notation 1**.**
Let be an -valued -predictable process. For a strategy vector in denote by the strategy vector such that
[TABLE]
For a feedback strategy vector , let be the solution to
[TABLE]
Denote then by the strategy vector such that
[TABLE]
Definition 2**.**
Let . A strategy vector is said to be an -Nash equilibrium if for each
[TABLE]
for every such that is a strategy vector.
A vector is called a feedback -Nash equilibrium if
[TABLE]
for every such that is a strategy vector.
We remark that the above definition of feedback -Nash equilibrium is not standard. Indeed, the perturbed strategy vector is usually required to be in feedback form. In our definition, a slightly more restrictive (or stronger) condition is used since the perturbing strategy is allowed to be in open-loop form. As a consequence, the approximation result of Section 5 will be slightly stronger than with the standard definition.
2.3. Mean field game
The mean field limiting system consists of a single player whose state evolves according to the dynamics
[TABLE]
Here the empirical measure appearing in (2.9) is replaced by a deterministic flow of probability measures .
Definition 3**.**
The set of open-loop controls is the set
[TABLE]
where is a filtered probability space, is an -measurable random variable with law , the initial condition, is a stationary Poisson random measure with respect to the filtration with intensity measure on , , and is an -valued -predictable process. We will often write to indicate the process .
Define the set of feedback controls as
[TABLE]
where is measurable and the filtered probability space, the initial condition and the Poisson random measure are as above. We will often write to indicate the function .
We remark that the feedback control is given with the probability space and the noise, in analogy with Definition 1 for the prelimit system.
Thanks to the Lipschitz condition (2.1), the limiting dynamics is well defined. More precisely, given any open-loop control and flow of measures , there exists a pathwise unique solution of Eq. (2.10), which we will denote by . Similarly, given any feedback control and flow of measures , there exists a pathwise unique process solving
[TABLE]
The corresponding open-loop control is then defined as
[TABLE]
In view of Definition 3, the open-loop control has to be given together with a filtered probability space, an initial condition and a Poisson random measure, which we impose to be the same as those given with the feedback control .
We define the object of the minimization for the mean field game. For any and set
[TABLE]
Define also for any .
The notion of solution for the limiting mean field game, which will provide approximate Nash equilibria for the -player game, is the following.
Definition 4**.**
An open-loop solution of the mean field game (2.10) is a triple
[TABLE]
such that
- (1)
, , is adapted to the filtration and ; 2. (2)
Optimality*: for every ;* 3. (3)
Mean Field Condition*: for every .*
We say that is a feedback solution of the mean field game if and is an open-loop solution of the mean field game, where is defined in (2.12).
In our writing, we will often drop the filtered probability space and the Poisson random measure from the notation.
In condition (3) of the above definition, as usual. Let us denote by the flow of the process , that is, . Then the mean field condition can be written as .
2.4. Relaxed controls
The space is not itself compact. In order to always have convergence along subsequences, we need to enlarge the space of controls, considering relaxed controls and related relaxed Poisson measures. They are used only for the limiting system.
Definition 5**.**
A deterministic relaxed control is a measure on the Borel sets such that
[TABLE]
The space of deterministic relaxed controls will be denoted by .
Given , the time derivative exists for Lebesgue-almost every ; it is the probability measure on given by
[TABLE]
As a consequence, can be factorized according to
[TABLE]
The space is endowed with the topology of weak convergence of measures, i.e. if and only if
[TABLE]
for every continuous on . Moreover there exists a metric which makes a compact metric space (for instance, Kushner and Dupuis, 2001).
Definition 6**.**
The space of (stochastic) relaxed controls is
[TABLE]
where is a filtered probability space, is a -valued random variable such that is -adapted for every , and is a stationary Poisson random measure with respect to the filtration with intensity measure on . We will often write to denote the process .
The space of relaxed feedback controls is the set
[TABLE]
where is measurable, is endowed with the topology of weak convergence, and the filtered probability space, the initial condition and the Poisson random measure are as above. We will often write to denote the process .
The relaxed feedback control is given with the probability space and the noise, in analogy with Definition 3. Because of (2.14), the derivative is an -predictable process for any . An ordinary open-loop control can be viewed as a relaxed control in which the derivative in time is a Dirac measure:
[TABLE]
We also have to introduce the corresponding relaxed Poisson measure in order to have well-defined dynamics. This will be done properly in Appendix A. Given any , Borel sets , , the relaxed Poisson measure related to the relaxed control has the property that the processes
[TABLE]
are -martingales, and are orthogonal for disjoint . This martingale property and the fact that is a counting measure valued process define the distribution of and the joint law of uniquely (see Appendix A). The martingale property (2.17) also implies that the process
[TABLE]
is an -martingale, for any bounded and measurable . For an ordinary control (or the relaxed control it induces), the corresponding relaxed Poisson measure is explicitly given by
[TABLE]
The stochastic differential equation (2.10) in this more general framework with a relaxed Poisson measure is written as
[TABLE]
for any relaxed control and .
Given a relaxed feedback control and a process , define the corresponding relaxed open-loop control through
[TABLE]
Let be the relaxed Poisson measure corresponding to . Equation (2.20) then becomes
[TABLE]
where the solution process appears also in the relaxed Poisson measure.
The proof of the following lemma is given in Appendix A.1.
Lemma 1**.**
For any and , respectively , there exists a pathwise unique solution to the stochastic differential equation (2.20), respectively (2.22).
The solutions to (2.20) and (2.22) will be denoted by and respectively. For , let denote the corresponding relaxed control defined by (2.21), that is, is the relaxed open-loop control such that
[TABLE]
In view of Definition 6, the relaxed open-loop control has to be given together with a filtered probability space, an initial condition and a Poisson random measure, which we impose to be the same as those coming with the relaxed feedback control .
Let and . Let . Thanks to the martingale property (2.18), we obtain that the process
[TABLE]
is an -martingale, for any . This yields the Dynkin formula
[TABLE]
The cost to be minimized is
[TABLE]
Define also for . The definitions of relaxed solution of the mean field game (2.20) and relaxed feedback solution are analogous to Definition 4, where ordinary controls are replaced by relaxed controls.
Definition 7**.**
A relaxed solution of the mean field game (2.10) is a triple
[TABLE]
such that
- (1)
, , is adapted to the filtration and ; 2. (2)
Optimality*: for every ;* 3. (3)
Mean Field Condition*: for every .*
We say that is a relaxed feedback solution of the mean field game if and is a relaxed solution of the mean field game, where is defined in (2.23).
In our writing, we will often drop the filtered probability space and the Poisson random measure from the notation.
In Section 3.2 we will show the existence of relaxed mean field game solutions via a fixed point argument, while existence of a relaxed feedback mean field game solution is established in Section 3.3.
We will use the characterization of solutions to (2.20) via the controlled martingale problem. The proof of the following lemma is omitted; it can be derived by mimicking the one of Theorem 2.8.1 in Kushner (1990, p. 42).
Lemma 2**.**
Let and . Then solves equation (2.20) in distribution if and only if the process defined in (2.24) is an -martingale for any . The underlying filtered probability space can always be assumed to be , where is the canonical space for defined in Appendix A, the canonical filtration, and is the canonical process.
The martingale property holds if and only if
[TABLE]
for every and every choice of , , , , such that .
In Section 4, under additional assumptions, we will prove existence of feedback mean field game solutions (not relaxed); such solutions will be shown to be unique either if the time horizon is small or if the Lasry-Lions monotonicity assumptions apply.
2.5. Example
We show how our assumptions are satisfied for a natural shape of the function for which, when considering and constants, the transition rates of the Markov chain solution of the dynamics (2.10) appear explicitly. Consider then defined by
[TABLE]
and the intensity measure on defined by
[TABLE]
where , which is viewed as a subset of , and is the Lebesgue measure on .
The function appearing in (2.28) yields the transition rates of the Markov chain solution of (2.10), that is, for , as ,
[TABLE]
Moreover, the measure defined in (2.29) has the property that
[TABLE]
for any bounded and measurable . In particular,
[TABLE]
for any function such that , .
If we want to depend also on a control and a flow of measures, we may consider the rate to depend also on and , so that (2.28) is rewritten as
[TABLE]
We also assume that is bounded by a constant (which holds a posteriori by the assumptions of the next lemma) and . With this , (2.30) becomes
[TABLE]
where is the solution of (2.10) under the control and flow of measures and denotes expectation with respect to the conditional probability provided . In particular, if , then the transition rate is . A proof of (2.30), (2.31) and (2.34) can be found in Turchi (2015), where the examples (2.28) and (2.33) were treated.
Let us check whether our assumptions on the model are satisfied for the above choice of and .
Lemma 3**.**
Let be defined by (2.33) and by (2.29).
- •
If the rate appearing in (2.33) is continuous in and , then (A) holds;
- •
If in addition is Lipschitz in , then (A’) holds;
- •
If in addition is Lipschitz also in , then (A”) holds.
Proof.
Let , , and fix . Then
[TABLE]
Applying (2.32), the last expression above is equal to
[TABLE]
which gives the claims. ∎
In order to verify assumption (C), we need additional hypotheses on the structure of the model.
Lemma 4**.**
Let be defined by (2.33) and by (2.29). Assume that is a compact and convex subset of a metric topological vector space. Let the running cost be strictly convex in and the rate appearing in (2.33) be affine (in the sense of being both convex and concave) in . Then assumption (C) is satisfied.
Proof.
We have where
[TABLE]
Applying formula (2.31) we obtain
[TABLE]
which is an affine function of if is affine in . Therefore, is a strictly convex function of if is strictly convex, and thus it has a unique minimum in . ∎
3. Relaxed Mean Field Game Solutions
3.1. The space
In order to prove the existence of solutions we use a fixed point theorem. First of all, we want to find a suitable space where all the flows of probability measures lie. Set and denote by
[TABLE]
the space of Lipschitz continuous flows of probability measures, with the same Lipschitz constant and initial point . This space is easily seen to be convex and compact with respect to the uniform norm, thanks to the Ascoli-Arzelà theorem. The following lemma allows to restrict attention to flows of probability measures in .
Lemma 5**.**
Let , or , and let be any deterministic flow of probability measures. Then the flow of the solution process , or , is in .
Proof.
We prove the claim for relaxed controls, so the conclusion follows also when considering the subset of ordinary controls. Let be a function, which is then Lipschitz and bounded and can be viewed as a vector in . Denote . Let and be fixed, and set . The function has a priori no regularity, except for being measurable. By the Dynkin formula (2.25) we have, for any ,
[TABLE]
Hence
[TABLE]
thanks to the fact that is a probability measure on for any . Clearly, . Thus, for any and ,
[TABLE]
which gives the claim. ∎
3.2. Existence of relaxed mean field game solutions
3.2.1. Tightness and continuity for fixed
Consider a sequence of random variables
[TABLE]
where is a relaxed control, is the related relaxed Poisson measure and , is fixed. The state space of these random variables is , where denotes the set of finite positive measures on endowed with the topology of weak convergence.
The following is of fundamental importance, and is similar to Theorem 13.2.1 in Kushner and Dupuis (2001, p. 363).
Theorem 1**.**
Assume (A) and (B). Then
- (1)
any sequence of the form (3.2) is tight; 2. (2)
the limit in distribution of any converging subsequence is such that is the relaxed Poisson measure related to the relaxed control and in distribution; 3. (3)
* is continuous in .*
Proof.
(1) The sequence of relaxed controls is tight as is compact. For any , the set
[TABLE]
is compact in , since is compact. From (2.13) and the martingale property (2.17), it follows that is a martingale for any and so . Therefore, by Chebychev’s inequality,
[TABLE]
for any , saying that the sequence of relaxed Poisson measures is tight. The properties of the stochastic integral give
[TABLE]
for any -stopping time , uniformly in , which yields the tightness of the processes in by Aldous’s criterion (Aldous, 1978).
(2) By abuse of notations, denote by the subsequence which converges in distribution to . From the martingale property (2.17), it follows that is a martingale for any Borel sets and , where the limiting measure is defined on the canonical space and the filtration is the canonical filtration (both defined in Appendix A). The limit random measure is integer valued (Theorem 15.7.4 in Kallenberg, 1986), so the uniqueness property says that in distribution. The claim in distribution will be shown also in the proof of Theorem 2, where is not fixed, using the controlled martingale problem, so we do not repeat the argument here.
(3) since and are bounded and continuous by assumption (B). ∎
By the chattering lemma, which we will present later as Lemma 8, we have
[TABLE]
The minimum on the left hand side exists by the above Theorem 1. The infimum on the right hand side is actually a minimum, too; see Theorem 4 below, where the existence of optimal feedback controls will be shown. However, there might exist more optima among relaxed open-loop controls than among ordinary feedback controls.
3.2.2. Fixed point argument
Let be the set of subsets of and define the point-to-set map by
[TABLE]
A flow is called a fixed point of this point-to-set map if . We need this map since the optimal control is not necessarily unique.
By construction, has a fixed point if and only if there exists a relaxed solution to the mean field game, in the sense of Definition 7. In order to prove the existence of a fixed point, we are going to apply Theorem 1 in Fan (1952), which requires the following definition.
Definition 8**.**
Let be a metric space. A map is said to have closed graph if , , for any and , in implies .
Proposition 1** (Ky Fan).**
Let be a non empty, compact and convex subset of a locally convex metric topological vector space. Let have closed graph and assume that is non empty and convex for any . Then the set of fixed points of is non empty and compact.
By means of this proposition we are now able to state and prove the following main theorem concerning existence of relaxed solutions, while uniqueness is not guaranteed.
Theorem 2**.**
Under assumptions (A) and (B) there exists at least one relaxed solution of the mean field game (2.20).
Proof.
We want to show the existence of a fixed point for the map defined in (3.3), applying Proposition 1. Recall that any element of is in by Lemma 5, and the set defined in (3.1) is a compact and convex subset of endowed with the uniform norm. By Theorem 1, is non empty for any . It remains to prove that is convex and has closed graph.
** is convex**. Let be fixed and let be such that and belong to , i.e. and are optimal controls for , and take . Let be a Bernoulli random variable with parameter , measurable and independent of and . Define by
[TABLE]
for any and . We have
[TABLE]
for every . This implies that
[TABLE]
and then in particular
[TABLE]
Since and are optimal for we have, thanks to (3.4),
[TABLE]
for any , which means that also is optimal for and hence (3.5) says that is convex.
** has closed graph**. Let be such that , in and for every . We have to prove that . Let be optimal for and such that . Set and let be the relaxed Poisson measure related to .
The tightness of the sequence is proved as in Theorem 1. Let be a subsequence which converges in distribution to . We have in distribution, i.e. it is the relaxed Poisson measure related to . In order to prove that in distribution, we use the controlled martingale problem formulation stated in Lemma 2, and hence let us assume that the processes are defined in the canonical space.
Property (2.27) holds for , and , any . Let denote the process defined by
[TABLE]
for any . Property (2.27) and the convergence in distribution of the sequence imply that
[TABLE]
thanks to continuity assumption (A), uniform convergence of and (2.16). Therefore we have proved that in distribution.
Thus we obtain
[TABLE]
which implies the convergence
[TABLE]
that is, uniformly. The convergence is then proved along a subsequence, but by hypothesis the limit exists in , hence .
It remains to prove that is optimal for . Again the convergence in distribution of the sequence implies that thanks to continuity assumption (B), uniform convergence of and (2.16). Then from the optimality of for , i.e. for every , taking the limit as we get for every , which means that is optimal for and thus as required. ∎
3.3. Relaxed feedback mean field game solutions
Theorem 2 provides a relaxed (open-loop) solution of the mean field game (2.20). Under the same assumptions we obtain here a relaxed feedback mean field game solution which has the same cost and flow of the open-loop one. This result is similar to Theorem 3.7 in Lacker (2015) and will provide approximate feedback Nash equilibria for the -player game.
Theorem 3**.**
Assume (A) and (B) and let be a relaxed mean field game solution. Then there exists a relaxed feedback control such that the tuple is a relaxed feedback mean field game solution; namely
[TABLE]
Proof.
The flow is fixed and set . We claim that there exists a measurable function such that
[TABLE]
This holds if and only if
[TABLE]
for any bounded and measurable . In order to construct , define the probability measure on by
[TABLE]
Then build by disintegration of :
[TABLE]
where denotes the marginal of and is measurable. Following Lacker (2015), we show that such satisfies (3.8): for every bounded and measurable we get
[TABLE]
which provides (3.8) thanks to Lemma 5.2 in Brunick and Shreve (2013).
Having , (3.8) yields
[TABLE]
-almost everywhere.
Then we solve equation (2.22) in the same probability space of , under the relaxed feedback control , and denote by its solution. By the Dynkin formula (2.25), we have for any ,
[TABLE]
and then thanks to (3.8)
[TABLE]
while Dynkin’s formula for yields
[TABLE]
Comparing (3.9) and (3.10) we obtain that and , which are vectors in , satisfy the same ODE in integral form, namely
[TABLE]
for any , the unknown being denoted by . Taking , , the corresponding system of ODEs, which is clearly linear in , has a unique absolutely continuous solution , hence (3.6) is proved.
Similarly, (3.8) gives
[TABLE]
and then we use (3.6) to conclude that
[TABLE]
∎
4. Feedback Mean field Game Solutions
4.1. Feedback optimal control for fixed
We show the existence of an optimal non-relaxed feedback control for for any , using the verification theorem for the related Hamilton-Jacobi-Bellman equation. Let be fixed.
For any , and let be the solution to
[TABLE]
and set
[TABLE]
Next, define the value function by
[TABLE]
Recall that the generator was defined in (2.7) by
[TABLE]
for any and . For a function the generator will be applied to the space variable, i.e. denote .
Thanks to Theorem D.5 in Hernández-Lerma and Lasserre (1996) on measurable selectors, there exists a feedback control (i.e. measurable) such that
[TABLE]
where is the value function (4.2). Let us remark that the above minimum exists for any and if (A) and (B) hold, as the right hand side turns out to be a continuous function of the variable , since the value function is trivially Lipschitz continuous in .
Theorem 4**.**
Assume (A) and (B). Let . Then any feedback control defined by (4.3) is optimal, that is, for any .
In order to prove Theorem 4, we use the Hamilton-Jacobi-Bellman equation of the problem (see, for instance, Chapter 3 in Fleming and Soner (2006)):
[TABLE]
for a function . Let us define, for ,
[TABLE]
Since is finite, we shall denote , , , , and . Therefore (4.4) can be written as
[TABLE]
which is in fact an ODE.
Define a classical solution to (4.5) as an absolutely continuous function from to such that for every . We apply to our problem the following verification theorem, which is a version of Theorem 3.8.1 in Fleming and Soner (2006, p. 135):
Proposition 2** (Verification).**
Let be a classical solution to (4.5), and let be any feedback control such that (4.3) holds for Lebesgue almost every . Then
[TABLE]
for any and , where is the value function (4.2).
We are now in the position to prove Theorem 4.
Proof of Theorem 4.
In view of Proposition 2, we have just to show that there exists a classical solution to (4.5). Hence it is enough to prove that is globally Lipschitz continuous in , uniformly in . So let be fixed and take and . Recall that
[TABLE]
and let be a minimizer for . Then
[TABLE]
Changing the role of and we obtain the converse, hence
[TABLE]
for any , which implies
[TABLE]
Therefore is Lipschitz continuous in in the norm , which is equivalent to the Euclidean norm in . ∎
4.2. Uniqueness of the feedback control for fixed
Consider the pre-Hamiltonian, as defined in (2.8),
[TABLE]
for and . We make the additional assumption (C); so let us recall that is the unique minimizer of in . Define for the feedback control
[TABLE]
where is the value function (4.2).
Theorem 5**.**
Assume (A), (B) and (C). Given , let be any optimal relaxed control for and let be the corresponding solution to (2.20). Then for -almost every , that is, corresponds to the feedback control .
This result and the proof of Theorem 2 imply that any relaxed solution of the mean field game must correspond to a feedback solution:
Corollary 1**.**
Assume (A), (B) and (C). Then there exists a feedback solution of the mean field game, and any solution is such that its control coincides with .
Let , and define
[TABLE]
Lemma 6**.**
If is continuous in , then
[TABLE]
for any and . Moreover, if (C) holds, then there exists a unique such that
[TABLE]
and , where .
Proof.
If is continuous in , then is continuous in in the weak topology. Since is compact, there exists a minimum: let be a minimizer. For fixed and we have
[TABLE]
and
[TABLE]
which means that .
Consider as a function of : it is non-negative and, if (C) holds, it equals zero if and only if . Therefore,
[TABLE]
which implies the claim, namely that . ∎
Remark 1**.**
Note that if (C) does not hold, then is supported on the set of all minimizers of . Thus it might not be a Dirac measure. This implies that there may exist an optimal relaxed control which is not an ordinary control (not even open-loop).
Proof of Theorem 5.
Fix . Let be an optimal relaxed control and denote by the corresponding optimal trajectory. By the chattering lemma, which we will state later as Lemma 8111Here only the open loop part of the chattering lemma is needed, which is well known, and so we postpone the proof of the lemma to Section 5, where we also give the feedback part.,
[TABLE]
where is the value function defined in (4.2). Thanks to (4.4), the Hamilton-Jacobi-Bellman equation, and (4.7), we have
[TABLE]
By the Dynkin formula (2.25) and the terminal condition for ,
[TABLE]
It follows that
[TABLE]
hence, in view of (4.8),
[TABLE]
for -almost every , which means that
[TABLE]
for -almost every . If (C) holds, then, by Lemma 6, the unique minimizer of is the measure with . It follows that for -almost every .
∎
4.3. Uniqueness of the feedback MFG solution for small time
In this subsection, we focus only on the dynamics for in (2.33),
[TABLE]
and defined in (2.29) with . Moreover, we assume that and, for ,
[TABLE]
where is some Lipschitz continuous function with Lipschitz constant such that for some . Since determines the transition rates, we set , .
We assume that the cost in the variable is in , is Lipschitz continuous in the variable with Lipschitz constant and is uniformly convex, that is, there exists such that
[TABLE]
for all .
This setup is analogous to the one considered in Gomes et al. (2013). The assumptions of Lemma 4 are satisfied and thus for any there exist a unique minimizer of , which in this setting becomes
[TABLE]
The assumptions of Lemma 1 are satisfied so that (A”) and (B”) hold. We need to be Lipschitz continuous in and ; this fact is proved in Proposition 1 in Gomes et al. (2013). We state the result in the following
Lemma 7**.**
Under the above assumptions (in this subsection), the function is Lipschitz continuous in and :
[TABLE]
for any .
Let us fix here the filtered probability space, the initial condition and the Poisson random measure. Define as in (4.6): it is the unique feedback control for given flow of measures , where is the value function defined in (4.2) with respect to . The cost functions and are uniformly bounded and so is the value function: Let us denote by the maximum of its absolute value. Denote by the maximum of and fix the constants
[TABLE]
Let be such that
[TABLE]
Theorem 6**.**
Under the assumptions of this subsection, for any there exists a unique feedback solution of the mean field game. It is such that is the feedback control .
Proof.
In the notation of Theorem 2, the map is defined by , a singleton. If we prove that this map is a contraction for small time horizon , then the assertion follows by the Banach-Cacciopoli Theorem. So let and set and . For a vector denote .
First we prove that the value function is Lipschitz continuous with respect to . Thanks to the HJB equation (4.4) we have
[TABLE]
The Hamiltonian is Lipschitz in ; in fact, by (2.6) and (4.9) we have
[TABLE]
Then using (4.10) and (4.11) we obtain
[TABLE]
for any , hence Gronwall’s lemma implies that
[TABLE]
for any .
Therefore, by applying again (4.10) and (4.11), we obtain
[TABLE]
and thus, again by Gronwall’s lemma,
[TABLE]
for any . Since we have
[TABLE]
and then the claim holds for satisfying (4.12). ∎
4.4. Uniqueness under monotonicity
Uniqueness of mean field game solutions was shown in Theorem 2 in Gomes et al. (2015) for arbitrary time horizon under the Lasry-Lions monotonicity assumptions. Here, we give a different proof of this result, which relies on the probabilistic representation of the mean field game, and allows for less restrictive assumptions on the data.
Specifically, we suppose that the function in the dynamics (2.10) does not depend on and that the running cost splits in . Moreover we assume that and satisfy the following monotonicity property:
[TABLE]
for any . For example, and could be the gradient of convex functions in .
Theorem 7**.**
Suppose that (A), (B) and the assumptions above hold. Let and be two feedback mean field game solutions. Then for any . Also the corresponding value functions and are the same. Moreover, if (C) holds, then for any .
Proof.
Since the dynamics does not depend on , we have and . The optimality of yields and similarly , hence
[TABLE]
Summing these two inequalities and using the fact that for any , we obtain
[TABLE]
If for some , then the latter expression is , thanks to (4.13), (4.14) and the continuity of ; a contradiction. Therefore for all .
The fact that is implied by the uniqueness of solutions to the HJB equation (4.4). Assuming (A) and (B), the optimal feedback satisfies (4.3). Thus, if (C) holds, then . ∎
5. Approximation of -player game
5.1. Approximation of relaxed controls
In order to get an -Nash equilibrium for the -player game in open-loop strategies, respectively in feedback strategies, we have first to find an approximation of the optimal relaxed control, respectively relaxed feedback control, for the mean field game. To this end, we will make use of the following version of the chattering lemma.
Lemma 8** (Chattering).**
For any relaxed control , there exists a sequence of stochastic open-loop controls such that, denoting by their relaxed control representation,
[TABLE]
where the limit is in the weak topology in . Moreover, any takes values in a finite subset of .
For any relaxed feedback control , there exists a sequence of feedback controls such that
[TABLE]
uniformly in and
[TABLE]
where denotes the relaxed control representation of the open-loop control corresponding to , as in (2.12), and is defined in (2.23); i.e. and .
Proof.
The first part is proved as Theorem 3.5.2 in Kushner (1990, p. 59), and the construction of the approximating sequence in the proof gives the for the second part; let us show how to build them. Let , cover by disjoint sets which contain a point and set , a finite subset of . For any and define the function
[TABLE]
Divide any interval into subintervals of length and define the feedback control , which is piecewise constant, by
[TABLE]
where is an arbitrary value in . The proof in Kushner (1990) shows that
[TABLE]
weakly, for any . Since is finite we obtain that there exists a sequence of ordinary feedback controls such that (5.1) holds uniformly in . Let be fixed and be the solution to (2.11) corresponding to the feedback control . By Theorem 1, the sequence is tight and there are a subsequence, which we still denote as , and a process such that in distribution. Possibly applying the Skorokhod representation (Theorem 4.30 in Kallenberg, 2001, p. 79), we may assume that this convergence is with probability one in the space of càdlàg functions equipped with the Skorokhod metric. This implies in particular that
[TABLE]
where is the finite random set of discontinuity points (the jumps) of .
Let now be any continuous function, which is also bounded as is compact. We have to show the convergence to zero, almost surely, of
[TABLE]
where
[TABLE]
Any feedback control is Lipschitz in , i.e. , and so tends to zero thanks to (5.3), the continuity of and dominated convergence. As to , write where
[TABLE]
and is the random set in where . For each , the random set of discontinuity points of the function is a subset of for some finite random set . Thus has null measure with respect to the limiting control with probability one, for each , thanks to Definition 5. Hence by (5.1) we get that tends to zero for each and so does since is finite.
Let be the open-loop control corresponding to and its relaxed control representation. We have just proved that -almost surely and thus Theorem 1 says that must have the same law as the solution to (2.22) under the relaxed feedback control . That solution is unique by Lemma 1, meaning that in distribution. Therefore (5.2) follows since by (2.23). ∎
Remark 2**.**
In the above proof we strongly used the finiteness of to get the approximation in feedback controls. While the result in the open-loop setting holds for general state space , when considering feedback controls it is not clear whether the above lemma can be generalized to uncountably infinite state spaces.
We are now able to state the approximation result:
Proposition 3**.**
Let , and . Then for every there exist and such that
[TABLE]
Proof.
Let be a sequence in that approximates as in Lemma 8. Then we apply Theorem 1 to the sequence : it is tight, a subsequence converges in distribution to and . Thus there exist for which (5.4) and (5.6) hold. In a similar way, one proves (5.5) and (5.7) for feedback controls. ∎
5.2. -Nash equilibria
We can now define the approximate Nash equilibrium for the -player game, first in open-loop form.
Notation 2**.**
Let be a relaxed solution of the mean field game (2.20), which exists assuming (A) and (B) by Theorem 2. Fix and let be as in Proposition 3, satisfying (5.4) and (5.6) with . Then denotes the strategy vector where , , , such that
[TABLE]
Equation (5.8) says that this control is symmetric. The following is our main result, whose proof is carried out in the next subsection. In addition to (A) and (B), we make the Lipschitz assumptions (A’) and (B’).
Theorem 8**.**
Assume (A’) and (B’). Then the vector strategy defined in Notation 5.8 is an -Nash equilibrium for the -player game for any where and is a constant.
An analogous result holds when considering feedback strategies, but we state it separately.
Notation 3**.**
Let be a relaxed feedback solution of the mean field game (2.22), which exists assuming (A) and (B) by Theorem 3. Fix and let be as in Proposition 3, satisfying (5.5) and (5.7) with . Then the tuple denotes the feedback strategy vector where , , such that
[TABLE]
for any , and , and the are i.i.d copies of .
Equation (5.9) says that this feedback strategy vector is symmetric and decentralized. In order to obtain feedback -Nash equilibria from a mean field game solution, we need the Lipschitz assumptions (A”) and (B”).
Theorem 9**.**
Assume (A”), (B”). Then the feedback strategy vector defined in Notation 3 is a feedback -Nash equilibrium for the -player game for any where and is a constant.
5.3. Proofs of the results
In the following will denote any constant which depends on , , and the Lipschitz constants and , but not on , and is allowed to change from line to line. We focus first on open-loop controls. Fix and let the strategy vector be as in Notation 5.8. We play this strategy in the -player game:
[TABLE]
This will be coupled with defined by
[TABLE]
Let be the empirical measure of the system (5.10) and be the empirical measure of (5.11). Denote . By (5.4) we have
[TABLE]
for any , since . From (5.8) it follows that
[TABLE]
This implies, thanks to Theorem 1 in Fournier and Guillin (2015), that
[TABLE]
for any and , where is a constant. This upper bound in cannot be improved, since for these discrete measures a lower bound still in can be found, see again Fournier and Guillin (2015).
Lemma 9**.**
Under assumption (A’), for every and
[TABLE]
Proof.
From (5.12) and (5.13) it follows that
[TABLE]
We estimate using the 1-Wasserstein metric (which is equivalent to the Euclidean metric in ) and (5.10), (5.11) and the Lipschitz assumption (2.3):
[TABLE]
Hence applying (5.16)
[TABLE]
Then we obtain, by Gronwall’s lemma,
[TABLE]
Similarly we show (5.15): using (5.10), (5.11) and (5.14) we get, for any ,
[TABLE]
and hence by Gronwall’s lemma. ∎
We are now in the position to state the result about the costs. Because of the symmetry of the problem, for the prelimit we shall consider only player one ().
Lemma 10**.**
Under assumptions (A’) and (B’)
[TABLE]
Proof.
Inequality (5.6), together with notation 5.8, yields
[TABLE]
While from (2.5), (5.14) and (5.15) we have
[TABLE]
which, combined with (5.18), gives the claim. ∎
We consider then any and the perturbed strategy vector . We denote by the solution to
[TABLE]
for each . Set also and
Lemma 11**.**
Under assumption (A’), for any and
[TABLE]
Proof.
We make the rough estimate
[TABLE]
Hence
[TABLE]
and then, by Gronwall’s lemma,
[TABLE]
Therefore (5.20) is proved. Estimate (5.21) follows from (5.20) and (5.14) and the fact that for any . While (5.22) is a consequence of (5.21):
[TABLE]
and we conclude by Gronwall’s lemma. ∎
Lemma 12**.**
Under assumptions (A’) and (B’)
[TABLE]
Proof.
Inequalities (2.5), (5.21) and (5.22) give
[TABLE]
∎
Theorem 8 is now a consequence of Lemmata 5.17 and 5.23:
Proof of Theorem 8.
Inequalities (5.17), (5.23), and the optimality of yield
[TABLE]
∎
Remark 3**.**
We observe that is still an -Nash equilibrium if we assume only (B) instead of (B’), but without the estimate of the order of convergence . Namely, there exists a sequence such that .
Proof of Theorem 9.
The argument is the same as in the proof of Theorem 8. The difference is that equations (5.10), (5.11) and (5.19) become respectively, for each ,
[TABLE]
where the latter means that
[TABLE]
and
[TABLE]
for , thanks to Notation 1. The estimates we need to apply Gronwall’s lemma, in particular in the proof of Lemma 11, are found using also (2.4) and the fact that for every and each and in the finite . ∎
6. Conclusions
We summarize here the results we have obtained. The assumptions are given in Section 2.1 and verified for a natural shape of the dynamics in Lemmata 3 and 4.
- (1)
Under assumptions (A) and (B), there exist a relaxed mean field game solution and a relaxed feedback mean field game solution (in the sense of Definition 7), see Theorems 2 and 3, respectively. 2. (2)
Assuming (A), (B) and (C), there exists a feedback solution of the mean field game (Definition 4), see Corollary 1. The feedback mean field game solution is unique for small under the additional assumptions of Section 4.3 by Theorem 6; uniqueness for arbitrary time horizon holds under the Lasry-Lions monotonicity assumptions, see Theorem 7. 3. (3)
The relaxed mean field game solutions provide -Nash equilibria for the -player game (cf. Definition 2), both in open-loop and in feedback form (not relaxed), with . If (A’) and (B’) hold, then the symmetric open-loop strategy vector defined in Notation 5.8 is an -Nash equilibrium by Theorem 8. Assuming (A”) and (B”), the feedback strategy vector defined in Notation 3, which is symmetric and decentralized, is a feedback -Nash equilibrium thanks to Theorem 9.
Appendix A Relaxed Poisson measures
In order to state the definition of the relaxed Poisson random measure we first need to define the canonical space of integer valued random measures on a metric space . Following Jacod (1979), the setting is:
- •
is the set of sequences such that is increasing and if ; set and ;
- •
if write and ;
- •
the canonical random measure is
[TABLE]
for any ;
- •
, is given, , and .
The filtered space is then the canonical space of integer valued random measures on . A probability measure on it is the law of an integer valued random measure on , given an initial condition on . Note that the canonical measure is not the identity: for this reason we can work with as the state space of a random measure. Moreover, the set of integer valued random measures is vaguely closed in : see Theorem 15.7.4 in Kallenberg (1986) and the references therein.
Let now be any integer valued random measure defined on a filtered probability space . It is determined by a sequence of stopping times and random variables which are -measurable. To any is associated its compensator, that is, a positive random measure on such that
- (1)
is predictable for any ; 2. (2)
is an -martingale for each and ; 3. (3)
for each and .
The compensator exists and is unique (up to a modification on a -null set) for any . The proof can be found in Jacod (1975), where the author also shows that a process with the above properties uniquely determines an integer valued random measure.
Consider then an arbitrary measurable space and define . Set and . The canonical random measure on is extended to via . Set .
Theorem 10** (Jacod (1975)).**
Let be a probability measure on and a predictable random measure satisfying (1) and (3). Then there exists a unique probability measure on whose restriction to is and for which is the compensator of .
By means of this theorem, we are able to define properly a relaxed Poisson measure. Consider a relaxed control and let be the state space of the process , the initial distribution and the Poisson random measure . The -algebra is generated by the processes and is the joint law of . So a relaxed Poisson measure , related to the relaxed control , is an integer valued random measure on whose compensator , calculated on , , , is . Its law is uniquely determined on and thus has the martingale properties (2.17) and (2.18). Moreover, the joint law of is uniquely determined.
We can give an explicit construction of . Let and be a sequence in which tends to in the sense of Lemma 8, the chattering lemma. Denote by the relaxed control representation of and construct as in (2.19): . Then, by Theorem 1, the sequence is tight and any subsequence converges in distribution to . The marginals are uniquely defined in this way, while to show that the joint law of is unique we need to invoke the above Theorem 10.
A.1. Proof of Lemma 1
Let be fixed, which we shall omit. Let be the space of stochastic processes with paths in and equip it with the norm . Let and define the map by
[TABLE]
for any . If we prove that this map is a contraction in the norm , then pathwise existence and uniqueness of solutions to equation (2.20) follow. We have, for any ,
[TABLE]
hence
[TABLE]
thanks to (2.1) and the fact that is a probability measure. Therefore is a contraction if , and so uniqueness is proved for small time horizon; but then iterating the same argument, we have uniqueness for any .
Consider now and define by
[TABLE]
for any process . Then for any and we have where
[TABLE]
and
[TABLE]
where denotes the total variation of the signed measure defined for any by ; while the total variation norm is . The first term is bounded as above yielding . For the second term, we use to obtain
[TABLE]
Thanks to (2.17) and (2.13), we have , saying that the right-hand side above is finite -a.s. Since the measure is integer valued, we can assume that the above supremum is attained on a set for -a.e. , giving thus a random set . Moreover, we may assume that on such a set the random measure considered is positive. The martingale property (2.18) now gives
[TABLE]
where in the last line above we have used the fact that is a probability measure and for each . Therefore, for , the map is a contraction; the claim follows iterating the above procedure.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aldous (1978) D. Aldous. Stopping times and tightness. Ann. Probab. , 6(2):335–340, 1978.
- 2Basna et al. (2014) R. Basna, A. Hilbert, and V. N. Kolokoltsov. An epsilon-Nash equilibrium for non-linear Markov games of mean-field-type on finite spaces. Comm. Stoch. Anal. , 8(4):449–468, 2014.
- 3Bayraktar and Cohen (2017) E. Bayraktar and A. Cohen. Analysis of a finite state many player game using its master equation. ar Xiv:1707.02648 [math.AP], July 2017.
- 4Benazzoli et al. (2017) C. Benazzoli, L. Campi, and L. Di Persio. Mean-field games with controlled jumps. ar Xiv:1703.01919 [math.PR], March 2017.
- 5Bensoussan et al. (2013) A. Bensoussan, J. Frehse, and P. Yam. Mean Field Games and Mean Field Type Control Theory . Springer Briefs in Mathematics. Springer, New York, 2013.
- 6Bensoussan et al. (2016) A. Bensoussan, K. Sung, S. Yam, and S. Yung. Linear-quadratic mean field games. J. Optim. Theory Appl. , 169(2):496–529, 2016.
- 7Brunick and Shreve (2013) G. Brunick and S. E. Shreve. Mimicking an Itô process by a solution of a stochastic differential equation. Ann. Appl. Probab. , 23(4):1584–1628, 2013.
- 8Cardaliaguet (2013) P. Cardaliaguet. Notes on mean field games. Technical report, Université de Paris - Dauphine, September 2013.
