Mean-field games of optimal stopping: a relaxed solution approach
G\'eraldine Bouveret, Roxana Dumitrescu, Peter Tankov

TL;DR
This paper introduces a relaxed solution framework for mean-field games involving optimal stopping, establishing existence, uniqueness, and a numerical method for equilibrium computation.
Contribution
It develops a relaxed optimal stopping approach for mean-field games, proving equilibrium existence, uniqueness, and connecting relaxed solutions to pure strategies.
Findings
Proved existence of relaxed Nash equilibrium.
Established conditions for pure strategy optimality.
Presented a convergent numerical method for potential games.
Abstract
We consider the mean-field game where each agent determines the optimal time to exit the game by solving an optimal stopping problem with reward function depending on the density of the state processes of agents still present in the game. We place ourselves in the framework of relaxed optimal stopping, which amounts to looking for the optimal occupation measure of the stopper rather than the optimal stopping time. This framework allows us to prove the existence of the relaxed Nash equilibrium and the uniqueness of the associated value of the representative agent under mild assumptions. Further, we prove a rigorous relation between relaxed Nash equilibria and the notion of mixed solutions introduced in earlier works on the subject, and provide a criterion, under which the optimal strategies are pure strategies, that is, behave in a similar way to stopping times. Finally, we present a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Mean-field games of optimal stopping:
a relaxed solution approach
Géraldine Bouveret Smith School, University of Oxford, South Parks Road, Oxford, OX1 3QY, United Kingdom, Email: [email protected]
Roxana Dumitrescu Department of Mathematics, King’s College London, Strand, London, WC2R 2LS, United Kingdom, Email: [email protected]
Peter Tankov ENSAE Paris, 5 avenue Henry Le Chatelier, 91120 Palaiseau, France, Email: [email protected]
Abstract
We consider the mean-field game where each agent determines the optimal time to exit the game by solving an optimal stopping problem with reward function depending on the density of the state processes of agents still present in the game. We place ourselves in the framework of relaxed optimal stopping, which amounts to looking for the optimal occupation measure of the stopper rather than the optimal stopping time. This framework allows us to prove the existence of a relaxed Nash equilibrium and the uniqueness of the associated value of the representative agent under mild assumptions. Further, we prove a rigorous relation between relaxed Nash equilibria and the notion of mixed solutions introduced in earlier works on the subject, and provide a criterion, under which the optimal strategies are pure strategies, that is, behave in a similar way to stopping times. Finally, we present a numerical method for computing the equilibrium in the case of potential games and show its convergence.
Keywords: Mean-field games, optimal stopping, relaxed solutions, infinite-dimensional linear programming
AMS: 91A55, 91A13, 60G40
1 Introduction
The purpose of this paper is to study a large-population stochastic differential game of optimal stopping, where each agent finds the optimal time to exit the game by solving an optimal stopping problem with instantaneous reward function depending on the density of the state processes of agents still present in the game. To motivate the mean-field game (MFG) framework, we first provide a formulation with a finite number of agents. Assume that each agent has a private state process , whose dynamics is given by the stochastic differential equation (SDE),
[TABLE]
where the Brownian motions , are independent.
The objective of each agent is to maximize over all possible stopping times the reward functional
[TABLE]
with
[TABLE]
where represents the optimal stopping time of the agent . Agents have the same state process coefficients and objective functions, and the optimal stopping problems are coupled only through the empirical measure . Since the objective functions are coupled, it is natural to look for Nash equilibria.
Stochastic differential games with a large number of players are rarely tractable. The MFG approach amounts to looking for a Nash equilibrium in the limiting regime, when the number of players goes to infinity. Following this approach, we study the MFG of optimal stopping, which can be seen as an infinite-agent version of the above game. In this approach, we first solve for a fixed flow of sub-probability measures the optimal stopping problem
[TABLE]
with
[TABLE]
Then, given the optimal stopping time for the agent with initial condition , and the initial measure , we look for the flow of measures such that
[TABLE]
Note that in , the probability is not a conditional but joint probability. A solution (Nash equilibrium) of the MFG problem is the flow of measures , which is the fixed point of the mapping defined by the right-hand side of (1.1).
In this paper, we prove the existence of the Nash equilibrium for the MFG problem and the uniqueness of the associated value of the representative agent. To this aim, we use the relaxed solution approach, which converts the stochastic optimal stopping problem into a linear programming problem over a space of measures. The decision variable is no longer the optimal stopping time, but rather the distribution of the killed state process.
Introducing relaxed solutions facilitates existence proofs: the existence is proven by using Fan-Glicksberg’s fixed-point theorem. The relaxed solutions are related to the mixed strategies introduced in [bertucci2017optimal], and we establish a rigorous relation between the two. Finally, we propose an implementable numerical scheme for computing a Nash equilibrium in the case of potential games, and show its convergence. An application of these results to a resource-sharing problem will be developed in a companion paper.
MFG theory has been introduced by P.-L. Lions and J.-M. Lasry in a series of papers [lasry2006jeux, lasry2006jeux1, lasry2007mean] using an analytic approach and studied independently at about the same time by [huang2006large]. Later on, a probabilistic approach has been developed in a series of papers by Carmona, Delarue, and their co-authors [carmona2013probabilistic, carmona2013mean, carmona2018probabilistic, carmona2016mean, lacker2015mean] and so on.
The analytic method consists in finding the Nash equilibria through a coupled system of nonlinear partial differential equations: a Hamilton-Jacobi-Bellman equation (backward in time), which describes the optimal control problem of the representative agent when the distribution is given, and a Kolmogorov-type equation (forward in time) which describes the evolution of the density under the optimal control. In the probabilistic approach, the system of PDEs is replaced by a coupled system of forward-backward stochastic differential equations of McKean-Vlasov type.
MFGs of optimal stopping have been considered in the literature only very recently, and our understanding of this type of games remains limited. [nutz2018mean] considers a MFG problem where the agents interact through the proportion of players that have already stopped and each agent solves a specific optimal stopping problem of the form
[TABLE]
There, the process creates an incentive for the agent to stay in the game, while the possibility of default at a random time creates an incentive to leave. The distribution of depends on the proportion of players who have already stopped in such a way that the departure of other agents creates an incentive for the agent under consideration to leave as well (this type of game is known as preemption game). In a similar spirit but with greater generality, [carmona2017mean] consider MFGs of timing, whose formulation is motivated by a dynamic model of bank run in a continuous time setting. As in [nutz2018mean], the payoff of each agent depends on the proportion of players who have already stopped, and the departure of players creates an additional incentive for the players still in the game to leave as well. Both papers ([nutz2018mean] and [carmona2017mean]) adopt a purely probabilistic approach.
In contrast to these two references, [bertucci2017optimal] studies a MFG of optimal stopping, which is similar to the one considered in this paper, i.e. where the interaction takes place through the density of states of agents remaining in the game, rather than the proportion of players that have already stopped. In this reference and in our paper, the departure of players creates an incentive for the players still in the game to stay, a type of behavior known as ’war of attrition’, which is characteristic of resource-sharing problems. In [bertucci2017optimal] the state process has constant coefficients and evolves in a bounded domain, and the MFG of optimal stopping is solved through a coupled system of a Hamilton-Jacobi-Bellman variational inequality and a Fokker-Planck equation.
[bertucci2017optimal] makes a number of significant contributions to the literature. In particular, he provides an example of non-existence of Nash equilibrium with pure strategies in optimal stopping MFG, and introduces the notion of mixed strategies in this context, for which existence may be recovered. However, the existence proofs in this paper are not fully clear to us.111To be precise, the weak convergence of the flow established in the proof of existence of a mixed solution in both stationary and parabolic cases (Theorems 1.6 for the stationary case and Theorem 2.1 for the parabolic case) is not sufficient to conclude that converges.
To clarify the existence question and solve the MFG of optimal stopping problem in greater generality (with variable coefficients and in unbounded domains), we adopt, in this paper, a completely different approach, based on the relaxed solution technique.
The approach of relaxed solutions/controls is a relatively popular method of compactification of stochastic control problems to establish existence of solutions, which comes in several different flavors. In, e.g., [el1987compactification] and a number of other papers, the authors reformulate the control problem as a relaxed controlled martingale problem. A similar approach is used by [lacker2015mean] in the context of (standard) MFG. In the second approach, especially popular for infinite-horizon and ergodic control problems, the control problem is reformulated as a linear programming problem on the space of measures, and one looks for the joint occupation density of the state process and the control. We refer the reader to, e.g., [buckdahn2011stochastic] and [stockbridge1990time], for a link between these two formulations. The literature on relaxed solutions for individual optimal stopping problems is quite limited. [SC2002] propose a linear programming formulation for the infinite-horizon optimal stopping of a Markov diffusion process, using two measures: the occupation measure of the process and the joint distribution of the stopping time and the stopped process. [HS] extend this result to processes with singular components such as reflected diffusions. In contrast to these two references, in our paper we propose a different formulation based only on the occupation measure of the process killed at the stopping time. To the best of our knowledge, ours is the first paper which uses relaxed solutions in order to solve optimal stopping problems of mean-field type.
The literature on numerical schemes for MFG is well developed in the case of MFG with regular controls (see e.g. [BC2015]), but very little is known in the case of MFG with optimal stopping. In the latter case [B2018] proposes an algorithm, which works only under the assumption that the instantaneous reward function is strictly monotonic with respect to the measure, which is quite restrictive for applications. We propose instead a different algorithm, which allows to consider the case of a non-strictly monotonic reward function.
The structure of the paper is the following. In Section 2, we present the model and give the mean-field formulation of the problem. In Section 3, we introduce the relaxed formulation of the single-agent optimal stopping problem and establish the existence of a relaxed solution. In Section 4, we study the relaxed optimal stopping problem in the MFG context and give conditions for the existence of a Nash equilibrium and uniqueness of the Nash equilibrium value. In Section 5, we establish the relation between the relaxed and strong formulation of both single-agent and MFG optimal stopping problems. Finally, in Section 6, we present the numerical algorithm and provide convergence results.
2 The model
We fix a terminal time horizon , and introduce a possibly unbounded open domain {\color[rgb]{0,0,0}{\mathcal{O}}\subseteq\mathbb{R}^{d}} on which the state processes of the agents will evolve. The space of bounded positive measures on will be denoted by , and the space of probability measures on will be denoted by . In the sequel, any element will be identified to a column vector with -th component and Euclidian norm Similarly, for any matrix we denote by its Euclidian norm.
N-players game formulation
Consider agents whose states , follow the diffusion-type dynamics
[TABLE]
where the -dimensional Brownian motions , are independent and the coefficients and satisfy the following assumption.
Assumption 1** (X-SDE).**
The coefficients and are assumed to be Lipschitz continuous in the second variable, uniformly in and bounded.
By classical results on SDEs, this assumption guarantees the existence of a strong solution to (2.1) satisfying
[TABLE]
We denote by the infinitesimal generator of this process
[TABLE]
with , the Hessian matrix of with respect to and the trace operator.
Each agent aims to determine the optimal stopping time valued in by solving the optimal stopping problem
[TABLE]
where is a discount factor, is the running reward function, is the terminal reward, is defined by
[TABLE]
with a stopping time with respect to the filtration generated by the Brownian motions of all agents, corresponding to agent and the exit time from the domain of agent . The assumptions on will be specified later, and is assumed to belong to and has derivatives of order in and of orders and in of polynomial growth in uniformly in . Letting the optimal stopping problem becomes (up to a constant),
[TABLE]
We now formulate the notion of Nash equilibrium for the optimal stopping game with players. To this purpose, let be the set of stopping times with respect to the filtration generated by the Brownian motions of all agents, taking values between [math] and . Given a strategy vector {\tau}:=(\tau^{1},\tau^{2},...,\tau^{N}){\color[rgb]{0,0,0}\in\mathcal{T}^{N}} and an individual strategy \sigma{\color[rgb]{0,0,0}\in\mathcal{T}}, let indicate the strategy vector that is obtained from by replacing , the strategy of player , with .
Definition 2.1** (Nash Equilibrium -players game).**
A strategy vector {\tau}:=(\tau^{1},\tau^{2},...,\tau^{N}){\color[rgb]{0,0,0}\in\mathcal{T}^{N}} is called a Nash equilibrium for the players game, if for every and every \sigma{\color[rgb]{0,0,0}\in\mathcal{T}}, we have
[TABLE]
where, for each \theta{\color[rgb]{0,0,0}\in\mathcal{T}^{N}},
[TABLE]
where is given by with replaced by , for each .
MFG formulation
In the limit of a large number of agents, we expect, from the law of large numbers, that the empirical measure converges to a deterministic limiting distribution for each . The problem of each agent therefore consists in finding the optimal stopping time in the filtration generated by the individual noise of this agent only, and it is sufficient to work on a probability space supporting a single Brownian motion.
Let be a probability space supporting a standard -dimensional Brownian motion . We denote by the natural filtration of completed with the sets of measure zero. In the MFG formulation, the state of the representative agent with initial value follows the dynamics
[TABLE]
where we write as a shorthand for . As intimated in the introduction, the first step of the MFG approach consists in solving the following optimal stopping problem for the agent
[TABLE]
where is the set of -stopping times with values in and is the exit time from the domain of this agent with initial value . Then, given the optimal stopping time (solution of the problem (2.4)) for the agent with initial condition , , and the initial measure , the second step consists in finding the flow of measures such that
[TABLE]
In other words, the solution of the optimal stopping MFG problem is the flow of measures , which is the fixed point of the mapping defined by the right-hand side of (2.5). In the sequel, such solution will be called a pure solution. As shown in [bertucci2017optimal], pure solutions for optimal stopping MFG problems do not always exist, and for this reason in the sequel we shall consider relaxed solutions. A relaxed solution is close in spirit to the mixed solution introduced in [bertucci2017optimal], precise relationship between the two notions will be established later in the paper.
3 Relaxed formulation of the single-agent optimal stopping
problem
The relaxed formulation of the optimal stopping problem consists in finding the occupation measure of the representative agent rather than the stopping time. We first provide a relaxed formulation of the standard optimal stopping problem in this section and then move to the relaxed formulation of the MFG problem in the following one. First, we introduce the necessary notations.
Let be the space of flows of (signed) bounded measures on : is such that: for every , is a (signed) bounded measure on , for every , the mapping is measurable, and . To each flow , we associate a signed measure on defined by . The space , endowed with the topology of weak convergence (that is, for every function continuous and bounded) is a locally convex Hausdorff topological space (see e.g. [V61]).
Consider the optimal stopping problem
[TABLE]
In this section we study a relaxed version of this optimal stopping problem, where the process starts with an initial distribution instead of a fixed value, and which is formulated in terms of flows of measures rather than stopping times. We let denote the distribution of the process , started with the initial distribution and killed at the first exit time from . In other words,
[TABLE]
We impose the following minimal assumption on the reward function . We shall see below in Corollary 3.4 that this assumption is sufficient for the problem to be well defined, but stronger assumptions will be imposed for existence of solution.
Assumption 2** (-min**).
The map is measurable and satisfies
[TABLE]
where denotes the negative part.
The previous assumption was not sufficient to guarantee that the integral in (3.2) is well defined.
Definition 3.1** (Relaxed optimal stopping problem).**
For a given initial distribution , the relaxed formulation of the optimal stopping problem (3.1) consists in finding the flow of measures , which maximizes the cost functional
[TABLE]
over , where the set contains all flows of positive bounded measures satisfying
[TABLE]
for all such that and is bounded on .
The rest of this section is devoted to the solution of the relaxed optimal stopping problem. A precise connection with the strong (classical) formulation of the optimal stopping problem will be established in Section 5. To gain some intuition about this definition right away, remark that for a stopping time , we can introduce the occupation measure . Then the objective function of the optimal stopping problem writes
[TABLE]
On the other hand, by Itô’s formula, for a positive and regular test function , one has
[TABLE]
In Lemmas 3.3 and 3.5, we study the properties of the set . First note that this set is clearly nonempty since it contains the flow . To proceed, we need a regularity assumption on the coefficients and . We distinguish two cases depending on the type of boundary of .
Assumption 3** (X-PDE).**
The coefficients and are such that for every bounded function with bounded derivatives of all orders, the equation
[TABLE]
has a solution on such that has a polynomial growth in , uniformly in , and such that one of the following two conditions holds:
- i.
The boundary of is unattainable: for all , a.s.
- ii.
The solution belongs to and satisfies for .
Remark 3.2**.**
Assumption (X-PDE) holds in a variety of different settings. Below, while not aiming to give the sharpest possible conditions, we present some examples of such settings.
- •
Let and assume that the operator is uniformly parabolic: there exists such that for all and , the matrix satisfies
[TABLE]
Furthermore, suppose that the coefficients are bounded, uniformly Hölder continuous in and uniformly continuous in , and the coefficients are Hölder continuous in uniformly on compacts and continuous in . Then, by Theorem 4.4.6 in [F75], equation (3.4) admits a solution, and the polynomial growth of follows from the estimate (4.4.12) in the above reference.
- •
Let be a bounded domain with boundary and assume that (3.5) is satisfied and the coefficients and are uniformly Hölder continuous in on . Then, by Theorem 4.3.6 in [F75] equation (3.4) admits a solution.
- •
As our last example we consider a situation where the condition (3.5) need not be satisfied. For simplicity, we restrict ourselves to the setting of homogeneous equations, that is, the coefficients and do not depend on , but the argument may be extended to the general case. Suppose that the boundary of is unattainable and that , , and are bounded and locally Lipschitz. This ensures that equation (2.1) admits a unique strong solution,
[TABLE]
and, applying Theorem V.39 in [protter] twice (first to the process and then to its first order tangent flow), we conclude that the mapping is twice continuously differentiable, and the derivatives and are given by the solutions of the following system of equations (where we use the Einstein convention of summing over repeated indices and denotes the Kroneker symbol).
[TABLE]
Moreover, by standard arguments (e.g., Theorem V.66 in [protter] and Gronwall’s lemma), from boundedness of derivatives of and it follows that for some constant ,
[TABLE]
Let us define
[TABLE]
Then, by dominated convergence, the derivatives , and exist, are bounded, continuous, and given by the following expressions.
[TABLE]
Furthermore, by the Markov property, for ,
[TABLE]
and an application of the Itô formula yields:
[TABLE]
where we removed the superscript to save space. Dividing both sides by and passing to the limit , we get (3.4).
Lemma 3.3**.**
Let Assumptions (X-SDE) and (X-PDE) be satisfied. Fix .
- i.
Let be a continuous function with polynomial growth. Then almost everywhere on , and for ,
[TABLE]
- ii.
Let such that , and are bounded. Then, for and for every ,
[TABLE]
for some .
Proof.
Part i. Assume that and are bounded positive functions with bounded derivatives of all orders, and let be the solution of
[TABLE]
described in Assumption (X-PDE). By Itô’s formula, for ,
[TABLE]
Taking the expectation and using the equation satisfied by , the fact that has polynomial growth and the a priori estimates on the strong solution of the SDE (i.e. , for all ), we get
[TABLE]
which means that is an admissible test function in the sense of Definition 3.1. Substituting the above expression for into the constraint (3.3), we have
[TABLE]
Since is arbitrary, this implies that
[TABLE]
-almost everywhere on . The result may be extended to a positive continuous function with polynomial growth by considering a sequence of functions , where converges uniformly on compact sets to (see Prop. 4.21 in [B2010]), converges pointwise to , where , are two sequences of mollifiers and , with a sequence of increasing compact sets approximating the open set (exhaustion by compact sets of the set ). Note that all elements of the sequence of functions admit bounded derivatives of all orders (since they are continuous and have compact support). The result follows by applying first Lebesgue’s Theorem, when taking the limit with respect to and and then the monotone convergence theorem when letting .
Part ii. First remark that
[TABLE]
is bounded on . This implies that it is enough to prove the result for , because for , the derivative may be approximated by smooth functions in the uniform norm.
By Itô formula, for ,
[TABLE]
Taking the expectation and integrating by parts we obtain
[TABLE]
for some constant , due to the bounds on , , , and . Then we can define the function
[TABLE]
which is an admissible test function by the same argument as the one used in the first part. This proves that
[TABLE]
and since for all , we get the statement of the lemma. ∎
Corollary 3.4**.**
Under the assumptions of Lemma 3.3, let , and let be the distribution of the process started with initial distribution and killed at the first exit time from . Then for every , , -almost everywhere on . In particular, if has a density then does as well.
Proof.
Approximating the indicator function with a sequence of continuous bounded functions and using the dominated convergence theorem, the first part of the above lemma yields for all with (where the inequality is interpreted componentwise),
[TABLE]
where is the transition distribution of the process killed at . ∎
In the following lemma we continue the study of the properties of the set . The compactness of this set is established under the following assumption.
Assumption 4** (-Compact**).
The initial distribution satisfies
[TABLE]
Lemma 3.5**.**
Let Assumptions (X-SDE) and (-Compact) be satisfied. Then the set is sequentially compact.
Proof.
Let us first show the tightness of the associated set of measures on . For , define the function
[TABLE]
with . Remark that
[TABLE]
and , for all , from which it is easy to see that is twice continuously differentiable on its entire domain, and that the expressions , , and are bounded on by a constant independent from . In addition, as , converges in a monotone fashion to the limiting function . Now, consider the test function . It follows that
[TABLE]
for . From the boundedness of and and the above observations, we deduce that the expression within the brackets in the last term is bounded uniformly on . The limits of the first two terms, on the other hand, are computed by monotone convergence. Letting , we conclude that there exists a constant such that
[TABLE]
from which the tightness follows 222For sake of clarity, we precise the tightness criteria. Let be a topological space equipped with its Borel sigma-field. Let be a flow of measures on . If there exists a measurable function with compact level sets such that , then is tight (the proof follows immediately by the measure version of the Markov inequality).. Moreover, taking in Lemma 3.3 we see that is uniformly bounded. Therefore, by Prokhorov’s theorem (Theorem 8.6.2 in [bogachev2007measure]), from any sequence of flows of measures , one can extract a subsequence, also denoted by , such that the sequence of associated measures on , converges weakly to a limiting measure . By weak convergence, the measure also satisfies the constraints of i.e., for every test function ,
[TABLE]
Taking the test function with a positive continuous function, we have
[TABLE]
We conclude that is a bounded measure and the measure on is absolutely continuous with respect to the Lebesgue measure, which means that we can write for some . The positivity of the limiting measure flow follows from weak convergence and absolute continuity. ∎
The following proposition is an existence result for the relaxed optimal stopping problem. We need the following assumption on .
Assumption 5** (-Exist**).
One of the following alternative conditions holds true:
- i.
The mapping is continuous on and satisfies
[TABLE]
where is the distribution at time of the process started with initial distribution .
- ii.
The function is of the form
[TABLE]
where and for each , is such that , and are bounded., and is bounded measurable.
Proposition 3.6**.**
Let Assumptions (X-SDE), (X-PDE), (-Compact) and (-Exist) be satisfied. Then there exists which maximizes the functional
[TABLE]
over all .
Proof.
Choose a maximizing sequence of flows of measures . By Lemma 3.5, it has a subsequence, also denoted by , which converges weakly to a limit . To show that is a maximizer of (3.2), we consider separately the two alternative assumptions of the proposition.
Suppose that Assumption i. holds true. Fix . By the continuity of and the integrability assumption, there exists such that
[TABLE]
Then, by weak convergence and by Corollary 3.4,
[TABLE]
Since is arbitrary, is a maximizing sequence and , this finishes the proof.
Suppose now that Assumption ii. holds true instead. Without loss of generality it is enough to consider the case where , and we omit the index . Consider the mapping defined by . By Lemma 3.3 and Proposition 3.6 in [ambrosio2000functions], is then of bounded variation on . Then, by Theorem 3.23 in the above reference, up to taking a subsequence, we may assume that the sequence of mappings converges in to some mapping . On the other hand, in view of the weak convergence, for any continuous function ,
[TABLE]
This shows that . We conclude that
[TABLE]
as . ∎
4 Relaxed formulation of the optimal stopping MFG problem
We now give the definition of Nash equilibrium for the relaxed MFG optimal stopping problem. For the problem to be well-defined, we impose the following minimal assumption on the reward function :
Assumption 6** (-min-MFG**).
For every , the map
[TABLE]
is measurable and satisfies
[TABLE]
Definition 4.1**.**
Given the initial distribution , a flow of measures is a Nash equilibrium for the relaxed MFG optimal stopping problem (or “relaxed Nash equilibrium”) if
[TABLE]
and
[TABLE]
for all .
In other words, the set of Nash equilibria coincides with the set of fixed points of the set-valued mapping , defined by
[TABLE]
which is well defined whenever the function satisfies the conditions of Proposition 3.6.
The next theorem estalishes existence of the MFG equilibrium under the following assumption.
Assumption 7** (-Exist-MFG**).
Let the reward function be of the form
[TABLE]
where, for each , are such that , , , , are bounded, and is bounded measurable and continuous with respect to its second argument.
Theorem 4.2**.**
Let Assumptions (X-SDE), (X-PDE), (-Compact) and (-Exist-MFG) be satisfied. Then there exists a Nash equilibrium for the relaxed MFG problem.
Proof.
We shall use the Fan-Glicksberg fixed-point theorem (Theorem 7.1 in [mclennan2018advanced]). We have seen that is a locally convex space; moreover, the subset is compact (by Lemma 3.5 and since is included in the space of positive and finite measures on a separable metric space, which is metrizable), convex and nonempty. The mapping is clearly convex. Therefore, to prove that it has a fixed point it suffices to check that it is upper semicontinuous. In other words, we check that it has a closed graph (see Proposition 5.1.3 in [mclennan2018advanced]), where the graph is defined by
[TABLE]
To show that is closed it suffices to check that for any two sequences and which converge weakly to and respectively, and such that
[TABLE]
for every , we have
[TABLE]
for every . To prove this, it is enough to show that, up to taking a subsequence,
[TABLE]
and
[TABLE]
We will only show that holds true, since the convergence given by (4.2) follows by the same arguments. It is enough to consider the case and we drop the index . We therefore need to prove
[TABLE]
where we write as a shorthand for . As in the proof of Proposition 3.6, we may show that converges to in . Similarly, we may show that converges to in . Since is continuous, converges almost everywhere to . Further, by Corollary 3.4, is uniformly bounded, and (4.3) follows from the dominated convergence theorem. ∎
Uniqueness of the Nash value for the relaxed MFG problem
We prove here the uniqueness result of the Nash equilibrium value for the relaxed problem, which holds under the following assumption on the map .
Assumption 8** (-Uniq-MFG**).
The function takes the following form
[TABLE]
where is such that , and are bounded., is continuous, with polynomial growth in and is bounded measurable, continuous and decreasing in the second argument.
Remark 4.3**.**
Note that under Assumption (-Uniq-MFG), the function satisfies for each and all and the following antimonotonicity condition
[TABLE]
Theorem 4.4** **(Uniqueness of the Nash value).
Let and be two Nash equilibria for the relaxed problem and let Assumption (-Uniq-MFG) be satisfied. Then,
[TABLE]
almost everywhere on , and in particular they lead to the same value of the relaxed fixed point problem, that is .
Proof.
Since is a Nash equilibrium, we get that
[TABLE]
Since is also a Nash equilibrium, we obtain
[TABLE]
From the two above inequalities, we derive that
[TABLE]
The antimonotonicity property of the map then implies that
[TABLE]
almost everywhere on , or in other words that
[TABLE]
almost everywhere on , which implies that
[TABLE]
almost everywhere on . Integrating over we see that the two equilibria lead to the same value. ∎
Remark 4.5**.**
A natural question is to see if one can use a relaxed Nash equilibrium corresponding to the MFG game problem in order to construct a -Nash equilibria for the -player game. A possible way to do it, is to show that, given a relaxed MFG equilibrium, then the empirical measures
[TABLE]
correspond to a -Nash equilibria, where, for every , maximizes
[TABLE]
This problem is left for further research.
5 Relation between the relaxed and the strong formulation of the single-agent optimal stopping and of the MFG problem and relation with mixed solutions
In this section we provide the relation between the relaxed and the strong formulation of the single-agent optimal stopping problem and of the MFG problem, as well as with the mixed solutions introduced in [bertucci2017optimal]. We make here the following additional assumption.
Assumption 9** (X-Reg).**
- i.
The domain is an open bounded domain of , with boundary of class and the process started with initial distribution and killed at the first exit time of has a distribution , which, for each , has a square integrable density with respect to the Lebesgue measure. 2. ii.
satisfies the uniform ellipticity condition.
Remark 5.1**.**
Let be as in Assumption (X-Reg), assume that satisfies the uniform ellipticity condition, that the coefficients and are uniformly Lipschitz continuous on and that the initial distribution admits a bounded density with respect to the Lebesgue measure. Then, by Theorem 3.16 in [friedman83], the operator admits a Green function , which is continuous in for all . Moreover, the Green function admits an Aronson-type estimate of the form
[TABLE]
see Equation (16.16) in [ladyzh]. This means that the solution to the equation
[TABLE]
with boundary condition and terminal condition is given by
[TABLE]
On the other hand, by Theorem 5.2 in [F75], this solution is given by
[TABLE]
We conclude that the Green function coincides with the density of the process started at and killed at the first exist time from . The density of the process started with the initial distribution is therefore given by
[TABLE]
Since is bounded by assumption, we conclude using the bound (5.1) that the density is uniformly bounded on . Note that the process satisfying the conditions given in this remark also satisfies the assumptions (X-SDE) and (X-PDE) (see Remark 3.2).
Note that, by Corollary 3.4, we derive that admits a square integrable density with respect to the Lebesgue measure, for each and for a.e. .
Let be a standard -dimensional Brownian motion and be a random variable with distribution , independent from . We suppose that is valued in and that admits a square integrable density with respect to the Lebesgue measure. In the sequel, we denote by the filtration given by , where denotes the sets of zero measure. Moreover, denotes the set of stopping times with respect to this filtration with values in . We also denote by the set of stopping times with respect to the (completed) filtration generated by the translated Brownian motion , , with values in .
We address first the case of the single-agent optimal stopping problem.
Theorem 5.2**.**
[Single-Agent optimal stopping problem] Let Assumptions (X-SDE), (X-PDE), (X-Reg) and (-Exist)(ii) be satisfied. Let be the value function of the following optimal stopping problem
[TABLE]
with and . We have
- i.
2. ii.
Let and define . Then the measure given by for all is a maximizer of the map 3. iii.
Let be a maximizer of the map Then it satisfies:
- a.
, with 2. b.
For all functions such that , the following holds
[TABLE]
Proof.
Part i. By Theorem 4.7, Chapter 3, in [bensoussan1982applications], the value function defined by (5.2) is a solution belonging to , with 333The Sobolev space represents the set of functions such that , with , where the derivatives are understood in the sense of distributions., which satisfies the following variational inequality
[TABLE]
First note that, by Lemma A.1, we have
[TABLE]
By classical results on optimal stopping and associated reflected Backward SDEs with random terminal time (see e.g. Proposition 2.3 in [el1997reflected]), we get
[TABLE]
where
[TABLE]
Note that, by definition of the value function , we have a.s.
Taking now the expectation in , we derive that
[TABLE]
Remark that the occupational measure associated with the diffusion process killed at the stopping time , that is , belongs to . Therefore, we have
[TABLE]
We now show the converse inequality. Fix . Using a classical method of regularisation by convolution with a standard mollifier, with respect to both time and space (see, e.g., an extension of Meyers-Serrin’s result - Theorem 3, p. 252, in [evans]), the value function can be approximated by a sequence of functions such that in as and is bounded. Since are admissible test functions, they verify the constraint (3.3). Therefore, using the assumptions on and passing to the limit, we derive that the value function satisfies
[TABLE]
From the above inequality, we derive that
[TABLE]
Since satisfies the variational inequality and due to the positivity of and Assumption (-Reg), we get
[TABLE]
Combining the two above relations and by arbitrariness of , we get
[TABLE]
Part ii. Since the stopping time given by is optimal for the stopping problem , we derive that
[TABLE]
with defined by for all .
Using part i. and the fact that , the result follows.
Part iii. Let be defined in part ii. Since by the results above it is a maximizer, we have . Therefore
[TABLE]
where the last relation follows since satisfies the variational inequality . Now, since a.e. on and satisfies , we get
[TABLE]
Using the above relation, the inequality and the fact that a.e. on , we finally obtain that
[TABLE]
and
[TABLE]
Let us now show that (5.11) implies that
[TABLE]
for all functions such that \operatorname{supp}\phi{\color[rgb]{0,0,0}\subseteq}\{(t,x)\in[0,T]\times{\mathcal{O}}:\,\,v>0\}.
First note that, by the same approximation procedure as the one used for the value function in Part i. (using an extension of Meyers-Serrin’s result), any non-negative function in satisfies the constraint (5.8).
Let be a non-negative function such that . Up to an appropriate scale factor, one can assume that . Suppose that
[TABLE]
Subtracting from (5.11), we obtain that
[TABLE]
Since is a non-negative function belonging to , we get a contradiction. This implies that for all non-negative functions such that we have
[TABLE]
The result can be extended to an arbitrary function (which also takes negative values) such that . Using appropriate scaling factors and similar arguments as above, one can show that and cannot be satisfied. Hence, for all functions such that , we have
[TABLE]
∎
We now illustrate the relation between the relaxed and strong formulation of the optimization problem in the MFG context, as well as the relation with the mixed solutions introduced in [bertucci2017optimal].
Theorem 5.3**.**
[MFG optimal stopping problem] Let Assumptions (X-SDE), (X-PDE), (X-Reg) and (-Exist-MFG) be satisfied. Let be a Nash equilibrium of the relaxed MFG problem and let be the value function of the optimal stopping problem
[TABLE]
with and .
We have
- i.
Relation with the strong formulation
2. ii.
Relation with mixed solutions
satisfies
- a.
, with 2. b.
For all functions such that , the following holds
[TABLE]
Proof.
The proof follows by using the results obtained in Theorem 5.2 applied to the instantaneous reward function (which satisfies Assumption (-Exist)(ii), so that Theorem 5.2 can be applied), together with the Nash equilibrium property of . ∎
Remark 5.4**.**
It follows from the variational inequality that on . Therefore, if on , . Such a solution is called a pure solution in [bertucci2017optimal], meaning that the agent will exit the game immediately upon entering the exercise region.
6 Fixed-point algorithm and convergence in the case of potential games
We first show that, in the case of potential games, the search for MFG equilibrium reduces to the maximization of a functional. The reward function of a potential game satisfied the following assumption.
Assumption 10** (-Pot**).
The reward function is of the form
[TABLE]
where for each , is bounded, measurable in , and continuous and decreasing in the second argument, and such that , and are bounded. Moreover, for each , there exists such that and .
Proposition 6.1**.**
Let Assumption (-Pot) be satisfied. Then is a Nash equilibrium of the relaxed optimal stopping problem if and only if
[TABLE]
where
[TABLE]
Proof.
Assume that is a Nash equilibrium. By definition we then have
[TABLE]
Since is decreasing in the second argument, is concave in the second argument, and by concavity this implies that . Conversely, assume that is a maximizer of . For every and every , then,
[TABLE]
which implies that
[TABLE]
where . Making tend to [math] and using the dominated convergence theorem, we conclude that
[TABLE]
∎
We propose now a fixed-point algorithm for potential games. We use the notations of Proposition 6.1.
Algorithm
- •
Fix
- •
For to
Compute the solution of the obstacle problem (5) associated with ;
Let be such that , for all , where 444Note that, for each from [math] to , we extend such that for all and . Therefore, we have a.s.;
Let be a maximizer of
Set ;
Set .
In the above algorithm, represents the number of iterations.
For each , define
[TABLE]
Lemma 6.2**.**
Let Assumptions (X-SDE), (X-PDE), (X-Reg) and (-Pot) be satisfied. The set-valued map has a closed graph and is a relaxed Nash equilibrium if and only if it satisfies .
Proof.
Let be a sequence converging weakly to some and such that weakly converging to some . Let us prove that . Taking subsequences if necessary, we can assume that converges to some and weakly converges to some .
Since maximizes the map , we get that
[TABLE]
for all . For simplicity, we consider here the case and drop the index . Using the same arguments as those in the proof of Theorem 4.2, we may say that, up to taking subsequences, the sequence (g*m^{{\color[rgb]{0,0,0}n}})_{n\geq 1} (resp. (g*m^{{\color[rgb]{0,0,0}n}^{*}})_{n\geq 1}) converges in to (resp. ). Due to the continuity of , we derive that \bar{f}(t,g*m^{{\color[rgb]{0,0,0}n}}_{{{\color[rgb]{0,0,0}t}}})g*m^{{\color[rgb]{0,0,0}n}^{*}}_{{{\color[rgb]{0,0,0}t}}} (resp. \bar{f}(t,g*m^{{\color[rgb]{0,0,0}n}}_{{{\color[rgb]{0,0,0}t}}})g*m_{{\color[rgb]{0,0,0}t}}) converges for a.e. to \bar{f}(t,g*\hat{m}_{{\color[rgb]{0,0,0}t}})g*m^{*}_{{\color[rgb]{0,0,0}t}} (resp. \bar{f}(t,g*\hat{m}_{{\color[rgb]{0,0,0}t}})g*m_{{\color[rgb]{0,0,0}t}}). By Corollary 3.4, g*m^{{\color[rgb]{0,0,0}n}^{*}} is uniformly bounded, therefore, by appealing to the dominated convergence theorem, we derive
[TABLE]
for all , that is
[TABLE]
Now it remains to show that is a maximizer of . For each , we have F(m^{{\color[rgb]{0,0,0}n}}+\hat{\rho}^{{\color[rgb]{0,0,0}n}}(m^{{\color[rgb]{0,0,0}n}^{*}}-m^{{\color[rgb]{0,0,0}n}}))\geq F(m^{{\color[rgb]{0,0,0}n}}+\rho(m^{{\color[rgb]{0,0,0}n}^{*}}-m^{{\color[rgb]{0,0,0}n}})), for all , for all . Taking the limit and using similar arguments as above, as well as the assumptions on , we get
[TABLE]
To conclude, we have .
It is clear that, if is a relaxed Nash equilibrium, then it satisfies . Conversely, one can show that if , then corresponds to a relaxed Nash equilibrium. Indeed, if , then we have or . If , then \int_{0}^{T}\int_{\mathcal{O}}f(m_{{\color[rgb]{0,0,0}t}})(m_{{\color[rgb]{0,0,0}t}}^{*}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}-m_{{\color[rgb]{0,0,0}t}}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})})dt\leq 0. Since is a maximizer of the map m^{\prime}\mapsto\int_{0}^{T}\int_{{\mathcal{O}}}f(m_{{\color[rgb]{0,0,0}t}})m_{{\color[rgb]{0,0,0}t}}^{\prime}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}dt, we derive that \int_{0}^{T}\int_{\mathcal{O}}f(m_{{\color[rgb]{0,0,0}t}})(m_{{\color[rgb]{0,0,0}t}}^{*}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}-m_{{\color[rgb]{0,0,0}t}}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})})dt=0, which implies that corresponds to a relaxed Nash equilibrium. If , the conclusion is clear. ∎
We now give the following convergence result.
Theorem 6.3**.**
Let Assumptions (X-SDE), (X-PDE) , (X-Reg) and (-Pot) be satisfied. Then the cluster points of the sequence (m^{n})_{{\color[rgb]{0,0,0}n\geq 1}} generated by the previous algorithm belong to the set of relaxed Nash equilibria and the sequence (u^{n}(0,x))_{{\color[rgb]{0,0,0}n\geq 1}} converges for all to , the value function of the obstacle problem associated with cost functional , where is a relaxed Nash equilibrium.
Proof.
First note that, by using the definition of and Theorem 5.2 part ii., we get that \tilde{m}^{n}\in\underset{m^{\prime}\in\mathcal{A}(m^{*}_{0})}{\arg\max}\int_{0}^{T}\int_{{\mathcal{O}}}f(m^{n}_{{\color[rgb]{0,0,0}t}})m_{{\color[rgb]{0,0,0}t}}^{\prime}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}dt. We thus have m^{{\color[rgb]{0,0,0}{n+1}}}\in\mathcal{C}(m^{{\color[rgb]{0,0,0}{n}}}), for all .
Let (m^{k_{n}})_{{\color[rgb]{0,0,0}n\geq 1}} be a sequence converging weakly to some , and taking a subsequence again if necessary, we may also assume that converges to some . As by the previous theorem the set-valued map has a closed graph, we have , that is , with m^{*}\in\underset{m^{\prime}\in\mathcal{A}(m^{*}_{0})}{\arg\max}\int_{0}^{T}\int_{\mathcal{O}}f(m_{{\color[rgb]{0,0,0}t}})m_{{\color[rgb]{0,0,0}t}}^{\prime}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}dt and
Now, since the sequence (F(m^{n}))_{{\color[rgb]{0,0,0}n\geq 1}} is increasing, one has Assume now that is not a Nash equilibrium, that is m\notin\underset{m^{\prime}\in\mathcal{A}(m^{*}_{0})}{\arg\max}\int_{0}^{T}\int_{\mathcal{O}}f(m_{{\color[rgb]{0,0,0}t}})m_{{\color[rgb]{0,0,0}t}}^{\prime}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}dt. Therefore, \int_{0}^{T}\int_{\mathcal{O}}f(m_{{\color[rgb]{0,0,0}t}})(m_{{\color[rgb]{0,0,0}t}}^{*}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})}-m_{{\color[rgb]{0,0,0}t}}{\color[rgb]{0,0,0}(}dx{\color[rgb]{0,0,0})})dt>0. Moreover, using Lemma 6.2, we have which implies that . Hence, we conclude that , which represents a contradiction.
Let us now prove the convergence of the sequence (u^{n}(0,x))_{{\color[rgb]{0,0,0}n\geq 1}} for all .
Since all Nash equilibria lead to the same value (see Theorem 4.4), we can define as being the solution of the obstacle problem associated with , with a Nash equilibrium.
Let be a given subsequence. Up to subtracting a subsequence again, one can assume that converges weakly to some , which, by the results above, is a relaxed Nash equilibrium.
Fix . We have:
[TABLE]
Using again the convergence in of to , the assumptions on together with Lebesgue Theorem, we get that the last term of the above inequality converges to 0. We can conclude that from every subsequence of , we can extract a further subsequence which converges to . The result follows. ∎
7 Acknowledgement
Peter Tankov gratefully acknowledges financial support from the LABEX ECODEC (ANR-11-IDEX-0003/LabexEcodec/ANR-11-LABX-0047) and from the FIME Research Initiative.
Appendix A Appendix
We show here that the representation (5.2) remains true when the initial condition is random. More precisely, we have the following result.
Lemma A.1**.**
Let \xi\in{\color[rgb]{0,0,0}L}^{2}({\mathcal{O}},\mathcal{F}_{0}). Then we have
[TABLE]
Proof.
The proof is based on quite classical arguments and we give it here for the reader’s convenience. Let us first consider a simple random variable \xi^{n}\in{\color[rgb]{0,0,0}L}^{2}({\mathcal{O}},\mathcal{F}_{0}), being such that there exists , and such that
[TABLE]
By using the definitions of and , we obtain
[TABLE]
Now, in the general case, we approximate by a sequence of simple random variables of the form given by (A.2). The continuity of with respect to implies that
[TABLE]
We have
[TABLE]
Since \xi^{n}{\color[rgb]{0,0,0}\rightarrow}\xi a.s. as n{\color[rgb]{0,0,0}\rightarrow}\infty, we get that \tau_{\mathcal{O}}^{{\color[rgb]{0,0,0}\xi^{n}}}{\color[rgb]{0,0,0}\rightarrow}\tau_{\mathcal{O}}^{{\color[rgb]{0,0,0}\xi}} a.s. as n{\color[rgb]{0,0,0}\rightarrow}\infty due to the continuity property of the first passage time for elliptic diffusions (see Proposition 4.4. in [pardoux1998backward]). Using the continuity property of the solution of the SDE with respect to the initial condition, together with the assumptions on and Lebesgue Theorem, it follows that
[TABLE]
By (A.3) and (A.4) and the uniqueness of the limit, we get (A.1). ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] \harvarditem [Ambrosio et al.]Ambrosio, Fusco \harvardand Pallara 2000 ambrosio 2000 functions Ambrosio, L., Fusco, N. \harvardand Pallara, D. \harvardyearleft 2000 \harvardyearright , Functions of bounded variation and free discontinuity problems , Vol. 254, Clarendon Press Oxford.
- 2[2] \harvarditem Benamou \harvardand Carlier 2015 BC 2015 Benamou, J. \harvardand Carlier, G. \harvardyearleft 2015 \harvardyearright , ‘Augmented lagrangian methods for transport optimization, mean field games and degenerate elliptic equations’, Journal of Optimization Theory and Applications, 167(1):1-26 .
- 3[3] \harvarditem Bensoussan \harvardand Lions 1982 bensoussan 1982 applications Bensoussan, A. \harvardand Lions, J.-L. \harvardyearleft 1982 \harvardyearright , Applications of variational inequalities in stochastic control , North Holland Publishing Company.
- 4[4] \harvarditem Bertucci 2017 bertucci 2017 optimal Bertucci, C. \harvardyearleft 2017 \harvardyearright , ‘Optimal stopping in mean field games, an obstacle problem approach’, Journal de Mathématiques Pures et Appliquées .
- 5[5] \harvarditem Bertucci 2018 B 2018 Bertucci, C. \harvardyearleft 2018 \harvardyearright , ‘A remark on uzawa’s algorithm and an application to mean-field games systems’, https://arxiv.org/pdf/1810.01181.pdf .
- 6[6] \harvarditem Bogachev 2007 bogachev 2007 measure Bogachev, V. I. \harvardyearleft 2007 \harvardyearright , Measure theory , Springer Science & Business Media.
- 7[7] \harvarditem Brezis 2010 B 2010 Brezis, H. \harvardyearleft 2010 \harvardyearright , ‘Functional analysis, sobolev spaces and partial differential equations’, Springer .
- 8[8] \harvarditem [Buckdahn et al.]Buckdahn, Goreac \harvardand Quincampoix 2011 buckdahn 2011 stochastic Buckdahn, R., Goreac, D. \harvardand Quincampoix, M. \harvardyearleft 2011 \harvardyearright , ‘Stochastic optimal control and linear programming approach’, Applied Mathematics & Optimization 63 (2), 257–276.
