Linear quadratic mean field games with a major player: The multi-scale approach
Yan Ma, Minyi Huang

TL;DR
This paper investigates linear quadratic mean field games with a major player, using a multi-scale approach to analyze asymptotic solvability, derive Riccati equations, and interpret strategies as best responses in an infinite population.
Contribution
It introduces a re-scaling technique to reduce coupled equations, providing necessary and sufficient conditions for asymptotic solvability and linking strategies to mean field approximations.
Findings
Derived Riccati equations in lower dimensions for solvability
Established conditions for asymptotic solvability
Interpreted strategies as best responses in an infinite population
Abstract
This paper considers linear quadratic (LQ) mean field games with a major player and analyzes an asymptotic solvability problem. It starts with a large-scale system of coupled dynamic programming equations and applies a re-scaling technique introduced in Huang and Zhou (2018a, 2018b) to derive a set of Riccati equations in lower dimensions, the solvability of which determines the necessary and sufficient condition for asymptotic solvability. We next derive the mean field limit of the strategies and the value functions. Finally, we show that the two decentralized strategies can be interpreted as the best responses of a major player and a representative minor player embedded in an infinite population, which have the property of consistent mean field approximations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic theories and models · Advanced Thermodynamics and Statistical Mechanics · Stochastic processes and financial applications
Linear quadratic mean field games with a major player:
The multi-scale approach
Yan Ma [email protected]
Minyi Huang [email protected] School of Mathematics and Statistics, Zhengzhou University, Zhengzhou, 450001, China
School of Mathematics and Statistics, Carleton University, Ottawa, ON K1S 5B6, Canada
Abstract
This paper considers linear quadratic (LQ) mean field games with a major player and analyzes an asymptotic solvability problem. It starts with a large-scale system of coupled dynamic programming equations and applies a re-scaling technique introduced in Huang and Zhou (2018a, 2018b) to derive a set of Riccati equations in lower dimensions, the solvability of which determines the necessary and sufficient condition for asymptotic solvability. We next derive the mean field limit of the strategies and the value functions. Finally, we show that the two decentralized strategies can be interpreted as the best responses of a major player and a representative minor player embedded in an infinite population, which have the property of consistent mean field approximations.
keywords:
asymptotic solvability, linear quadratic, mean field game, major and minor players, re-scaling, Riccati differential equation
††thanks: This paper was not presented at any IFAC meeting. This work was supported by the National Science Foundation of China (No.11601489), Startup Research Fund of Zhengzhou University (No.129-51090091), Outstanding Young Talent Research Fund of Zhengzhou University (No.129-32210453), Natural Sciences and Engineering Research Council (NSERC) of Canada. Submitted to Automatica, Jan 2019; revised Aug 2019. This version contains a more detailed Sec. 5 than the revised journal submission. Corresponding author: M. Huang.
,
1 Introduction
Mean field game theory has undergone a phenomenal growth. It provides a powerful methodology for handling complexity in noncooperative mean field decision problems. The readers are referred to (Caines, Huang, and Malhamé, 2017) for an overview. Most existing analysis has been developed based on two routes called the direct approach and the fixed point approach. By the direct approach, one starts by formally solving an -player game to obtain a large coupled solution equation system, and next derives a simple limiting equation system by taking ; see (Lasry and Lions, 2007) for the limit consisting of a Hamilton-Jacobi-Bellman (HJB) equation and a Fokker-Planck-Kolmogorov (FPK) equation. By the fixed point approach, one determines the best response of a representative agent to a mean field of an infinite population, and next all the agents’ best responses should regenerate that mean field (Huang, Malhamé, and Caines, 2006). This procedure formalizes a fixed point problem, which can be solved and further used to design decentralized strategies. For LQ mean field games, the recent work (Huang and Zhou, 2018b) shows the exact relationship of the two approaches. In general, the fixed point approach has more flexibilities and can be implemented in diverse models (Huang, Caines, and Malhamé, 2007; Li and Zhang, 2008; Bensoussan et al, 2013; Huang and Ma, 2016; Carmona and Delarue, 2018). Further convergence analysis in the direct approach can be found in (Cardaliaguet et al, 2015; Lacker, 2016; Fischer, 2017). Mean field games have found applications in traffic routing (Bauso, Zhang, and Papachristodoulou, 2017), smart grids (Couillet, et al, 2012; Ma, Callaway, and Hiskens, 2013; Kizilkale, Salhab, and Malhamé, 2019) and production planning (Wang and Huang, 2019), among others. A notable feature of the early literature of mean field games is that all players in the model are comparably small, and can be called peers.
Huang (2010) introduces an LQ mean field game model with a major player which has strong influence. A motivating example is the interaction between a large corporation and many much smaller firms. There has been a rapid increase of literature on mean field games with a major and many minor players. In the setting of LQ models, Nguyen and Huang (2012a) consider continuum parametrized minor players, and Nguyen and Huang (2012b) extend to mass behavior directly impacted by the major player. Kordonis and Papavassilopoulos (2015) analyze minor players with random entrance. Major players with leadership are studied by Bensoussan et al (2017), Moon and Basar (2018). Partial state observation is considered by Caines and Kizilkale (2017), Firoozi and Caines (2015). Huang, Wang and Wu (2016) take linear backward stochastic differential equations to model the dynamics of the players. Huang, Jaimungal, and Nourian (2015) present an application of the major player mean field game theory to an optimal execution model with an institutional trader and a large number of small traders.
Major-minor player games with nonlinear diffusion dynamics are an important class of modelling; see Nourian and Caines (2013), Buckdahn, Li and Peng (2014), Bensoussan, Chau and Yam (2016), Carmona and Zhu (2016). Leader-follower interaction is adopted by Bensoussan et al (2015), Fu and Horst (2018). To deal with this nonlinear modelling, forward-backward stochastic differential equations provide a vital analytical tool. Sen and Caines (2016) apply nonlinear filtering when the major player’s state is partially observed. More recently, Lasry and Lions (2018) introduce master equations for mean field games with major and minor players. They may be viewed as a pair of abstract dynamic programming equations. Cardadiaguet, Cirant, and Porretta (2018) prove the convergence of the Nash equilibria by use of the master equations when the number of minor players tends to infinity. A mean field principal-agent model is formulated by Elie, Mastrolia, and Possamai (2019). For major player models with discrete states, see (Huang 2012; Carmona and Wang, 2017; Kolokoltsov, 2017).
Huang (2010) applies a state space augmentation approach by adding the mean field dynamics into the two decision problems, one for the major player and one for a representative minor player. This Markovianizes the problem and enables the use of dynamic programming. The procedure of Huang (2010) is based on the fixed point approach and the associated consistent mean field approximations, and that work only assumes existence of the solution.
This paper analyzes the LQ mean field game with a major player and homogeneous (or symmetric) minor players and takes the direct approach by starting with the solution for players. Specifically, we will extend an asymptotic solvability notion introduced in a recent work Huang and Zhou (2018a) for LQ mean field games without a major player. With or without a major player, asymptotic solvability can be informally stated as the existence of Nash equilibria with complete state information for all sufficiently large population sizes, in addition to some boundedness property of the solution. We exploit the multi-scale nature of the optimization problem and use a re-scaling method in Huang and Zhou (2018a, 2018b) so that the key information in some higher order terms, as components in the solution matrices of coupled Riccati equations, can be captured. We derive the necessary and sufficient condition for asymptotic solvability and evaluate the value function. The re-scaling method gives a set of ordinary differential equations (ODEs) for nine matrix functions. To reveal the special structure underlying these functions, we will further relate them to the best responses of the major player and a representative minor player staying an infinite population, where consistent mean field approximations hold. The latter is a key feature of the fixed point approach in mean field games. Our mean field limit analysis shares similarity to (Cardadiaguet, Cirant, and Porretta, 2018) which performs convergence analysis in a nonlinear system via the master equation. But we explicitly exploit the multi-scale phenomena in our model to identify a lower dimensional object which governs the asymptotic behavior of the system when the number of minor players tends to infinity. Similar methods appear in the statistical physics literature on mean field models (Ott and Antonsen, 2008; Pazo and Montbrio, 2014).
We mention other related LQ models of finding mean field limits via analyzing large scale equations. Papavassilopoulos (2014) uses large algebraic Riccati equations in mean field games and analyzes existence by an implicit function theorem. Priuli (2015) considers coupled HJB and FPK equations with decentralized information. Mean field social optimal control is analyzed in (Huang 2003, Chap. 6; Herty, Pareschi, and Steffensen, 2015) via large Riccati equations.
The organization of the paper is as follows. Section 2 describes the LQ Nash game with players together with its solution via dynamic programming and Riccati equations. Section 3 extends the formulation of asymptotic solvability in Huang and Zhou (2018a, 2018b) to the LQ model with a major player. Section 4 presents further mean field limits and the performance. Section 5 formulates two optimal control problems under a mean field generated by an infinite number of minor players and addresses the relation to the asymptotic solvability problem. Numerical examples are presented in Section 6. Section 7 concludes the paper.
Notation: For symmetric matrix , we may write . For a matrix , denote the -norm . Let the function be defined for in a subset of a Euclidean space and parameter for some . We say is compactly of if for each compact subset , there exists a constant depending on such that .
2 The LQ game with major and minor players
We consider the LQ game with a major player and minor players , . At time , the states of and are, respectively, denoted by and , . The dynamics of the players are given by a system of linear stochastic differential equations (SDEs):
[TABLE]
where we have state , control , and . The initial states are independent with and finite second moment. The standard -dimensional Brownian motions are independent and also independent of the initial states. The deterministic constant matrices , , , , , , , , have compatible dimensions. Denote . The costs of players , , are given by
[TABLE]
The deterministic constant matrices (or vectors) , , , , , , , , , , , , , , , above have compatible dimensions, and , , , , , . For notational simplicity, we only consider constant parameters. Our analysis can be easily extended to the case of time-dependent parameters. Define
[TABLE]
We denote by a matrix with all entries equal to 1, and by the column vectors the canonical basis of . For instance, . For matrices , , the Kronecker product . We may use a subscript to indicate the identity matrix to be .
Now we write (1) and (2) in the form
[TABLE]
where . We consider closed-loop perfect state (CLPS) information so that is observed by each player, and look for Nash strategies in this section. Let denote the strategies of all players other than . A set of strategies is a Nash equilibrium if for any , we have
[TABLE]
for any state feedback based strategy which together with ensures a unique solution of on . Denote the value function of by , , which corresponds to the initial time-state pair in (5), i.e., at the initial time , and can be interpreted as evaluated on the time interval under the set of Nash strategies. The set of value functions is determined by the system of HJB equations
[TABLE]
and
[TABLE]
where and the minimizers are
[TABLE]
Next we substitute and into (2) and (2):
[TABLE]
and
[TABLE]
Suppose has the following form
[TABLE]
where is symmetric. Then
[TABLE]
Denote
[TABLE]
where is the th submatrix in (14). We have and . We write
[TABLE]
We may write , , in a similar form.
We substitute (13) into (2) and derive the equation systems:
[TABLE]
[TABLE]
[TABLE]
By (2) and (13), we derive the equation systems:
[TABLE]
[TABLE]
[TABLE]
Remark 1**.**
If (17) and (20) have a solution on , such a solution is unique due to the local Lipschitz continuity of the vector field; see (Hale, 1969). The ODE guarantees each , , to be symmetric. If (17) and (20) have a unique solution on , then we can uniquely solve and .
Lemma 1**.**
Suppose that (17) and (20) have a unique solution on . Then we can uniquely solve (18), (19), (21), (22), and the Nash game of players has a set of feedback Nash strategies given by
[TABLE]
Proof. This lemma follows the standard results in (Basar and Olsder, 1999, Theorem 6.16, Corollary 6.5).
By Lemma 1 and Remark 1, the solution of the feedback Nash strategies completely reduces to the study of (17) and (20).
3 Asymptotic solvability
Define the identity matrix
[TABLE]
For , exchanging the th and th rows of submatrices in , let denote the resulting matrix. For instance, we have
[TABLE]
It is easy to check that .
Theorem 2**.**
* and have the representation:*
[TABLE]
and
[TABLE]
where each submatrix depends on and is . Moreover, for .
Proof. See Appendix A.
Definition 3**.**
The sequence of Nash games (1)-(4) has asymptotic solvability if there exists such that for all , has a solution on and
[TABLE]
**
Note that (27) is equivalent to
[TABLE]
Denote
[TABLE]
Define
[TABLE]
For (17) and (20) we write the ODE system for the set of variables in (33); see Appendix B. The following ODE system is obtained as the limit of the above ODE system with respect to :
[TABLE]
where the terminal conditions are
[TABLE]
Theorem 4**.**
The sequence of games in (1)-(4) has asymptotic solvability if and only if (56) has a solution on .
Proof. See Appendix B.
Due to the quadratic terms in its right hand sides, we call (56) a system of Riccati ODEs. As it turns out later in Section 5, this set of solution functions can be interpreted according to two optimal control problems.
4 Equilibrium costs and decentralized control
For this section, we assume (56) has a solution on . Therefore there exists such that for all , (17) and (20) have a solution on .
Proposition 5**.**
Let be the solution of (18) and (21). We have the representation
[TABLE]
where each vector of is in and is the th component of .
Proof. The method is similar to proving Theorem 2, and we omit the detail.
Define
[TABLE]
We derive a set of ODEs for ; see Appendix C. By taking the limit form of these equations with respect to , we introduce the ODE system:
[TABLE]
where
[TABLE]
After (56) is solved, satisfies a linear ODE system and can be uniquely solved on .
Proposition 6**.**
We have
[TABLE]
Proof. We consider the first ODE system for and , and the second ODE system for and . By (Huang and Zhou, 2018b, Theorem 4), we obtain the error bound.
In view of (19) and (22), we obtain
[TABLE]
where and . For , we can uniquely solve and . It is clear that does not depend on . We rewrite
[TABLE]
As the approximation of (63) and (64), we introduce the ODE system
[TABLE]
where and , and we solve on .
Proposition 7**.**
We have
[TABLE]
Proof. The proof is similar to that of Proposition 6.
Assumption (H): The initial states , are i.i.d. and has mean , and covariance . In addition, has mean and covariance
Denote the set of Nash strategies given by (23)-(24).
Proposition 8**.**
Under Assumption (H), the costs under the set of strategies have the asymptotic form
[TABLE]
[TABLE]
Proof. Note that
[TABLE]
and
[TABLE]
where . Similarly we have
[TABLE]
We complete the proof by elementary computations and taking limits.
Substituting and into (1) and (2), we have
[TABLE]
When , we obtain a limit form of the strategies
[TABLE]
where is the infinite population limit of the state average of the minor players in (69). For the player game, we replace in the strategies (70)-(71) by and write the closed-loop system of equations:
[TABLE]
where is generated by the players instead of an infinite population. Denote the strategies in (72)-(73) by . Following the standard mean square error estimate of for (72)-(73), under assumption (H) we can show that is an -Nash equilibrium for the player game, where and each player may use centralized state information ; see related methods in (Huang, 2010).
By the re-scaling technique, we derive the mean field limits of the costs and strategies. The feasibility condition is determined by (56) directly based on the model parameters in (1)-(4). This is different from (Huang, 2010), where the existence condition is described in an augmented state space in dimensions and imposes consistency requirements on matrices.
5 The limiting control problems and best responses
For this section, we assume (56) has a solution on .
An interesting question is whether the above two limit strategies in (70)-(71) have the interpretation as best responses in appropriately constructed optimal control problems. Finding the best response of a single agent in an infinite population model has been a key step in the fixed point approach in mean field games; see (Huang, Caines, and Malhamé, 2007; Huang, 2010). We introduce two optimal control problems.
Problem (P0): The dynamics are given by
[TABLE]
where and are given. Equation (75) may be viewed as the limit of (69) but now is indirectly controlled by in (74). The cost is
[TABLE]
Problem (P1): The dynamics are given by
[TABLE]
where and are given. The notation is reused in Problem (P1), where has taken a specific form in (76). Equation (76) can be viewed as a limit form of (72) when . Since the two problems will be solved separately, this should cause no risk of confusion. The cost is
[TABLE]
Since , both Problem (P0) and Problem (P1) can be solved. The resulting optimal control laws will also be called best responses.
Below, we start with the solution of Problem (P0). Denote
[TABLE]
Then we have
[TABLE]
We further write
[TABLE]
By dynamic programming for the optimal control problem (P0), we introduce
[TABLE]
where the terminal condition can be determined as
[TABLE]
We uniquely solve and on . Note that is a matrix. The optimal control law is
[TABLE]
Denote
[TABLE]
where and are symmetric. Then by (97), we derive the ODE system:
[TABLE]
where
[TABLE]
And by (102),
[TABLE]
where
[TABLE]
Finally, we rewrite as
[TABLE]
Now we give the solution of Problem (P1). Denote
[TABLE]
The dynamics can be given as
[TABLE]
By dynamic programming, we introduce the two ODEs:
[TABLE]
where
[TABLE]
We uniquely solve and on . Denote
[TABLE]
where , and are symmetric. By (122), we derive the ODE system:
[TABLE]
where
[TABLE]
Now by (129), we derive
[TABLE]
where
[TABLE]
The optimal control law is given by
[TABLE]
Theorem 9**.**
We have
[TABLE]
and
[TABLE]
Proof. We can directly show that
[TABLE]
is a solution of (56). Similarly, satisfies the ODE of . Therefore, we obtain the representation of .
By Theorem 9, after appropriate arrangement the matrix functions in the solution of (56) have the interpretation as the solutions of two Riccati-like equations.
It is now clear that agrees with given by (70)-(71). Then we have the following interpretation on within an infinite population of minor players. First, is the best response with respect to ; second, is the best response with respect to ; and finally, is generated by the infinite number of minor players applying their best responses. This suggests a consistent mean field approximation, which is well known in the fixed point approach of mean field games (Caines, Huang, and Malhamé, 2017). In our mean field limit here, the consistent mean field approximation is a derived property. This is in contrast to the major player model in (Huang, 2010; Carmona and Zhu, 2016), where consistent mean field approximations are imposed as a requirement at the beginning so that individual strategies can be determined.
6 Numerical examples
We have seen that testing asymptotic solvability reduces to checking the solution of (56) on . It is generally infeasible to solve (56) analytically. Its numerical solution provides a practical means to check asymptotic solvability.
Example 1**.**
The parameters in (1)-(4) are given by , , , , , , , , , , , , , , , and . We use ode45 of MatLab to numerically solve (56) on ; see Fig. 1. The existence of the solution suggests asymptotic solvability holds.
Example 2**.**
, , , , , , , , , , , , , , , and . The numerical solution of (56) has a finite escape time between and as shown in Fig. 2, suggesting no asymptotic solvability.
7 Concluding remarks
We study an asymptotic solvability problem for LQ Nash games involving a major player and minor players, where tends to infinity. We obtain the necessary and sufficient condition of asymptotic solvability via a system of Riccati ODEs and evaluate the equilibrium costs. The system of Riccati ODEs has close relation with a limiting control model of two players: the major player and a representative minor player. For future work, it is of interest to generalize our analysis to deal with leadership (Bensoussan et al, 2017), noisy measurements (Firoozi and Caines, 2015), and control constraints (Hu, Huang and Li, 2018).
Appendix A: Proof of Theorem 2
Lemma A.1**.**
Assume that (17) and (20) have a solution on . Then the following holds.
i) has the representation
[TABLE]
where , and are symmetric matrix functions. The matrix appears for times.
ii) has the representation
[TABLE]
where , , and are symmetric matrix functions. The matrix appears for times.
iii) For , .
Proof. i) For , denote where is an matrix. Let . By the method in (Huang and Zhou, 2018b, Lemma A.1) and elementary matrix computations, we can verify that satisfies (17) and (20). Hence,
[TABLE]
Then and , and we obtain .
Taking in place of and following the method in (Huang and Zhou, 2018b), we obtain the representation of .
ii) Now denote , and we can verify that
[TABLE]
is a solution of (17) and (20). Hence,
[TABLE]
This yields , and . In addition, . Since is symmetric, . So is symmetric. Similarly, by the relation we obtain . Now, repeatedly using the relation for all , we obtain the representation of . Note that is symmetric.
iii) This equality can be shown as in the case in the proof of part i).
Proof of Theorem 2: By Lemma A.1, we have
[TABLE]
and
[TABLE]
and
[TABLE]
and
[TABLE]
We have , and
[TABLE]
It follows that
[TABLE]
By Lemma A.1, we have
[TABLE]
and
[TABLE]
and
[TABLE]
and
[TABLE]
and
[TABLE]
and
[TABLE]
and
[TABLE]
We have , and
[TABLE]
Therefore, for all . This completes the proof.
Appendix B: Proof of Theorem 4
Step 1. By (A.1), (Appendix A: Proof of Theorem 2) and (A.5), we determine
[TABLE]
[TABLE]
[TABLE]
where , , , and
[TABLE]
For reasons of space, the expression of is not displayed. We further obtain
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
where are not displayed and are compactly of .
Step 2. Proof of Theorem 4.
Denote for (33), and we view each of , , , , , , , as a function of with parameter . They are all compactly of . For some , we further have
[TABLE]
Subsequently, we view the ODE system (B.1)-(B.9) as a slightly perturbed form of (56). The remaining proof is similar to that of (Huang and Zhou, 2018b, Theorem 5) and we only give its sketch. If asymptotic solvability holds, we solve (B.1)-(B.9) for all sufficiently large . By taking some increasing subsequence of population sizes , we can ensure that as , their solutions have a limit as a vector function on which satisfies the limit ODE (56) on . Conversely, if (56) has a solution on , there exists such that (B.1)-(B.9) has a solution on for all ; all these solutions are uniformly bounded. Accordingly we obtain to satisfy (27) for all . So asymptotic solvability holds.
Appendix C
In view of (18) and (21), by Proposition 5 we have
[TABLE]
where
[TABLE]
and
[TABLE]
where
[TABLE]
By (C.5), we have the relation
[TABLE]
where , and By (C.15), we have
[TABLE]
where the terminal condition is
[TABLE]
and , , are compactly of .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Basar, T., & Olsder, G. J. (1999). Dynamic Noncooperative Game Theory , 2nd ed.. SIAM, Philadelphia.
- 2[2] Bauso, D., Zhang, X., & Papachristodoulou, A. (2017). Density flow in dynamical networks via mean-field games. IEEE Transactions on Automatic Control , 62(3), 1342-1355.
- 3[3] Bensoussan, A., Chau M., Lai Y., & Yam P. (2017). Linear-quadratic mean field Stackelberg games with state and control delays. SIAM Journal on Control and Optimization , 55(4), 2748-2781.
- 4[4] Bensoussan, A., Chau, M. H. M., & Yam, S. C. P. (2015). Mean field Stackelberg games: Aggregation of delayed instructions. SIAM Journal on Control and Optimization , 53(4), 2237-2266.
- 5[5] Bensoussan, A., Chau, M. H. M., & Yam, S. C. P. (2016). Mean field games with a dominating player. Applied Mathematics & Optimization , 74(1), 91-128.
- 6[6] Bensoussan, A., Frehse, J., & Yam, P. (2013). Mean Field Games and Mean Field Type Control Theory . New York: Springer.
- 7[7] Buckdahn, R., Li, J., & Peng, S. (2014). Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents. SIAM Journal on Control and Optimization , 52(1), 451-492.
- 8[8] Caines, P. E., Huang, M., & Malhamé, R. P. (2017). Mean Field Games, In Handbook of Dynamic Game Theory , T. Basar and G. Zaccour Eds., 345-372, Berlin: Springer.
