Mean Field Linear Quadratic Control: FBSDE and Riccati Equation Approaches
Bingchang Wang, and Huanshui Zhang

TL;DR
This paper develops a comprehensive framework for mean field linear quadratic control problems, deriving decentralized control laws via FBSDE and Riccati equations, and establishing their social optimality and Nash equilibrium properties.
Contribution
It introduces a novel approach combining FBSDE and Riccati equations to design decentralized controls for mean field LQ control and game problems, linking open-loop and feedback solutions.
Findings
Decentralized controls are asymptotically social optimal.
Decentralized controls form an asymptotic Nash equilibrium.
Proposed controls are equivalent to previous feedback strategies.
Abstract
This paper studies social optima and Nash games for mean field linear quadratic control systems, where subsystems are coupled via dynamics and individual costs. For the social control problem, we first obtain a set of forward-backward stochastic differential equations (FBSDE) from variational analysis, and construct a feedback-type control by decoupling the FBSDE. By using solutions of two Riccati equations, we design a set of decentralized control laws, which is further proved to be asymptotically social optimal. Two equivalent conditions are given for uniform stabilization of the systems in different cases. For the game problem, we first design a set of decentralized control from variational analysis, and then show that such set of decentralized control constitute an asymptotic Nash equilibrium by exploiting the stabilizing solution of a nonsymmetric Riccati equation. It is verified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClimate Change Policy and Economics · Stochastic processes and financial applications · Economic theories and models
Mean Field Linear Quadratic Control: FBSDE and Riccati Equation Approaches
Bingchang Wang, Member, IEEE,
and Huanshui Zhang, Senior Member, IEEE This work was supported by the National Natural Science Foundation of China under Grants 61773241, 61573221 and 61633014.Bingchang Wang is with the School of Control Science and Engineering, Shandong University, Jinan 250061, P. R. China. (e-mail: [email protected]) Huanshui Zhang is with the School of Control Science and Engineering, Shandong University, Jinan 250061, P. R. China. (e-mail: [email protected])
Abstract
This paper studies social optima and Nash games for mean field linear quadratic control systems, where subsystems are coupled via dynamics and individual costs. For the social control problem, we first obtain a set of forward-backward stochastic differential equations (FBSDE) from variational analysis, and construct a feedback-type control by decoupling the FBSDE. By using solutions of two Riccati equations, we design a set of decentralized control laws, which is further proved to be asymptotically social optimal. Two equivalent conditions are given for uniform stabilization of the systems in different cases. For the game problem, we first design a set of decentralized control from variational analysis, and then show that such set of decentralized control constitute an asymptotic Nash equilibrium by exploiting the stabilizing solution of a nonsymmetric Riccati equation.
It is verified that the proposed decentralized control laws are equivalent to the feedback strategies of mean field control in previous works. This may illustrate the relationship between open-loop and feedback solutions of mean field control (games).
Index Terms:
Mean field game, variational analysis, social optimality, forward-backward stochastic differential equation, Riccati equation
I Introduction
Mean field games have drawn increasing attention in many fields including system control, applied mathematics and economics [7, 8, 12]. The mean field game involves a very large population of small interacting players with the feature that while the influence of each one is negligible, the impact of the overall population is significant. By combining mean field approximations and individual’s best response, the dimensionality difficulty is overcome. Mean field games and control have found wide applications, including smart grids [27, 10], finance, economics [13, 9, 32], and social sciences [5], etc.
By now, mean field games have been intensively studied in the LQ (linear-quadratic) framework [18, 19, 25, 33, 6, 29]. Huang et al. developed the Nash certainty equivalence (NCE) based on the fixed-point method and designed an -Nash equilibrium for mean field LQ games with discount costs by the NCE approach [18, 19]. The NCE approach was then applied to the cases with long run average costs [25] and with Markov jump parameters [33], respectively. Bensoussan et al. employed the adjoint equation approach and the fixed-point theorem to obtain a sufficient condition for the unique existence of the equilibrium strategy over a finite horizon [6]. For other aspects of mean field games, readers are referred to [21, 23, 39, 11] for nonlinear mean field games, [37] for oblivious equilibrium in dynamic games, [17, 34, 35] for mean field games with major players, [16, 29] for robust mean field games.
Besides noncooperative games, social optima in mean field models have also attracted much interest. The social optimum control refers to that all the players cooperate to optimize the common social cost—the sum of individual costs, which is usually regarded as a type of team decision problem [30, 14]. Huang et al. considered social optima in mean field LQ control, and provided an asymptotic team-optimal solution [20]. Wang and Zhang [36] investigated a mean field social optimal problem where the Markov jump parameter appears as a common source of randomness. For further literature, see [22] for social optima in mixed games, [3] for team-optimal control with finite population and partial information.
Most previous results on mean field games and control were given by virtue of the fixed-point analysis. However, the fixed-point method is sometimes conservative, particularly for general systems. In this paper, we break away from the fixed-point method and solve the problem by tackling forward-backward stochastic differential equations (FBSDE). In recent years, some substantial progress for the optimal LQ control has been made by solving the FBSDE. See [40, 42, 43, 31] for details.
This paper investigates social optima and Nash games for linear quadratic mean field systems, where subsystems (agents) are coupled via dynamics and individual costs. For the finite-horizon social control problem, we first obtain a set of forward-backward stochastic differential equations (FBSDE) by examining the variation of the social cost, and give a centralized feedback-type control laws by decoupling the FBSDE. With mean field approximations, we design a set of decentralized control laws, which is further shown to have asymptotic social optimality. For the infinite-horizon case, we design a set of decentralized control laws by using solutions of two Riccati equations, which is shown to be asymptotically social optimal. Some equivalent conditions are further given for uniform stabilization of the multiagent systems when the state weight is semi-positive definite or only symmetric. For the problem of mean field games, we first design a set of decentralized control by variational analysis, whose control gain satisfies a nonsymmetric Riccati equation. With the help of the stabilizing solution of the nonsymmetric Riccati equation, we show that the set of decentralized control laws is an asymptotic Nash equilibrium. It is verified that the proposed decentralized control laws are equivalent representation of the feedback strategies in previous works of mean field control and games. Finally, some numerical examples are given to illustrate the effectiveness of the proposed control laws.
The main contributions of the paper are summarized as follows.
(i) For the social control problem, we first obtain necessary and sufficient existence conditions of finite-horizon centralized optimal control by variational analysis, and then design a feedback-type decentralized control by tackling FBSDE with mean field approximations.
(ii) In the case , the necessary and sufficient conditions are given for uniform stabilization of the systems with the help of the system’s observability and detectability.
(iii) In the case that is only symmetric, the necessary and sufficient conditions are given for uniform stabilization of the systems using the Hamiltonian matrices.
(iv) For the game problem, we show that the decentralized control laws constitute an -Nash equilibrium by exploiting the stabilizing solution of a nonsymmetric Riccati equation.
(v) It is under nonconservative assumptions that we obtain the asymptotically optimal decentralized control, and such control laws are shown to be equivalent to the feedback strategies given by the fixed-point method in previous works [19, 20].
The organization of the paper is as follows. In Section II, the socially optimal control problem is investigated. We first construct asymptotically optimal decentralized control laws by tackling FBSDE for the finite-horizon case, then design asymptotically optimal control for the infinite-horizon case and further give two equivalent conditions of uniform stabilization for different cases. In Section III, we design a decentralized -Nash equilibrium for the finite-horizon and infinite-horizon cases, respectively. The proposed decentralized control laws are compared with the feedback strategies of previous works in Section IV. In Section V, some numerical examples are given to show the effectiveness of the proposed control laws. Section VI concludes the paper.
The following notation will be used throughout this paper. denotes the Euclidean vector norm or matrix spectral norm. For a vector and a matrix , , and () means that is positive definite (semi-positive definite). For two vectors , . is the space of all -valued continuous functions defined on , and is a subspace of which is given by is the space of all -adapted -valued processes such that . For two sequences and , denotes that , and denotes . For convenience of presentation, we use to denote generic positive constants, which may vary from place to place.
II Mean Field LQ Social Control
Consider a large population systems with agents. Agent evolves by the following stochastic differential equation:
[TABLE]
where and are the state and input of the th agent. , . are a sequence of independent -dimensional Brownian motions on a complete filtered probability space . The cost function of agent is given by
[TABLE]
where , are symmetric matrices with appropriate dimensions, and . Denote . The decentralized control set is given by
[TABLE]
For comparison, define the centralized control sets as
[TABLE]
and {\cal U}_{c}=\big{\{}(u_{1},\cdots,u_{N})\big{|}\ u_{i}\ \hbox{is adapted to}\ {\cal U}_{c,i}}, where and .
In this section, we mainly study the following problem.
(PS). Seek a set of decentralized control laws to optimize social cost for the system (1)-(2), i.e., where
[TABLE]
Assume
A1) are mutually independent and have the same mathematical expectation. , , . There exists a constant (independent of ) such that . Furthermore, and are independent of each other.
II-A The finite-horizon problem
For the convenience of design, we first consider the following finite-horizon problem.
[TABLE]
where and
[TABLE]
We first give an equivalent condition for the convexity of Problem (P1).
Proposition II.1
Problem (P1) is convex in if and only if for any , ,
[TABLE]
where and satisfies
[TABLE]
Proof. Let and be the state processes of agent with the control and , respectively. Take any and let . Then
[TABLE]
Denote , and . Thus, satisfies (4). By the definition of the convexity, the lemma follows.
By examining the variation of , we obtain the necessary and sufficient conditions for the existence of centralized optimal control of (P1).
Theorem II.1
Suppose . Then (P1) has a set of optimal control laws if and only if Problem (P1) is convex in and the following equation system admits a set of solutions :
[TABLE]
where , , , and furthermore the optimal control is given by .
Proof. Suppose that where are a set of solutions to the equation system
[TABLE]
where , are to be determined. Denote by the state of agent under the control . For any and , let . Denote by the solution of the following perturbed state equationㄩ
[TABLE]
Let . It can be verified that satisfies (4). Then by Itô’s formula, for any ,
[TABLE]
which implies
[TABLE]
From (3), we have
[TABLE]
where , and
[TABLE]
Note that
[TABLE]
From (7), one can obtain that
[TABLE]
From (11), is a minimizer to Problem (P1) if and only if and . By Proposition II.1, if and only if (P1) is convex. is equivalent to
[TABLE]
Thus, we have the following optimality system:
[TABLE]
such that . This implies that the equation systems (5) admits a solution
.
On other hand, if the equation system (5) admits a solution . Let . If (P1) is convex, then is a minimizer to Problem (P1).
It follows from (5) that
[TABLE]
Let . Then by (5), (18) and Itô’s formula,
[TABLE]
This implies that , ,
[TABLE]
Then
Theorem II.2
Assume that A1) holds and . Then Problem (P1) has an optimal control
[TABLE]
where and are determined by (19)-(22).
Proof. Denote . Then from (20) and (22), satisfies
[TABLE]
where . Note that and . By [2, 41], (19) and (23) admit unique solutions and , respectively, which implies that (20) and (22) have unique solutions and , respectively. Then by [26, 42], the FBSDE (5) admits a unique solution. By Theorem II.1, Problem (P1) has an optimal control given by where and are determined by (19)-(22).
As an approximation to in (18), we obtain
[TABLE]
Then, by Theorem II.2, the decentralized control law for agent may be taken as
[TABLE]
where , and are determined by (19)-(22), and and satisfy (24) and
[TABLE]
Remark II.1
In previous works [20, 36], the mean field term in cost functions (dynamics) is first substituted by a deterministic function . By solving an optimal tracking problem subject to consistency requirements, a fixed-point equation is obtained. The decentralized control is constructed by handling the fixed-point equation. Here, we firstly obtain the centralized open-loop solution by variational analysis. By tackling the coupled FBSDEs combined with mean field approximations, the decentralized control laws are designed. Note that in this case and are fully decoupled and no fixed-point equation is needed.
Theorem II.3
Let A1) hold and . The set of decentralized control laws given by (25) has asymptotic social optimality, i.e.,
[TABLE]
Proof. See Appendix A.
II-B The infinite-horizon problem
Based on the analysis in Section II-A, we may design the following decentralized control laws for Problem (PS):
[TABLE]
where and are determined by
[TABLE]
and are determined by
[TABLE]
Here the existence conditions of and need to be investigated further.
We introduce some assumptions:
A2) The system is stabilizable, and is stabilizable.
A3) , ) is observable, and is observable.
Assumptions A2) and A3) are basic in the study of the LQ optimal control problem. We will show that under some conditions, A2) is also necessary for uniform stabilization of multiagent systems. In many cases, A3) may be weakened to the following assumption.
A3′)** , ) is detectable, and is detectable.
Lemma II.1
Under A2)-A3), (28) and (29) admit unique solutions , respectively, and (30)-(31) admits a set of unique solutions .
Proof. From A2)-A3) and [2], (28) and (29) admit unique solutions such that and are Hurwitz, respectively. From an argument in [34, Appendix A], we obtain if and only if
[TABLE]
Under this initial condition, we have
[TABLE]
It is straightforward that .
We further introduce the following assumption.
A4) is Hurwitz, where .
Lemma II.2
Let A1)-A4) hold. Then for (PS),
[TABLE]
Proof. See Appendix B.
It is shown that the decentralized control laws (25) uniformly stabilize the systems (1) .
Theorem II.4
Let A1)-A4) hold. Then for any ,
[TABLE]
Proof. See Appendix B.
We now give two equivalent conditions for uniform stabilization of multiagent systems.
Theorem II.5
Let A3) hold. Then for (PS) the following statements are equivalent:
(i) For any initial condition satisfying A1),
[TABLE]
(ii) (28) and (29) admit unique solutions , respectively, and is Hurwitz.
(iii) A2) and A4) hold.
Proof. See the Appendix C.
For the case , we have a simplified version of Theorem II.5.
Corollary II.1
Assume that A3) holds and . Then for (PS) the following statements are equivalent:
(i) For any satisfying A1),
[TABLE]
(ii) (28) and (29) admit unique solutions , respectively.
(iii) A2) holds.
When A3) is weakened to A3*′*), we have the following equivalent conditions of uniform stabilization of the systems.
Theorem II.6
Let A3′) hold. Then for (PS) the following statements are equivalent:
(i) For any initial condition satisfying A1),
[TABLE]
(ii) (28) and (29) admit unique solutions , respectively, and is Hurwitz.
(iii) A2) and A4) hold.
Proof. See the Appendix C.
Remark II.2
In [43], some similar results were given for the stabilization of mean field systems. However, only the limiting problem was considered in their work and the mean field term in dynamics and costs is instead of . Here we study large-population multiagent systems and the number of agents is large but not infinite. The errors of mean field approximations are further analyzed. To obtain asymptotic optimality, an additional assumption A4) is needed later.
For the more general case that are only symmetric, we have the following equivalent conditions for uniform stabilization of multiagent systems.
Denote
[TABLE]
Theorem II.7
Assume that both and have no eigenvalues on the imaginary axis. Then for (PS) the following statements are equivalent:
(i) For any satisfying A1),
[TABLE]
(ii) (28) and (29) admit -stabilizing solutions111For a Riccati equation (28), is called a -stabilizing solution if satisfies (28) and all the eigenvalues of are in left half-plane., respectively, and is Hurwitz.
(iii) A2) and A4) hold.
Remark II.3
* and are Hamiltonian matrices. The Hamiltonian matrix plays a significant role in studying general algebraic Riccati equations. See more details of the property of Hamiltonian matrices in [1, 28].*
To show Theorem II.7, we need two lemmas. The first lemma is a result from [28, Theorem 6].
Lemma II.3
Equations (28) and (29) admit -stabilizing solutions if and only if A2) holds and both and have no eigenvalues on the imaginary axis.
Lemma II.4
Let A1) hold. Assume that (28) and (29) admit -stabilizing solutions, respectively, and is Hurwitz. Then
[TABLE]
Proof. From the definition of -stabilizing solutions, and are Hurwitz. By the argument in the proof of Theorem II.4, the lemma follows.
The Proof of Theorem II.7. By using Lemmas II.3 and II.4 together with a similar argument in the proof of Theorem II.4, the Theorem follows.
Example II.1
Consider a scalar system with , , , , , . Then
[TABLE]
By direct computations, neither nor has eigenvalues in imaginary axis if and only if
[TABLE]
Note that if (or , ), i.e., is observable (detectable), then (35) holds, and if ( ), i.e., is observable (detectable), then (36) holds.
For this model, the Riccati equation (28) is written as
[TABLE]
Let . If (35) holds then , which implies (37) admits two solutions. If then (37) has a unique positive solution such that . If and then (37) has a unique non-negative solution such that .
Assume that (35) and (36) hold. By Theorem II.7, the system is uniformly stable if and only if is stabilizable (i.e., or ), and . Note that . When , we have .
Example II.2
We further consider the model in Example II.1 for the case that and (i.e., (36) does not hold). In this case, the Riccati equation (29) admits a unique solution . (30) becomes and has a unique solution in . Thus, satisfies
[TABLE]
Assume that is a constant. Then (38) does not admit a solution in unless .
We are in a position to state the asymptotic optimality of the decentralized control.
Theorem II.8
Let A1)-A4) hold. For Problem (PS), the set of decentralized control laws given by (27) has asymptotic social optimality, i.e.,
[TABLE]
Proof. We first prove that for , implies that
[TABLE]
for all . From , we have and
[TABLE]
which further implies that
[TABLE]
By (1) we have
[TABLE]
which leads to for any ,
[TABLE]
By and basic SDE estimates, we can find a constant such that
[TABLE]
[TABLE]
which together with A3) implies that
[TABLE]
This and (40) lead to
[TABLE]
By (1), we have
[TABLE]
It follows from (43) that
[TABLE]
From (44) and (45), we obtain that
[TABLE]
This together with A3) implies that
[TABLE]
which gives (39). By Theorem II.4,
[TABLE]
By a similar argument to the proof of Theorem II.3 combined with Lemma II.2, the conclusion follows.
If A3) is replaced by A3*′*), the decentralized control (27) still has asymptotic social optimality.
Corollary II.2
Assume that A1)-A2), A3′), A4) hold. The set of decentralized control laws given by (27) is asymptotically socially optimal.
Proof. Without loss of generality, we simply assume , where is Hurwitz, and is Hurwitz (If necessary, we may apply a nonsingular linear transformation as in the proof of Theorem II.6). Write and such that
[TABLE]
and is observable which is due to the detectability of . By the proof of Theorem II.4 or [17], implies , which together with (41) gives . This and the observability of leads to . Thus, . The other parts of the proof are similar to that of Theorem II.8.
III Mean Field LQ Games
In this section, we investigate the game problem for LQ mean field systems.
(PG). Seek a set of decentralized control laws to minimize individual cost for each agent in the system (1)-(2).
III-A The finite-horizon problem
We first consider the finite-horizon problem. Suppose that is given for approximation of . Replacing in (1) and (3) by , we have the following auxiliary optimal control problem.
[TABLE]
where
[TABLE]
By examining the variation of , we obtain the unique optimal control of (P2).
Theorem III.1
Assume . Then the FBSDE
[TABLE]
admits a unique solution , and the optimal control .
Proof. Since and , then by [41], (P2) is uniformly convex, and hence admits a unique optimal control. By a similar argument with Theorem II.1, the conclusion follows.
It follows from (46) that
[TABLE]
Replacing by , we have
[TABLE]
Let . By Itô’s formula, we obtain
[TABLE]
This implies
[TABLE]
Denote , and . Then by (46) and (47) we have
[TABLE]
Let . By Itô’s formula,
[TABLE]
which implies that , and
[TABLE]
Assume
A5) Equation (48) admits a solution in .
By the local Lipschitz-continuous property of the quadratic function, (48) can admit a unique local solution in a small time duration . It may be referred to [1] for some sufficient conditions of the existence of the solution in . We now provide a necessary and sufficient condition to guarantee the global solvability of (48).
Proposition III.1
(48) admits a solution in if and only if for any
[TABLE]
where
[TABLE]
Proof. Sufficiency is given by [26, Theorem 4.3, p.48]. Necessity is implied from Proposition 4.2 and Theorem 3.2 of [26, Chapter 2].
Let
[TABLE]
where and are determined by (50), (48) and (49), respectively, and and satisfy
[TABLE]
Denote .
Theorem III.2
Let A1), A5) hold and . The set of decentralized strategies given by (51) is an -Nash equilibrium, i.e.,
[TABLE]
*where . *
Proof. See the Appendix D.
III-B The infinite-horizon problem
For simplicity, we consider the case .
Based on the analysis in Section III-A, we may design the following decentralized control for (PG):
[TABLE]
where and are determined by
[TABLE]
respectively, and are determined by
[TABLE]
and satisfies
[TABLE]
Here the existence conditions of and need to be investigated further.
We introduce the following assumptions.
A6) is stabilizable, and is detectable.
A7) (58) admits a stabilizing solution.
Lemma III.1
Assume that has stable eigenvalues (with negative real parts) and unstable eigenvalues, where
[TABLE]
Suppose that
[TABLE]
where is Hurwitz and is invertible. Then A7) holds.
Proof. Let . It follows from (63) that
[TABLE]
By pre-multiplying by on both sides, we obtain
[TABLE]
which leads to (58). By (64), we have is Hurwitz. It is straightforward that .
Remark III.1
The above lemma provides a convenient method to compute the stabilizing solutions of algebraic Riccati equations. Assume there exists an invertible matrix V=\left[\begin{array}[]{cc}V_{11}&V_{12}\\ V_{21}&V_{22}\end{array}\right] such that V^{-1}M_{3}V=\left[\begin{array}[]{cc}H_{11}&H_{12}\\ 0&H_{22}\end{array}\right], where is invertible, and are Hurwitz. Then is the stabilizing solution of (58). comprises independent vectors, which are called Schur vectors [24].
Lemma III.2
Assume that A1), A6), A7) hold. Then (59)-(60) admit a set of unique solutions , and
[TABLE]
Proof. By a similar argument in the proof of Theorem II.6, the lemma follows.
Theorem III.3
Let A1), A6), A7) hold. For Problem (PG), the set of decentralized strategies given by (56) is an -Nash equilibrium, i.e.,
[TABLE]
where
Proof. See Appendix D.
IV Comparison of Different Solutions
In this section, we compare the proposed decentralized control laws with the feedback decentralized strategies in previous works [19, 20].
We first introduce a definition from [4].
Definition IV.1
For a control problem with an admissible control set , a control law is said to be a representation of another control if
(i) they both generate the same unique state trajectory, and
(ii) they both have the same open-loop value on this trajectory.
For Problem (PS), let , and . In [20, Theorem 4.3], the decentralized control laws are given by
[TABLE]
where is the semi-positive definite solution of (57), and Here satisfies
[TABLE]
and are determined by
[TABLE]
in which . By comparing this with (29)-(31), one can obtain that , and . From the above discussion, we have the equivalence of the two sets of decentralized control laws.
Proposition IV.1
The set of decentralized control laws in (27) is a representation of given by (65).
For Problem (PG), let , and . In [19], the decentralized strategies are given by
[TABLE]
where is the positive definite solution of (28), is determined by the fixed-point equation
[TABLE]
We now show the equivalence of the decentralized open-loop and feedback solutions to mean field games.
Proposition IV.2
The set of decentralized control laws in (56) is a representation of given by (66).
Proof. Let . From (67), we have
[TABLE]
which gives
[TABLE]
By comparing this with (57)-(59), one can obtain , and . Thus, we have , which implies that is a representation of in (56).
V Numerical Examples
In this section, some numerical examples are given to illustrate the effectiveness of the proposed decentralized control laws.
We first consider a scalar system with agents in Problem (PS). Take in (1)-(2). The initial states of agents are taken independently from a normal distribution . Then, under the control law (27), the state trajectories of agents for the cases with and are shown in Figs. 1 and 2, respectively. After the transient phase, the states of agents behave similarly and achieve agreement roughly.
Next, we simulate the scalar case of Problem (PG), where the parameters are the same as above, except . After the control laws (56) are applied, the state trajectories of 50 agents with and are shown in Figs. 3 and 4, respectively.
For the case and , the trajectories of and in Problems (PS) and (PG) are shown in Fig. 5. It can be seen that and coincide well, which illustrate the consistency of mean field approximations. Clearly, the state average of agents has significantly lower value in Problem (PS) than in (PG).
Finally, we consider the 2-dimensional case of Problem (PS). Take parameters as follows: A=\left[\begin{array}[]{cc}0.1&0\\ -1&0.2\\ \end{array}\right], B=\left[\begin{array}[]{cc}1&0\\ 0&1\\ \end{array}\right], G=\left[\begin{array}[]{cc}-0.5&0\\ 0&-0.3\\ \end{array}\right], B=\left[\begin{array}[]{c}1\\ 1\\ \end{array}\right], Q=\left[\begin{array}[]{cc}1&0\\ 0&1\\ \end{array}\right], \Gamma=\left[\begin{array}[]{cc}1&0\\ 1&1\\ \end{array}\right], R=\left[\begin{array}[]{cc}1&0\\ 0&1\\ \end{array}\right], \eta=\left[\begin{array}[]{c}0\\ 0.5\\ \end{array}\right], and . Denote . Both of and are taken independently from a normal distribution . Under the control laws (27), the trajectories of and , are shown in Figs. 6 and 7, respectively.
VI Concluding Remarks
In this paper, we have considered uniform stabilization and asymptotic optimality for mean field LQ multiagent systems. For social control and Nash game problems, we design the decentralized open-loop control laws by the variational analysis, respectively, which are further shown to be asymptotically optimal. Two equivalent conditions are further given for uniform stabilization of the systems in different cases. Finally, we show such decentralized control laws are equivalent to the feedback strategies in previous works.
An interesting generalization is to consider mean field LQ control systems with partial measurements by using variational analysis. Also, the variational analysis may be applied to general nonlinear model to construct decentralized control laws for social control and Nash games.
Appendix A Proof of Theorem II.3
To prove Theorem II.3, we need a lemma.
Lemma A.1
Let A1) hold and . Under the control (25), we have
[TABLE]
Proof. It follows by (26) that
[TABLE]
From this and (24), we have
[TABLE]
which leads to
[TABLE]
By A1), one can obtain
[TABLE]
which completes the proof.
Proof of Theorem II.3. We first prove that for , implies that for all . By , we have This leads to
[TABLE]
where By (1),
[TABLE]
which with A1) implies that
[TABLE]
Note that
[TABLE]
We have
[TABLE]
By (24) and (26), we obtain that
[TABLE]
Let , and . Then by (1) and (26),
[TABLE]
From (3), we have
[TABLE]
where
[TABLE]
By (A4), . We now prove .
[TABLE]
By (19)-(22), (A.6) and Itô’s formula,
[TABLE]
From this and (A.8), we obtain
[TABLE]
By Lemma A.1, (A.4) and (A.5), we obtain
[TABLE]
which implies .
Appendix B Proofs of Lemma II.2 and Theorem II.4
Proof of Lemma II.2. From (A.2), we have
[TABLE]
Thus,
[TABLE]
Proof of Theorem II.4. By A1)-A4), Lemmas II.1 and II.2, we obtain that and
[TABLE]
which further gives that
[TABLE]
Denote . Then and
[TABLE]
Note that is Hurwitz. By Schwarz’s inequality,
[TABLE]
This with (27) completes the proof.
Appendix C Proofs of Theorems II.5 and II.6
Proof. i) ii). By (26),
[TABLE]
It follows from A1) that
[TABLE]
By comparing (31) and (C.1), we obtain . Note that \|\bar{x}\|^{2}=\big{\|}\mathbb{E}\hat{x}_{i}\big{\|}^{2}\leq\mathbb{E}\|\hat{x}_{i}\|^{2}. It follows from (34) that
[TABLE]
By (31), we have
[TABLE]
where . By the arbitrariness of with (C.2) we obtain that is Hurwitz. That is, is stabilizable. By [2], (29) admits a unique solution such that . Note that . Then from (34) we have
[TABLE]
This leads to , where . By (B.1), we obtain
[TABLE]
By (34) and the arbitrariness of we obtain that is Hurwitz, i.e., is stabilizable. By [2], (28) admits a unique solution such that .
[TABLE]
On the other hand, (A.2) gives
[TABLE]
By (C.4) and the arbitrariness of , we obtain that is Hurwitz.
(ii)(iii). Define , where satisfies
[TABLE]
Denote by when . By (29) we have
[TABLE]
Note that . Then exists, which implies
[TABLE]
Rewrite in (23) by . Then we have . By (23),
[TABLE]
This with (C.5) implies
[TABLE]
By A3), one can obtain that there exists such that (See e.g. [43, 44]). Thus, we have \lim_{t\to\infty}e^{-\rho t}\big{\|}\bar{y}({t})\big{\|}^{2}=0, which is stabilizable. Similarly, we can show is stabilizable.
(iii)(i). This part has been proved in Theorem II.4.
Proof of Theorem II.6. (iii)(i). From [2], (28) and (29) admit unique solutions such that and are Hurwitz, respectively. Thus, there exists a unique such that . It is straightforward that . By the argument in the proof of Theorem II.4, (i) follows.
(i)(ii). The proof of this part is similar to that of (i)(ii) in Theorem II.5.
(ii)(iii). Since , then there exists an orthogonal such that
[TABLE]
where . From (28),
[TABLE]
where . Denote
[TABLE]
By pre- and post-multiplying by and where , it follows that
[TABLE]
From the arbitrariness of , we obtain . Since is semi-positive definite, then , and . By comparing each block matrix of both sides of (C.6), we obtain . It follows from (C.6) that
[TABLE]
Let , where satisfies . Then we have
[TABLE]
By Lemma 4.1 of [38], the detectability of implies the detectability of . Take . Then , which together with the detectability of implies and is Hurwitz. Denote . By (C.7),
[TABLE]
which implies exists. By a similar argument with the proof of Theorem II.5, we obtain \lim_{t_{0}\to\infty}e^{-\rho t_{0}}\big{\|}\zeta_{2}(t_{0})\big{\|}^{2}_{\Pi_{2,T}(0)}=0 and , which gives and is Hurwitz. This with the fact that is Hurwitz gives that is stable, which leads to (iii).
Appendix D Proof of Theorems III.2 and III.3
Proof of Theorem III.2. From (52) and (53), we have
[TABLE]
where . This implies that
[TABLE]
By Schwarz’s inequality,
[TABLE]
To prove (55), it suffices to only consider such that . By (3),
[TABLE]
After the set of strategies is applied, the corresponding dynamics of agents can be written as
[TABLE]
This with (47) implies
[TABLE]
By (D.1), (D.5) and elementary SDE estimates, one can obtain
[TABLE]
We have
[TABLE]
which together with (D.6) gives that
[TABLE]
Note that
[TABLE]
and . By Schwarz’s inequality, (D.6) and (D.7), we obtain
[TABLE]
From this and (D.2), the theorem follows.
Proof of Theorem III.3. Note that are mutually independent processes with the expectation . By Lemma III.2,
[TABLE]
We only need to show for all satisfying
[TABLE]
From (D.8), we obtain
[TABLE]
which with Lemma III.2 implies
[TABLE]
where is independent of . The rest of the proof follows by that of Theorem III.2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. Abou-Kandil, G. Freiling, V. Ionescu, and G. Jank, Matrix Riccati Equations in Control and Systems Theory . Birkhiiuser Verlag, 2003.
- 2[2] B. D. O. Anderson and J. B. Moore, Optimal Control: Linear Quadratic Methods . Englewood Cliffs, NJ: Prentice Hall, 1990.
- 3[3] J. Arabneydi and A. Mahajan, “Team-optimal solution of finite number of mean-field coupled LQG subsystems,” in Proc. 54th IEEE CDC , Osaka, Japan, 2015, pp. 5308-5313.
- 4[4] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory . Academic Press, London, 1982.
- 5[5] D. Bauso, H. Tembine, and T. Basar, “Opinion dynamics in social networks through mean-field games,” SIAM J. Control Optim. , vol. 54, no. 6, pp. 3225-3257, 2016.
- 6[6] A. Bensoussan, K.C. Sung, S.C. Yam, and S. P. Yung, “Linear-quadratic mean field games,” J. Optimization Theory & Applications , vol. 169, no. 2, pp. 496-529, 2016.
- 7[7] A. Bensoussan, J. Frehse, and P. Yam, Mean Field Games and Mean Field Type Control Theory . Springer, New York, 2013.
- 8[8] P. E. Caines, M. Huang, and R. P. Malhame, Mean field games, in Handbook of Dynamic Game Theory , T. Basar and G. Zaccour Eds., Springer, Berlin, 2017.
