Mean Field Game for Linear Quadratic Stochastic Recursive Systems

Liangquan Zhang; Xun Li

arXiv:1908.05063·math.OC·April 9, 2021

Mean Field Game for Linear Quadratic Stochastic Recursive Systems

Liangquan Zhang, Xun Li

PDF

TL;DR

This paper develops a framework for linear-quadratic mean-field games involving forward-backward stochastic differential equations, establishing well-posedness and epsilon-Nash equilibrium properties for decentralized strategies.

Contribution

It introduces a coupled mean-field FBSDE approach with projection operators for LQ mean-field games and proves well-posedness using monotonicity conditions.

Findings

01

Established well-posedness of the consistency system.

02

Proved epsilon-Nash equilibrium property.

03

Provided a decentralized strategy framework.

Abstract

This paper focuses on linear-quadratic (LQ for short) mean-field games described by forward-backward stochastic differential equations (FBSDEs for short), in which the individual control region is postulated to be convex. The decentralized strategies and consistency condition are represented by a kind of coupled mean-field FBSDEs with projection operators. The well-posedness of consistency condition system is obtained using the monotonicity condition method. The $ϵ$ -Nash equilibrium property is discussed as well.

Equations235

\left\{\begin{array}[]{lll}\mathrm{d}x_{t}^{i}&=&\left(A_{t}x_{t}^{i}+B_{t}u_{t}^{i}+F_{t}x_{t}^{(N)}+b_{t}\right)\mathrm{d}t+\left(D_{t}u_{t}^{i}+\sigma_{t}\right)\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i}&=&-\left(M_{t}x_{t}^{i}+U_{t}y_{t}^{i}+H_{t}x_{t}^{(N)}+V_{t}y_{t}^{(N)}+K_{t}u_{t}^{i}+f_{t}\right)\mathrm{d}t+z_{t}^{i}\mathrm{d}W_{t}^{i},\\ x_{0}^{i}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i}=\Phi x_{T}^{i},\qquad 0\leq t\leq T,\end{array}\right.

\left\{\begin{array}[]{lll}\mathrm{d}x_{t}^{i}&=&\left(A_{t}x_{t}^{i}+B_{t}u_{t}^{i}+F_{t}x_{t}^{(N)}+b_{t}\right)\mathrm{d}t+\left(D_{t}u_{t}^{i}+\sigma_{t}\right)\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i}&=&-\left(M_{t}x_{t}^{i}+U_{t}y_{t}^{i}+H_{t}x_{t}^{(N)}+V_{t}y_{t}^{(N)}+K_{t}u_{t}^{i}+f_{t}\right)\mathrm{d}t+z_{t}^{i}\mathrm{d}W_{t}^{i},\\ x_{0}^{i}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i}=\Phi x_{T}^{i},\qquad 0\leq t\leq T,\end{array}\right.

U_{a d}^{c} := {u^{i} (\cdot) ∣ u^{i} (\cdot) \in L_{F}^{2} (0, T; U), 1 \leq i \leq N},

U_{a d}^{c} := {u^{i} (\cdot) ∣ u^{i} (\cdot) \in L_{F}^{2} (0, T; U), 1 \leq i \leq N},

U_{a d}^{d, i} := {u^{i} (\cdot) ∣ u^{i} (\cdot) \in L_{F^{i}}^{2} (0, T; U), 1 \leq i \leq N} .

U_{a d}^{d, i} := {u^{i} (\cdot) ∣ u^{i} (\cdot) \in L_{F^{i}}^{2} (0, T; U), 1 \leq i \leq N} .

J_{i} (u^{i}, u_{- i})

J_{i} (u^{i}, u_{- i})

J_{i} (\overset{u}{ˉ}^{i} (\cdot), \overset{u}{ˉ}_{- i} (\cdot)) = u^{i} (\cdot) \in U_{a d}^{c} in f J_{i} (u^{i} (\cdot), \overset{u}{ˉ}_{- i} (\cdot)),

J_{i} (\overset{u}{ˉ}^{i} (\cdot), \overset{u}{ˉ}_{- i} (\cdot)) = u^{i} (\cdot) \in U_{a d}^{c} in f J_{i} (u^{i} (\cdot), \overset{u}{ˉ}_{- i} (\cdot)),

\left\{\begin{array}[]{lll}\mathrm{d}x_{t}^{i,\diamond}&=&\left(A_{t}x_{t}^{i,\diamond}+B_{t}u_{t}^{i}+F_{t}\phi_{t}^{1}+b_{t}\right)\mathrm{d}t+\left(D_{t}u_{t}^{i}+\sigma_{t}\right)\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\diamond}&=&-\left(M_{t}x_{t}^{i,\diamond}+U_{t}y_{t}^{i,\diamond}+H_{t}\phi_{t}^{1}+V_{t}\phi_{t}^{2}+K_{t}u_{t}^{i}+f_{t}\right)\mathrm{d}t+z_{t}^{i,\diamond}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\diamond}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\diamond}=\Phi x_{T}^{i,\diamond},\qquad 0\leq t\leq T,\end{array}\right.

\left\{\begin{array}[]{lll}\mathrm{d}x_{t}^{i,\diamond}&=&\left(A_{t}x_{t}^{i,\diamond}+B_{t}u_{t}^{i}+F_{t}\phi_{t}^{1}+b_{t}\right)\mathrm{d}t+\left(D_{t}u_{t}^{i}+\sigma_{t}\right)\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\diamond}&=&-\left(M_{t}x_{t}^{i,\diamond}+U_{t}y_{t}^{i,\diamond}+H_{t}\phi_{t}^{1}+V_{t}\phi_{t}^{2}+K_{t}u_{t}^{i}+f_{t}\right)\mathrm{d}t+z_{t}^{i,\diamond}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\diamond}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\diamond}=\Phi x_{T}^{i,\diamond},\qquad 0\leq t\leq T,\end{array}\right.

J_{i} (u^{i})

J_{i} (u^{i})

J_{i} (u^{i, *} (\cdot)) = u^{i} (\cdot) \in U_{a d}^{d, i} in f J_{i} (u^{i} (\cdot)) .

J_{i} (u^{i, *} (\cdot)) = u^{i} (\cdot) \in U_{a d}^{d, i} in f J_{i} (u^{i} (\cdot)) .

\left\{\begin{array}[]{rcl}\mathrm{d}p_{t}^{i}&=&\left[U_{t}p_{t}^{i}-L_{t}\left(y_{t}^{i,\ast}-\phi_{t}^{2}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i}&=&\left[-M_{t}p_{t}^{i}-A_{t}q_{t}^{i}+Q_{t}\left(x_{t}^{i,\ast}-\phi_{t}^{1}\right)\right]\mathrm{d}t+k_{t}^{i}\mathrm{d}W_{t}^{i},\\ p_{0}^{i}&=&0,\text{ }q_{T}^{i}=\Phi^{T}p_{T}^{i}-G\left(x_{T}^{i,\ast}-\phi_{T}^{1}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}p_{t}^{i}&=&\left[U_{t}p_{t}^{i}-L_{t}\left(y_{t}^{i,\ast}-\phi_{t}^{2}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i}&=&\left[-M_{t}p_{t}^{i}-A_{t}q_{t}^{i}+Q_{t}\left(x_{t}^{i,\ast}-\phi_{t}^{1}\right)\right]\mathrm{d}t+k_{t}^{i}\mathrm{d}W_{t}^{i},\\ p_{0}^{i}&=&0,\text{ }q_{T}^{i}=\Phi^{T}p_{T}^{i}-G\left(x_{T}^{i,\ast}-\phi_{T}^{1}\right).\end{array}\right.

H^{i}

H^{i}

⟨ \frac{\partial H ^{i} ( t , p ^{i, *} , q ^{i, *} , k ^{i, *} , x ^{i, *} , y ^{i, *} , z ^{i, *} , u ^{i, *} )}{\partial u ^{i}}, u - u^{i, *} ⟩ \leq 0, for u \in U, t \in [0, T], P -a.s.

⟨ \frac{\partial H ^{i} ( t , p ^{i, *} , q ^{i, *} , k ^{i, *} , x ^{i, *} , y ^{i, *} , z ^{i, *} , u ^{i, *} )}{\partial u ^{i}}, u - u^{i, *} ⟩ \leq 0, for u \in U, t \in [0, T], P -a.s.

⟨ B^{T} q^{i, *} + K^{T} p^{i, *} + D^{T} k^{i, *} - R u^{i, *}, u - u^{i, *} ⟩, for all u \in U, a.e. t \in [0, T], P -a.s.

⟨ B^{T} q^{i, *} + K^{T} p^{i, *} + D^{T} k^{i, *} - R u^{i, *}, u - u^{i, *} ⟩, for all u \in U, a.e. t \in [0, T], P -a.s.

⟨ R^{\frac{1}{2}} [R^{- 1} (B^{T} q^{i, *} + K^{T} p^{i, *} + D^{T} k^{i, *}) - u^{i, *}], R^{\frac{1}{2}} (u - u^{i, *}) ⟩,

⟨ R^{\frac{1}{2}} [R^{- 1} (B^{T} q^{i, *} + K^{T} p^{i, *} + D^{T} k^{i, *}) - u^{i, *}], R^{\frac{1}{2}} (u - u^{i, *}) ⟩,

u_{t}^{i, *} = P_{U} [R_{t}^{- 1} (B_{t}^{T} q_{t}^{i, *} + K_{t}^{T} p_{t}^{i, *} + D_{t}^{T} k_{t}^{i, *})],

u_{t}^{i, *} = P_{U} [R_{t}^{- 1} (B_{t}^{T} q_{t}^{i, *} + K_{t}^{T} p_{t}^{i, *} + D_{t}^{T} k_{t}^{i, *})],

φ (t, p, q, k) = P_{U} [R_{t}^{- 1} (B_{t}^{T} q + K_{t}^{T} p + D_{t}^{T} k)] .

φ (t, p, q, k) = P_{U} [R_{t}^{- 1} (B_{t}^{T} q + K_{t}^{T} p + D_{t}^{T} k)] .

\left\{\begin{array}[]{rcl}\mathrm{d}x_{t}^{i,\ast}&=&\left[A_{t}x_{t}^{i,\ast}+B_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+F_{t}\phi_{t}^{1}+b_{t}\right]\mathrm{d}t\\ &&+\left[D_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+\sigma_{t}\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\ast}&=&-\left[M_{t}x_{t}^{i,\ast}+U_{t}y_{t}^{i,\ast}+H_{t}\phi_{t}^{1}+V_{t}\phi_{t}^{2}+K_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+f_{t}\right]\mathrm{d}t+z_{t}^{i,\ast}\mathrm{d}W_{t}^{i}\\ \mathrm{d}p_{t}^{i,\ast}&=&\left[U_{t}p_{t}^{i,\ast}-L_{t}\left(y_{t}^{i,\ast}-\phi_{t}^{2}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i,\ast}&=&\left[-M_{t}p_{t}^{i,\ast}-A_{t}q_{t}^{i,\ast}+Q_{t}\left(x_{t}^{i,\ast}-\phi_{t}^{1}\right)\right]\mathrm{d}t+k_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\ast}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\ast}=\Phi x_{T}^{i,\ast},\text{ }p_{0}^{i,\ast}=0,\text{ }q_{T}^{i,\ast}=\Phi^{T}p_{T}^{i,\ast}-G\left(x_{T}^{i,\ast}-\phi_{T}^{1}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}x_{t}^{i,\ast}&=&\left[A_{t}x_{t}^{i,\ast}+B_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+F_{t}\phi_{t}^{1}+b_{t}\right]\mathrm{d}t\\ &&+\left[D_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+\sigma_{t}\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\ast}&=&-\left[M_{t}x_{t}^{i,\ast}+U_{t}y_{t}^{i,\ast}+H_{t}\phi_{t}^{1}+V_{t}\phi_{t}^{2}+K_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+f_{t}\right]\mathrm{d}t+z_{t}^{i,\ast}\mathrm{d}W_{t}^{i}\\ \mathrm{d}p_{t}^{i,\ast}&=&\left[U_{t}p_{t}^{i,\ast}-L_{t}\left(y_{t}^{i,\ast}-\phi_{t}^{2}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i,\ast}&=&\left[-M_{t}p_{t}^{i,\ast}-A_{t}q_{t}^{i,\ast}+Q_{t}\left(x_{t}^{i,\ast}-\phi_{t}^{1}\right)\right]\mathrm{d}t+k_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\ast}&=&x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\ast}=\Phi x_{T}^{i,\ast},\text{ }p_{0}^{i,\ast}=0,\text{ }q_{T}^{i,\ast}=\Phi^{T}p_{T}^{i,\ast}-G\left(x_{T}^{i,\ast}-\phi_{T}^{1}\right).\end{array}\right.

ϕ_{\cdot}^{1}

ϕ_{\cdot}^{1}

ϕ_{\cdot}^{2}

\left\{\begin{array}[]{rcl}\mathrm{d}x_{t}^{i,\ast}&=&\left[A_{t}x_{t}^{i,\ast}+B_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+F_{t}\mathbb{E}x_{t}^{i,\ast}+b_{t}\right]\mathrm{d}t\\ &&+\left[D_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+\sigma_{t}\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\ast}&=&-\left[M_{t}x_{t}^{i,\ast}+U_{t}y_{t}^{i,\ast}+H_{t}\mathbb{E}x_{t}^{i,\ast}+V_{t}\mathbb{E}y_{t}^{i,\ast}+K_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+f_{t}\right]\mathrm{d}t+z_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ \mathrm{d}p_{t}^{i,\ast}&=&\left[U_{t}p_{t}^{i,\ast}-L_{t}\left(y_{t}^{i,\ast}-\mathbb{E}y_{t}^{i,\ast}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i,\ast}&=&\left[-M_{t}p_{t}^{i,\ast}-A_{t}q_{t}^{i,\ast}+Q_{t}\left(x_{t}^{i,\ast}-\mathbb{E}x_{t}^{i,\ast}\right)\right]\mathrm{d}t+k_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\ast}&=&=x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\ast}=\Phi x_{T}^{i,\ast},\text{ }p_{0}^{i}=0,\\ q_{T}^{i,\ast}&=&\Phi^{T}p_{T}^{i,\ast}-G\left(x_{T}^{i,\ast}-\mathbb{E}x_{T}^{i,\ast}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}x_{t}^{i,\ast}&=&\left[A_{t}x_{t}^{i,\ast}+B_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+F_{t}\mathbb{E}x_{t}^{i,\ast}+b_{t}\right]\mathrm{d}t\\ &&+\left[D_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+\sigma_{t}\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}y_{t}^{i,\ast}&=&-\left[M_{t}x_{t}^{i,\ast}+U_{t}y_{t}^{i,\ast}+H_{t}\mathbb{E}x_{t}^{i,\ast}+V_{t}\mathbb{E}y_{t}^{i,\ast}+K_{t}\varphi\left(p_{t}^{i,\ast},q_{t}^{i,\ast},k_{t}^{i,\ast}\right)+f_{t}\right]\mathrm{d}t+z_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ \mathrm{d}p_{t}^{i,\ast}&=&\left[U_{t}p_{t}^{i,\ast}-L_{t}\left(y_{t}^{i,\ast}-\mathbb{E}y_{t}^{i,\ast}\right)\right]\mathrm{d}t,\\ \mathrm{d}q_{t}^{i,\ast}&=&\left[-M_{t}p_{t}^{i,\ast}-A_{t}q_{t}^{i,\ast}+Q_{t}\left(x_{t}^{i,\ast}-\mathbb{E}x_{t}^{i,\ast}\right)\right]\mathrm{d}t+k_{t}^{i,\ast}\mathrm{d}W_{t}^{i},\\ x_{0}^{i,\ast}&=&=x\in\mathbb{R}^{n},\text{ }y_{T}^{i,\ast}=\Phi x_{T}^{i,\ast},\text{ }p_{0}^{i}=0,\\ q_{T}^{i,\ast}&=&\Phi^{T}p_{T}^{i,\ast}-G\left(x_{T}^{i,\ast}-\mathbb{E}x_{T}^{i,\ast}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}x&=&\left[Ax+B\varphi\left(p,q,k\right)+F\mathbb{E}x+b\right]\mathrm{d}t+\left[D\varphi\left(p,q,k\right)+\sigma\right]\mathrm{d}W_{t},\\ \mathrm{d}y&=&-\left[Mx+Uy+H\mathbb{E}x+V\mathbb{E}y+K\varphi\left(p,q,k\right)+f\right]\mathrm{d}t+z\mathrm{d}W_{t},\\ \mathrm{d}p&=&\left[Up-L\left(y-\mathbb{E}y\right)\right]\mathrm{d}t,\\ \mathrm{d}q&=&\left[-Mp-Aq+Q\left(x-\mathbb{E}x\right)\right]\mathrm{d}t+kW_{t},\\ x_{0}&=&x\in\mathbb{R}^{n},\text{ }y_{T}=\Phi x_{T},\text{ }p_{0}=0,\\ q_{T}&=&\Phi^{T}p_{T}-G\left(x_{T}-\mathbb{E}x_{T}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}x&=&\left[Ax+B\varphi\left(p,q,k\right)+F\mathbb{E}x+b\right]\mathrm{d}t+\left[D\varphi\left(p,q,k\right)+\sigma\right]\mathrm{d}W_{t},\\ \mathrm{d}y&=&-\left[Mx+Uy+H\mathbb{E}x+V\mathbb{E}y+K\varphi\left(p,q,k\right)+f\right]\mathrm{d}t+z\mathrm{d}W_{t},\\ \mathrm{d}p&=&\left[Up-L\left(y-\mathbb{E}y\right)\right]\mathrm{d}t,\\ \mathrm{d}q&=&\left[-Mp-Aq+Q\left(x-\mathbb{E}x\right)\right]\mathrm{d}t+kW_{t},\\ x_{0}&=&x\in\mathbb{R}^{n},\text{ }y_{T}=\Phi x_{T},\text{ }p_{0}=0,\\ q_{T}&=&\Phi^{T}p_{T}-G\left(x_{T}-\mathbb{E}x_{T}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}\alpha^{i}&=&\left[A\alpha^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\theta^{i}&=&-\left[M\alpha^{i}+U\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\kappa^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\chi^{i}&=&\left[U\chi^{i}-L\left(\theta^{i}-\mathbb{E}\theta^{i}\right)\right]\mathrm{d}t,\\ \mathrm{d}\beta^{i}&=&\left[-M\chi^{i}-A\beta^{i}+Q\left(\alpha^{i}-\mathbb{E}\alpha^{i}\right)\right]\mathrm{d}t+\gamma^{i}W_{t}^{i},\\ \alpha_{0}^{i}&=&x\in\mathbb{R}^{n},\text{ }\theta_{T}^{i}=\Phi\alpha_{T}^{i},\text{ }\chi_{0}^{i}=0,\\ \beta_{T}^{i}&=&\Phi_{T}^{T}\chi^{i}-G\left(\alpha_{T}^{i}-\mathbb{E}\alpha_{T}^{i}\right).\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}\alpha^{i}&=&\left[A\alpha^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\theta^{i}&=&-\left[M\alpha^{i}+U\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\kappa^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\chi^{i}&=&\left[U\chi^{i}-L\left(\theta^{i}-\mathbb{E}\theta^{i}\right)\right]\mathrm{d}t,\\ \mathrm{d}\beta^{i}&=&\left[-M\chi^{i}-A\beta^{i}+Q\left(\alpha^{i}-\mathbb{E}\alpha^{i}\right)\right]\mathrm{d}t+\gamma^{i}W_{t}^{i},\\ \alpha_{0}^{i}&=&x\in\mathbb{R}^{n},\text{ }\theta_{T}^{i}=\Phi\alpha_{T}^{i},\text{ }\chi_{0}^{i}=0,\\ \beta_{T}^{i}&=&\Phi_{T}^{T}\chi^{i}-G\left(\alpha_{T}^{i}-\mathbb{E}\alpha_{T}^{i}\right).\end{array}\right.

J^{i} (\overset{u}{ˉ}_{t}^{i}, \overset{u}{ˉ}_{t}^{- i}) \leq J^{i} (u_{t}^{i}, \overset{u}{ˉ}_{t}^{- i}) + ϵ,

J^{i} (\overset{u}{ˉ}_{t}^{i}, \overset{u}{ˉ}_{t}^{- i}) \leq J^{i} (u_{t}^{i}, \overset{u}{ˉ}_{t}^{- i}) + ϵ,

\left\{\begin{array}[]{rcl}\mathrm{d}\breve{x}^{i}&=&\left[A\breve{x}^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\breve{x}^{(N)}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\breve{y}^{i}&=&-\left[M\breve{x}^{i}+U\breve{y}^{i}+H\breve{x}^{(N)}+V\breve{y}^{(N)}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\alpha^{i}&=&\left[A\alpha^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\theta^{i}&=&-\left[M\alpha^{i}+U\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\kappa^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\chi^{i}&=&\left[U\chi^{i}-L\left(\theta^{i}-\mathbb{E}\theta^{i}\right)\right]\mathrm{d}t,\\ \mathrm{d}\beta^{i}&=&\left[-M\chi^{i}-A\beta^{i}+Q\left(\alpha^{i}-\mathbb{E}\alpha^{i}\right)\right]\mathrm{d}t+\gamma^{i}W_{t}^{i},\\ \breve{x}_{0}^{i}&=&\alpha_{0}^{i}=x,\text{ }\breve{y}_{T}^{i}=\Phi\breve{x}_{T}^{i},\\ \theta_{T}^{i}&=&\Phi\alpha_{T}^{i},\text{ }\chi_{0}^{i}=-\Psi\left(\theta_{0}^{i}-\mathbb{E}\theta_{0}^{i}\right),\\ \beta_{T}^{i}&=&\Phi_{T}^{T}\chi^{i}-G\left(\alpha_{T}^{i}-\mathbb{E}\alpha_{T}^{i}\right),\end{array}\right.

\left\{\begin{array}[]{rcl}\mathrm{d}\breve{x}^{i}&=&\left[A\breve{x}^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\breve{x}^{(N)}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\breve{y}^{i}&=&-\left[M\breve{x}^{i}+U\breve{y}^{i}+H\breve{x}^{(N)}+V\breve{y}^{(N)}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\alpha^{i}&=&\left[A\alpha^{i}+B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t+\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\theta^{i}&=&-\left[M\alpha^{i}+U\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t+\kappa^{i}\mathrm{d}W_{t}^{i},\\ \mathrm{d}\chi^{i}&=&\left[U\chi^{i}-L\left(\theta^{i}-\mathbb{E}\theta^{i}\right)\right]\mathrm{d}t,\\ \mathrm{d}\beta^{i}&=&\left[-M\chi^{i}-A\beta^{i}+Q\left(\alpha^{i}-\mathbb{E}\alpha^{i}\right)\right]\mathrm{d}t+\gamma^{i}W_{t}^{i},\\ \breve{x}_{0}^{i}&=&\alpha_{0}^{i}=x,\text{ }\breve{y}_{T}^{i}=\Phi\breve{x}_{T}^{i},\\ \theta_{T}^{i}&=&\Phi\alpha_{T}^{i},\text{ }\chi_{0}^{i}=-\Psi\left(\theta_{0}^{i}-\mathbb{E}\theta_{0}^{i}\right),\\ \beta_{T}^{i}&=&\Phi_{T}^{T}\chi^{i}-G\left(\alpha_{T}^{i}-\mathbb{E}\alpha_{T}^{i}\right),\end{array}\right.

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{x}^{(N)}(t)\Big{|}^{2}

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{x}^{(N)}(t)\Big{|}^{2}

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{y}^{(N)}(t)\Big{|}^{2}

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{x}^{(N)}(t)-\mathbb{E}\alpha^{i}(t)\Big{|}^{2}

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{x}^{(N)}(t)-\mathbb{E}\alpha^{i}(t)\Big{|}^{2}

\displaystyle\mathbb{E}\sup_{0\leq t\leq T}\Big{|}\breve{y}^{(N)}(t)-\mathbb{E}\theta^{i}(t)\Big{|}^{2}

\left\{\begin{array}[]{rl}\mathrm{d}\breve{x}^{(N)}=&\left[A\breve{x}^{(N)}+\frac{1}{N}\sum_{i=1}^{N}B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\breve{x}^{(N)}+b\right]\mathrm{d}t\\ &+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\breve{y}^{(N)}=&-\left[M\breve{x}^{(N)}+U\breve{y}^{(N)}+H\breve{x}^{(N)}+V\breve{y}^{(N)}+\frac{1}{N}\sum_{i=1}^{N}K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t\\ &+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \breve{x}_{0}^{(N)}=&x,\quad\breve{y}_{T}^{(N)}=\Phi\breve{x}_{T}^{(N)}.\end{array}\right.

\left\{\begin{array}[]{rl}\mathrm{d}\breve{x}^{(N)}=&\left[A\breve{x}^{(N)}+\frac{1}{N}\sum_{i=1}^{N}B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\breve{x}^{(N)}+b\right]\mathrm{d}t\\ &+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\breve{y}^{(N)}=&-\left[M\breve{x}^{(N)}+U\breve{y}^{(N)}+H\breve{x}^{(N)}+V\breve{y}^{(N)}+\frac{1}{N}\sum_{i=1}^{N}K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t\\ &+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \breve{x}_{0}^{(N)}=&x,\quad\breve{y}_{T}^{(N)}=\Phi\breve{x}_{T}^{(N)}.\end{array}\right.

\left\{\begin{array}[]{l}\mathrm{d}\left(\mathbb{E}\alpha^{i}\right)=\left[A\mathbb{E}\alpha^{i}+B\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t,\\ \mathrm{d}\left(\mathbb{E}\theta^{i}\right)=-\left[M\mathbb{E}\alpha^{i}+U\mathbb{E}\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t,\\ \mathbb{E}\alpha_{0}^{i}=x,\quad\mathbb{E}\theta_{T}^{i}=\Phi\mathbb{E}\alpha_{T}^{i}.\end{array}\right.

\left\{\begin{array}[]{l}\mathrm{d}\left(\mathbb{E}\alpha^{i}\right)=\left[A\mathbb{E}\alpha^{i}+B\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\mathbb{E}\alpha^{i}+b\right]\mathrm{d}t,\\ \mathrm{d}\left(\mathbb{E}\theta^{i}\right)=-\left[M\mathbb{E}\alpha^{i}+U\mathbb{E}\theta^{i}+H\mathbb{E}\alpha^{i}+V\mathbb{E}\theta^{i}+K\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+f\right]\mathrm{d}t,\\ \mathbb{E}\alpha_{0}^{i}=x,\quad\mathbb{E}\theta_{T}^{i}=\Phi\mathbb{E}\alpha_{T}^{i}.\end{array}\right.

Δ_{t}^{1}

Δ_{t}^{1}

Δ_{t}^{2}

\left\{\begin{array}[]{l}\mathrm{d}\Delta^{1}=\left[A\Delta^{1}+\frac{1}{N}\sum_{i=1}^{N}B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)-B\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\Delta^{1}\right]\mathrm{d}t\\ \qquad\qquad+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\Delta^{2}=-\Big{[}M\Delta^{1}+U\Delta^{2}+H\Delta^{1}+V\Delta^{2}\\ \qquad\qquad+\displaystyle\frac{1}{N}\sum_{i=1}^{N}K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)-K\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)\Big{]}\mathrm{d}t+\frac{1}{N}\sum_{i=1}^{N}\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \Delta_{0}^{1}=0,\quad\Delta_{T}^{2}=\Phi\Delta_{T}^{1},\end{array}\right.

\left\{\begin{array}[]{l}\mathrm{d}\Delta^{1}=\left[A\Delta^{1}+\frac{1}{N}\sum_{i=1}^{N}B\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)-B\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+F\Delta^{1}\right]\mathrm{d}t\\ \qquad\qquad+\displaystyle\frac{1}{N}\sum_{i=1}^{N}\left[D\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)+\sigma\right]\mathrm{d}W_{t}^{i},\\ \mathrm{d}\Delta^{2}=-\Big{[}M\Delta^{1}+U\Delta^{2}+H\Delta^{1}+V\Delta^{2}\\ \qquad\qquad+\displaystyle\frac{1}{N}\sum_{i=1}^{N}K\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)-K\mathbb{E}\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)\Big{]}\mathrm{d}t+\frac{1}{N}\sum_{i=1}^{N}\breve{z}^{i}\mathrm{d}W_{t}^{i},\\ \Delta_{0}^{1}=0,\quad\Delta_{T}^{2}=\Phi\Delta_{T}^{1},\end{array}\right.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Mean Field Game for Linear Quadratic Stochastic Recursive Systems

Liangquan Zhang1, Xun Li2

School of Science

Beijing University of Posts and Telecommunications

Beijing 100876, China

Department of Applied Mathematics

The Hong Kong Polytechnic University, Hong Kong

L. Zhang acknowledges the financial support partly by the National Nature Science Foundation of China (Grant No. 11701040, 61871058, 11871010 & 61603049) and the Fundamental Research Funds for the Central Universities (No. 500417024 & 505018304). E-mail: [email protected]. Li acknowledges the financial support partly by PolyU G-UA4N, Hong Kong RGC under grants 15224215 and 15255416. E-mail: [email protected].

Abstract

This paper focuses on linear-quadratic (LQ for short) mean-field games described by forward-backward stochastic differential equations (FBSDEs for short), in which the individual control region is postulated to be convex. The decentralized strategies and consistency condition are represented by a kind of coupled mean-field FBSDEs with projection operators. The well-posedness of consistency condition system is obtained using the monotonicity condition method. The $\epsilon$ -Nash equilibrium property is discussed as well.

AMS subject classifications: 93E20, 60H15, 60H30.

Key words: $\epsilon$ -Nash equilibrium, Mean-field forward-backward stochastic differential equation (MF-FBSDE), Linear-quadratic constrained control, Projection, Monotonic condition.

1 Introduction

The control of stochastic multi-agent systems has attracted large attentions by many researchers. As well-known, the large population systems arise naturally in various different fields (e.g., biology, engineering, social science, economics and finance, operational research and management, etc.). Readers interested in this topic may refer [27, 28, 29, 30] for more details of their solid backgrounds and real applications. The agents (or players) in large population system are individually negligible but their collective behaviors will make some significant impact on all agents. This trait can be captured by the weakly-coupling structure in the individual dynamics and cost functionals through the state-average. The individual behaviors of all agents in micro-scale can make their mass effects in the macro-scale.

As for the controlled large population system, it is intractable for a given agent to collect all agents due to the highly complex interactions among its colleagues. Consequently, the centralized controls, which are built upon the full information of all agents, are not implementable and not efficient in large population framework. Alternatively, it is more reasonable and effective to study the decentralized strategies which depend on the local information111Here local information means the optimal control regulator for a given agent, is designed on its own individual state and some quantity which can be obtained in off-line way. only. The mean-field type stochastic control problem is of both great interest and importance in various fields such as science, engineering, economics, management, and particularly in financial investment. In contrast with the standard stochastic control problems, the underlying dynamic system and the cost functional involve state processes as well as their expected values (hence the name mean-field). In financial investment, however, one frequently encounters interesting problems which are closely related to money managers’ performance evaluation and incentive compensation mechanisms. Together with MF-FSDEs, research is naturally required on optimal control problems based on mean-field forward-backward stochastic differential equations (hereafter MF-FBSDEs). Hence, one powerful tool employed is so-called mean-field games (see [35]). The basic idea is to approximate the initial large population control problem by its limiting problem via some mean-field term (i.e., the asymptotic limit of state-average). There are huge literature can be found in [3, 11, 12, 13, 18, 31, 34, 35] for the study of mean-field games; [29] for cooperative social optimization; [27], [36] and [37] and references therein for models with a major player; [1, 4, 46] for optimal control with a mean term in the dynamics and cost, etc.

The main contribution of this paper is to study the forward backward mean-field LQG of large population systems for which the individual states follow some forward backward stochastic differential equations (FBSDEs in short). This framework makes our setting very different to existing works of mean-field LQG games wherein the individual states evolve by some forward stochastic differential equations. In contrast to classical stochastic differential equations, the terminal condition of BSDE should be specified as the priori random variable, which means, the BSDE will admit one pair of adapted solutions, in which the second solution component (the diffusion term) is naturally appeared here by virtue of the martingale representation theorem and the adaptiveness requirement for filtration. The linear BSDEs are first introduced in [2] for studying the optimal control problems, and the general nonlinear BSDEs are developed by Pardoux and Peng in 1992 [38]. Since then, the study of BSDE has initiated consistent and intense discussions, moreover, it has been used in many applications of diverse areas. For instance, the BSDE takes very important role to characterize the nonlinear expectation ( $g$ -expectation, see [40]), or the stochastic differential recursive utility (see [14]). Subsequently, El Karoui, Peng, and Quenez [32] presents many applications of BSDE in mathematical finance and optimal control theory. Pardoux and Peng establish a kind of stochastic partial differential equations with backward doubly SDE (see [39]). Therefore, it is very natural to study its dynamic optimization in large-population setting. Indeed, the dynamic optimization of backward large population system is inspired by a variety of scenarios. For example, the dynamic economic models for which the participants are of some recursive utilities or nonlinear expectations, or some production planning problems with some tracking terminal objectives but affected by the market price via production average.

Another example arises from the risk management when considering the relative or comparable criteria based on the average performance of all other peers through the whole sector. This is the case for a given pension fund to evaluate its own performance by setting the average performance (say, average hedging cost or initial deposit, surplus) as its benchmark. In addition, the controlled forward large population systems, which are subjected to some terminal constraints, can be reformulated by some backward large population systems, as motivated by [33]. Applying to performance evaluation and incentive compensation of fund managers in the field of financial engineering is of both academic and practical importance. Findings from FBSDEs and mean-field stochastic optimal controls will not only contribute to the academic literature by shedding light on performance evaluation and incentive compensation schemes, but also provide practical applications to fund management and risk control, especially under the current circumstances with on-going economic recession. More importantly, research outcomes in this field are expected to add to our knowledge based on economic theory about providing appropriate incentives for managers in an agency framework. They can also be generalized to various industries and economic regions to provide policy makers with a theoretical basis during their decision-making processes. Inspired by above mentioned motivations, this paper studies the forward backward mean-field linear-quadratic-Gaussian (BMFLQG) games.

We concern on the linear-quadratic (LQ) mean-field game where the individual control domain is convex subset of $\mathbb{R}^{m}$ . The LQ problems with convex control domain comes naturally from various practical applications. For instance, the no-shorting constraint in portfolio selection leads to the LQ control with positive control ( $\mathbb{R}_{+}^{m},$ the positive orthant). Moreover, due to general market accessibility constraint, it is also interesting to study the LQ control with more general closed convex cone constraint (see [19]). As a response, this paper investigates the LQ dynamic game of large-population system with general closed convex control constraint.

The control constraint brings some new features to our study here: (1) The related consistency condition (CC) system is no longer linear, and it becomes a class of nonlinear FBSDEs with projection operator. (2) Due to the nonlinearity of (1), the standard Riccati equation with feedback control is no longer valid to represent the consistency condition of limit state-average process. Instead, the consistency condition is embedded into a class of mean-field coupled FBSDEs with a generic driven Brownian motion.

Similarly like in Hu, Huang, Li [24], we first apply the stochastic maximum principle for convex control domain of the optimal decentralized response through some Hamiltonian system with projection operator upon the control set $U$ for forward-backward systems. Then, the consistency condition system is connected to the well-posedness of some mean-field coupled forward-backward stochastic differential equation (MF-FBSDE). Next, we state some monotonicity condition of this MF-FBSDE to obtain its uniqueness and existence. At last, the related approximate Nash equilibrium property is also established. The MFG strategy derived is an open-loop manner. Consequently, the approximate Nash equilibrium property is verified under the open-loop strategies perturbation and some estimates of forward-backward SDE are involved. In addition, all agents are set to be statistically identical thus the limiting control problem and fixed-point arguments are given for a representative agent.

In order to make our paper more accessible to the reader, we provide the standard procedure of MFG, and describe our result mainly consisting of the following steps:

Step 1: Fix the state-average limit: lim ${}_{N\rightarrow+\infty}x^{(N)}$ and by lim ${}_{N\rightarrow+\infty}y^{(N)}$ a frozen process $\mathbb{E}x$ and $\mathbb{E}y$ (the law of large numbers) and formulate an auxiliary stochastic control problem for $\mathcal{A}_{i}$ which is parameterized by $\mathbb{E}x$ and $\mathbb{E}y$ . Note that the coupled Hamiltonian systems admits a unique strong adapted solution.

Step 2: Solve the above auxiliary stochastic control problem via Pontryagin’s maximum principle to obtain the decentralized optimal state $\left(x_{t}^{i,\ast},y_{t}^{i,\ast},z_{t}^{i,\ast}\right)$ , (which should depend on the undetermined processes $\mathbb{E}x$ and $\mathbb{E}y$ ). By means of convex analysis, we are able to construct the unique feedback control using a so-called projection mapping, $\varphi\left(t,p,q,k\right).$

Step 3: Characterize the decentralized strategies $\{\bar{u}_{t}^{i},1\leq i\leq N\}$ of Problem (CC) through the auxiliary (LCC) and consistency condition system. Namely, show $\{\bar{u}_{t}^{i},1\leq i\leq N\}$ is an $\epsilon$ -Nash equilibrium. This step actually can be further divided into:

*Step 3-1: *Introduce the decentralized state $\left(\breve{x}_{t}^{i},\breve{y}^{i},\breve{z}^{i}\right),$ with its decentralized open-loop optimal strategy $\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)$ and the consistency conditions systems. We get two estimations between them in Lemma 7 and Lemma 8, respectively;

*Step 3-2: *For any fixed $i$ , $1\leq i\leq N$ , we shall consider a group of state equations $\left(\tilde{x}^{i},\tilde{y}^{i},\tilde{z}^{i}\right)$ driven by certain perturbation control $u^{i}\in\mathcal{U}_{ad}^{d,i}$ and systems $\left(\mathring{x}^{i},\mathring{y}^{i},\mathring{z}^{i}\right)$ of the decentralized limiting state with perturbation control. Similarly, we have the estimations between perturbation systems and consistency condition system in Lemma 10, plus the estimations between perturbation systems and decentralized limiting state with perturbation control system in Lemma 11;

*Step 3-3: *Finally, based on Lemma 7-Lemma 11, employing the relation between limiting cost functional $J_{i}$ and the cost functional $\mathcal{J}_{i}$ of $\mathcal{A}_{i}$ with help of perturbational control, we are able to get our desired result.

This paper is organized as follows: Section 2 formulates the LQ MFGs of BSDEs type with convex control domain. The decentralized strategies are derived with the help of a mean field forward-backward SDEs with projection operators. The consistency condition is also established. Section 3 verifies the $\epsilon$ -Nash equilibrium of the decentralized strategies. Some proofs will be scheduled in Appendix A. Related results on properties of projection in convex analysis can be found in Appendix B.

2 Preliminaries

Throughout this paper, we denote the $k$ -dimensional Euclidean space by $\mathbb{R}^{k}$ with standard Euclidean norm $|\cdot|$ and standard Euclidean inner product $\langle\cdot,\cdot\rangle$ . The transpose of a vector (or matrix) $x$ is denoted by $x^{T}$ . $\text{Tr}(A)$ denotes the trace of a square matrix $A$ . Let $\mathbb{R}^{m\times n}$ be the Hilbert space consisting of all ( $m\times n$ )-matrices with the inner product $\langle A,B\rangle:=\text{Tr}(AB^{\top})$ and the norm $|A|:=\langle A,A\rangle^{\frac{1}{2}}$ . Denote the set of symmetric $k\times k$ matrices with real elements by $S^{k}$ . If $M\in S^{k}$ is positive (semi)definite, we write $M>\ (\geq)\ 0$ . $L^{\infty}(0,T;\mathbb{R}^{k})$ is the space of uniformly bounded $\mathbb{R}^{k}-$ valued functions. If $M(\cdot)\in L^{\infty}(0,T;S^{k})$ and $M(t)>\ (\geq)\ 0$ for all $t\in[0,T]$ , we say that $M(\cdot)$ is positive (semi) definite, which is denoted by $M(\cdot)>\ (\geq)\ 0$ . $L^{2}(0,T;\mathbb{R}^{k})$ is the space of all $\mathbb{R}^{k}-$ valued functions satisfying $\int_{0}^{T}|x(t)|^{2}dt<\infty.$

Consider a finite time horizon $[0,T]$ for fixed $T>0$ . We assume $(\Omega,\mathcal{F},\{\mathcal{F}_{t}\}_{0\leq t\leq T},P)$ is a complete, filtered probability space on which a standard $N$ -dimensional Brownian motion $\{W_{i}(t),\ 1\leq i\leq N\}_{0\leq t\leq T}$ is defined. For given filtration $\mathbb{F}=\{\mathcal{F}_{t}\}_{0\leq t\leq T},$ let $L_{\mathbb{F}}^{2}(0,T;\mathbb{R}^{k})$ denote the space of all $\mathcal{F}_{t}$ -progressively measurable $\mathbb{R}^{k}$ -valued processes satisfying $\mathbb{E}\int_{0}^{T}|x(t)|^{2}dt<\infty.$ Let $L_{\mathbb{F}}^{2,\mathcal{E}_{0}}(0,T;\mathbb{R}^{k})\subset L_{\mathbb{F}}^{2}(0,T;\mathbb{R}^{k})$ be the subspace satisfying $\mathbb{E}x_{t}\equiv 0$ for $x\in L_{\mathbb{F}}^{2,\mathcal{E}_{0}}(0,T;\mathbb{R}^{k}).$ Let $L_{\mathbb{F}_{T}}^{2}(\mathbb{R}^{k})$ denote the space of all $\mathcal{F}_{T}$ -measurable $\mathbb{R}^{k}$ -valued random variable satisfying $\mathbb{E}|\xi|^{2}<\infty.$

Now let us consider a large-population system with $N$ weakly-coupled negligible agents $\{\mathcal{A}_{i}\}_{1\leq i\leq N}$ . The state $x^{i}$ and $y^{i}$ for each $\mathcal{A}_{i}$ satisfies the following controlled linear stochastic system:

[TABLE]

where $x^{(N)}(\cdot)=\displaystyle\frac{1}{N}\sum_{i=1}^{N}x^{i}(\cdot)$ and $y^{(N)}(\cdot)=\displaystyle\frac{1}{N}\sum_{i=1}^{N}y^{i}(\cdot)$ is the state-average, $(A,B,F,b,D,\sigma,$

$M,N,H,V,U,K,f,\Phi)$ are matrix-valued functions with appropriate dimensions to be identified. For sake of presentation, we set all agents are homogeneous or statistically symmetric with same coefficients $(A,B,F,b,D,\sigma,M,U,H,V,U,K,f,\Phi)$ and deterministic initial states $x$ .

Now we identify the information structure of large population system: $\mathbb{F}^{i}=\{\mathcal{F}^{i}_{t}\}_{0\leq t\leq T}$ is the natural filtration generated by $\{W_{i}(t),0\leq t\leq T\}$ and augmented by all $P-$ null sets in $\mathcal{F}.$ $\mathbb{F}=\{\mathcal{F}_{t}\}_{0\leq t\leq T}$ is the natural filtration generated by $\{W_{i}(t),1\leq i\leq N,0\leq t\leq T\}$ and augmented by all $P-$ null sets in $\mathcal{F}.$ Thus, $\mathbb{F}^{i}$ is the individual decentralized information of $i^{th}$ Brownian motion while $\mathbb{F}$ is the centralized information driven by all Brownian motion components. Note that the heterogeneous noise $W_{i}$ is specific for individual agent $\mathcal{A}_{i}$ but $x^{i}(t)$ is adapted to $\mathcal{F}_{t}$ instead of $\mathcal{F}^{i}_{t}$ due to the coupling state-average $x^{(N)}.$

The (centralized) admissible control $u^{i}\in\mathcal{U}_{ad}^{c}$ where the (centralized) admissible control set $\mathcal{U}_{ad}^{c}$ is defined as

[TABLE]

where $U\subset\mathbb{R}^{m}$ is a closed convex set. Typical examples of such set is $U=\mathbb{R}_{+}^{m}$ which represents the positive control. By “centralized”, we mean $\mathbb{F}$ is the centralized information generated by all Brownian motion components. Moreover, we also define decentralized control as $u^{i}\in\mathcal{U}_{ad}^{d,i}$ , where the decentralized admissible control set $\mathcal{U}_{ad}^{d,i}$ is defined as

[TABLE]

Note that both $\mathcal{U}_{ad}^{d,i}$ and $\mathcal{U}_{ad}^{c}$ are defined in open-loop sense, and $\mathcal{U}_{ad}^{d,i}\subset\mathcal{U}_{ad}^{c}$ . Let $u=(u^{1},\cdots,u^{i}$ , $\cdots,u^{N})$ denote the set of control strategies of all $N$ agents and $u_{-i}=(u^{1},\cdots,u^{i-1},$ $u^{i+1},\cdots,u^{N})$ denote the control strategies set except the $i^{th}$ agent $\mathcal{A}_{i}.$ Introduce the cost functional of $\mathcal{A}_{i}$ as

[TABLE]

We assume the followng conditions:

(A1)

Assume $A(\cdot),$ $F(\cdot),$ $M(\cdot),$ $U(\cdot),$ $H(\cdot),$ $V(\cdot)\in L^{\infty}(0,T;S^{n}),$ and $B(\cdot),$ $D(\cdot),$ $K\left(\cdot\right)\in L^{\infty}(0,T;\mathbb{R}^{n\times m}),$ $b(\cdot),$ $\sigma(\cdot),$ $f\left(\cdot\right)\in L^{\infty}(0,T;\mathbb{R}^{n});$

(A2)

$Q(\cdot),$ $L\left(\cdot\right)\in L^{\infty}(0,T;S^{n}),$ $Q(\cdot),$ $L\left(\cdot\right)\geq 0,$ $R(\cdot)\in L^{\infty}(0,T;S^{m}),$ $R(\cdot)>0$ and $R^{-1}(\cdot)\in L^{\infty}(0,T;S^{m})$ , $G\in S^{n}$ , $G>0$ .

By virtue of the theory of mean field BSDEs (see Lemma 3.1 in [5]), under the assumptions (A1)-(A2), Eq. $(\ref{FB1})$ admits a unique solution $\left(x^{i},y^{i},z^{i}\right)\in L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})$ $\times L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})$ with an admissible control $u_{i}\in\mathcal{U}_{ad}^{c}$ . We now formulate the large population LQG with control constraint (CC).

Problem (CC). Find an open-loop Nash equilibrium strategies set $\bar{u}=(\bar{u}^{1},\bar{u}^{2},\cdots,\bar{u}^{N})$ satisfying

[TABLE]

where $\bar{u}_{-i}$ represents $(\bar{u}^{1},\cdots,\bar{u}^{i-1},\bar{u}^{i+1},\cdots,\bar{u}^{N})$ , the strategies of all agents except $\mathcal{A}_{i}$ .

Observe that the problem (CC) is of large computational issue since the highly-complicated coupling structure among these agents. Alternatively, the mean-field game theory employed is to search the approximate Nash equilibrium, which bridges the “centralized" LQG to the limiting LQG control problems, as the number of agents tends to infinity. Similar in [24], we need to construct some auxiliary control problem using the frozen state-average limit. Based on it, we can find the decentralized strategies by consistency condition.

Let us introduce the following auxiliary problem for $\mathcal{A}_{i}:$

[TABLE]

with the limiting cost functional given by

[TABLE]

where $\phi^{i},$ $i=1,2$ are the average limit of realized states which should be determined by the consistency-condition (CC) in our later analysis (see (10)). Note that the auxiliary state $\left(x_{t}^{i,\diamond},y_{t}^{i,\diamond},z_{t}^{i,\diamond}\right)$ is different to the true state $\left(x^{i},y^{i},z^{i}\right)$ . Also, the admissible control $u^{i}$ in (3), (4) $\in\mathcal{U}_{ad}^{d,i}$ whereas in (1), (2), the admissible control $\in\mathcal{U}_{ad}^{c}$ (for sake of simplicity, we still denote them with the same notation).

Now we formulate the following limiting stochastic optimal control problem with control constraint (LCC).

Problem (LCC). For the $i^{th}$ agent, $i=1,2,\cdots,N,$ find $u^{i,\ast}(\cdot)\in\mathcal{U}_{ad}^{d,i}$ satisfying

[TABLE]

Then $u^{i,\ast}(\cdot)$ is called a decentralized optimal control for Problem (LCC). Now we apply the well known maximum principle (Theorem 3.3 in [43]) to characterize $u^{i,\ast}$ with the optimal state ${x}^{i,\ast}.$ To this end, let us introduce the following adjoint process

[TABLE]

The Hamiltonian function can be expressed by

[TABLE]

Since $U$ is a closed convex set, then maximum principle reads as the following local form

[TABLE]

Hereafter, time argument is suppressed in case when no confusion occurs. Noticing (5), then (6) yields that

[TABLE]

or equivalently (noticing $R>0$ ),

[TABLE]

for all $u\in U,$ a.e. $t\in\left[0,T\right],$ $P$ -a.s. As $R>0$ , we take the following norm on $U\subset\mathbb{R}^{m}$ (which is equivalent to its Euclidean norm) $\|x\|_{R}^{2}=\left\langle\left\langle x,x\right\rangle\right\rangle:=\left\langle R^{\frac{1}{2}}x,R^{\frac{1}{2}}x\right\rangle,$ and by the well-known results of convex analysis, we obtain that (7) is equivalent to

[TABLE]

where $\mathbf{P}_{U}(\cdot)$ is the projection mapping from $\mathbb{R}^{m}$ to its closed convex subset $\Gamma$ under the norm $\|\cdot\|_{R}$ . For more details, see Appendix. Hereafter, denote

[TABLE]

Here, for simplicity, the dependence of $\varphi$ on time variable $t$ is suppressed. The related Hamiltonian system becomes

[TABLE]

After above preparations, it follows that

[TABLE]

Here, the first equality of (8) and (9) is due to the consistency condition: the frozen term $\phi^{1}$ and $\phi^{2}$ should equal to the average limit of all realized states $\left(x^{i,\ast},y^{i,\ast}\right);$ the second equality is due to the law of large numbers. Thus, by replacing $\left(\phi^{1},\phi^{2}\right)$ by $\left(\mathbb{E}x^{i,\ast},\mathbb{E}y^{i,\ast}\right)$ in above Hamiltonian system, we get the following system

[TABLE]

Clearly, all agents are statistically identical, therefore we can suppress subscript “ $i$ ” and the following consistency condition system appears for generic agent:

[TABLE]

Here, $W$ stands for a generic Brownian motion on $(\Omega,\mathcal{F},P),$ and denote $\mathbb{F}^{W}$ the natural filtration generated by it and augmented by all null-sets. $L_{\mathbb{F}^{W}}^{2},L_{\mathbb{F}^{W}}^{2,\mathcal{E}_{0}}$ are defined in the similar way with $L_{\mathbb{F}}^{2},L_{\mathbb{F}}^{2,\mathcal{E}_{0}}$ before. The system (10) is a nonlinear mean-field forward-backward SDE (MF-FBSDE) with projection operator. It characterizes the state-average limit $\phi^{1}=\mathbb{E}{x},\phi^{2}=\mathbb{E}{y}$ and MFG strategies $\bar{u}_{i}=\varphi(p,q,k)$ for a generic agent in the combined manner, which is totally different from [24, 25]. As you may concern, we need to prove the above consistency condition system admits a unique solution. We have the following uniqueness and existence result.

Remark 1.

It is necessary to point out that there should put a term $y_{0}^{i,\ast}-y_{0}^{(N),\ast}$ in $(\ref{cost1})$ . But we claim that after taking expectation, it will disappear. Indeed, according to (7), one has $\lim_{N\rightarrow+\infty}y_{0}^{(N),\ast}=\lim_{N\rightarrow+\infty}\frac{1}{N}\sum_{i=1}^{N}y_{0}^{i,\ast}=\mathbb{E}y_{0}^{i,\ast}=y_{0}^{i,\ast}$ . The first equality is just the definition of $y_{0}^{(N),\ast},$ the second one is because of the law of large numbers, whilst the last one is due to the fact that $y_{0}^{i,\ast}$ is an $\mathbb{F}_{0}^{i}$ -measurable random vector; and therefore is deterministic. Apparently, in contrast to Huang et al. [24, 25], our framework involves the state $\left(x^{i},y^{i}\right).$

Theorem 2.

Assume that (A1) and (A2) are in force. There exists a unique adapted solution $(x,y,z,p,q,k)\in L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2,\mathcal{E}_{0}}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2,\mathcal{E}_{0}}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n})$ to system (10).

For simplicity, we put the proof of Theorem 2 in the Appendix A.

3 Main result

In above sections, we can characterize the decentralized strategies $\{\bar{u}_{t}^{i},1\leq i\leq N\}$ of Problem (CC) through the auxiliary (LCC) and consistency condition system. For sake of presentation, we alter the notation of consistency condition system to be $(\alpha^{i},\beta^{i},\gamma^{i},\theta^{i},\kappa^{i},\gamma^{i})$ :

[TABLE]

Now, we are in position to verify the $\epsilon$ -Nash equilibrium of them. To this end, let us first present the definition of $\epsilon$ -Nash equilibrium.

Definition 3.

A set of strategies, $\bar{u}_{t}^{i}\in\mathcal{U}^{c}_{ad}$ , $1\leq i\leq N$ , for $N$ agents, is called to satisfy an $\epsilon$ -Nash equilibrium with respect to costs $\mathcal{J}^{i},\ 1\leq i\leq N,$ if there exists $\epsilon=\epsilon(N)\geq 0,\displaystyle\lim_{N\rightarrow+\infty}\epsilon(N)=0,$ such that for any $1\leq i\leq N$ , we have

[TABLE]

when any alternative strategy $u^{i}\in\mathcal{U}^{c}_{ad}$ is applied by $\mathcal{A}_{i}$ .

Remark 4.

If $\epsilon=0$ , then Definition 3 is reduced to the usual exact Nash equilibrium.

Now, we give the main result of this paper and its proof will be shown step by step.

Theorem 5.

*Assume that *(A1) and (A2) are in force. Then, $(\bar{u}_{1},\bar{u}_{2},\cdots,\bar{u}_{N})$ is an $\epsilon$ -Nash equilibrium of Problem (CC).

In order to prove the main Theorem 5, we needs several lemmas which are presented later. For agent $\mathcal{A}_{i},$ recall that its decentralized open-loop optimal strategy is $\bar{u}_{i}=\varphi\left(\chi^{i},\beta^{i},\gamma^{i}\right)$ . The decentralized state $\left(\breve{x}_{t}^{i},\breve{y}^{i},\breve{z}^{i}\right),$ is

[TABLE]

where $\breve{x}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\breve{x}^{i}$ and $\breve{y}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\breve{y}^{i}$ Note that $(\alpha^{i},\beta^{i},\gamma^{i},\theta^{i},\kappa^{i},\gamma^{i})$ satisfies (11).

For each $1\leq i\leq N$ , the monotonic fully coupled FBSDEs (11) has a unique solution $(\alpha^{i},\beta^{i},\gamma^{i})\in L_{\mathbb{F}^{i}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{i}}^{2}(0,T;\mathbb{R}^{n})\times L_{\mathbb{F}^{i}}^{2}(0,T;\mathbb{R}^{n})$ . Thus, the system of all first equation of (13), $1\leq i\leq N$ , has also a unique solution $\left((\breve{x}^{i})_{i},(\breve{y}^{i})_{i},(\breve{\kappa}^{i})_{i}\right)\in(L_{\mathbb{F}}^{2}(0,T;\mathbb{R}^{n}))^{\otimes N}\times(L_{\mathbb{F}}^{2}(0,T;\mathbb{R}^{n}))^{\otimes N}\times(L_{\mathbb{F}}^{2}(0,T;\mathbb{R}^{n}))^{\otimes N}$ , where $\otimes N$ denotes the $n$ -tuple Cartesian product. Moreover, since $\{W_{i}\}_{i=1}^{N}$ is $N$ -dimensional Brownian motions whose components are independent and identically distributed, we have $(\alpha^{i},\beta^{i},\gamma^{i},\theta^{i},\kappa^{i},\gamma^{i}),1\leq i\leq N$ are independent and identically distributed.

Lemma 6.

If (A1) and (A2) hold, then

[TABLE]

The proof of Lemma 6 is classical by virtue of B-D-G inequality and Schwarz inequality, so we omit it.

Lemma 7.

If (A1) and (A2) hold, then

[TABLE]

Proof.

Let us add up both sides of the first and second equation of (13) with respect to all $1\leq i\leq N$ and multiply $\frac{1}{N}$ , we obtain (recall that $\breve{x}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\breve{x}^{i}$ , $\breve{y}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\breve{y}^{i}$ and $\breve{z}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\breve{z}^{i}$

[TABLE]

On the other hand, by taking the expectation on both sides of the second equation of (13), it follows from Fubini’s theorem that $\mathbb{E}\alpha^{i}$ satisfies the following equation:

[TABLE]

From (16) and (17), by denoting

[TABLE]

we have

[TABLE]

and the inequality $(x+y)^{2}\leq 2x^{2}+2y^{2}$ yields that, for any $t\in[0,T]$ ,

[TABLE]

From the well-known Cauchy-Schwartz inequality and the B-D-G inequality, we obtain that there exists a constant $C_{0}$ independent of $N$ (which may vary line by line) such that

[TABLE]

Since $(\chi^{i},\beta^{i},\gamma^{i}),1\leq i\leq N$ are independent identically distributed, for each fixed $s\in[0,T]$ , let us denote that $\mu(s)=\mathbb{E}\varphi(\chi^{i},\beta^{i},\gamma^{i}))$ (note that $\mu$ does not depend on $i$ ), we have

[TABLE]

Since $(\chi^{i},\beta^{i},\gamma^{i}),1\leq i\leq N$ are independent, we have

[TABLE]

Then, due to the fact that $(\chi^{i},\beta^{i},\gamma^{i}),1\leq i\leq N$ are identically distributed, there exists a constant $C_{0}$ independent of $N$ such that

[TABLE]

where the last equality comes from the fact that $\varphi(\chi^{i},\beta^{i},\gamma^{i})\in L_{\mathcal{F}^{i}}^{2}(0,T;\Gamma)$ .

We proceed the second term of (20), using the fact that $(\chi^{i},\beta^{i},\gamma^{i})$ are identically distributed as follows:

[TABLE]

Moreover, we obtain from (20) that

[TABLE]

Consequently, by virtue of Gronwall’s inequality, we get the first estimate (18).

We now handle the estimates (19). Applying Itô’s formula again, we have

[TABLE]

Using B-D-G inequalities, we show that there exists a constant $C_{1},$ modifying $C_{1}$ if necessary,

[TABLE]

Employing the classical Cauchy-Schwarz inequality and Gronwall’s inequality with estimation (14), we get (15). $\Box$

Lemma 8.

Assume that (A1) and (A2) are in force. Then, we have

[TABLE]

Proof.

From (13) and (11), we have that

[TABLE]

where $(\beta^{i},\chi^{i},\gamma^{i})$ is the unique solution to the following FBSDEs:

[TABLE]

From (23), we have

[TABLE]

The classical estimate for the SDE yields that

[TABLE]

where $C_{0}$ is a constant independent of $N$ . Noticing (14) of Lemma 7, we obtain (21). We consider

[TABLE]

By classical estimation for BSDE, we have

[TABLE]

where $C_{0}$ is a constant independent of $N$ . By Gronwall’s inequality, we get the desired result. $\Box$

Lemma 9.

For all $1\leq i\leq N$ , we have

[TABLE]

Proof.

From the definition of (2), (4) and (13), we have

[TABLE]

and

[TABLE]

then

[TABLE]

We will use the following

[TABLE]

and Lemma 7, Lemma 8 as well as $\mathbb{E}\sup_{0\leq t\leq T}\left|\alpha^{i}(t)\right|^{2}\leq C_{0}$ , for some constant $C_{0}$ independent of $N$ which may vary line by line in the following, we have

[TABLE]

With similar argument, using (15) and (22), one can show that

[TABLE]

The proof is completed by noticing (24). $\Box$

We will prove the control strategies set $(\bar{u}^{1},\bar{u}^{2},\ldots,\bar{u}^{N})$ is an $\epsilon$ -Nash equilibrium for Problem (CC). For any fixed $i$ , $1\leq i\leq N$ , we consider the perturbation control $u^{i}\in\mathcal{U}_{ad}^{d,i}$ and we have the following state dynamics ( $j\neq i$ ):

[TABLE]

where $\tilde{x}^{(N)}=\displaystyle\frac{1}{N}\sum_{i=1}^{N}\tilde{x}^{i},$ $\tilde{y}^{(N)}=\displaystyle\frac{1}{N}\sum_{i=1}^{N}\tilde{y}^{i}$ . The wellposedness of above system is easily to obtain. To prove $(\bar{u}^{1},\bar{u}^{2},\ldots,\bar{u}^{N})$ is an $\epsilon$ -Nash equilibrium, we can show that for $1\leq i\leq N$ ,

[TABLE]

Then we only need to consider the perturbation $u^{i}\in\mathcal{U}_{ad}^{d,i}$ such that $\mathcal{J}_{i}(u^{i},\bar{u}_{-i})\leq\mathcal{J}_{i}(\bar{u}^{i},\bar{u}_{-i})$ . Thus we have

[TABLE]

which implies that

[TABLE]

where $C_{0}$ is a constant independent of $N$ .

Now, for the $i^{th}$ agent, we consider the perturbation in the Problem (LCC). We introduce the following system of the decentralized limiting state with perturbation control ( $j\neq i$ ):

[TABLE]

We have the following results:

Lemma 10.

Let (A1) and (A2) hold, then

[TABLE]

Proof.

By (25), we get

[TABLE]

Let us denote

[TABLE]

and recall (17) which is

[TABLE]

we have

[TABLE]

By the Cauchy-Schwartz inequality as well as the B-D-G inequality, we obtain that there exists a constant $C_{0}$ independent of $N$ which may vary line by line such that, for any $t\in[0,T]$ ,

[TABLE]

On the one hand, by denoting $\mu(s):=\mathbb{E}\varphi(\chi^{j},\beta^{j},\gamma^{j})$ (note that since $(\chi^{j},\beta^{j},\gamma^{j}$ , $1\leq j\leq N$ , $j\neq i$ , are independent identically distributed, thus $\mu$ is independent of $j$ ), we have

[TABLE]

Then, due to the fact that $(\chi^{i},\beta^{i},\gamma^{i}),1\leq i\leq N$ are identically distributed and $\varphi(\chi^{i},\beta^{i},\gamma^{i})\in L_{\mathbb{F}^{i}}^{2}(0,T;U)$ , similarly to Lemma 7 we can obtain that there exists a constant $C_{0}$ independent of $N$ such that

[TABLE]

In addition, due to (26), we get

[TABLE]

and similarly, since $(\chi^{j},\beta^{j},\gamma^{j})$ , $1\leq j\leq N$ , $j\neq i$ , are identically distributed, we have

[TABLE]

Therefore, from above estimates, we get from (31) that, for any $t\in[0,T]$ ,

[TABLE]

Finally, by using Gronwall’s inequality, we get (28). We now proceed the second inequality. Applying Itô’s formula again, we have

[TABLE]

Still using B-D-G inequalities, we show that there exists a constant $C_{2},$ modifying $C_{2}$ if necessary,

[TABLE]

By employing the classical Cauchy-Schwarz inequality and Gronwall’s inequality, we obtain get (29). $\Box$

Lemma 11.

[TABLE]

Proof.

From respectively the first equation of (25) and (27), we obtain

[TABLE]

With the help of classical estimates of SDE and BSDE, Gronwall’s inequality and (28) and (29) of Lemma 10, it is easily to obtain (32) and (33). The proof is completed. $\Box$

Lemma 12.

For all $1\leq i\leq N,$ for the perturbation control $u^{i}$ , we have

[TABLE]

Proof.

Recall (2), (4), (8), and (9), we have

[TABLE]

Using Lemma 10 and Lemma 11 as well as $\mathbb{E}\sup_{0\leq t\leq T}\left(|\bar{y}^{i}(t)|^{2}+|\alpha^{i}(t)|^{2}\right)\leq C_{0}$ , for some constant $C$ independent of $N$ which may vary line by line in the following, we have

[TABLE]

With similar argument, we can show that

[TABLE]

Hence, we get the desired result. $\Box$

Proof of Theorem 5: Now, we consider the $\epsilon$ -Nash equilibrium for $\mathcal{A}_{i}$ for Problem (CC). Combining Lemma 9 and Lemma 12, we have

[TABLE]

Consequently, Theorem 5 holds with $\epsilon=O\Big{(}\frac{1}{\sqrt{N}}\Big{)}$ . $\Box$

Appendix A Proof of theorem

Proof of Theorem 2.

(Uniqueness) Suppose that there exists two solutions: $(x^{1},y^{1},z^{1},p^{1},q^{1},k^{1}),$

$(x^{2},y^{2},z^{2},p^{2},q^{2},k^{2})$ and denote

[TABLE]

Then, we have

[TABLE]

with

[TABLE]

Taking the expectation in the second equation of (34) yields $\mathbb{E}\hat{p}=0$ . Applying Itô’s formula to $\big{<}\hat{q},\hat{x}\big{>}-\big{<}\hat{p},\hat{y}\big{>}$ and taking expectations on both sides (also, noting $\mathbb{E}\hat{p}=0,$ which derives that $\mathbb{E}\hat{q}=0,$ and the monotonicity property of $\widehat{\varphi}$ ), we arrive at

[TABLE]

Thus, $G\big{(}\hat{x}_{T}-\mathbb{E}\hat{x}_{T}\big{)}=0$ , $Q\big{(}\hat{x}-\mathbb{E}\hat{x}\big{)}=0,$ and $L\left(\hat{y}_{s}-\mathbb{E}\hat{y}_{s}\right)=0$ which according to the uniqueness and existences of classical BSDE theory implies $\hat{p}_{s}\equiv 0,$ $\hat{q}_{s}\equiv 0.$ Next, we have $\widehat{\varphi}(\hat{p},\hat{q},\hat{k})\equiv 0$ which further implies $\mathbb{E}\hat{x}_{s}\equiv 0,$ hence $\hat{x}_{s}\equiv 0.$ Moreover, $\hat{y}_{T}=0$ yields, by Theorem 3.1 in [6], $\hat{y}=0.$ Hence the uniqueness follows. $\Box$

In order to prove the existence for FBSDE (10), we need the following result. It involves a priori estimates of solutions of the following family of mean fields FBSDEs parameterized by $\alpha\in[0,1].$

Before that, we denote

[TABLE]

Consider the following a family of FBSDEs with parameter $\alpha\in\mathbb{R},$

[TABLE]

where $\left(b_{0},\sigma_{0},\gamma_{0},\lambda_{0},\mu_{0},\psi_{0}\right)\in L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R\times R}^{n}\mathbb{\times R}^{n}\times\mathbb{R}),$ and $\upsilon$ ( $\xi)$ is a $\mathbb{R}$ -valued ( $\mathbb{R}^{n}$ -valued) square integrable random varible which is $\mathbb{F}_{T}^{W}$ -measurable. Note the coefficient $\Xi\triangleq Up-L\left(y-\mathbb{E}y\right).$ It is easy to check $\mathbb{E}p^{\alpha}=0,$ then by uniqueness of BSDE, $\mathbb{E}q^{\alpha}=0.$ Specifically, letting $\alpha=0,$ one immediately has

[TABLE]

Obviously, (36) is kind of decoupled FBSDEs whose solvability is trivial.

Lemma 13.

Assume that (A1) and (A2) are in force, there exists a positive constant $\delta_{0}\in[0,1],$ such that if, a priori, for some $\alpha_{0}\in[0,1)$ , for each $x_{0}\in R^{n},$ $\left(b_{0},\sigma_{0},\gamma_{0},\lambda_{0},\mu_{0},\psi_{0}\right)\in L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}),$ mean field FBSDEs (35) have a unique adapted solution in $L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n})$ , then for each $\delta\in[\alpha_{0},\alpha_{0}+\delta_{0}]$ , for each $x_{0}\in\mathbb{R}^{n},$ $\left(b_{0},\sigma_{0},\gamma_{0},\lambda_{0},\mu_{0},\psi_{0}\right)\in L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}),$ Eq. (35) also have a unique solution in $L_{\mathbb{F}^{W}}^{2}(0,T;\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n}\times\mathbb{R}^{n})$ .

Proof.

Define

[TABLE]

We set $\left(x^{0},\mathbb{E}x^{0},y^{0},\mathbb{E}y^{0},z^{0},p^{0},q^{0},k^{0}\right)=0,$ and solve iteratively the following equations:

[TABLE]

We set

[TABLE]

with

[TABLE]

Now introduce a map $I_{\alpha_{0}}:\left(x^{i},y^{i},z^{i},p^{i},q^{i},k^{i}\right)\rightarrow\left(x^{i+1},y^{i+1},z^{i+1},p^{i+1},q^{i+1},k^{i+1}\right)\in\mathcal{M}\left(0,T\right)$ by the following mean fields FBSDEs:

[TABLE]

Applying the Itô’s formula to $\left\langle\hat{x}^{i+1},\hat{q}^{i+1}\right\rangle-\left\langle\hat{y}^{i+1},\hat{p}^{i+1}\right\rangle$ on $\left[0,T\right],$ we have

[TABLE]

After simple computation, we have

[TABLE]

By using the monotonicity property of $\varphi\left(p,q,k\right)$ (Proposition 17 below and the classical geometric inequality and Lipschitz property of projection operator (Proposition 16), it follows that.

[TABLE]

Next by baisc technique in SDE, BSDE, we have

[TABLE]

and

[TABLE]

Moreover,

[TABLE]

Observe that inequality (40) does not contain $\hat{x}^{i+1}$ and $\hat{y}^{i+1}.$ Combining (39)-(43), by similar method used in [20], we have, for some $\delta_{0}\in\left(0,1\right),$

[TABLE]

which means that the map $I_{\alpha_{0}+\delta_{0}}$ is a contraction. $\Box$

(Existence) We can solve Eq. (35) successively for the case $\alpha\in\left[0,\delta_{0}\right],$ $\left[\delta_{0},2\delta_{0}\right],\cdots$ When $\alpha=1,$ we deduce immediately that the solution to Eq. (10) exists. $\Box$

Appendix B Properties of projection

We recall the following properties of projection $\mathbf{P}_{U}$ onto a closed convex set $U$ , see [9], Chapter 5.

Theorem 14.

For a nonempty closed convex set $U\subset\mathbb{R}^{m}$ , for every $x\in\mathbb{R}^{m}$ , there exists a unique $x^{\ast}\in U$ , such that

[TABLE]

Moreover, $x^{\ast}$ is characterized by the property

[TABLE]

The above element $x^{\ast}$ is called the projection of $x$ onto $U$ and is denoted by $\mathbf{P}_{U}[x]$ .

One can immediately obtain the following

Proposition 15.

Let $U\subset\mathbb{R}^{m}$ be a nonempty closed convex set, then we have

[TABLE]

Proposition 16.

Let $U\subset\mathbb{R}^{m}$ be a nonempty closed convex set, then the projection $\mathbf{P}_{U}$ does not increase the distance, i.e.

[TABLE]

Now let us consider $\mathbb{R}^{m}$ and the projection $\mathbf{P}_{U}$ both with the norm $\|\cdot\|_{R_{0}}:=\langle R_{0}^{\frac{1}{2}}\cdot,R_{0}^{\frac{1}{2}}\cdot\rangle$ , from (45), we have

Proposition 17.

Let $U\subset\mathbb{R}^{m}$ be a nonempty closed convex set, then

[TABLE]

The proofs of Proposition 15-Proposition 17 can be found in [7, 9].

Conflict of Interest: The authors declare that they have no conflict of interest.

Acknowledgements. The authors wish to thank the editors and two referees for their valuable comments and constructive suggestions which improved the presentation of this manuscript.

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Andersson and B. Djehiche, A maximum principle for SD Es of meanfield type, Appl. Math. Optim ., vol. 63, 341–356, 2011.
2[2] J. Bismut, “An introductory approach to duality in optimal stochastic control,” SIAM Rev ., vol. 20, 62–78, 1978.
3[3] M. Bardi, Explicit solutions of some linear-quadratic mean field games, Netw. Heterog. Media , vol. 7, pp. 243–261, 2012.
4[4] R. Buckdahn, B. Djehiche, and J. Li, A general stochastic maximum principle for SD Es of mean-field type, Appl. Math. Optim ., vol. 64, 197–216, 2011.
5[5] R. Buckdahn, B. Djehiche, J. Li, S.Peng, Mean-field backward stochastic differential equations. A limit approach. Ann. Probab . 37 (4), 1524-1565, (2009).
6[6] R. Buckdahn, J. Li, S. Peng, Mean-field backward stochastic differential equations and related partial differential equations. Stoch. Process. Appl . 119 (10), 3133-3154, (2009).
7[7] A.V. Balakrishnan. Applied Functional Analysis . Springer-Verlag, New York, 1976.
8[8] A. Bensoussan, J. Frehse and S. Yam. Mean Field Games and Mean Field Type Control Theory . Springer, New York, 2013.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Mean Field Game for Linear Quadratic Stochastic Recursive Systems

Abstract

1 Introduction

2 Preliminaries

Remark 1**.**

Theorem 2**.**

3 Main result

Definition 3**.**

Remark 4**.**

Theorem 5**.**

Lemma 6**.**

Lemma 7**.**

Proof.

Lemma 8**.**

Proof.

Lemma 9**.**

Proof.

Lemma 10**.**

Proof.

Lemma 11**.**

Proof.

Lemma 12**.**

Proof.

Appendix A Proof of theorem

Proof of Theorem 2.

Lemma 13**.**

Proof.

Appendix B Properties of projection

Theorem 14**.**

Proposition 15**.**

Proposition 16**.**

Proposition 17**.**

Remark 1.

Theorem 2.

Definition 3.

Remark 4.

Theorem 5.

Lemma 6.

Lemma 7.

Lemma 8.

Lemma 9.

Lemma 10.

Lemma 11.

Lemma 12.

Lemma 13.

Theorem 14.

Proposition 15.

Proposition 16.

Proposition 17.