Maximum principle for stochastic optimal control problem of   forward-backward stochastic difference systems

Shaolin Ji; Haodong Liu

arXiv:1812.11283·math.OC·January 1, 2019·Int. J. Control

Maximum principle for stochastic optimal control problem of forward-backward stochastic difference systems

Shaolin Ji, Haodong Liu

PDF

Open Access

TL;DR

This paper establishes a maximum principle for stochastic optimal control problems involving forward-backward stochastic difference systems, covering both partially and fully coupled equations, with a focus on convex control domains.

Contribution

It introduces a novel maximum principle for FBS{ extDelta}Ss, including new adjoint difference equations and applicable to both partially and fully coupled systems.

Findings

01

Derived the adjoint difference equation using a product rule representation.

02

Established the maximum principle for convex control domains.

03

Applied the framework to both partially and fully coupled FBS{ extDelta}Ss.

Abstract

In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS{\Delta}Ss). Two types of FBS{\Delta}Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS{\Delta}E) and the second one is described by a fully coupled FBS{\Delta}E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS{\Delta}E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.

Equations287

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(t,X_{t},u_{t}\right)+\sum_{i=1}^{d}\sigma_{i}\left(t,X_{t},u_{t}\right)\Delta W_{t}^{i},\\ X_{0}&=&x_{0},\\ \Delta Y_{t}&=&-f\left(t+1,X_{t+1},Y_{t+1},Z_{t+1},u_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&y_{T},\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(t,X_{t},u_{t}\right)+\sum_{i=1}^{d}\sigma_{i}\left(t,X_{t},u_{t}\right)\Delta W_{t}^{i},\\ X_{0}&=&x_{0},\\ \Delta Y_{t}&=&-f\left(t+1,X_{t+1},Y_{t+1},Z_{t+1},u_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&y_{T},\end{array}\right.

J (u (\cdot)) = E [t = 0 \sum T - 1 l (t, X_{t}, Y_{t}, Z_{t}, u_{t}) + h (X_{T})] .

J (u (\cdot)) = E [t = 0 \sum T - 1 l (t, X_{t}, Y_{t}, Z_{t}, u_{t}) + h (X_{T})] .

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(t,X_{t},Y_{t},Z_{t},u_{t}\right)+\sum_{i=1}^{d}\sigma_{i}\left(t,X_{t},Y_{t},Z_{t},u_{t}\right)\Delta W_{t}^{i},\\ X_{0}&=&x_{0},\\ \Delta Y_{t}&=&-f\left(t+1,X_{t+1},Y_{t+1},Z_{t+1},u_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&y_{T},\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(t,X_{t},Y_{t},Z_{t},u_{t}\right)+\sum_{i=1}^{d}\sigma_{i}\left(t,X_{t},Y_{t},Z_{t},u_{t}\right)\Delta W_{t}^{i},\\ X_{0}&=&x_{0},\\ \Delta Y_{t}&=&-f\left(t+1,X_{t+1},Y_{t+1},Z_{t+1},u_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&y_{T},\end{array}\right.

J (u (\cdot)) = E [t = 0 \sum T - 1 l (t, X_{t}, Y_{t}, Z_{t}, u_{t}) + h (X_{T})] .

J (u (\cdot)) = E [t = 0 \sum T - 1 l (t, X_{t}, Y_{t}, Z_{t}, u_{t}) + h (X_{T})] .

Δ ⟨ X_{t}, Y_{t} ⟩ = ⟨ X_{t + 1}, Δ Y_{t} ⟩ + ⟨ Δ X_{t}, Y_{t} ⟩

Δ ⟨ X_{t}, Y_{t} ⟩ = ⟨ X_{t + 1}, Δ Y_{t} ⟩ + ⟨ Δ X_{t}, Y_{t} ⟩

\left\{\begin{array}[c]{rcl}\Delta Y_{t}&=&-f\left(t+1,Y_{t+1},Z_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&\eta,\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta Y_{t}&=&-f\left(t+1,Y_{t+1},Z_{t+1}\right)+Z_{t}\Delta W_{t}+\Delta N_{t},\\ Y_{T}&=&\eta,\end{array}\right.

∣ f (T, y_{1}, z_{1}) - f (T, y_{2}, z_{2}) ∣

∣ f (T, y_{1}, z_{1}) - f (T, y_{2}, z_{2}) ∣

∣ f (t, y_{1}, z_{1}) - f (t, y_{2}, z_{2}) ∣

η + f (T, η) - E [η + f (T, η) ∣ F_{T - 1}] = Z_{T - 1} Δ W_{T - 1} + Δ N_{T - 1} .

η + f (T, η) - E [η + f (T, η) ∣ F_{T - 1}] = Z_{T - 1} Δ W_{T - 1} + Δ N_{T - 1} .

E [e_{i}^{*} (η + f (T, η)) (Δ W_{T - 1})^{*} ∣ F_{T - 1}] = e_{i}^{*} Z_{T - 1}

E [e_{i}^{*} (η + f (T, η)) (Δ W_{T - 1})^{*} ∣ F_{T - 1}] = e_{i}^{*} Z_{T - 1}

Z_{T - 1} = E [(η + f (T, η)) (Δ W_{T - 1})^{*} ∣ F_{T - 1}]

Z_{T - 1} = E [(η + f (T, η)) (Δ W_{T - 1})^{*} ∣ F_{T - 1}]

\begin{array}[c]{ccl}\mathbb{E}\left[\left\|Z_{T-1}\right\|^{2}\right]&\leq&\mathbb{E}\left[\mathbb{E}\left[\left|\eta+f\left(T,\eta\right)\right|^{2}|\mathcal{F}_{T-1}\right]\mathbb{E}\left[\left|\left(\Delta W_{T-1}\right)\right|^{2}|\mathcal{F}_{T-1}\right]\right]<\infty.\end{array}

\begin{array}[c]{ccl}\mathbb{E}\left[\left\|Z_{T-1}\right\|^{2}\right]&\leq&\mathbb{E}\left[\mathbb{E}\left[\left|\eta+f\left(T,\eta\right)\right|^{2}|\mathcal{F}_{T-1}\right]\mathbb{E}\left[\left|\left(\Delta W_{T-1}\right)\right|^{2}|\mathcal{F}_{T-1}\right]\right]<\infty.\end{array}

Z_{t}

Z_{t}

Y_{t}

E [e_{i}^{*} N_{t} (W_{t})^{*} ∣ F_{t - 1}]

E [e_{i}^{*} N_{t} (W_{t})^{*} ∣ F_{t - 1}]

= e_{i}^{*} s = 0 \sum t - 2 Δ N_{s} E [(W_{t})^{*} ∣ F_{t - 1}] + E [e_{i}^{*} Δ N_{t - 1} (W_{t - 1} + Δ W_{t - 1})^{*} ∣ F_{t - 1}]

= e_{i}^{*} N_{t - 1} (W_{t - 1})^{*},

b (ω, t, x, u)

b (ω, t, x, u)

σ_{i} (ω, t, x, u)

f (ω, t, x, y, z, u)

l (ω, t, x, y, z, u)

h (ω, x)

b (ω, t, x, y, z, u)

b (ω, t, x, y, z, u)

σ_{i} (ω, t, x, y, z, u)

f (ω, t, x, y, z, u)

l (ω, t, x, y, z, u)

h (ω, x)

\lambda=\left(\begin{array}[c]{c}x\\ y\\ z\end{array}\right),A\left(t,\lambda;u\right)=\left(\begin{array}[c]{c}-f\\ b\\ \sigma\end{array}\right)\left(t,\lambda;u\right).

\lambda=\left(\begin{array}[c]{c}x\\ y\\ z\end{array}\right),A\left(t,\lambda;u\right)=\left(\begin{array}[c]{c}-f\\ b\\ \sigma\end{array}\right)\left(t,\lambda;u\right).

⟨ A (t, λ_{1}; u) - A (t, λ_{2}; u), λ_{1} - λ_{2} ⟩ \leq - α ∣ λ_{1} - λ_{2} ∣^{2}, P - a . s .,

⟨ A (t, λ_{1}; u) - A (t, λ_{2}; u), λ_{1} - λ_{2} ⟩ \leq - α ∣ λ_{1} - λ_{2} ∣^{2}, P - a . s .,

\forall λ_{1}, λ_{2} \in R^{n} \times R^{n} \times R^{n};

⟨ - f (T, x_{1}, y, z, u) + f (T, x_{2}, y, z, u), x_{1} - x_{2} ⟩ \leq - α ∣ x_{1} - x_{2} ∣^{2}, P - a . s .;

⟨ - f (T, x_{1}, y, z, u) + f (T, x_{2}, y, z, u), x_{1} - x_{2} ⟩ \leq - α ∣ x_{1} - x_{2} ∣^{2}, P - a . s .;

⟨ b (0, λ_{1}; u) - b (0, λ_{2}; u), y_{1} - y_{2} ⟩ + ⟨ σ (0, λ_{1}; u) - σ (0, λ_{2}; u), z_{1} - z_{2} ⟩

⟨ b (0, λ_{1}; u) - b (0, λ_{2}; u), y_{1} - y_{2} ⟩ + ⟨ σ (0, λ_{1}; u) - σ (0, λ_{2}; u), z_{1} - z_{2} ⟩

\leq - α [∣ y_{1} - y_{2} ∣^{2} + ∥ z_{1} - z_{2} ∥^{2}],

u_{t}^{ε} = (1 - δ_{t s}) \overset{u}{ˉ}_{t} + δ_{t s} (\overset{u}{ˉ}_{s} + ε Δ v) = \overset{u}{ˉ}_{t} + δ_{t s} ε Δ v,

u_{t}^{ε} = (1 - δ_{t s}) \overset{u}{ˉ}_{t} + δ_{t s} (\overset{u}{ˉ}_{s} + ε Δ v) = \overset{u}{ˉ}_{t} + δ_{t s} ε Δ v,

\begin{array}[c]{rclrcl}\bar{\varphi}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},\bar{u}_{t}\right),&\varphi^{\varepsilon}\left(t\right)&=&\varphi\left(t,X_{t}^{\varepsilon},Y_{t}^{\varepsilon},Z_{t}^{\varepsilon},u_{t}^{\varepsilon}\right),\\ \widetilde{\varphi}^{\varepsilon}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},u_{t}^{\varepsilon}\right),&\varphi_{\mu}\left(t\right)&=&\varphi_{\mu}\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},\bar{u}_{t}\right),\end{array}

\begin{array}[c]{rclrcl}\bar{\varphi}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},\bar{u}_{t}\right),&\varphi^{\varepsilon}\left(t\right)&=&\varphi\left(t,X_{t}^{\varepsilon},Y_{t}^{\varepsilon},Z_{t}^{\varepsilon},u_{t}^{\varepsilon}\right),\\ \widetilde{\varphi}^{\varepsilon}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},u_{t}^{\varepsilon}\right),&\varphi_{\mu}\left(t\right)&=&\varphi_{\mu}\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t},\bar{u}_{t}\right),\end{array}

0 \leq t \leq T sup E X_{t}^{ε} - \overset{ˉ}{X}_{t}^{2} \leq C ε^{2} E ∣ Δ v ∣^{2} .

0 \leq t \leq T sup E X_{t}^{ε} - \overset{ˉ}{X}_{t}^{2} \leq C ε^{2} E ∣ Δ v ∣^{2} .

X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1} = b^{ε} (s) - \overline{b} (s) + i = 1 \sum d [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] Δ W_{s}^{i} .

X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1} = b^{ε} (s) - \overline{b} (s) + i = 1 \sum d [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] Δ W_{s}^{i} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Climate Change Policy and Economics · Insurance, Mortality, Demography, Risk Management

Full text

Maximum principle for stochastic optimal control problem of forward-backward

stochastic difference systems

Shaolin Ji Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. [email protected]. Research supported by NSF (No. 11571203).

Haodong Liu Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. (Corresponding author).

Abstract: In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS $\Delta$ Ss). Two types of FBS $\Delta$ Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS $\Delta$ E) and the second one is described by a fully coupled FBS $\Delta$ E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS $\Delta$ E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.

Keywords: backward stochastic difference equations; forward-backward stochastic difference equations; monotone condition; stochastic optimal control; maximum principle

1 Introduction

The Maximum Principle is one of the principal approaches in solving the optimal control problems. A lot of work has been done on the Maximum Principle for forward stochastic system. See, for example, Bensoussan [2], Bismut [4], Kushner [12], Peng [16]. Peng also firstly studied one kind of forward-backward stochastic control system (FBSCS) in [17] and obtained the maximum principle for this kind of control system with control domain being convex. The FBSCSs have wide applications in many fields. As the stochastic differential recursive utility, which is a generalization of a standard additive utility, can be regarded as a solution of a backward stochastic differential equation (BSDE). The recursive utility optimization problem can be described by a optimization problem for a FBSCS (see [19]). Besides, in the dynamic principal-agent problem with unobservable states and actions, the principal’s problem can be formulated as a partial information optimal control problem of a FBSCS (see [22]). We refer to [8], [11], [13], [21], [24], [25] for other works on optimization problems for FBSCSs.

In this paper, we will discuss the Maximum Principle for optimal control of discrete time systems described by forward-backward stochastic difference equations (FBS $\Delta$ Es). To the best of our knowledge, there are few results on such optimization control problems. In fact, the discrete time control systems are of great value in practice. For example, the digital control can be formulated as discrete time control problems, where the sampled data is obtained at discrete instants of time. Besides, the forward-backward stochastic difference system (FBS $\Delta$ S) can be used for modeling in financial markets. For example, the solution to the backward stochastic difference equation (BS $\Delta$ E) can be used to construct time-consistent nonlinear expectations (see [5], [6]) and be used for pricing in the financial markets (see [3]). However, the formulation of BS $\Delta$ E is quite different from its continuous time counterpart. Many works are devoted to the study of BS $\Delta$ Es (see, e.g. [3], [5], [6], [20]). Based on the driving process, there are mainly two types of formulations of BS $\Delta$ Es. One is driving by a finite state process which takes values from the basis vectors (as in [5]) and the other is driving by a martingale with independent increments (as in [3]). For the latter case, the solution of the BS $\Delta$ E is a triple of processes which is due to the discrete time version of the Kunita–Watanabe decomposition. In this paper, we adopt the second type of formulation to investigate the optimization problems for FBS $\Delta$ Ss.

Let $\left(\Omega,\mathcal{F},\left\{\mathcal{F}_{t}\right\}_{0\leq t\leq T},P\right)$ be a probability space, and $W_{t}$ be a martingale process with independent increments. Define the difference operator $\Delta$ as $\Delta V_{t}=V_{t+1}-V_{t}$ . Here we consider two types of controlled FBS $\Delta$ Ss.

Problem 1 (partially coupled system):

The controlled system is

[TABLE]

and the cost functional is

[TABLE]

Problem 2 (fully coupled system):

The controlled system is:

[TABLE]

and the cost functional is

[TABLE]

Let $\left\{U_{t}\right\}_{t\in\left\{0,1,...,T-1\right\}}$ be a sequence of nonempty convex subset of $\mathbb{R}^{r}$ . We denote the set of admissible controls $\mathcal{U}$ by $\mathcal{U}=\left\{u\left(\cdot\right)\in\mathcal{M}^{2}\left(0,T-1;\mathbb{R}^{r}\right)|u\left(t\right)\in U_{t}\right\}.$ It can be seen that in Problem 1, $b$ and $\sigma$ do not contain the solution $(Y,Z)$ of the backward equation. This kind of FBS $\Delta$ E is called the partially coupled FBS $\Delta$ E. Meanwhile, the system in Problem 2 is called the fully coupled FBS $\Delta$ E.

The optimal control problem is to find the optimal control $u\in\mathcal{U}$ , such that the optimal control and the corresponding state trajectory can minimize the cost functional $J\left(u\left(\cdot\right)\right)$ . In this paper, we assume the control domain is convex. By making the perturbation of the optimal control at a fixed time point, we obtain the maximum principle for problem 1 and 2.

To build the maximum principle, the key step is to find the adjoint variables which can be applied to deduce the variational inequality. In [14], the authors studied the maximum principle for a discrete time stochastic optimal control problem in which the state equation is only governed by a forward stochastic difference equation. By applying the Riesz representation theorem, they explicitly obtained the adjoint variables and establish the maximum principle. But to solve our problems, we need to construct the adjoint difference equations since generally the adjoint variables can not be obtained explicitly for our case. To construct the adjoint equations in our discrete time framework, the techniques which are adopted for the continuous time framework as in [16, 17] are not appliable. In this paper, we propose two techniques to deduce the adjoint difference equations. The first one is that we choose the following product rule:

[TABLE]

where $X_{t}$ (resp. $Y_{t}$ ) subjects to a forward (resp. backward) stochastic difference equation. The second one is that the BS $\Delta$ E should be formulated as in (2.1). In other words, the generator $f$ of the BS $\Delta$ E (2.1) depends on time $t+1$ . It is worth pointing out that this kind of formulation is just the formulation of the adjoint equations for stochastic optimal control problems (see [14] for the classical case). Based on these two techniques, we can deduce the adjoint difference equations. The readers may refer to Remark 3.6 for more details.

The remainder of this paper is organized as follows. In section 2, two types of the controlled FBS $\Delta$ Ss are formulated. We deduce the maximum principle for the partially coupled controlled FBS $\Delta$ S in section 3. Finally, we establish the maximum principle for the fully coupled controlled FBS $\Delta$ S in section 4.

2 Preliminaries and model formulation

Let $T$ be a deterministic terminal time, and let $\mathcal{T}:=\left\{0,1,...,T\right\}$ . Consider a filtered probability space $\left(\Omega,\mathcal{F},\left\{\mathcal{F}_{t}\right\}_{0\leq t\leq T},P\right)$ , with $\mathcal{F}_{0}=\left\{\emptyset,\Omega\right\}$ and $\mathcal{F=F}_{T}$ . Here we define the difference operator $\Delta$ as $\Delta U_{t}=U_{t+1}-U_{t}$ . Let $W$ be a fixed $\mathbb{R}^{d}$ -valued square integrable martingale process with independent increments, i.e. $\mathbb{E}\left[\Delta W_{t}|\mathcal{F}_{t}\right]=\mathbb{E}\left[\Delta W_{t}\right]=0$ for any $t\in\left\{0,...,T-1\right\}$ . Also we suppose that $\mathbb{E}\left[\Delta W_{t}\left(\Delta W_{t}\right)^{\ast}\right]=I_{d}$ for any $t\in\left\{0,...,T-1\right\}$ . Here $\left(\cdot\right)^{\ast}$ denotes vector transposition. We assume that $\mathcal{F}_{t}$ is the completion of the $\sigma$ -algebra generated by the process $W$ up to time $t$ .

Denote by $L^{2}\left(\mathcal{F}_{t};\mathbb{R}^{n}\right)$ the set of all $\mathcal{F}_{t}-$ measurable square integrable random variable $X_{t}$ taking values in $\mathbb{R}^{n}$ and by $\mathcal{M}^{2}\left(0,t;\mathbb{R}^{n}\right)$ the set of all $\left\{\mathcal{F}_{s}\right\}_{0\leq s\leq t}$ -adapted square integrable process $X$ taking values in $\mathbb{R}^{n}$ . Moreover, we define $e_{i}=\left(0,0,...,0,1,0,...,0\right)^{\ast}\in\mathbb{R}^{n}$ and mention that an inequality on a vector quantity is to hold componentwise.

Consider the following backward stochastic difference equation (BS $\Delta$ E):

[TABLE]

where $\eta\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ , $f:\Omega\times\left\{1,2,...,T\right\}\times\mathbb{R}^{n}\times\mathbb{R}^{n\times d}\longmapsto\mathbb{R}^{n}$ .

Assumption 2.1

A1. The function $f\left(t,y,z\right)$ is uniformly Lipschitz continuous and independent of $z$ at $t=T$ , i.e. there exists constants $c_{1},c_{2}>0$ , such that for any $t\in\left\{1,2,...,T-1\right\}$ , $y_{1},y_{2}\in\mathbb{R}^{n}$ , $z_{1},z_{2}\in\mathbb{R}^{n\times d}$ ,

[TABLE]

A2. $f\left(t,0,0\right)\in L^{2}\left(\mathcal{F}_{t};\mathbb{R}^{n}\right)$ for any $t\in\left\{1,2,...,T\right\}$ .

Remark 2.2

The BS $\Delta$ E (2.1) is analogous to the continuous time BSDE driven by a general martingale (cf. [9]), and the solution is a triple of processes.

Definition 2.3

A solution to BS $\Delta$ E (2.1) is a triple of processes $\left(Y,Z,N\right)\in\mathcal{M}^{2}\left(0,T;\mathbb{R}^{n}\right)\times\mathcal{M}^{2}\left(0,T-1;\mathbb{R}^{n\times d}\right)\times\mathcal{M}^{2}\left(0,T;\mathbb{R}^{n}\right)$ which satisfies equality (2.1) for all $t\in\left\{0,1,...,T-1\right\}$ , and $N$ is a martingale process strongly orthogonal to $W$ .

By using the Galtchouk-Kunita-Watanabe decomposition in [3], we can obtain the existence and uniqueness result of BS $\Delta$ E (2.1):

Theorem 2.4

Suppose that Assumption (2.1) holds. Then for any terminal condition $\eta\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ , the BS $\Delta$ E (2.1) has a unique adapted solution $\left(Y,Z,N\right)$ .

Proof. We first prove the existence and uniqueness of $\left(Y_{T-1},Z_{T-1},\Delta N_{T-1}\right)$ . Due to Assumption (2.1) and $\eta\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ , we get $f\left(T,\eta\right)\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ . Here we omit the variable $Z$ since $f$ is independent of $Z$ at time $T$ . Then we have $\mathbb{E}\left[\left|\mathbb{E}\left[\eta+f\left(T,\eta\right)|\mathcal{F}_{T-1}\right]\right|^{2}\right]<\infty.$ Hence, $\eta+f\left(T,\eta\right)-\mathbb{E}\left[\eta+f\left(T,\eta\right)|\mathcal{F}_{T-1}\right]$ is a square integrable martingale difference. So it admits the Galtchouk-Kunita-Watanabe decomposition, which implies that there exists $Z_{T-1}\in\mathcal{F}_{T-1}$ , $Z_{T-1}\Delta W_{T-1}\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ , $\Delta N_{T-1}\in L^{2}\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ such that $\mathbb{E}\left[\Delta N_{T-1}|\mathcal{F}_{T-1}\right]=0$ , $\mathbb{E}\left[e_{i}^{\ast}\Delta N_{T-1}\left(\Delta W_{T-1}\right)^{\ast}|\mathcal{F}_{T-1}\right]=0$ and

[TABLE]

Moreover, $\Delta N_{T-1}$ is uniquely determined in this decomposition. For fixed $i\in\left\{1,2,...,n\right\}$ , premultiply the equation by $e_{i}^{\ast}$ , postmultiply the equation by $\left(\Delta W_{T-1}\right)^{\ast}$ and then take the $\mathcal{F}_{T-1}$ conditional expectation. This yields that

[TABLE]

since $\mathbb{E}\left[\Delta W_{T-1}\left(\Delta W_{T-1}\right)^{\ast}|\mathcal{F}_{T-1}\right]=I$ . Therefore, we get the unique $Z_{T-1}$ by

[TABLE]

and

[TABLE]

It leads that $Y_{T-1}=\mathbb{E}\left[\eta+f\left(T,\eta\right)|\mathcal{F}_{T-1}\right]$ and $Y_{T-1}\in L^{2}\left(\mathcal{F}_{T-1};\mathbb{R}^{n}\right)$ .

Then, by similar arguments as above, we can obtain the unique solution $\left(Y_{t},Z_{t},\Delta N_{t}\right)\in L^{2}\left(\mathcal{F}_{t};\mathbb{R}^{n}\right)\times L^{2}\left(\mathcal{F}_{t};\mathbb{R}^{n\times d}\right)\times L^{2}\left(\mathcal{F}_{t};\mathbb{R}^{n}\right)$ for $t\in\left\{0,1,...,T-2\right\}.$ Moreover,

[TABLE]

By taking the convention $N_{0}=0$ and letting $N_{t}=N_{0}+\sum_{s=0}^{t-1}\Delta N_{s}$ , we have that (2.1) holds true for all $t\in\left\{0,1,...,T-1\right\}$ . Finally, since

[TABLE]

we conclude that $N$ is strongly orthogonal to $W$ .

Now we consider the control systems (1.1)-(1.2) and (1.3)-(1.4).

Let the coefficients in system (1.1)-(1.2) be such that:

[TABLE]

And the coefficients in system (1.3)-(1.4) be such that:

[TABLE]

Remark 2.5

The cost functional in [17] consists of three parts: the running cost functional, the terminal cost functional of $X_{T}$ , the initial cost functional of $Y_{0}$ . In our formulation, if we take $l\left(\omega,0,X_{0},Y_{0},Z_{0},u_{0}\right)=\gamma\left(\omega,Y_{0}\right)$ , then the cost functional (1.4) for our discrete time framework can be reduced to the cost functional in [17] formally.

For system (1.1)-(1.2), we assume that:

Assumption 2.6

For $\varphi=b$ , $\sigma_{i}$ , $f$ , $l$ , $h$ , we assume that

$\varphi$ * is adapted map, i.e. for any $\left(x,y,z,u\right)\in\mathbb{R}^{m}\times\mathbb{R}^{n}\times\mathbb{\mathbb{R}}^{n\times d}\times\mathbb{R}^{r}$ , $\varphi\left(\cdot,\cdot,x,y,z,u\right)$ is $\left\{\mathcal{F}_{t}\right\}$ -adapted process. Moreover, $\varphi\left(\cdot,t,0,0,0,0\right)\in L^{2}\left(\mathcal{F}_{t}\right).$ * 2. 2.

$\forall t\in\left\{0,1,...,T\right\}$ *, * $\varphi\left(\cdot,t,\cdot,\cdot,\cdot,\cdot\right)\,$ is continuously differentiable with respect to $x,y,z,u$ , and $\varphi_{x},\varphi_{y},\varphi_{z_{i}},\varphi_{u}$ are uniformly bounded $P-a.s.$ . Also, for $t=T$ , $f_{z_{i}}\equiv 0$ , i.e. $f$ is independent of $z$ at time $T$ . Here we use $z_{i}$ to represent the $i$ -th column of the matrix $z$ .

Let

[TABLE]

For control system (1.3)-(1.4), we additionally assume that:

Assumption 2.7

For any $u\in\mathcal{U}$ , the coefficients in (1.3) satisfy the following monotone conditions, i.e. when $t\in\left\{1,...,T-1\right\}$ ,

[TABLE]

when $t=T$ ,

[TABLE]

when $t=0$ ,

[TABLE]

where $\alpha$ is a given positive constant.

Besides, in the following, we formally denote $b\left(T,x,y,z,u\right)\equiv 0$ , $\sigma\left(T,x,y,z,u\right)\equiv 0$ , $l\left(T,x,y,z,u\right)\equiv 0$ , $f\left(0,x,y,z,u\right)\equiv 0$ .

3 Maximum principle for the partially coupled FBS $\Delta$ E system

For any $u\in\mathcal{U}$ , it is obvious that there exists a unique solution $\left\{X_{t}\right\}_{t=0}^{T}\in\mathcal{M}^{2}\left(0,T;\mathbb{R}^{m}\right)$ to the forward stochastic difference equation in the system (1.1). Then, by Theorem 2.4, the backward equation in the system (1.1) has a unique solution $\left(Y,Z,N\right)$ where $Y=\left\{Y_{t}\right\}_{t=0}^{T}$ , $Z=\left\{Z_{t}\right\}_{t=0}^{T-1}$ and $N=\left\{N_{t}\right\}_{t=0}^{T}$ .

Suppose that $\bar{u}=\left\{\bar{u}_{t}\right\}_{t=0}^{T}$ is the optimal control of problem (1.1)-(1.2) and $\left(\bar{X},\bar{Y},\bar{Z}\right)$ is the corresponding optimal trajectory. For a fixed time $0\leq s\leq T$ , choose any $\Delta v\in L^{2}\left(\mathcal{F}_{s};\mathbb{R}^{r}\right)$ such that $\bar{u}_{s}+\Delta v$ takes values in $U_{s}$ . For any $\varepsilon\in\left[0,1\right]$ , construct the perturbed admissible control

[TABLE]

where $\delta_{ts}=1$ for $t=s$ , $\delta_{ts}=0$ for $t\neq s$ and $t\in\left\{0,1,...,T\right\}$ . Since $U_{s}$ is a convex set, $\left\{u_{t}^{\varepsilon}\right\}_{t=0}^{T}\in\mathcal{U}$ is an admissible control. Let $\left(X^{\varepsilon},Y^{\varepsilon},Z^{\varepsilon},N^{\varepsilon}\right)$ be the solution of (1.1) corresponding to the control $u^{\varepsilon}$ .

Set

[TABLE]

where $\varphi=b$ , $\sigma_{i}$ , $g$ , $f$ , $l$ , $h$ and $\mu=x$ , $y$ , $z_{i}$ and $u$ .

Then, we have the following estimates.

Lemma 3.1

Under Assumption (2.6), we have

[TABLE]

Proof. In the following, the positive constant $C$ may change from lines to lines.

When $t=0,...,s$ , $X_{t}^{\varepsilon}=\bar{X}_{t}$ .

When $t=s+1$ ,

[TABLE]

Then,

[TABLE]

By the boundedness of $b_{u}$ , we have

[TABLE]

By the boundedness of $\sigma_{iu}$ , we have

[TABLE]

which leads to

[TABLE]

When $t=s+2,...,T$ ,

[TABLE]

Due to the boundedness of $b_{x}$ , $\sigma_{ix}$ , we obtain $\mathbb{E}\left|X_{t}^{\varepsilon}-\bar{X}_{t}\right|^{2}\leq C\mathbb{E}\left[\left|X_{t-1}^{\varepsilon}-\bar{X}_{t-1}\right|^{2}\right]$ . Thus, by induction we prove the result.

Let $\xi=\left\{\xi_{t}\right\}_{t=0}^{T}$ be the solution to the following difference equation,

[TABLE]

It is easy to check that

[TABLE]

and we have the following result:

Lemma 3.2

Under Assumption 2.6, we have

[TABLE]

Proof. When $t=0,...,s$ , $X_{t}^{\varepsilon}=\bar{X}_{t}$ and $\xi_{t}=0$ which lead to $X_{t}^{\varepsilon}-\bar{X}_{t}-\xi_{t}=0.$

When $t=s+1$ ,

[TABLE]

where

[TABLE]

Then

[TABLE]

Since $\left\|\widetilde{b}_{u}\left(s\right)-b_{u}\left(s\right)\right\|\rightarrow 0$ and $\left\|\widetilde{\sigma}_{iu}\left(s\right)-\sigma_{iu}\left(s\right)\right\|\rightarrow 0$ as $\varepsilon\rightarrow 0$ , we have

[TABLE]

When $t=s+2,...,T$ ,

[TABLE]

where

[TABLE]

Then

[TABLE]

$\left\|\widetilde{b}_{x}\left(t-1\right)-b_{x}\left(t-1\right)\right\|\rightarrow 0$ and $\left\|\widetilde{\sigma}_{ix}\left(t-1\right)-\sigma_{ix}\left(t-1\right)\right\|\rightarrow 0$ as $\varepsilon\rightarrow 0$ . Since $\widetilde{b}_{x}\left(t-1\right)$ and $\widetilde{\sigma}_{ix}\left(t-1\right)$ are bounded, by the estimation (3.5), we have

[TABLE]

This completes the proof.

Lemma 3.3

Under Assumption 2.6, we have

[TABLE]

Proof. It is obvious that $Y_{T}^{\varepsilon}-\bar{Y}_{T}=0$ at time $T$ .

When $t=s,...,T-1$ (if $s=T$ , skip this part), we have

[TABLE]

It yields that

[TABLE]

Similarly, we have

[TABLE]

When $t=s-1$ , by similar analysis,

[TABLE]

If $s=T$ , it shows like

[TABLE]

When $t=0,...,s-2$ , we have

[TABLE]

Thus, there exists $C>0$ , such that for any $t\in\left\{0,1,...,T\right\}$ ,

[TABLE]

This completes the proof.

Let $\left(\eta,\zeta,V\right)$ be the solution to the following BS $\Delta$ E,

[TABLE]

Notice that $f_{x}\left(T\right)=f_{x}\left(T,\bar{X}_{T},\bar{Y}_{T},\bar{u}_{T}\right)$ since $f$ is independent of $Z$ , also as $f_{y}\left(T\right)$ , $f_{u}\left(T\right)$ .

It is easy to check that

[TABLE]

and we have the following result:

Lemma 3.4

Under Assumption 2.6, we have

[TABLE]

Proof. When $t=T$ , $Y_{T}^{\varepsilon}-\bar{Y}_{T}-\eta_{T}=0$ .

When $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

for $\mu=x$ , $y$ , $z_{i}$ and $u$ . Then,

[TABLE]

and

[TABLE]

Notice that $\widetilde{f}_{x}\left(t\right)-f_{x}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{y}\left(t\right)-f_{y}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{z_{i}}\left(t\right)-f_{z_{i}}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{u}\left(t\right)-f_{u}\left(t\right)\rightarrow 0$ as $\varepsilon\rightarrow 0$ . We obtain that

[TABLE]

This completes the proof.

By Lemma 3.2 and Lemma 3.4, we have

[TABLE]

Introducing the following adjoint equation:

[TABLE]

where $W$ and $Q$ are square integrable martingale processes and $Q$ is strongly orthogonal to $W$ .

Obviously the forward equation in (3.8) admits a unique solution $k\in\mathcal{M}^{2}\left(0,T;\mathbb{R}^{n}\right)$ . Then, based on the solution $k$ , according to Theorem 2.4, the backward equation in (3.8) has a unique solution $\left(p,q,Q\right)\in\mathcal{M}^{2}\left(0,T;\mathbb{R}^{m}\right)\times\mathcal{M}^{2}\left(0,T-1;\mathbb{R}^{m\times d}\right)\times\mathcal{M}^{2}\left(0,T;\mathbb{R}^{m}\right)$ . So FBS $\Delta$ E has a unique solution $\left(p,q,Q,k\right)$ .

We obtain the following maximum principle for the optimal control problem (1.1)-(1.2).

Define the Hamiltonian function

[TABLE]

Theorem 3.5

Suppose that Assumption (2.6) holds. Let $\bar{u}$ be an optimal control of the problem (1.1)-(1.2), $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory and $\left(p,q,k\right)$ be the solution to the adjoint equation (3.8). Then for any $t\in\left\{0,1,...,T\right\}$ , for any $v\in U_{t}$ , we have

[TABLE]

Proof. For $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

It is obvious that $\mathbb{E}\left[\Phi_{t}\right]=0$ . We have

[TABLE]

and

[TABLE]

Similarly, it can be shown that for $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

It is easy to check that

[TABLE]

Then we have

[TABLE]

Therefore,

[TABLE]

Since $\xi_{0}=0$ and $k_{0}=0$ , we deduce

[TABLE]

By $\lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon}\left[J\left(u^{\varepsilon}\left(\cdot\right)\right)-J\left(\bar{u}\left(\cdot\right)\right)\right]\geq 0$ , we obtain

[TABLE]

Thus, it is easy to obtain equation (3.9) since $s$ is taking arbitrarily. This completes the proof.

Remark 3.6

In the introduction we point out that we need a reasonable representation of the product rule. When we calculate $\Delta\left\langle\xi_{t},p_{t}\right\rangle$ in (3.10), $\Delta\left\langle\xi_{t},p_{t}\right\rangle$ is represented as $\left\langle\xi_{t+1},\cdot\cdot\cdot\right\rangle+\cdot\cdot\cdot$ . Combining the formulation of the BS $\Delta$ E mentioned in the introduction, this representation will lead to the terms such as $\left\langle\square_{t},\Diamond_{t}\right\rangle-\left\langle\square_{t+1},\Diamond_{t+1}\right\rangle$ in (3.11). By summing and rearranging these terms in (3.12), we obtain the dual relation (3.13).

When $g\equiv 0$ and $f\equiv 0$ , our control system (1.1)-(1.2) degenerates to the classical discrete control system which only contains a forward stochastic difference equation as in [14]. For this special case, the adjoint equation becomes

[TABLE]

and the Hamiltonian function becomes

[TABLE]

The adjoint equation has the following explicit solution

[TABLE]

which coincides with the results in [14].

4 Maximum principle for the fully coupled FBS $\Delta$ E system

In this section we suppose $W$ to be one-dimensional driving process. Let $\bar{u}=\left\{\bar{u}_{t}\right\}_{t=0}^{T}$ be the optimal control for the control problem (1.3)-(1.4) and $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory. Note that the existence and uniqueness of $\left(\bar{X},\bar{Y},\bar{Z}\right)$ is guaranteed by the results in [15]. The perturbed control $u^{\varepsilon}$ is the same as (3.1) and we denote by $\left(X^{\varepsilon},Y^{\varepsilon},Z^{\varepsilon}\right)$ the corresponding trajectory.

Let

[TABLE]

Using the similar notations (3.2) in section 3, we have

[TABLE]

Lemma 4.1

Under Assumption 2.6 and Assumption 2.7, we have

[TABLE]

Proof. By (4.1),

[TABLE]

By the monotone condition, we obtain

[TABLE]

On the other hand,

[TABLE]

and similarly,

[TABLE]

Combining (4.3) and (4.4), we have

[TABLE]

This completes the proof.

Next we introduce the following variational equation:

[TABLE]

By Assumption 2.6 and Assumption 2.7, when $t\in\left\{1,...,T-1\right\}$ ,

[TABLE]

when $t=0$ ,

[TABLE]

when $t=T$ ,

[TABLE]

Thus, the coefficients of (4.5) satisfy the monotone condition and there exists a unique solution $\left(\xi,\eta,\zeta,V\right)$ to (4.5). Similar to the proof of Lemma 4.1, we have

[TABLE]

Define

[TABLE]

where $\varphi=b$ , $\sigma_{i}$ , $g$ , $f$ , $l$ , $h$ and $\mu=x$ , $y$ , $z_{i}$ and $u$ .

Lemma 4.2

Under Assumption 2.6 and Assumption 2.7, we have

[TABLE]

Proof. Note that

[TABLE]

Set

[TABLE]

Then,

[TABLE]

where

[TABLE]

According to (4.10),

[TABLE]

where

[TABLE]

Combining (4.6), (4.7) and (4.8), we have

[TABLE]

Note that

[TABLE]

When $\varepsilon\rightarrow 0$ , $\left\|\widetilde{f}_{\mu}\left(t\right)-f_{\mu}\left(t\right)\right\|\rightarrow 0$ for $\mu=x$ , $y$ , $z$ and $u$ . Then, by Lemma 4.1,

[TABLE]

Similar results hold for the other terms in (4.11). Finally, we have

[TABLE]

This completes the proof.

By Lemma 4.2, we obtain

[TABLE]

Introduce the following adjoint equation:

[TABLE]

Define the Hamiltonian function as follows:

[TABLE]

Theorem 4.3

Suppose that Assumption 2.6 and Assumption 2.7 hold. Let $\bar{u}$ be an optimal control for (1.3)-(1.3), $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory and $\left(p,q,k\right)$ be the solution to the adjoint equation (4.12). Then, for any $t\in\left\{0,1,...,T\right\}$ and any $v\in U_{t}$ , we have

[TABLE]

Proof. From the expression of $\xi_{t}$ , $p_{t}$ for $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

Since $W$ and $Q$ are square integrable martingale processes and $Q$ is strongly orthogonal to $W$ , we have $\mathbb{E}\left[\Phi_{t}\right]=0$ . Similarly,

[TABLE]

where

[TABLE]

Furthermore,

[TABLE]

and

[TABLE]

Then, we obtain

[TABLE]

Therefore,

[TABLE]

Notice that $\xi_{0}=0$ , $k_{0}=0$ . So

[TABLE]

Since $\lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon}\left[J\left(u^{\varepsilon}\left(\cdot\right)\right)-J\left(\bar{u}\left(\cdot\right)\right)\right]\geq 0$ , we obtain

[TABLE]

Then, (4.13) holds due to that $s$ is taking arbitrarily. This completes the proof.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Bender, C., & Zhang, J. (2008). Time discretization and Markovian iteration for coupled FBSD Es. The Annals of Applied Probability, 18(1), 143-177.
2[2] Bensoussan, A. (1982). Lectures on stochastic control. In Nonlinear filtering and stochastic control (pp. 1-62). Springer, Berlin, Heidelberg.
3[3] Bielecki, T. R., Cialenco, I., & Chen, T. (2015). Dynamic conic finance via backward stochastic difference equations. SIAM Journal on Financial Mathematics, 6(1), 1068-1122.
4[4] Bismut, J. M. (1978). An introductory approach to duality in optimal stochastic control. SIAM review, 20(1), 62-78.
5[5] Cohen, S. N., & Elliott, R. J. (2010). A general theory of finite state backward stochastic difference equations. Stochastic Processes and their Applications, 120(4), 442-466.
6[6] Cohen, S. N., & Elliott, R. J. (2011). Backward stochastic difference equations and nearly time-consistent nonlinear expectations. SIAM Journal on Control and Optimization, 49(1), 125-139.
7[7] Delarue, F., & Menozzi, S. (2006). A forward–backward stochastic algorithm for quasi-linear PD Es. The Annals of Applied Probability, 16(1), 140-184.
8[8] Dokuchaev, N., & Zhou, X. Y. (1999). Stochastic controls with terminal contingent conditions. Journal of Mathematical Analysis and Applications, 238(1), 143-165.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Maximum principle for stochastic optimal control problem of forward-backward

1 Introduction

2 Preliminaries and model formulation

Assumption 2.1

Remark 2.2

Definition 2.3

Theorem 2.4

Remark 2.5

Assumption 2.6

Assumption 2.7

3 Maximum principle for the partially coupled FBSΔ\DeltaΔE system

Lemma 3.1

Lemma 3.2

Lemma 3.3

Lemma 3.4

Theorem 3.5

Remark 3.6

4 Maximum principle for the fully coupled FBSΔ\DeltaΔE system

Lemma 4.1

Lemma 4.2

Theorem 4.3

3 Maximum principle for the partially coupled FBS $\Delta$ E system

4 Maximum principle for the fully coupled FBS $\Delta$ E system