Maximum principle for stochastic optimal control problem of finite state   forward-backward stochastic difference systems

Shailin Ji; Haodong Liu

arXiv:1907.04209·math.OC·July 10, 2019

Maximum principle for stochastic optimal control problem of finite state forward-backward stochastic difference systems

Shailin Ji, Haodong Liu

PDF

Open Access

TL;DR

This paper develops a maximum principle for stochastic optimal control problems involving finite state forward-backward stochastic difference systems, extending control theory to discrete-time, finite state models with new adjoint equations.

Contribution

It introduces a maximum principle for finite state FBS{ extunderscore}Ss, including both partially and fully coupled systems, with new adjoint difference equations and control domain considerations.

Findings

01

Derived the adjoint difference equation for the systems.

02

Established the maximum principle for convex control domains.

03

Extended stochastic control theory to finite state, discrete-time systems.

Abstract

In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS{\Delta}Ss) where the uncertainty is modeled by a discrete time, finite state process, rather than white noises. Two types of FBS{\Delta}Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS{\Delta}E) and the second one is described by a fully coupled FBS{\Delta}E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS{\Delta}E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.

Equations252

Δ ⟨ X_{t}, Y_{t} ⟩ = ⟨ X_{t + 1}, Δ Y_{t} ⟩ + ⟨ Δ X_{t}, Y_{t} ⟩

Δ ⟨ X_{t}, Y_{t} ⟩ = ⟨ X_{t + 1}, Δ Y_{t} ⟩ + ⟨ Δ X_{t}, Y_{t} ⟩

M_{t} = W_{t} - E [W_{t} ∣ F_{t - 1}], t = 1, ..., T .

M_{t} = W_{t} - E [W_{t} ∣ F_{t - 1}], t = 1, ..., T .

\left\{\begin{array}[c]{rcl}\Delta Y_{t}&=&-f\left(\omega,t+1,Y_{t+1},Z_{t+1}\right)+Z_{t}M_{t+1},\\ Y_{T}&=&\eta.\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta Y_{t}&=&-f\left(\omega,t+1,Y_{t+1},Z_{t+1}\right)+Z_{t}M_{t+1},\\ Y_{T}&=&\eta.\end{array}\right.

f (ω, t, y, Z_{t}^{1}) = f (ω, t, y, Z_{t}^{2}) .

f (ω, t, y, Z_{t}^{1}) = f (ω, t, y, Z_{t}^{2}) .

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(\omega,t,X_{t},u_{t}\right)+\sum\limits_{i=1}^{m}e_{i}\cdot\sigma_{i}\left(\omega,t,X_{t},u_{t}\right)M_{t+1},\\ \Delta Y_{t}&=&-f\left(\omega,t+1,X_{t+1},Y_{t+1},Z_{t+1}\widetilde{I},u_{t+1}\right)+Z_{t}M_{t+1},\\ X_{0}&=&x_{0},\\ Y_{T}&=&y_{T}\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(\omega,t,X_{t},u_{t}\right)+\sum\limits_{i=1}^{m}e_{i}\cdot\sigma_{i}\left(\omega,t,X_{t},u_{t}\right)M_{t+1},\\ \Delta Y_{t}&=&-f\left(\omega,t+1,X_{t+1},Y_{t+1},Z_{t+1}\widetilde{I},u_{t+1}\right)+Z_{t}M_{t+1},\\ X_{0}&=&x_{0},\\ Y_{T}&=&y_{T}\end{array}\right.

J (u (\cdot)) = E t = 0 \sum T - 1 l (ω, t, X_{t}, Y_{t}, Z_{t} I, u_{t}) + h (ω, X_{T})

J (u (\cdot)) = E t = 0 \sum T - 1 l (ω, t, X_{t}, Y_{t}, Z_{t} I, u_{t}) + h (ω, X_{T})

b (ω, t, x, u)

b (ω, t, x, u)

σ_{i} (ω, t, x, u)

f (ω, t, x, y, z, u)

l (ω, t, x, y, z, u)

h (ω, x)

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(\omega,t,X_{t},Y_{t},Z_{t}\widetilde{I},u_{t}\right)+\sigma\left(\omega,t,X_{t},Y_{t},Z_{t}\widetilde{I},u_{t}\right)M_{t+1},\\ \Delta Y_{t}&=&-f\left(\omega,t+1,X_{t+1},Y_{t+1},Z_{t+1}\widetilde{I},u_{t+1}\right)+Z_{t}M_{t+1},\\ X_{0}&=&x_{0},\\ Y_{T}&=&y_{T},\end{array}\right.

\left\{\begin{array}[c]{rcl}\Delta X_{t}&=&b\left(\omega,t,X_{t},Y_{t},Z_{t}\widetilde{I},u_{t}\right)+\sigma\left(\omega,t,X_{t},Y_{t},Z_{t}\widetilde{I},u_{t}\right)M_{t+1},\\ \Delta Y_{t}&=&-f\left(\omega,t+1,X_{t+1},Y_{t+1},Z_{t+1}\widetilde{I},u_{t+1}\right)+Z_{t}M_{t+1},\\ X_{0}&=&x_{0},\\ Y_{T}&=&y_{T},\end{array}\right.

J (u (\cdot)) = E t = 0 \sum T - 1 l (ω, t, X_{t}, Y_{t}, Z_{t} I, u_{t}) + h (ω, X_{T})

J (u (\cdot)) = E t = 0 \sum T - 1 l (ω, t, X_{t}, Y_{t}, Z_{t} I, u_{t}) + h (ω, X_{T})

b (ω, t, x, y, z, u)

b (ω, t, x, y, z, u)

σ (ω, t, x, y, z, u)

f (ω, t, x, y, z, u)

l (ω, t, x, y, z, u)

h (ω, x)

λ

λ

A (t, λ; u)

∣ λ ∣

∣ λ ∣

∣ A (t, λ) ∣

⟨ A (t, λ_{1}; u) - A (t, λ_{2}; u), λ_{1} - λ_{2} ⟩ \leq - α ∣ λ_{1} - λ_{2} ∣^{2},

⟨ A (t, λ_{1}; u) - A (t, λ_{2}; u), λ_{1} - λ_{2} ⟩ \leq - α ∣ λ_{1} - λ_{2} ∣^{2},

\forall λ_{1}, λ_{2} \in R \times R \times R^{1 \times d};

⟨ - f (T, x_{1}, y, z I, u) + f (T, x_{2}, y, z I, u), x_{1} - x_{2} ⟩ \leq - α ∣ x_{1} - x_{2} ∣^{2};

⟨ - f (T, x_{1}, y, z I, u) + f (T, x_{2}, y, z I, u), x_{1} - x_{2} ⟩ \leq - α ∣ x_{1} - x_{2} ∣^{2};

\begin{array}[c]{cl}&\left\langle b\left(0,\lambda_{1};u\right)-b\left(0,\lambda_{2};u\right),y_{1}-y_{2}\right\rangle\\ &+\left\langle\left(\sigma\left(0,\lambda_{1};u\right)-\sigma\left(0,\lambda_{2};u\right)\right)\mathbb{E}\left[M_{1}M_{1}^{\ast}|\mathcal{F}_{0}\right],z_{1}-z_{2}\right\rangle\\ \leq&-\alpha\left[\left|y_{1}-y_{2}\right|^{2}+\left|\left(z_{1}-z_{2}\right)\widetilde{I}\right|^{2}\right].\end{array}

\begin{array}[c]{cl}&\left\langle b\left(0,\lambda_{1};u\right)-b\left(0,\lambda_{2};u\right),y_{1}-y_{2}\right\rangle\\ &+\left\langle\left(\sigma\left(0,\lambda_{1};u\right)-\sigma\left(0,\lambda_{2};u\right)\right)\mathbb{E}\left[M_{1}M_{1}^{\ast}|\mathcal{F}_{0}\right],z_{1}-z_{2}\right\rangle\\ \leq&-\alpha\left[\left|y_{1}-y_{2}\right|^{2}+\left|\left(z_{1}-z_{2}\right)\widetilde{I}\right|^{2}\right].\end{array}

u_{t}^{ε} = (1 - δ_{t s}) \overset{u}{ˉ}_{t} + δ_{t s} (\overset{u}{ˉ}_{s} + ε Δ v) = \overset{u}{ˉ}_{t} + δ_{t s} ε Δ v,

u_{t}^{ε} = (1 - δ_{t s}) \overset{u}{ˉ}_{t} + δ_{t s} (\overset{u}{ˉ}_{s} + ε Δ v) = \overset{u}{ˉ}_{t} + δ_{t s} ε Δ v,

\begin{array}[c]{rclrcl}\bar{\varphi}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},\bar{u}_{t}\right),&\varphi^{\varepsilon}\left(t\right)&=&\varphi\left(t,X_{t}^{\varepsilon},Y_{t}^{\varepsilon},Z_{t}^{\varepsilon}\widetilde{I},u_{t}^{\varepsilon}\right),\\ \widetilde{\varphi}^{\varepsilon}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},u_{t}^{\varepsilon}\right),&\varphi_{\mu}\left(t\right)&=&\varphi_{\mu}\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},\bar{u}_{t}\right),\end{array}

\begin{array}[c]{rclrcl}\bar{\varphi}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},\bar{u}_{t}\right),&\varphi^{\varepsilon}\left(t\right)&=&\varphi\left(t,X_{t}^{\varepsilon},Y_{t}^{\varepsilon},Z_{t}^{\varepsilon}\widetilde{I},u_{t}^{\varepsilon}\right),\\ \widetilde{\varphi}^{\varepsilon}\left(t\right)&=&\varphi\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},u_{t}^{\varepsilon}\right),&\varphi_{\mu}\left(t\right)&=&\varphi_{\mu}\left(t,\bar{X}_{t},\bar{Y}_{t},\bar{Z}_{t}\widetilde{I},\bar{u}_{t}\right),\end{array}

0 \leq t \leq T sup E X_{t}^{ε} - \overset{ˉ}{X}_{t}^{2} \leq C ε^{2} E ∣ Δ v ∣^{2} .

0 \leq t \leq T sup E X_{t}^{ε} - \overset{ˉ}{X}_{t}^{2} \leq C ε^{2} E ∣ Δ v ∣^{2} .

X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1} = b^{ε} (s) - \overline{b} (s) + i = 1 \sum m e_{i} \cdot [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] M_{s + 1} .

X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1} = b^{ε} (s) - \overline{b} (s) + i = 1 \sum m e_{i} \cdot [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] M_{s + 1} .

E X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1}^{2} \leq 2 E [b^{ε} (s) - \overline{b} (s)^{2} + i = 1 \sum m ∣ [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] M_{s + 1} ∣^{2}] .

E X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1}^{2} \leq 2 E [b^{ε} (s) - \overline{b} (s)^{2} + i = 1 \sum m ∣ [σ_{i}^{ε} (s) - \overline{σ}_{i} (s)] M_{s + 1} ∣^{2}] .

E [b^{ε} (s) - \overline{b} (s)^{2}] \leq C E [∣ u_{s}^{ε} - \overset{u}{ˉ}_{s} ∣^{2}] = C ε^{2} E [∣ Δ v ∣^{2}] .

E [b^{ε} (s) - \overline{b} (s)^{2}] \leq C E [∣ u_{s}^{ε} - \overset{u}{ˉ}_{s} ∣^{2}] = C ε^{2} E [∣ Δ v ∣^{2}] .

\begin{array}[c]{cl}&\mathbb{E}\left|\left[\widetilde{\sigma_{i}}^{\varepsilon}\left(s\right)-\overline{\sigma}_{i}\left(s\right)\right]M_{s+1}\right|^{2}\\ \leq&C\mathbb{E}\left[\left|\left[\widetilde{\sigma_{i}}^{\varepsilon}\left(s\right)-\overline{\sigma}_{i}\left(s\right)\right]\widetilde{I}\right|^{2}\right]\\ \leq&C\varepsilon^{2}\mathbb{E}\left[\left|\Delta v\right|^{2}\right]\end{array}

\begin{array}[c]{cl}&\mathbb{E}\left|\left[\widetilde{\sigma_{i}}^{\varepsilon}\left(s\right)-\overline{\sigma}_{i}\left(s\right)\right]M_{s+1}\right|^{2}\\ \leq&C\mathbb{E}\left[\left|\left[\widetilde{\sigma_{i}}^{\varepsilon}\left(s\right)-\overline{\sigma}_{i}\left(s\right)\right]\widetilde{I}\right|^{2}\right]\\ \leq&C\varepsilon^{2}\mathbb{E}\left[\left|\Delta v\right|^{2}\right]\end{array}

E X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1}^{2} \leq C ε^{2} E [∣ Δ v ∣^{2}] .

E X_{s + 1}^{ε} - \overset{ˉ}{X}_{s + 1}^{2} \leq C ε^{2} E [∣ Δ v ∣^{2}] .

\begin{array}[c]{cl}&\mathbb{E}\left|X_{t}^{\varepsilon}-\bar{X}_{t}\right|^{2}\\ \leq&2\mathbb{E}\left[\left|b\left(t-1,X_{t-1}^{\varepsilon},\bar{u}_{t-1}\right)-b\left(t-1,\bar{X}_{t-1},\bar{u}_{t-1}\right)\right|^{2}\right.\\ &+\left.\sum\limits_{i=1}^{m}\left|\left[\sigma_{i}\left(t-1,X_{t-1}^{\varepsilon},\bar{u}_{t-1}\right)-\sigma_{i}\left(t-1,\bar{X}_{t-1},\bar{u}_{t-1}\right)\right]M_{t}\right|^{2}\right].\end{array}

\begin{array}[c]{cl}&\mathbb{E}\left|X_{t}^{\varepsilon}-\bar{X}_{t}\right|^{2}\\ \leq&2\mathbb{E}\left[\left|b\left(t-1,X_{t-1}^{\varepsilon},\bar{u}_{t-1}\right)-b\left(t-1,\bar{X}_{t-1},\bar{u}_{t-1}\right)\right|^{2}\right.\\ &+\left.\sum\limits_{i=1}^{m}\left|\left[\sigma_{i}\left(t-1,X_{t-1}^{\varepsilon},\bar{u}_{t-1}\right)-\sigma_{i}\left(t-1,\bar{X}_{t-1},\bar{u}_{t-1}\right)\right]M_{t}\right|^{2}\right].\end{array}

\left\{\begin{array}[c]{rcl}\Delta\xi_{t}&=&b_{x}\left(t\right)\xi_{t}+\delta_{ts}b_{u}\left(t\right)\varepsilon\Delta v+\sum\limits_{i=1}^{m}e_{i}\cdot\left[\xi_{t}^{\ast}\sigma_{ix}\left(t\right)+\delta_{ts}\varepsilon\Delta v^{\ast}\sigma_{iu}\left(t\right)\right]M_{t+1},\\ \xi_{0}&=&0.\end{array}\right.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Climate Change Policy and Economics · Insurance, Mortality, Demography, Risk Management

Full text

Maximum principle for stochastic optimal control problem of finite state

forward-backward stochastic difference systems

Shaolin Ji Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. [email protected]. Research supported by NSF (No. 11571203).

Haodong Liu Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. (Corresponding author).

Abstract: In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS $\Delta$ Ss) where the uncertainty is modeled by a discrete time, finite state process, rather than white noises. Two types of FBS $\Delta$ Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS $\Delta$ E) and the second one is described by a fully coupled FBS $\Delta$ E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS $\Delta$ E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.

Keywords: backward stochastic difference equations; forward-backward stochastic difference equations; monotone condition; stochastic optimal control; maximum principle

1 Introduction

The Maximum Principle is one of the important approaches in solving the optimal control problems. A lot of work has been done on the Maximum Principle for stochastic system. See, for example, Bensoussan [1], Bismut [3], Kushner [13], Peng [18]. Peng also firstly studied one kind of forward-backward stochastic control system (FBSCS) in [19] and obtained the maximum principle for this kind of control system with control domain being convex. The FBSCSs have wide applications in many fields. As the stochastic differential recursive utility, which is a generalization of a standard additive utility, can be regarded as a solution of a backward stochastic differential equation (BSDE). The recursive utility optimization problem can be described by a optimization problem for a FBSCS (see [21]). Besides, in the dynamic principal-agent problem with unobservable states and actions, the principal’s problem can be formulated as a partial information optimal control problem of a FBSCS (see [24]). We refer to [7], [10], [11], [14], [22], [26], [28] for other works on optimization problems for FBSCSs.

In this paper, we will discuss the Maximum Principle for optimal control of discrete time systems described by forward-backward stochastic difference equations (FBS $\Delta$ Es). To the best of our knowledge, there are few results on such optimization control problems. In fact, the discrete time control systems are of great value in practice. For example, the digital control can be formulated as discrete time control problems, where the sampled data is obtained at discrete instants of time. Besides, the forward-backward stochastic difference system (FBS $\Delta$ S) can be used for modeling in financial markets. For example, the solution to the backward stochastic difference equation (BS $\Delta$ E) can be used to construct time-consistent nonlinear expectations (see [5], [6]) and be used for pricing in the financial markets (see [2]). However, the formulation of BS $\Delta$ E is quite different from its continuous time counterpart. Many works are devoted to the study of BS $\Delta$ Es (see, e.g. [2], [5], [6], [23]). Based on the driving process, there are mainly two types of formulations of BS $\Delta$ Es. One is driving by a finite state process which takes values from the basis vectors (as in [5]) and the other is driving by a martingale with independent increments (as in [2]). For the former framework, the researchers in [5] obtained the discrete time version of martingale representation theorem and establish the solvability result of BS $\Delta$ E with the uniqueness of $Z$ under a new kind of equivalence relation. Further works about the applications of the finite state framework can be seen in [8], [17], [15]. In this paper, we adopt the first type of formulation to investigate the optimization problems for FBS $\Delta$ Ss.

In this paper, we study two stochastic optimal control problems. The Problem 1 involves a partially coupled FBS $\Delta$ E (2.2). In more details, the coefficients $b$ and $\sigma$ of the forward equation do not contain the solution $(Y,Z)$ of the backward equation. The state equation of Problem 2 is described by a fully coupled FBS $\Delta$ E (2.4).

The optimal control problem is to find the optimal control $u\in\mathcal{U}$ , such that the optimal control and the corresponding state trajectory can minimize the cost functional $J\left(u\left(\cdot\right)\right)$ . In this paper, we assume the control domain is convex. By making the perturbation of the optimal control at a fixed time point, we obtain the maximum principle for problem 1 and 2.

To build the maximum principle, the key step is to find the adjoint variables which can be applied to deduce the variational inequality. In [16], the authors studied the maximum principle for a discrete time stochastic optimal control problem in which the state equation is only governed by a forward stochastic difference equation. By applying the Riesz representation theorem, they explicitly obtained the adjoint variables and establish the maximum principle. But to solve our problems, we need to construct the adjoint difference equations since generally the adjoint variables can not be obtained explicitly for our case. To construct the adjoint equations in our discrete time framework, the techniques which are adopted for the continuous time framework as in [18, 19] are not applicable. In this paper, we propose two techniques to deduce the adjoint difference equations. The first one is that we choose the following product rule:

[TABLE]

where $X_{t}$ (resp. $Y_{t}$ ) subjects to a forward (resp. backward) stochastic difference equation. The second one is that the BS $\Delta$ E should be formulated as in (2.1). In other words, the generator $f$ of the BS $\Delta$ E (2.1) depends on time $t+1$ . It is worth pointing out that this kind of formulation is just the formulation of the adjoint equations for stochastic optimal control problems (see [16] for tha classical case). Based on these two techniques, we can deduce the adjoint difference equations. The readers may refer to Remark 3.6 for more details.

Besides, the second difficulty is in the finite state space case. Since the uniqueness of the variable $Z$ is not defined in the normal sense, the norm of the variable should be redefined. In [5], Cohen and Elliott defined a seminorm of $Z_{t}$ through the term $Z_{t}M_{t+1}$ . However, since the Itô isometry cannot work in the discrete time case and the martingale difference process $M_{t}$ depends on the past, the relation between the norm defined by $Z_{t}$ itself and the norm defined by $Z_{t}M_{t+1}$ is not clear. So it makes estimating the diffusion term of the variation equations quite difficult. In this paper, we propose a new definition of the norm for the variable $Z_{t}$ in the diffusion term and prove the relation between this norm of $Z_{t}$ and the seminorm defined by $Z_{t}M_{t+1}$ . With this relation, we can derive the estimation of the solutions to the stochastic difference equations in the discrete time finite state space framework.

The remainder of this paper is organized as follows. In section 2, two types of the controlled FBS $\Delta$ Ss are formulated. We deduce the maximum principle for the partially coupled controlled FBS $\Delta$ S in section 3. Finally, we establish the maximum principle for the fully coupled controlled FBS $\Delta$ S in section 4.

2 Preliminaries and model formulation

Let $T$ be a deterministic terminal time and $\mathcal{T}:=\left\{0,1,...,T\right\}$ . Following [5], we consider an underlying discrete time, finite state process $W$ which takes values in the standard basis vectors of $\mathbb{R}^{d}$ , where $d$ is the number of states of the process $W$ . In more detail, for each $t\in\mathcal{T}$ , $W_{t}\in\left\{e_{1},e_{2},...e_{d}\right\}$ where $e_{i}=\left(0,0,...,0,1,0,...,0\right)^{\ast}\in\mathbb{R}^{d}$ and $\left[\cdot\right]^{\ast}$ denotes vector transposition.

Consider a filtered probability space $\left(\Omega,\mathcal{F},\left\{\mathcal{F}_{t}\right\}_{0\leq t\leq T},P\right)$ , where $\mathcal{F}_{t}$ is the completion of the $\sigma$ -algebra generated by the process $W$ up to time $t$ and $\mathcal{F}=\mathcal{F}_{T}$ . Denote by $L\left(\mathcal{F}_{t};\mathbb{R}^{n\times d}\right)$ the set of all $\mathcal{F}_{t}-$ adapted random variable $X_{t}$ taking values in $\mathbb{R}^{n\times d}$ and by $\mathcal{M}\left(0,t;\mathbb{R}^{n\times d}\right)$ the set of all $\mathcal{F}_{t}$ -adapted process $X$ taking values in $\mathbb{R}^{n\times d}$ with the norm $\left\|\cdot\right\|$ defined by $\left\|X\right\|=\left(\mathbb{E}\left[\sum_{s=0}^{t}\left|X_{s}\right|^{2}\right]\right)^{\frac{1}{2}}$ .

For simplicity, we suppose the process $W$ satisfies the following assumption. Note that in the following, an inequality on a vector quantity is to hold componentwise.

Assumption 2.1

For any $t\in\left\{0,1,2,...,T-1\right\}$ , any $\omega\in\Omega$ , $\mathbb{E}\left[W_{t+1}|\mathcal{F}_{t}\right]\left(\omega\right)>0.$

The above assumption means that the probability of every possible path of $W$ on $\left\{0,1,2,...,T\right\}$ is strictly positive. Hence under this assumption, the conception ” $P-$ almost surely” in the following statements can be changed to ”for every $\omega$ ”. In fact, this assumption is given just for simple statements. Without this assumption, the proof ideas are the same, but the statements are more sophisticated. We set $\mathbb{E}\left[W_{t+1}|\mathcal{F}_{t}\right]=\left(P_{t}^{1},P_{t}^{2},...,P_{t}^{N}\right)^{\ast}$ .

Define

[TABLE]

$M$ is a martingale difference process taking values in $\mathbb{R}^{d}$ . The following equivalence relations given in [5] will be used in the following.

Definition 2.2

For two $\mathcal{F}_{t}$ -measurable random variables $Z_{t}$ and $\widetilde{Z}_{t}$ , we define $Z_{t}\thicksim_{M_{t+1}}\widetilde{Z}_{t}$ , if $Z_{t}M_{t+1}=\widetilde{Z}_{t}M_{t+1},$ $P-a.s.;$

For two adapted processes $Z$ and $\widetilde{Z}$ , we define $Z\thicksim_{M}\widetilde{Z}$ , if $Z_{t}M_{t+1}=\widetilde{Z}_{t}M_{t+1},$ $P-a.s.$ for any $t\in\left\{0,1,2,...,T-1\right\}.$

For a $\mathcal{F}_{t}$ -adapted process $X$ , define the difference operator $\Delta$ as $\Delta X_{t}=X_{t+1}-X_{t}$ . Consider the following backward stochastic difference equation (BS $\Delta$ E):

[TABLE]

where $\eta\in L\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ and $f:\Omega\times\left\{1,2,...,T\right\}\times\mathbb{R}^{n}\times\mathbb{R}^{n\times d}\longmapsto\mathbb{R}^{n}$ is $\mathcal{F}_{t}$ -adapted mapping.

Assumption 2.3

A1. For any $y\in\mathbb{R}^{n}$ , $t\in\left\{1,2,...,T-1\right\}$ , $\omega\in\Omega$ , and $Z^{1},$ $Z^{2}\in\mathcal{M}\left(0,T-1;\mathbb{R}^{n\times d}\right)$ , if $Z^{1}\thicksim_{M}Z^{2}$ , then

[TABLE]

A2. The function $f\left(t,y,z\right)$ is independent of $z$ at $t=T$ .

We have the following existence and uniqueness theorem of BS $\Delta$ E (2.1) in [12].

Theorem 2.4

Suppose that Assumption (2.3) holds. Then for any terminal condition $\eta\in L\left(\mathcal{F}_{T};\mathbb{R}^{n}\right)$ , BS $\Delta$ E (2.1) has a unique adapted solution $\left(Y,Z\right)$ . Here the uniqueness for $Y$ is in the sense of indistinguishability and for $Z$ is in the sense of $\thicksim_{M}$ equivalence.

We define the $d\times\left(d-1\right)$ matrix $\widetilde{I}=\begin{pmatrix}I_{d-1}&-\mathbf{1}_{d-1}\end{pmatrix}^{\ast},$ where $I_{d-1}$ is $\left(d-1\right)$ -dimensional identity matrix, $\mathbf{1}_{d-1}\mathbf{=}\left(1,1,...,1\right)^{\ast}$ is $\left(d-1\right)$ -dimensional vector with every element being equal to $1$ . Then, we consider two types of controlled systems.

Problem 1 (partially coupled system):

The controlled system is

[TABLE]

and the cost functional is

[TABLE]

where

[TABLE]

Problem 2 (fully coupled system):

The controlled system is:

[TABLE]

and the cost functional is

[TABLE]

where

[TABLE]

Let $\left\{U_{t}\right\}_{t\in\left\{0,1,...,T\right\}}$ be a sequence of nonempty convex subset of $\mathbb{R}^{r}$ . We denote the set of admissible controls $\mathcal{U}$ by $\mathcal{U}=\left\{u\left(\cdot\right)\in\mathcal{M}\left(0,T;\mathbb{R}^{r}\right)|u\left(t\right)\in U_{t}\right\}.$ It can be seen that in Problem 1, $b$ and $\sigma$ do not contain the solution $(Y,Z)$ of the backward equation. This kind of FBS $\Delta$ E is called the partially coupled FBS $\Delta$ E. Meanwhile, the system in Problem 2 is called the fully coupled FBS $\Delta$ E.

The optimal control problem is to find the optimal control $u\in\mathcal{U}$ , such that the optimal control and the corresponding state trajectory can minimize the cost functional $J\left(u\left(\cdot\right)\right)$ . In this paper, we assume the control domain is convex.

Remark 2.5

The cost functional in [19] consists of three parts: the running cost functional, the terminal cost functional of $X_{T}$ , the initial cost functional of $Y_{0}$ . In our formulation, if we take $l\left(\omega,0,X_{0},Y_{0},Z_{0},u_{0}\right)=\gamma\left(\omega,Y_{0}\right)$ , then the cost functional (2.5) for our discrete time framework can be reduced to the cost functional in [19] formally.

For controlled system (2.2)-(2.3), we assume that:

Assumption 2.6

For $\varphi=b$ , $\sigma_{i}\widetilde{I}$ , $f$ , $l$ , $h$ ,

$\varphi$ * is an adapted map, i.e. for any $\left(x,y,\widetilde{z},u\right)\in\mathbb{R}^{m}\times\mathbb{R}^{n}\times\mathbb{R}^{n\times\left(d-1\right)}\times\mathbb{R}^{r}$ , $\varphi\left(\cdot,\cdot,x,y,\widetilde{z},u\right)$ is $\left\{\mathcal{F}_{t}\right\}$ -adapted process.* 2. 2.

*for any $t\in\left\{0,1,...,T\right\}$ and $\omega\in\Omega$ , * $\varphi\left(\omega,t,\cdot,\cdot,\cdot,\cdot\right)\,$ is continuously differentiable with respect to $x,y,\widetilde{z},u$ , and $\varphi_{x},\varphi_{y},\varphi_{\widetilde{z}_{i}},\varphi_{u}$ are uniformly bounded. Also, for $t=T$ , $f$ is independent of $\widetilde{z}$ at time $T$ .

Set

[TABLE]

and

[TABLE]

For controlled system (2.4)-(2.5), we additionally assume that:

Assumption 2.7

For any $u\in\mathcal{U}$ , the coefficients in (2.4) satisfy the following monotone conditions, i.e. when $t\in\left\{1,...,T-1\right\}$ ,

[TABLE]

when $t=T$ ,

[TABLE]

when $t=0$ ,

[TABLE]

where $\alpha$ is a given positive constant.

Besides, in the following, we formally denote $b\left(T,x,y,z\widetilde{I},u\right)\equiv 0$ , $\sigma\left(T,x,y,z\widetilde{I},u\right)\equiv 0$ , $l\left(T,x,y,z\widetilde{I},u\right)\equiv 0$ , $f\left(0,x,y,z\widetilde{I},u\right)\equiv 0$ .

3 Maximum principle for the partially coupled FBS $\Delta$ E system

For any $u\in\mathcal{U}$ , it is obvious that there exists a unique solution $\left\{X_{t}\right\}_{t=0}^{T}\in\mathcal{M}\left(0,T;\mathbb{R}^{m}\right)$ to the forward stochastic difference equation in the system (2.2). According to Lemma 2.3 in [12], it can be seen that $f$ satisfies Assumption (2.3). So given $X$ , by Theorem 2.4, the backward equation in the system (2.2) has a unique solution $\left(Y,Z\right)$ .

Suppose that $\bar{u}=\left\{\bar{u}_{t}\right\}_{t=0}^{T}$ is the optimal control of problem (2.2)-(2.3) and $\left(\bar{X},\bar{Y},\bar{Z}\right)$ is the corresponding optimal trajectory. For a fixed time $0\leq s\leq T$ , choose any $\Delta v\in L\left(\mathcal{F}_{s};\mathbb{R}^{r}\right)$ such that $\bar{u}_{s}+\Delta v$ takes values in $U_{s}$ . For any $\varepsilon\in\left[0,1\right]$ , construct the perturbed admissible control

[TABLE]

where $\delta_{ts}=1$ for $t=s$ , $\delta_{ts}=0$ for $t\neq s$ and $t\in\left\{0,1,...,T\right\}$ . Since $U_{s}$ is a convex set, $\left\{u_{t}^{\varepsilon}\right\}_{t=0}^{T}\in\mathcal{U}$ is an admissible control. Let $\left(X^{\varepsilon},Y^{\varepsilon},Z^{\varepsilon},N^{\varepsilon}\right)$ be the solution of (2.2) corresponding to the control $u^{\varepsilon}$ .

Set

[TABLE]

where $\varphi=b$ , $\sigma_{i}$ , $f$ , $l$ , $h$ and $\mu=x$ , $y$ , $z_{i}$ and $u$ .

Then, we have the following estimates.

Lemma 3.1

Under Assumption 2.6, we have

[TABLE]

Proof. In the following, the positive constant $C$ may change from lines to lines.

When $t=0,...,s$ , $X_{t}^{\varepsilon}=\bar{X}_{t}$ .

When $t=s+1$ ,

[TABLE]

Then,

[TABLE]

By the boundedness of $b_{u}$ , we have

[TABLE]

By the Proposition 2.4 in [12] and boundedness of $\sigma_{iu}\widetilde{I}$ , we have

[TABLE]

which leads to

[TABLE]

When $t=s+2,...,T$ ,

[TABLE]

Due to the boundedness of $b_{x}$ , $\sigma_{ix}\widetilde{I}$ , combined with the Proposition 2.4, we obtain $\mathbb{E}\left|X_{t}^{\varepsilon}-\bar{X}_{t}\right|^{2}\leq C\mathbb{E}\left[\left|X_{t-1}^{\varepsilon}-\bar{X}_{t-1}\right|^{2}\right]$ . Thus, by induction we prove the result.

Let $\xi=\left\{\xi_{t}\right\}_{t=0}^{T}$ be the solution to the following difference equation,

[TABLE]

It is easy to check that

[TABLE]

and we have the following result:

Lemma 3.2

Under Assumption 2.6, we have

[TABLE]

Proof. When $t=0,...,s$ , $X_{t}^{\varepsilon}=\bar{X}_{t}$ and $\xi_{t}=0$ which lead to $X_{t}^{\varepsilon}-\bar{X}_{t}-\xi_{t}=0.$

When $t=s+1$ ,

[TABLE]

where

[TABLE]

Then

[TABLE]

Since $\left\|\widetilde{b}_{u}\left(s\right)-b_{u}\left(s\right)\right\|\rightarrow 0$ and $\left\|\left[\widetilde{\sigma}_{iu}\left(s\right)-\sigma_{iu}\left(s\right)\right]\widetilde{I}\right\|\rightarrow 0$ as $\varepsilon\rightarrow 0$ , we have

[TABLE]

When $t=s+2,...,T$ ,

[TABLE]

where

[TABLE]

Then

[TABLE]

It is easy to check that $\left\|\widetilde{b}_{x}\left(t-1\right)-b_{x}\left(t-1\right)\right\|\rightarrow 0$ and $\left\|\left[\widetilde{\sigma}_{ix}\left(t-1\right)-\sigma_{ix}\left(t-1\right)\right]\widetilde{I}\right\|\rightarrow 0$ as $\varepsilon\rightarrow 0$ . Since $\widetilde{b}_{x}\left(t-1\right)$ and $\widetilde{\sigma}_{ix}\left(t-1\right)$ are bounded, by the estimation (3.5), we have

[TABLE]

This completes the proof.

Lemma 3.3

Under Assumption 2.6, we have

[TABLE]

Proof. It is obvious that $Y_{T}^{\varepsilon}-\bar{Y}_{T}=0$ at time $T$ .

When $t=s,...,T-1$ (if $s=T$ , skip this part), we have

[TABLE]

It yields that

[TABLE]

Similarly, we have

[TABLE]

Combined with Proposition 2.4, we have

[TABLE]

When $t=s-1$ , by similar analysis,

[TABLE]

If $s=T$ ,

[TABLE]

When $t=0,...,s-2$ , we have

[TABLE]

Thus, there exists $C>0$ , such that for any $t\in\left\{0,1,...,T\right\}$ ,

[TABLE]

This completes the proof.

Let $\left(\eta,\zeta\right)$ be the solution to the following BS $\Delta$ E,

[TABLE]

It is easy to check that

[TABLE]

and we have the following result:

Lemma 3.4

Under Assumption 2.6, we have

[TABLE]

Proof. When $t=T$ , $Y_{T}^{\varepsilon}-\bar{Y}_{T}-\eta_{T}=0$ .

When $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

for $\mu=x$ , $y$ , $z_{i}$ and $u$ . Then,

[TABLE]

and

[TABLE]

Notice that $\widetilde{f}_{x}\left(t\right)-f_{x}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{y}\left(t\right)-f_{y}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{z_{i}}\left(t\right)-f_{z_{i}}\left(t\right)\rightarrow 0,$ $\widetilde{f}_{u}\left(t\right)-f_{u}\left(t\right)\rightarrow 0$ as $\varepsilon\rightarrow 0$ . We obtain that

[TABLE]

This completes the proof.

By Lemma 3.2 and Lemma 3.4, we have

[TABLE]

Introducing the following adjoint equation:

[TABLE]

where $\left(\cdot\right)^{{\dagger}}$ denotes the pseudoinverse of a matrix.

Obviously the forward equation in (3.8) admits a unique solution $k\in\mathcal{M}\left(0,T;\mathbb{R}^{n}\right)$ . Then, based on the solution $k$ , according to Theorem 2.4, it is easy to check that the backward equation in (3.8) has a unique solution $\left(p,q\right)\in\mathcal{M}\left(0,T;\mathbb{R}^{m}\right)\times\mathcal{M}\left(0,T-1;\mathbb{R}^{m\times d}\right)$ . So FBS $\Delta$ E has a unique solution $\left(p,q,k\right)$ .

We obtain the following maximum principle for the optimal control problem (2.2)-(2.3).

Define the Hamiltonian function

[TABLE]

Theorem 3.5

Suppose that Assumption 2.6 holds. Let $\bar{u}$ be an optimal control of the problem (2.2)-(2.3), $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory and $\left(p,q,k\right)$ be the solution to the adjoint equation (3.8). Then for any $t\in\left\{0,1,...,T\right\}$ , $v\in U_{t}$ and $\omega\in\Omega$ , we have

[TABLE]

Proof. For $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

It is obvious that $\mathbb{E}\left[\Phi_{t}\right]=0$ . We have

[TABLE]

and

[TABLE]

Similarly, it can be shown that for $t\in\left\{0,1,...,T-1\right\}$ ,

[TABLE]

where

[TABLE]

According to the result in [4], we know that $\forall\omega\in\Omega$ ,

[TABLE]

Then we can obtain

[TABLE]

Similarly,

[TABLE]

Thus

[TABLE]

Therefore,

[TABLE]

Since $\xi_{0}=0$ and $k_{0}=0$ , we deduce

[TABLE]

By $\lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon}\left[J\left(u^{\varepsilon}\left(\cdot\right)\right)-J\left(\bar{u}\left(\cdot\right)\right)\right]\geq 0$ , we obtain

[TABLE]

It is easy to obtain equation (3.9) since $s$ is taking arbitrarily. This completes the proof.

Remark 3.6

In the introduction we point out that we need a reasonable representation of the product rule. When we calculate $\Delta\left\langle\xi_{t},p_{t}\right\rangle$ in (3.10), $\Delta\left\langle\xi_{t},p_{t}\right\rangle$ is represented as $\left\langle\xi_{t+1},\cdot\cdot\cdot\right\rangle+\cdot\cdot\cdot$ . Combining the formulation of the BS $\Delta$ E mentioned in the introduction, this representation will lead to the terms such as $\left\langle\square_{t},\Diamond_{t}\right\rangle-\left\langle\square_{t+1},\Diamond_{t+1}\right\rangle$ in (3.11). By summing and rearranging these terms in (3.12), we obtain the dual relation (3.13).

4 Maximum principle for the fully coupled FBS $\Delta$ E system

In this section we consider the control problem (2.4)-(2.5). Without loss of generality, we only consider the one-dimensional case for $X$ and $Y$ . Let $\bar{u}=\left\{\bar{u}_{t}\right\}_{t=0}^{T}$ be the optimal control for the control problem (2.4)-(2.5) and $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory. Note that the existence and uniqueness of $\left(\bar{X},\bar{Y},\bar{Z}\right)$ is guaranteed by the results in [12]. The perturbed control $u^{\varepsilon}$ is the same as (3.1) and we denote by $\left(X^{\varepsilon},Y^{\varepsilon},Z^{\varepsilon}\right)$ the corresponding trajectory.

Let

[TABLE]

Using the similar analysis and similar notations in section 3, we have

[TABLE]

Lemma 4.1

Under Assumption 2.6 and Assumption 2.7, we have

[TABLE]

Proof. By (4.1),

[TABLE]

By the monotone condition, we obtain

[TABLE]

On the other hand,

[TABLE]

and

[TABLE]

Thus

[TABLE]

Combining (4.3) and (4.4), we have

[TABLE]

This completes the proof.

Next we introduce the following variational equation:

[TABLE]

By Assumption 2.6 and Assumption 2.7, when $t\in\left\{1,...,T-1\right\}$ ,

[TABLE]

when $t=0$ ,

[TABLE]

when $t=T$ ,

[TABLE]

Thus, the coefficients of (4.5) satisfy the monotone condition and there exists a unique solution $\left(\xi,\eta,\zeta\right)$ to (4.5). Similar to the proof of Lemma 4.1, we have

[TABLE]

Define

[TABLE]

where $\varphi=b$ , $\sigma_{i}$ , $f$ , $l$ , $h$ and $\mu=x$ , $y$ , $z$ and $u$ .

Lemma 4.2

Under Assumption 2.6 and Assumption 2.7, we have

[TABLE]

Proof. Note that

[TABLE]

Set

[TABLE]

Then,

[TABLE]

where

[TABLE]

According to (4.10),

[TABLE]

where

[TABLE]

Combining (4.6), (4.7) and (4.8), we have

[TABLE]

Note that

[TABLE]

When $\varepsilon\rightarrow 0$ , $\left\|\widetilde{f}_{\mu}\left(t\right)-f_{\mu}\left(t\right)\right\|\rightarrow 0$ for $\mu=x$ , $y$ , $\widetilde{z}$ and $u$ . Then, by Lemma 4.1,

[TABLE]

Similar results hold for the other terms in (4.11). Finally, we have

[TABLE]

This completes the proof.

By Lemma 4.2, we obtain

[TABLE]

Introduce the following adjoint equation:

[TABLE]

Define the Hamiltonian function as follows:

[TABLE]

Theorem 4.3

Suppose that Assumption 2.6 and Assumption 2.7 hold. Let $\bar{u}$ be an optimal control for (2.4)-(2.5), $\left(\bar{X},\bar{Y},\bar{Z}\right)$ be the corresponding optimal trajectory and $\left(p,q,k\right)$ be the solution to the adjoint equation (4.12). Then, for any $t\in\left\{0,1,...,T\right\}$ , $\omega\in\Omega$ and $v\in U_{t}$ , we have

[TABLE]

Proof. From the expression of $\xi_{t}$ , $p_{t}$ for $t\in\left\{0,1,...,T-1\right\}$ , we have

[TABLE]

where

[TABLE]

We have $\mathbb{E}\left[\Phi_{t}\right]=0$ . Besides,

[TABLE]

Similarly,

[TABLE]

where

[TABLE]

Furthermore,

[TABLE]

Then, we obtain

[TABLE]

Therefore,

[TABLE]

Notice that $\xi_{0}=0$ , $k_{0}=0$ . So

[TABLE]

Since $\lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon}\left[J\left(u^{\varepsilon}\left(\cdot\right)\right)-J\left(\bar{u}\left(\cdot\right)\right)\right]\geq 0$ , we obtain

[TABLE]

Then, (4.13) holds due to that $s$ is taking arbitrarily. This completes the proof.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Bensoussan, A. (1982). Lectures on stochastic control. In Nonlinear filtering and stochastic control (pp. 1-62). Springer, Berlin, Heidelberg.
2[2] Bielecki, T. R., Cialenco, I., & Chen, T. (2015). Dynamic conic finance via backward stochastic difference equations. SIAM Journal on Financial Mathematics, 6(1), 1068-1122.
3[3] Bismut, J. M. (1978). An introductory approach to duality in optimal stochastic control. SIAM review, 20(1), 62-78.
4[4] Cohen, S. N., & Elliott, R. J. (2008). Solutions of backward stochastic differential equations on Markov chains. Communications on stochastic analysis, 2(2), 251-262.
5[5] Cohen, S. N., & Elliott, R. J. (2010). A general theory of finite state backward stochastic difference equations. Stochastic Processes and their Applications, 120(4), 442-466.
6[6] Cohen, S. N., & Elliott, R. J. (2011). Backward stochastic difference equations and nearly time-consistent nonlinear expectations. SIAM Journal on Control and Optimization, 49(1), 125-139.
7[7] Dokuchaev, N., & Zhou, X. Y. (1999). Stochastic controls with terminal contingent conditions. Journal of Mathematical Analysis and Applications, 238(1), 143-165.
8[8] Eberlein, E., Gehrig, T., & Madan, D. B. (2011). Pricing to acceptability: With applications to valuing one’s own credit risk.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Maximum principle for stochastic optimal control problem of finite state

1 Introduction

2 Preliminaries and model formulation

Assumption 2.1

Definition 2.2

Assumption 2.3

Theorem 2.4

Remark 2.5

Assumption 2.6

Assumption 2.7

3 Maximum principle for the partially coupled FBSΔ\DeltaΔE system

Lemma 3.1

Lemma 3.2

Lemma 3.3

Lemma 3.4

Theorem 3.5

Remark 3.6

4 Maximum principle for the fully coupled FBSΔ\DeltaΔE system

Lemma 4.1

Lemma 4.2

Theorem 4.3

3 Maximum principle for the partially coupled FBS $\Delta$ E system

4 Maximum principle for the fully coupled FBS $\Delta$ E system