Maximum principle for stochastic optimal control problem of forward-backward stochastic difference systems
Shaolin Ji, Haodong Liu

TL;DR
This paper establishes a maximum principle for stochastic optimal control problems involving forward-backward stochastic difference systems, covering both partially and fully coupled equations, with a focus on convex control domains.
Contribution
It introduces a novel maximum principle for FBS{ extDelta}Ss, including new adjoint difference equations and applicable to both partially and fully coupled systems.
Findings
Derived the adjoint difference equation using a product rule representation.
Established the maximum principle for convex control domains.
Applied the framework to both partially and fully coupled FBS{ extDelta}Ss.
Abstract
In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS{\Delta}Ss). Two types of FBS{\Delta}Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS{\Delta}E) and the second one is described by a fully coupled FBS{\Delta}E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS{\Delta}E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Climate Change Policy and Economics · Insurance, Mortality, Demography, Risk Management
Maximum principle for stochastic optimal control problem of forward-backward
stochastic difference systems
Shaolin Ji Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. [email protected]. Research supported by NSF (No. 11571203).
Haodong Liu Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. (Corresponding author).
Abstract: In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBSSs). Two types of FBSSs are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBSE) and the second one is described by a fully coupled FBSE. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BSE), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.
Keywords: backward stochastic difference equations; forward-backward stochastic difference equations; monotone condition; stochastic optimal control; maximum principle
1 Introduction
The Maximum Principle is one of the principal approaches in solving the optimal control problems. A lot of work has been done on the Maximum Principle for forward stochastic system. See, for example, Bensoussan [2], Bismut [4], Kushner [12], Peng [16]. Peng also firstly studied one kind of forward-backward stochastic control system (FBSCS) in [17] and obtained the maximum principle for this kind of control system with control domain being convex. The FBSCSs have wide applications in many fields. As the stochastic differential recursive utility, which is a generalization of a standard additive utility, can be regarded as a solution of a backward stochastic differential equation (BSDE). The recursive utility optimization problem can be described by a optimization problem for a FBSCS (see [19]). Besides, in the dynamic principal-agent problem with unobservable states and actions, the principal’s problem can be formulated as a partial information optimal control problem of a FBSCS (see [22]). We refer to [8], [11], [13], [21], [24], [25] for other works on optimization problems for FBSCSs.
In this paper, we will discuss the Maximum Principle for optimal control of discrete time systems described by forward-backward stochastic difference equations (FBSEs). To the best of our knowledge, there are few results on such optimization control problems. In fact, the discrete time control systems are of great value in practice. For example, the digital control can be formulated as discrete time control problems, where the sampled data is obtained at discrete instants of time. Besides, the forward-backward stochastic difference system (FBSS) can be used for modeling in financial markets. For example, the solution to the backward stochastic difference equation (BSE) can be used to construct time-consistent nonlinear expectations (see [5], [6]) and be used for pricing in the financial markets (see [3]). However, the formulation of BSE is quite different from its continuous time counterpart. Many works are devoted to the study of BSEs (see, e.g. [3], [5], [6], [20]). Based on the driving process, there are mainly two types of formulations of BSEs. One is driving by a finite state process which takes values from the basis vectors (as in [5]) and the other is driving by a martingale with independent increments (as in [3]). For the latter case, the solution of the BSE is a triple of processes which is due to the discrete time version of the Kunita–Watanabe decomposition. In this paper, we adopt the second type of formulation to investigate the optimization problems for FBSSs.
Let be a probability space, and be a martingale process with independent increments. Define the difference operator as . Here we consider two types of controlled FBSSs.
Problem 1 (partially coupled system):
The controlled system is
[TABLE]
and the cost functional is
[TABLE]
Problem 2 (fully coupled system):
The controlled system is:
[TABLE]
and the cost functional is
[TABLE]
Let be a sequence of nonempty convex subset of . We denote the set of admissible controls by It can be seen that in Problem 1, and do not contain the solution of the backward equation. This kind of FBSE is called the partially coupled FBSE. Meanwhile, the system in Problem 2 is called the fully coupled FBSE.
The optimal control problem is to find the optimal control , such that the optimal control and the corresponding state trajectory can minimize the cost functional . In this paper, we assume the control domain is convex. By making the perturbation of the optimal control at a fixed time point, we obtain the maximum principle for problem 1 and 2.
To build the maximum principle, the key step is to find the adjoint variables which can be applied to deduce the variational inequality. In [14], the authors studied the maximum principle for a discrete time stochastic optimal control problem in which the state equation is only governed by a forward stochastic difference equation. By applying the Riesz representation theorem, they explicitly obtained the adjoint variables and establish the maximum principle. But to solve our problems, we need to construct the adjoint difference equations since generally the adjoint variables can not be obtained explicitly for our case. To construct the adjoint equations in our discrete time framework, the techniques which are adopted for the continuous time framework as in [16, 17] are not appliable. In this paper, we propose two techniques to deduce the adjoint difference equations. The first one is that we choose the following product rule:
[TABLE]
where (resp. ) subjects to a forward (resp. backward) stochastic difference equation. The second one is that the BSE should be formulated as in (2.1). In other words, the generator of the BSE (2.1) depends on time . It is worth pointing out that this kind of formulation is just the formulation of the adjoint equations for stochastic optimal control problems (see [14] for the classical case). Based on these two techniques, we can deduce the adjoint difference equations. The readers may refer to Remark 3.6 for more details.
The remainder of this paper is organized as follows. In section 2, two types of the controlled FBSSs are formulated. We deduce the maximum principle for the partially coupled controlled FBSS in section 3. Finally, we establish the maximum principle for the fully coupled controlled FBSS in section 4.
2 Preliminaries and model formulation
Let be a deterministic terminal time, and let . Consider a filtered probability space , with and . Here we define the difference operator as . Let be a fixed -valued square integrable martingale process with independent increments, i.e. for any . Also we suppose that for any . Here denotes vector transposition. We assume that is the completion of the -algebra generated by the process up to time .
Denote by the set of all measurable square integrable random variable taking values in and by the set of all -adapted square integrable process taking values in . Moreover, we define and mention that an inequality on a vector quantity is to hold componentwise.
Consider the following backward stochastic difference equation (BSE):
[TABLE]
where , .
Assumption 2.1
A1. The function is uniformly Lipschitz continuous and independent of at , i.e. there exists constants , such that for any , , ,
[TABLE]
A2. for any .
Remark 2.2
The BSE (2.1) is analogous to the continuous time BSDE driven by a general martingale (cf. [9]), and the solution is a triple of processes.
Definition 2.3
A solution to BSE (2.1) is a triple of processes which satisfies equality (2.1) for all , and is a martingale process strongly orthogonal to .
By using the Galtchouk-Kunita-Watanabe decomposition in [3], we can obtain the existence and uniqueness result of BSE (2.1):
Theorem 2.4
Suppose that Assumption (2.1) holds. Then for any terminal condition , the BSE (2.1) has a unique adapted solution .
Proof. We first prove the existence and uniqueness of . Due to Assumption (2.1) and , we get . Here we omit the variable since is independent of at time . Then we have Hence, is a square integrable martingale difference. So it admits the Galtchouk-Kunita-Watanabe decomposition, which implies that there exists , , such that , and
[TABLE]
Moreover, is uniquely determined in this decomposition. For fixed , premultiply the equation by , postmultiply the equation by and then take the conditional expectation. This yields that
[TABLE]
since . Therefore, we get the unique by
[TABLE]
and
[TABLE]
It leads that and .
Then, by similar arguments as above, we can obtain the unique solution for Moreover,
[TABLE]
By taking the convention and letting , we have that (2.1) holds true for all . Finally, since
[TABLE]
we conclude that is strongly orthogonal to .
Now we consider the control systems (1.1)-(1.2) and (1.3)-(1.4).
Let the coefficients in system (1.1)-(1.2) be such that:
[TABLE]
And the coefficients in system (1.3)-(1.4) be such that:
[TABLE]
Remark 2.5
The cost functional in [17] consists of three parts: the running cost functional, the terminal cost functional of , the initial cost functional of . In our formulation, if we take , then the cost functional (1.4) for our discrete time framework can be reduced to the cost functional in [17] formally.
For system (1.1)-(1.2), we assume that:
Assumption 2.6
For , , , , , we assume that
* is adapted map, i.e. for any , is -adapted process. Moreover, * 2. 2.
*, *is continuously differentiable with respect to , and are uniformly bounded . Also, for , , i.e. is independent of at time . Here we use to represent the -th column of the matrix .
Let
[TABLE]
For control system (1.3)-(1.4), we additionally assume that:
Assumption 2.7
For any , the coefficients in (1.3) satisfy the following monotone conditions, i.e. when ,
[TABLE]
when ,
[TABLE]
when ,
[TABLE]
where is a given positive constant.
Besides, in the following, we formally denote , , , .
3 Maximum principle for the partially coupled FBSE system
For any , it is obvious that there exists a unique solution to the forward stochastic difference equation in the system (1.1). Then, by Theorem 2.4, the backward equation in the system (1.1) has a unique solution where , and .
Suppose that is the optimal control of problem (1.1)-(1.2) and is the corresponding optimal trajectory. For a fixed time , choose any such that takes values in . For any , construct the perturbed admissible control
[TABLE]
where for , for and . Since is a convex set, is an admissible control. Let be the solution of (1.1) corresponding to the control .
Set
[TABLE]
where , , , , , and , , and .
Then, we have the following estimates.
Lemma 3.1
Under Assumption (2.6), we have
[TABLE]
Proof. In the following, the positive constant may change from lines to lines.
When , .
When ,
[TABLE]
Then,
[TABLE]
By the boundedness of , we have
[TABLE]
By the boundedness of , we have
[TABLE]
which leads to
[TABLE]
When ,
[TABLE]
Due to the boundedness of , , we obtain . Thus, by induction we prove the result.
Let be the solution to the following difference equation,
[TABLE]
It is easy to check that
[TABLE]
and we have the following result:
Lemma 3.2
Under Assumption 2.6, we have
[TABLE]
Proof. When , and which lead to
When ,
[TABLE]
where
[TABLE]
Then
[TABLE]
Since and as , we have
[TABLE]
When ,
[TABLE]
where
[TABLE]
Then
[TABLE]
and as . Since and are bounded, by the estimation (3.5), we have
[TABLE]
This completes the proof.
Lemma 3.3
Under Assumption 2.6, we have
[TABLE]
Proof. It is obvious that at time .
When (if , skip this part), we have
[TABLE]
It yields that
[TABLE]
Similarly, we have
[TABLE]
When , by similar analysis,
[TABLE]
If , it shows like
[TABLE]
When , we have
[TABLE]
Thus, there exists , such that for any ,
[TABLE]
This completes the proof.
Let be the solution to the following BSE,
[TABLE]
Notice that since is independent of , also as , .
It is easy to check that
[TABLE]
and we have the following result:
Lemma 3.4
Under Assumption 2.6, we have
[TABLE]
Proof. When , .
When , we have
[TABLE]
where
[TABLE]
for , , and . Then,
[TABLE]
and
[TABLE]
Notice that as . We obtain that
[TABLE]
This completes the proof.
By Lemma 3.2 and Lemma 3.4, we have
[TABLE]
Introducing the following adjoint equation:
[TABLE]
where and are square integrable martingale processes and is strongly orthogonal to .
Obviously the forward equation in (3.8) admits a unique solution . Then, based on the solution , according to Theorem 2.4, the backward equation in (3.8) has a unique solution . So FBSE has a unique solution .
We obtain the following maximum principle for the optimal control problem (1.1)-(1.2).
Define the Hamiltonian function
[TABLE]
Theorem 3.5
Suppose that Assumption (2.6) holds. Let be an optimal control of the problem (1.1)-(1.2), be the corresponding optimal trajectory and be the solution to the adjoint equation (3.8). Then for any , for any , we have
[TABLE]
Proof. For , we have
[TABLE]
where
[TABLE]
It is obvious that . We have
[TABLE]
and
[TABLE]
Similarly, it can be shown that for , we have
[TABLE]
where
[TABLE]
It is easy to check that
[TABLE]
Then we have
[TABLE]
Therefore,
[TABLE]
Since and , we deduce
[TABLE]
By , we obtain
[TABLE]
Thus, it is easy to obtain equation (3.9) since is taking arbitrarily. This completes the proof.
Remark 3.6
In the introduction we point out that we need a reasonable representation of the product rule. When we calculate in (3.10), is represented as . Combining the formulation of the BSE mentioned in the introduction, this representation will lead to the terms such as in (3.11). By summing and rearranging these terms in (3.12), we obtain the dual relation (3.13).
When and , our control system (1.1)-(1.2) degenerates to the classical discrete control system which only contains a forward stochastic difference equation as in [14]. For this special case, the adjoint equation becomes
[TABLE]
and the Hamiltonian function becomes
[TABLE]
The adjoint equation has the following explicit solution
[TABLE]
which coincides with the results in [14].
4 Maximum principle for the fully coupled FBSE system
In this section we suppose to be one-dimensional driving process. Let be the optimal control for the control problem (1.3)-(1.4) and be the corresponding optimal trajectory. Note that the existence and uniqueness of is guaranteed by the results in [15]. The perturbed control is the same as (3.1) and we denote by the corresponding trajectory.
Let
[TABLE]
Using the similar notations (3.2) in section 3, we have
[TABLE]
Lemma 4.1
Under Assumption 2.6 and Assumption 2.7, we have
[TABLE]
Proof. By (4.1),
[TABLE]
By the monotone condition, we obtain
[TABLE]
On the other hand,
[TABLE]
and similarly,
[TABLE]
Combining (4.3) and (4.4), we have
[TABLE]
This completes the proof.
Next we introduce the following variational equation:
[TABLE]
By Assumption 2.6 and Assumption 2.7, when ,
[TABLE]
when ,
[TABLE]
when ,
[TABLE]
Thus, the coefficients of (4.5) satisfy the monotone condition and there exists a unique solution to (4.5). Similar to the proof of Lemma 4.1, we have
[TABLE]
Define
[TABLE]
where , , , , , and , , and .
Lemma 4.2
Under Assumption 2.6 and Assumption 2.7, we have
[TABLE]
Proof. Note that
[TABLE]
Set
[TABLE]
Then,
[TABLE]
where
[TABLE]
According to (4.10),
[TABLE]
where
[TABLE]
Combining (4.6), (4.7) and (4.8), we have
[TABLE]
Note that
[TABLE]
When , for , , and . Then, by Lemma 4.1,
[TABLE]
Similar results hold for the other terms in (4.11). Finally, we have
[TABLE]
This completes the proof.
By Lemma 4.2, we obtain
[TABLE]
Introduce the following adjoint equation:
[TABLE]
Define the Hamiltonian function as follows:
[TABLE]
Theorem 4.3
Suppose that Assumption 2.6 and Assumption 2.7 hold. Let be an optimal control for (1.3)-(1.3), be the corresponding optimal trajectory and be the solution to the adjoint equation (4.12). Then, for any and any , we have
[TABLE]
Proof. From the expression of , for , we have
[TABLE]
where
[TABLE]
Since and are square integrable martingale processes and is strongly orthogonal to , we have . Similarly,
[TABLE]
where
[TABLE]
Furthermore,
[TABLE]
and
[TABLE]
Then, we obtain
[TABLE]
Therefore,
[TABLE]
Notice that , . So
[TABLE]
Since , we obtain
[TABLE]
Then, (4.13) holds due to that is taking arbitrarily. This completes the proof.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bender, C., & Zhang, J. (2008). Time discretization and Markovian iteration for coupled FBSD Es. The Annals of Applied Probability, 18(1), 143-177.
- 2[2] Bensoussan, A. (1982). Lectures on stochastic control. In Nonlinear filtering and stochastic control (pp. 1-62). Springer, Berlin, Heidelberg.
- 3[3] Bielecki, T. R., Cialenco, I., & Chen, T. (2015). Dynamic conic finance via backward stochastic difference equations. SIAM Journal on Financial Mathematics, 6(1), 1068-1122.
- 4[4] Bismut, J. M. (1978). An introductory approach to duality in optimal stochastic control. SIAM review, 20(1), 62-78.
- 5[5] Cohen, S. N., & Elliott, R. J. (2010). A general theory of finite state backward stochastic difference equations. Stochastic Processes and their Applications, 120(4), 442-466.
- 6[6] Cohen, S. N., & Elliott, R. J. (2011). Backward stochastic difference equations and nearly time-consistent nonlinear expectations. SIAM Journal on Control and Optimization, 49(1), 125-139.
- 7[7] Delarue, F., & Menozzi, S. (2006). A forward–backward stochastic algorithm for quasi-linear PD Es. The Annals of Applied Probability, 16(1), 140-184.
- 8[8] Dokuchaev, N., & Zhou, X. Y. (1999). Stochastic controls with terminal contingent conditions. Journal of Mathematical Analysis and Applications, 238(1), 143-165.
