Maximum principle for stochastic optimal control problem of finite state forward-backward stochastic difference systems
Shailin Ji, Haodong Liu

TL;DR
This paper develops a maximum principle for stochastic optimal control problems involving finite state forward-backward stochastic difference systems, extending control theory to discrete-time, finite state models with new adjoint equations.
Contribution
It introduces a maximum principle for finite state FBS{ extunderscore}Ss, including both partially and fully coupled systems, with new adjoint difference equations and control domain considerations.
Findings
Derived the adjoint difference equation for the systems.
Established the maximum principle for convex control domains.
Extended stochastic control theory to finite state, discrete-time systems.
Abstract
In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBS{\Delta}Ss) where the uncertainty is modeled by a discrete time, finite state process, rather than white noises. Two types of FBS{\Delta}Ss are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBS{\Delta}E) and the second one is described by a fully coupled FBS{\Delta}E. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BS{\Delta}E), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Climate Change Policy and Economics · Insurance, Mortality, Demography, Risk Management
Maximum principle for stochastic optimal control problem of finite state
forward-backward stochastic difference systems
Shaolin Ji Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. [email protected]. Research supported by NSF (No. 11571203).
Haodong Liu Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, Shandong 250100, PR China. (Corresponding author).
Abstract: In this paper, we study the maximum principle for stochastic optimal control problems of forward-backward stochastic difference systems (FBSSs) where the uncertainty is modeled by a discrete time, finite state process, rather than white noises. Two types of FBSSs are investigated. The first one is described by a partially coupled forward-backward stochastic difference equation (FBSE) and the second one is described by a fully coupled FBSE. By adopting an appropriate representation of the product rule and an appropriate formulation of the backward stochastic difference equation (BSE), we deduce the adjoint difference equation. Finally, the maximum principle for this optimal control problem with the control domain being convex is established.
Keywords: backward stochastic difference equations; forward-backward stochastic difference equations; monotone condition; stochastic optimal control; maximum principle
1 Introduction
The Maximum Principle is one of the important approaches in solving the optimal control problems. A lot of work has been done on the Maximum Principle for stochastic system. See, for example, Bensoussan [1], Bismut [3], Kushner [13], Peng [18]. Peng also firstly studied one kind of forward-backward stochastic control system (FBSCS) in [19] and obtained the maximum principle for this kind of control system with control domain being convex. The FBSCSs have wide applications in many fields. As the stochastic differential recursive utility, which is a generalization of a standard additive utility, can be regarded as a solution of a backward stochastic differential equation (BSDE). The recursive utility optimization problem can be described by a optimization problem for a FBSCS (see [21]). Besides, in the dynamic principal-agent problem with unobservable states and actions, the principal’s problem can be formulated as a partial information optimal control problem of a FBSCS (see [24]). We refer to [7], [10], [11], [14], [22], [26], [28] for other works on optimization problems for FBSCSs.
In this paper, we will discuss the Maximum Principle for optimal control of discrete time systems described by forward-backward stochastic difference equations (FBSEs). To the best of our knowledge, there are few results on such optimization control problems. In fact, the discrete time control systems are of great value in practice. For example, the digital control can be formulated as discrete time control problems, where the sampled data is obtained at discrete instants of time. Besides, the forward-backward stochastic difference system (FBSS) can be used for modeling in financial markets. For example, the solution to the backward stochastic difference equation (BSE) can be used to construct time-consistent nonlinear expectations (see [5], [6]) and be used for pricing in the financial markets (see [2]). However, the formulation of BSE is quite different from its continuous time counterpart. Many works are devoted to the study of BSEs (see, e.g. [2], [5], [6], [23]). Based on the driving process, there are mainly two types of formulations of BSEs. One is driving by a finite state process which takes values from the basis vectors (as in [5]) and the other is driving by a martingale with independent increments (as in [2]). For the former framework, the researchers in [5] obtained the discrete time version of martingale representation theorem and establish the solvability result of BSE with the uniqueness of under a new kind of equivalence relation. Further works about the applications of the finite state framework can be seen in [8], [17], [15]. In this paper, we adopt the first type of formulation to investigate the optimization problems for FBSSs.
In this paper, we study two stochastic optimal control problems. The Problem 1 involves a partially coupled FBSE (2.2). In more details, the coefficients and of the forward equation do not contain the solution of the backward equation. The state equation of Problem 2 is described by a fully coupled FBSE (2.4).
The optimal control problem is to find the optimal control , such that the optimal control and the corresponding state trajectory can minimize the cost functional . In this paper, we assume the control domain is convex. By making the perturbation of the optimal control at a fixed time point, we obtain the maximum principle for problem 1 and 2.
To build the maximum principle, the key step is to find the adjoint variables which can be applied to deduce the variational inequality. In [16], the authors studied the maximum principle for a discrete time stochastic optimal control problem in which the state equation is only governed by a forward stochastic difference equation. By applying the Riesz representation theorem, they explicitly obtained the adjoint variables and establish the maximum principle. But to solve our problems, we need to construct the adjoint difference equations since generally the adjoint variables can not be obtained explicitly for our case. To construct the adjoint equations in our discrete time framework, the techniques which are adopted for the continuous time framework as in [18, 19] are not applicable. In this paper, we propose two techniques to deduce the adjoint difference equations. The first one is that we choose the following product rule:
[TABLE]
where (resp. ) subjects to a forward (resp. backward) stochastic difference equation. The second one is that the BSE should be formulated as in (2.1). In other words, the generator of the BSE (2.1) depends on time . It is worth pointing out that this kind of formulation is just the formulation of the adjoint equations for stochastic optimal control problems (see [16] for tha classical case). Based on these two techniques, we can deduce the adjoint difference equations. The readers may refer to Remark 3.6 for more details.
Besides, the second difficulty is in the finite state space case. Since the uniqueness of the variable is not defined in the normal sense, the norm of the variable should be redefined. In [5], Cohen and Elliott defined a seminorm of through the term . However, since the Itô isometry cannot work in the discrete time case and the martingale difference process depends on the past, the relation between the norm defined by itself and the norm defined by is not clear. So it makes estimating the diffusion term of the variation equations quite difficult. In this paper, we propose a new definition of the norm for the variable in the diffusion term and prove the relation between this norm of and the seminorm defined by . With this relation, we can derive the estimation of the solutions to the stochastic difference equations in the discrete time finite state space framework.
The remainder of this paper is organized as follows. In section 2, two types of the controlled FBSSs are formulated. We deduce the maximum principle for the partially coupled controlled FBSS in section 3. Finally, we establish the maximum principle for the fully coupled controlled FBSS in section 4.
2 Preliminaries and model formulation
Let be a deterministic terminal time and . Following [5], we consider an underlying discrete time, finite state process which takes values in the standard basis vectors of , where is the number of states of the process . In more detail, for each , where and denotes vector transposition.
Consider a filtered probability space , where is the completion of the -algebra generated by the process up to time and . Denote by the set of all adapted random variable taking values in and by the set of all -adapted process taking values in with the norm defined by .
For simplicity, we suppose the process satisfies the following assumption. Note that in the following, an inequality on a vector quantity is to hold componentwise.
Assumption 2.1
For any , any ,
The above assumption means that the probability of every possible path of on is strictly positive. Hence under this assumption, the conception ”almost surely” in the following statements can be changed to ”for every ”. In fact, this assumption is given just for simple statements. Without this assumption, the proof ideas are the same, but the statements are more sophisticated. We set .
Define
[TABLE]
is a martingale difference process taking values in . The following equivalence relations given in [5] will be used in the following.
Definition 2.2
For two -measurable random variables and , we define , if
For two adapted processes and , we define , if for any
For a -adapted process , define the difference operator as . Consider the following backward stochastic difference equation (BSE):
[TABLE]
where and is -adapted mapping.
Assumption 2.3
A1. For any , , , and , if , then
[TABLE]
A2. The function is independent of at .
We have the following existence and uniqueness theorem of BSE (2.1) in [12].
Theorem 2.4
Suppose that Assumption (2.3) holds. Then for any terminal condition , BSE (2.1) has a unique adapted solution . Here the uniqueness for is in the sense of indistinguishability and for is in the sense of equivalence.
We define the matrix where is -dimensional identity matrix, is -dimensional vector with every element being equal to . Then, we consider two types of controlled systems.
Problem 1 (partially coupled system):
The controlled system is
[TABLE]
and the cost functional is
[TABLE]
where
[TABLE]
Problem 2 (fully coupled system):
The controlled system is:
[TABLE]
and the cost functional is
[TABLE]
where
[TABLE]
Let be a sequence of nonempty convex subset of . We denote the set of admissible controls by It can be seen that in Problem 1, and do not contain the solution of the backward equation. This kind of FBSE is called the partially coupled FBSE. Meanwhile, the system in Problem 2 is called the fully coupled FBSE.
The optimal control problem is to find the optimal control , such that the optimal control and the corresponding state trajectory can minimize the cost functional . In this paper, we assume the control domain is convex.
Remark 2.5
The cost functional in [19] consists of three parts: the running cost functional, the terminal cost functional of , the initial cost functional of . In our formulation, if we take , then the cost functional (2.5) for our discrete time framework can be reduced to the cost functional in [19] formally.
For controlled system (2.2)-(2.3), we assume that:
Assumption 2.6
For , , , , ,
* is an adapted map, i.e. for any , is -adapted process.* 2. 2.
*for any and , *is continuously differentiable with respect to , and are uniformly bounded. Also, for , is independent of at time .
Set
[TABLE]
and
[TABLE]
For controlled system (2.4)-(2.5), we additionally assume that:
Assumption 2.7
For any , the coefficients in (2.4) satisfy the following monotone conditions, i.e. when ,
[TABLE]
when ,
[TABLE]
when ,
[TABLE]
where is a given positive constant.
Besides, in the following, we formally denote , , , .
3 Maximum principle for the partially coupled FBSE system
For any , it is obvious that there exists a unique solution to the forward stochastic difference equation in the system (2.2). According to Lemma 2.3 in [12], it can be seen that satisfies Assumption (2.3). So given , by Theorem 2.4, the backward equation in the system (2.2) has a unique solution .
Suppose that is the optimal control of problem (2.2)-(2.3) and is the corresponding optimal trajectory. For a fixed time , choose any such that takes values in . For any , construct the perturbed admissible control
[TABLE]
where for , for and . Since is a convex set, is an admissible control. Let be the solution of (2.2) corresponding to the control .
Set
[TABLE]
where , , , , and , , and .
Then, we have the following estimates.
Lemma 3.1
Under Assumption 2.6, we have
[TABLE]
Proof. In the following, the positive constant may change from lines to lines.
When , .
When ,
[TABLE]
Then,
[TABLE]
By the boundedness of , we have
[TABLE]
By the Proposition 2.4 in [12] and boundedness of , we have
[TABLE]
which leads to
[TABLE]
When ,
[TABLE]
Due to the boundedness of , , combined with the Proposition 2.4, we obtain . Thus, by induction we prove the result.
Let be the solution to the following difference equation,
[TABLE]
It is easy to check that
[TABLE]
and we have the following result:
Lemma 3.2
Under Assumption 2.6, we have
[TABLE]
Proof. When , and which lead to
When ,
[TABLE]
where
[TABLE]
Then
[TABLE]
Since and as , we have
[TABLE]
When ,
[TABLE]
where
[TABLE]
Then
[TABLE]
It is easy to check that and as . Since and are bounded, by the estimation (3.5), we have
[TABLE]
This completes the proof.
Lemma 3.3
Under Assumption 2.6, we have
[TABLE]
Proof. It is obvious that at time .
When (if , skip this part), we have
[TABLE]
It yields that
[TABLE]
Similarly, we have
[TABLE]
Combined with Proposition 2.4, we have
[TABLE]
When , by similar analysis,
[TABLE]
If ,
[TABLE]
When , we have
[TABLE]
Thus, there exists , such that for any ,
[TABLE]
This completes the proof.
Let be the solution to the following BSE,
[TABLE]
It is easy to check that
[TABLE]
and we have the following result:
Lemma 3.4
Under Assumption 2.6, we have
[TABLE]
Proof. When , .
When , we have
[TABLE]
where
[TABLE]
for , , and . Then,
[TABLE]
and
[TABLE]
Notice that as . We obtain that
[TABLE]
This completes the proof.
By Lemma 3.2 and Lemma 3.4, we have
[TABLE]
Introducing the following adjoint equation:
[TABLE]
where denotes the pseudoinverse of a matrix.
Obviously the forward equation in (3.8) admits a unique solution . Then, based on the solution , according to Theorem 2.4, it is easy to check that the backward equation in (3.8) has a unique solution . So FBSE has a unique solution .
We obtain the following maximum principle for the optimal control problem (2.2)-(2.3).
Define the Hamiltonian function
[TABLE]
Theorem 3.5
Suppose that Assumption 2.6 holds. Let be an optimal control of the problem (2.2)-(2.3), be the corresponding optimal trajectory and be the solution to the adjoint equation (3.8). Then for any , and , we have
[TABLE]
Proof. For , we have
[TABLE]
where
[TABLE]
It is obvious that . We have
[TABLE]
and
[TABLE]
Similarly, it can be shown that for ,
[TABLE]
where
[TABLE]
According to the result in [4], we know that ,
[TABLE]
Then we can obtain
[TABLE]
Similarly,
[TABLE]
Thus
[TABLE]
Therefore,
[TABLE]
Since and , we deduce
[TABLE]
By , we obtain
[TABLE]
It is easy to obtain equation (3.9) since is taking arbitrarily. This completes the proof.
Remark 3.6
In the introduction we point out that we need a reasonable representation of the product rule. When we calculate in (3.10), is represented as . Combining the formulation of the BSE mentioned in the introduction, this representation will lead to the terms such as in (3.11). By summing and rearranging these terms in (3.12), we obtain the dual relation (3.13).
4 Maximum principle for the fully coupled FBSE system
In this section we consider the control problem (2.4)-(2.5). Without loss of generality, we only consider the one-dimensional case for and . Let be the optimal control for the control problem (2.4)-(2.5) and be the corresponding optimal trajectory. Note that the existence and uniqueness of is guaranteed by the results in [12]. The perturbed control is the same as (3.1) and we denote by the corresponding trajectory.
Let
[TABLE]
Using the similar analysis and similar notations in section 3, we have
[TABLE]
Lemma 4.1
Under Assumption 2.6 and Assumption 2.7, we have
[TABLE]
Proof. By (4.1),
[TABLE]
By the monotone condition, we obtain
[TABLE]
On the other hand,
[TABLE]
and
[TABLE]
Thus
[TABLE]
Combining (4.3) and (4.4), we have
[TABLE]
This completes the proof.
Next we introduce the following variational equation:
[TABLE]
By Assumption 2.6 and Assumption 2.7, when ,
[TABLE]
when ,
[TABLE]
when ,
[TABLE]
Thus, the coefficients of (4.5) satisfy the monotone condition and there exists a unique solution to (4.5). Similar to the proof of Lemma 4.1, we have
[TABLE]
Define
[TABLE]
where , , , , and , , and .
Lemma 4.2
Under Assumption 2.6 and Assumption 2.7, we have
[TABLE]
Proof. Note that
[TABLE]
Set
[TABLE]
Then,
[TABLE]
where
[TABLE]
According to (4.10),
[TABLE]
where
[TABLE]
Combining (4.6), (4.7) and (4.8), we have
[TABLE]
Note that
[TABLE]
When , for , , and . Then, by Lemma 4.1,
[TABLE]
Similar results hold for the other terms in (4.11). Finally, we have
[TABLE]
This completes the proof.
By Lemma 4.2, we obtain
[TABLE]
Introduce the following adjoint equation:
[TABLE]
Define the Hamiltonian function as follows:
[TABLE]
Theorem 4.3
Suppose that Assumption 2.6 and Assumption 2.7 hold. Let be an optimal control for (2.4)-(2.5), be the corresponding optimal trajectory and be the solution to the adjoint equation (4.12). Then, for any , and , we have
[TABLE]
Proof. From the expression of , for , we have
[TABLE]
where
[TABLE]
We have . Besides,
[TABLE]
Similarly,
[TABLE]
where
[TABLE]
Furthermore,
[TABLE]
Then, we obtain
[TABLE]
Therefore,
[TABLE]
Notice that , . So
[TABLE]
Since , we obtain
[TABLE]
Then, (4.13) holds due to that is taking arbitrarily. This completes the proof.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bensoussan, A. (1982). Lectures on stochastic control. In Nonlinear filtering and stochastic control (pp. 1-62). Springer, Berlin, Heidelberg.
- 2[2] Bielecki, T. R., Cialenco, I., & Chen, T. (2015). Dynamic conic finance via backward stochastic difference equations. SIAM Journal on Financial Mathematics, 6(1), 1068-1122.
- 3[3] Bismut, J. M. (1978). An introductory approach to duality in optimal stochastic control. SIAM review, 20(1), 62-78.
- 4[4] Cohen, S. N., & Elliott, R. J. (2008). Solutions of backward stochastic differential equations on Markov chains. Communications on stochastic analysis, 2(2), 251-262.
- 5[5] Cohen, S. N., & Elliott, R. J. (2010). A general theory of finite state backward stochastic difference equations. Stochastic Processes and their Applications, 120(4), 442-466.
- 6[6] Cohen, S. N., & Elliott, R. J. (2011). Backward stochastic difference equations and nearly time-consistent nonlinear expectations. SIAM Journal on Control and Optimization, 49(1), 125-139.
- 7[7] Dokuchaev, N., & Zhou, X. Y. (1999). Stochastic controls with terminal contingent conditions. Journal of Mathematical Analysis and Applications, 238(1), 143-165.
- 8[8] Eberlein, E., Gehrig, T., & Madan, D. B. (2011). Pricing to acceptability: With applications to valuing one’s own credit risk.
