Mild solutions to the dynamic programming equation for stochastic optimal control problems
Viorel Barbu, Chiara Benazzoli, Luca Di Persio

TL;DR
This paper establishes the existence and uniqueness of mild solutions to the 1-D dynamic programming equation in stochastic optimal control with multiplicative noise, using nonlinear semigroup theory, and extends results to higher dimensions.
Contribution
It introduces a novel approach using nonlinear semigroup theory to analyze the dynamic programming equation in stochastic control, including multidimensional cases.
Findings
Unique mild solution in 1D for the dynamic programming equation
Solution regularity in $C([0,T];W^{1,inity})$ and $ ext{second derivative}$ in $C([0,T];L^1)$
Extension of results to n-dimensional stochastic control problems
Abstract
We show via the nonlinear semigroup theory in that the -D dynamic programming equation associated with a stochastic optimal control problem with multiplicative noise has a unique mild solution with . The -dimensional case is also investigated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Mild solutions to the dynamic programming equation for stochastic optimal control problems
Viorel Barbu A.I. Cuza University, Iasi, Romania
Chiara Benazzoli Dept. of Mathematics, University of Trento, Italy
Luca Di Persio Dept. of Computer Science, University of Verona, Italy
Abstract
We show via the nonlinear semigroup theory in that the -D dynamic programming equation associated with a stochastic optimal control problem with multiplicative noise has a unique mild solution with . The -dimensional case is also investigated.
Keyword: stochastic process; optimal control;
-accretive operator; Cauchy problem.
1 Introduction
Consider the following stochastic optimal control problem
[TABLE]
subject to and to state equation
[TABLE]
where is the set of all -adapted processes and is an -D Wiener process in a probability space , provided the natural filtration . Here , while is the strong solution to (2).
We would like to underline that the studied optimization problem is related to the so called stochastic volatility models, used in the financial framework, whose relevance has raised exponentially during last years. In fact such models, contrarily to the constant volatility ones as, e.g., the standard Black and Scholes approach, the Vasicek interest rate model, or the Cox-Ross-Rubistein model, allow to consider the more realistic situation of volatility levels changing in time. As an example, the latter is the case of the Heston model, see [9], where the variance is assumed to be a stochastic process following a Cox-Ingersoll-Ross (CIR) dynamic, see [10] or [4] and references therein for more recent related techniques, as well as the case of the Constant Elasticity of Variance (CEV) model, see [5], where the volatility is expressed by a power of the underlying level, which is often referred as a local stochastic volatility model. Other interesting examples, which is the object of our ongoing research particularly from the numerical point of view, include the Stochastic Alpha, Beta, Rho (SABR) model, see, e.g., [8], and models which are used to estimate the stochastic volatility by exploiting directly markets data, as happens using the GARCH approach and its variants.
Within latter frameworks and due to several macroeconomic crises that have affected different (type of) financial markets worldwide, governments decided to become active players of the game, as, e.g., in the recent case of the Volatility Control Mechanism (VCM) established for the securities, resp. for the derivatives, market established in August 2016, resp. in January 2017, within the Hong Kong Stock Exchange (HKEX) framework, see, e.g., [12, 13] and references therein for other applications and examples.
Hypotheses:
is convex, continuous and , , for some . 2. 2.
, , . 3. 3.
, and
[TABLE]
We set
[TABLE]
and we denote by the Legendre conjugate of , namely,
[TABLE]
We have , where is the subdiffential of , and is the normal cone to . This yields
[TABLE]
We denote also by the potential of , that is
[TABLE]
The dynamic programming equation corresponding to the stochastic optimal control problem (1) is given by (see, e.g., [7],[11]),
[TABLE]
or equivalently
[TABLE]
Moreover, if is a smooth solution to (6) the associated feedback controller
[TABLE]
is optimal for problem (1).
Up to our knowledge, in literature the rigorous treatment of existence theory for equation (6) has been shown, so far within the theory of viscosity solutions only. (See, e.g., [6].) Here we shall exploit a different approach, namely we use a suitable transformation aiming at reducing (6) to an one dimensional Fokker-Planck equation which is then treated as a nonlinear Cauchy problem in . The -dimensional case is also studied in section 4. As regards the non-degenerate hypothesis (3) it will be later on dispensed by assuming more regularity on function . (See section 4 below.)
1.1 Notation and basic results
We shall use the standard notation for functional spaces on . In particular is the space of functions , differentiable of order and with bounded derivatives until order . By , , we denote the classical space of Lebesgue-measurable -integrable functions on with the norm and by , , , the standard Sobolev spaces on , . We set also , , , for and , for . By we denote the space of Schwartz distributions on .
Definition 1.1** (Accretive operator)**
Given a Banach space , a nonlinear operator from to itself, with domain , is said to be accretive if , , there exists such that
[TABLE]
*where is the dual space of , is the duality pairing and is the duality mapping of . (See, e.g., [1].)
An accretive operator is said to be -accretive if for all (equivalently some) , while it is said to be m if there is such that is -accretive.*
We refer to [1] for basic results on -accretive operators in Banach spaces and the corresponding associated Cauchy problem.
2 Existence results
We set
[TABLE]
and we rewrite eq. (7) as
[TABLE]
We recall (see [3] for details), that, for , the equation
[TABLE]
has a unique solution and . Then by (10) we have
[TABLE]
Setting
[TABLE]
and taking into account that , , and , we obtain for operator the estimate
[TABLE]
Therefore eq. (11) can be rewritten as follows
[TABLE]
where and in .
Definition 2.1
The function is said to be a mild solution to equation (16) if and
[TABLE]
[TABLE]
[TABLE]
[TABLE]
We have
Theorem 2.2
Under hypotheses (1)-(3) eq. (11) has a unique mild solution . Assume further that . Then and .
Theorem 2.2 will be proven by using the standard existence theory for the Cauchy problem in Banach spaces with nonlinear quasi--accritive operators. Now taking into account that for equation (12) uniquely defines the function , by Theorem 2.2 we obtain the following existence result for the dynamic programming equation (6).
Theorem 2.3
Under hypothesis (1)-(3) there is a unique mild solution
[TABLE]
to equation (6). Moreover, if , and , then H^{*}\bigl{(}-\frac{\sigma^{2}}{2}\,\varphi_{xx}(T-t,x)\bigr{)}\in L^{2}([0,T]\times\mathbb{R}).
According to the Definition 2.1 and (13), by mild solution to equation (6), we mean a function defined by
[TABLE]
[TABLE]
for and is the solution to (19).
In particular, the mild solution to equation (6) is in . Therefore, the feedback controller (8) is well defined on .
Remark 2.4
The principal advantage of Theorem 2.2 compared with standard existence results expressed in terms of viscosity solutions is the regularity of and the fact that the optimal feedback controller can be computed explicitly by the finite difference scheme (21)-(22). This will be treated in a forthcoming paper.
3 Proof of Theorem 2.2
The idea is to write equation (16) as a Cauchy problem of the form
[TABLE]
in the space , where is a suitable nonlinear quasi--accretive operator. The operator is defined as follows
[TABLE]
[TABLE]
where the derivatives are taken in the sense.
Lemma 3.1
For each and there exists a unique solution to equation
[TABLE]
Moreover, it holds
[TABLE]
, hence turns to be quasi--accretive in .
Proof. [Proof of Lemma 3.1] Assume first that . For each consider the equation
[TABLE]
in . Equivalently,
[TABLE]
where z=\bigl{(}\nu\,I-\frac{d^{2}}{dx^{2}}\bigr{)}^{-1}\,y is defined by equation
[TABLE]
Note that by Hypothesis (2) the operator \Gamma\,y=(\lambda-\nu^{2})\bigl{(}\nu I-\frac{d^{2}}{dx^{2}}\bigr{)}^{-1}y-\bigl{(}\nu I-\frac{d^{2}}{dx^{2}}\bigr{)}^{-1}(fy^{\prime})+\nu y is linear continuous in and by (29) we have that
[TABLE]
[TABLE]
Here and are the norm and the scalar product in , respectively, and by , we denote the norm of . We note that Hypothesis (1) and (4) imply that the function is continuous, monotonically non–decreasing, and
[TABLE]
Furthermore, by (29)-(31), we have
[TABLE]
The latter yields
[TABLE]
where is dependent on . By assumption (3) we have that the operator y\rightarrow\mathcal{H}(y)\equiv H^{*}\bigl{(}\frac{\sigma^{2}}{2}\,y\bigr{)} is maximal monotone in , hence, by (33), is maximal monotone and coercive, i.e. positively definite, therefore we have
[TABLE]
for . Consequently, for each and , eq. (28) (equivalently eq. (27)) has a unique solution , with H^{*}\bigl{(}\frac{\sigma^{2}}{2}\,y_{\lambda,\nu}\bigr{)}\in L^{2}(\mathbb{R}).
We have also
[TABLE]
so that .
Since by assumption (3) the operator z\rightarrow\nu z+H^{*}\bigl{(}\frac{\sigma^{2}}{2}\,z\bigr{)} is invertible in , and its inverse maps inverse into itself, we infer that .
It is worth to mention that by (27), we have
[TABLE]
, so that
[TABLE]
, for and where . To get (34), we simply multiply the equation
[TABLE]
by
[TABLE]
where for , and we integrate on , taking into account that
[TABLE]
[TABLE]
For a rigorous proof of these relations we replace by , where is a smooth approximation of signum function, while , see , e.g., [1], p. 115. If and is strongly convergent to , we can proceed as above to obtain for the corresponding solution to (27) the estimate (34), namely,
[TABLE]
Hence there exists such that
[TABLE]
By (28), we have
[TABLE]
[TABLE]
Let ,
that is in . Equivalently
[TABLE]
This yields
[TABLE]
and then
[TABLE]
On the other hand, by (38), we have
[TABLE]
Hence
[TABLE]
This yields
[TABLE]
and therefore, by (36), we derive the estimate
[TABLE]
Since, by hypothesis (1) , the latter implies that
[TABLE]
where is still independent of as well as on .
By (35) and (42), it follows that
[TABLE]
strongly in , and therefore solves (27). Furthermore, by (34) and (42), we have
[TABLE]
, where is independent of . We also obtain that inequality (34) holds for solution to (27), with only. Now we are going to extend the solution to (27) for all . To this end we set , rewriting (27) as follows . For every , we can equivalently write this as
[TABLE]
By (34) we also have
[TABLE]
then, by contraction principle, (45) has a unique solution , for all . Estimate (44) extends for all . In order to complete the proof of Lemma 3.1, we are going to let in equation (27), or, more precisely, in (28) which holds for all . As noted before, for all , we have
[TABLE]
and
[TABLE]
consequently
[TABLE]
and
[TABLE]
We set . Then, for , we have in and
[TABLE]
Hence
[TABLE]
strongly in , and
[TABLE]
strongly in , where , and
[TABLE]
for . Moreover, by (34), the map is Lipschitz in , with Lipschitz constant , then solves (25), and (26) follows. This completes the proof of Lemma 3.1.
Proof. [Proof of Theorem 2.2 (continued)] Coming back to equation (23), by Lemma 3.1 and (14), it follows that the operator is quasi-m-accretive in . Then by the Crandall & Ligget theorem, see [1], p. 147, the Cauchy problem (23) has a unique mild solution , that is
[TABLE]
[TABLE]
The function is a mild solution to (16) in the sense of Definition 2.1.
Assume now that and . Taking into account that it is easily seen that this implies that
[TABLE]
Assume also that . Then, if we take in (19), and get
[TABLE]
Multiplying by and integrating on we get
[TABLE]
Integrating by parts in , summing up, after some calculation involving (14) and (47), we get the estimate
[TABLE]
which implies the desired conclusion
[TABLE]
4 A multi-dimensional case
Consider the problem (1) in with the drift , namely
[TABLE]
subject to , and to stochastic differential equation
[TABLE]
Here is a Wiener process, satisfies assumption (1) and
- (i)
2. (ii)
, where satisfies condition (3), while the matrix is such that is positive defined.
Let be the elliptic second order operator
[TABLE]
where . The corresponding dynamic programming equation for (48) reads as follows
[TABLE]
If
[TABLE]
equation (52) reduces to
[TABLE]
see (11), where , . By [3], for the elliptic equation in has a unique solution which satisfies if , if and if , where here is the Marcinkiewicz space. The latter implies that any solution to (53) leads to a unique solution for , , for , and, respectively, for . Concerning the existence of a solution to eq. (53), we have a result similar to the one stated in Theorem 2.2, namely
Theorem 4.1
Under assumption (i)-(ii)-(iii) there is a unique mild solution , in the sense of Definition 2.1.
Proof. We shall proceed as in the proof of Theorem 2.2. In particular, we consider the operator
[TABLE]
[TABLE]
and we write equation (53) as
[TABLE]
Lemma 4.1
The operator is m-accretive in .
Proof. Since the operator is m-accretive in , see, e.g., [2, 3], then the same holds for the operator , moreover, taking into account that , it follows the m-accretivety of the operator , as claimed. Indeed, equation
[TABLE]
is equivalent to
[TABLE]
where and this implies the conclusion.
Again invoking the Crandall & Ligget Theorem, we get that the eq. (55) has a unique mild solution , which is given by
[TABLE]
[TABLE]
hence completing the proof of Theorem 4.1.
By Theorem 4.1 it follows the existence and uniqueness of a solution .
Remark 4.2
In the general -dimensional case, where , the dynamic programming equation corresponding to (1) reduces to
[TABLE]
where
[TABLE]
therefore eq. (56) can be treated analogously to what we have seen in the 1-dimensional case, at least if the operator is continuous in , which happens under some additional conditions on . We note that, for , the linear Fokker-Planck equation (56), has been treated in [2].
5 The degenerate 1-D case
Consider here equation (16), that is
[TABLE]
where is assumed to satisfy the condition only. Moreover, if we consider, as above, the operator , such that
[TABLE]
[TABLE]
we have the following holds
Lemma 5.1
* is quasi--accretive in .*
Proof. For each we consider the operator
[TABLE]
which is quasi--accretive, seen Lemma 3.1. Hence, for each and the equation
[TABLE]
has a unique solution , with H^{*}\Bigl{(}\frac{\sigma^{2}+\epsilon}{2}\,y_{\epsilon}\Bigr{)}\in L^{\infty}(\mathbb{R}).
Dynamic estimates. As in the proof of Lemma 3.1, we have
[TABLE]
that is for
[TABLE]
Assume now that , then, by (60) we see that for each
[TABLE]
Moreover, by (5), we also have
[TABLE]
for and large enough (independently of ). This yields
[TABLE]
Hence in for . Similarly, it follows that
[TABLE]
if is large enough, but independent of . Therefore, if multiply the equation by and integrate on , we get which implies in .
By (60), we see that \Bigl{\{}\Bigl{(}H^{*}\Bigl{(}\frac{\sigma^{2}+\epsilon}{2}\,y_{\epsilon}\Bigr{)}\Bigr{)}^{\prime}+f\,y_{\epsilon}^{\prime}\Bigr{\}}_{\epsilon>0} is bounded in .
Hence \Bigl{(}H^{*}\Bigl{(}\frac{\sigma^{2}+\epsilon}{2}\,y_{\epsilon}\Bigr{)}\Bigr{)}^{\prime} bounded in , so that \Bigl{\{}\eta_{\epsilon}=H^{*}\Bigl{(}\frac{\sigma^{2}+\epsilon}{2}\,y_{\epsilon}\Bigr{)}\Bigr{\}} is compact in . It follows that on a subsequence , we have
[TABLE]
where \zeta=H^{*}\Bigl{(}\frac{\sigma^{2}}{2}\,y\Bigr{)} in . Letting in (60), we get
[TABLE]
Next for we choose , in and we have
[TABLE]
getting
[TABLE]
Hence, for we have for
[TABLE]
This yields
[TABLE]
Hence for , is the solution to equation as claimed. As seen earlier this implies that the operator is quasi--accretive in
Then by the existence theorem for the equation
[TABLE]
we get
Theorem 5.1
There is a unique mild solution to equation (57).
As in previous case Theorem 5.1 implies via (13) the existence of a mild solution to equation (1) satisfying (20). We omit the details.
6 Conclusions
In this paper it is shown, via nonlinear semigroup theory in , both the existence and the uniqueness of a mild solution for the dynamic programming equation for stochastic optimal control problem with control in the volatility term. Latter problem is related to the analysis of controlled stochastic volatility models, within the financial frameworks, whose related computational study is the subject of our ongoing research.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Viorel Barbu. Nonlinear differential equations of monotone types in Banach spaces . Springer Science & Business Media, 2010.
- 2[2] Viorel Barbu. Generalized solutions to nonlinear fokker–planck equations. Journal of Differential Equations , 261(4):2446–2471, 2016.
- 3[3] Philippe Benilan, Haim Brezis, and Michael G Crandall. A semilinear equation in l 1 ( ℝ n ) superscript 𝑙 1 superscript ℝ 𝑛 l^{1}(\mathbb{R}^{n}) . Annali della Scuola Normale Superiore di Pisa-Classe di Scienze , 2(4):523–555, 1975.
- 4[4] Francesco. Cordoni and Luca Di Persio. Transition density for cir process by lie symmetries and application to zcb pricing. International Journal of Pure and Applied Mathematics , 88(2):239–246, 2013.
- 5[5] John C. Cox. Notes on option pricing i: Constant elasticity of diffusions. Stanford University , Unpublished draft(2), 1975.
- 6[6] Michael G Crandall, Hitoshi Ishii, and Pierre-Louis Lions. User’s guide to viscosity solutions of second order partial differential equations. Bulletin of the American Mathematical Society , 27(1):1–67, 1992.
- 7[7] Wendell H Fleming and Raymond W Rishel. Deterministic and stochastic optimal control , volume 1. Springer Science & Business Media, 2012.
- 8[8] P. Hagan, A. Lesniewski, and D. Woodward. Probability distribution in the sabr model of stochastic volatility. volume 110, pages 1–35, 2015.
