A Hida-Malliavin white noise calculus approach to optimal control

Nacira Agram; Bernt {\O}ksendal

arXiv:1704.08899·math.OC·November 12, 2018

A Hida-Malliavin white noise calculus approach to optimal control

Nacira Agram, Bernt {\O}ksendal

PDF

TL;DR

This paper introduces a novel approach using Hida-Malliavin calculus and white noise theory to derive optimal control conditions for systems with jumps, diffusion, and control-dependent coefficients, simplifying previous methods.

Contribution

It provides an alternative framework that handles jumps and control-dependent coefficients without requiring second order BSDEs, extending the classical maximum principle.

Findings

01

Handles systems with jumps and control-dependent coefficients

02

Avoids the need for second order BSDEs in the maximum principle

03

Illustrated with a constrained mean-variance portfolio example

Abstract

The classical maximum principle for optimal stochastic control states that if a control $\overset{u}{^}$ is optimal, then the corresponding Hamiltonian has a maximum at $u = \overset{u}{^}$ . The first proofs for this result assumed that the control did not enter the diffusion coefficient. Moreover, it was assumed that there were no jumps in the system. Subsequently it was discovered by Shige Peng (still assuming no jumps) that one could also allow the diffusion coefficient to depend on the control, provided that the corresponding adjoint backward stochastic differential equation (BSDE) for the first order derivative was extended to include an extra BSDE for the second order derivatives. In this paper we present an alternative approach based on Hida-Malliavin calculus and white noise theory. This enables us to handle the general case with jumps, allowing both the diffusion coefficient and the jump…

Equations256

⎩ ⎨ ⎧ d X (t) = X (0) = b (t, X (t), u (t)) d t + σ (t, X (t), u (t)) d B (t) + \int_{R_{0}} γ (t, X (t), u (t), ζ) \tilde{N} (d t, d ζ); 0 \leq t \leq T, x_{0} \in R (constant).

⎩ ⎨ ⎧ d X (t) = X (0) = b (t, X (t), u (t)) d t + σ (t, X (t), u (t)) d B (t) + \int_{R_{0}} γ (t, X (t), u (t), ζ) \tilde{N} (d t, d ζ); 0 \leq t \leq T, x_{0} \in R (constant).

\int_{R_{0}} ζ^{2} ν (d ζ) < \infty.

\int_{R_{0}} ζ^{2} ν (d ζ) < \infty.

u \in A sup J (u) = J (\overset{u}{^}),

u \in A sup J (u) = J (\overset{u}{^}),

J (u) := E [\int_{0}^{T} f (t, X^{u} (t), u (t)) d t + g (X^{u} (T))]

J (u) := E [\int_{0}^{T} f (t, X^{u} (t), u (t)) d t + g (X^{u} (T))]

E [Y^{2} (t)] < \infty, for all t

E [Y^{2} (t)] < \infty, for all t

Y (t) = a t + b B (t) + \int_{0}^{t} \int_{R_{0}} ζ \tilde{N} (d s, d ζ)

Y (t) = a t + b B (t) + \int_{0}^{t} \int_{R_{0}} ζ \tilde{N} (d s, d ζ)

η (\cdot) := \int_{0}^{\cdot} \int_{R_{0}} ζ \tilde{N} (d s, d ζ)

η (\cdot) := \int_{0}^{\cdot} \int_{R_{0}} ζ \tilde{N} (d s, d ζ)

F = \sum_{n = 0}^{\infty} I_{n} (f_{n})

F = \sum_{n = 0}^{\infty} I_{n} (f_{n})

I_{n} (f_{n}) = n! \int_{0}^{T} \int_{0}^{t_{n}} \dots \int_{0}^{t_{2}} f_{n} (t_{1}, \dots, t_{n}) d B (t_{1}) d B (t_{2}) \dots d B (t_{n})

I_{n} (f_{n}) = n! \int_{0}^{T} \int_{0}^{t_{n}} \dots \int_{0}^{t_{2}} f_{n} (t_{1}, \dots, t_{n}) d B (t_{1}) d B (t_{2}) \dots d B (t_{n})

E [F^{2}] = ∣∣ F ∣ ∣_{L^{2} (P)}^{2} = \sum_{n = 0}^{\infty} n! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} .

E [F^{2}] = ∣∣ F ∣ ∣_{L^{2} (P)}^{2} = \sum_{n = 0}^{\infty} n! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} .

∣∣ F ∣ ∣_{D_{1, 2}}^{2} := \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} < \infty.

∣∣ F ∣ ∣_{D_{1, 2}}^{2} := \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} < \infty.

D_{t} F = \sum_{n = 1}^{\infty} n I_{n - 1} (f_{n} (\cdot, t)),

D_{t} F = \sum_{n = 1}^{\infty} n I_{n - 1} (f_{n} (\cdot, t)),

E [\int_{0}^{T} (D_{t} F)^{2} d t] = \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} = ∣∣ F ∣ ∣_{D_{1, 2}}^{2},

E [\int_{0}^{T} (D_{t} F)^{2} d t] = \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} (λ^{n})}^{2} = ∣∣ F ∣ ∣_{D_{1, 2}}^{2},

D_{t} F = f (t) \mbox f or a . a . t \in [0, T] .

D_{t} F = f (t) \mbox f or a . a . t \in [0, T] .

D_{t}[{\textstyle\int_{0}^{T}}\mathrm{\psi}(s)dB(s)]={\textstyle\int_{0}^{T}}D_{t}\mathrm{\psi}(s)dB(s)+\mathrm{\psi}(t)\;\mbox{for a.a. $(t,\omega)$}.

D_{t}[{\textstyle\int_{0}^{T}}\mathrm{\psi}(s)dB(s)]={\textstyle\int_{0}^{T}}D_{t}\mathrm{\psi}(s)dB(s)+\mathrm{\psi}(t)\;\mbox{for a.a. $(t,\omega)$}.

D_{t} Φ (F_{1}, \dots, F_{m}) = \sum_{i = 1}^{m} \frac{\partial Φ}{\partial x _{i}} (F_{1}, \dots, F_{m}) D_{t} F_{i} .

D_{t} Φ (F_{1}, \dots, F_{m}) = \sum_{i = 1}^{m} \frac{\partial Φ}{\partial x _{i}} (F_{1}, \dots, F_{m}) D_{t} F_{i} .

E [F \int_{0}^{T} ψ (t) d B (t)] = E [\int_{0}^{T} ψ (t) D_{t} F d t] .

E [F \int_{0}^{T} ψ (t) d B (t)] = E [\int_{0}^{T} ψ (t) D_{t} F d t] .

D_{s} φ (t) = 0 for s > t .

D_{s} φ (t) = 0 for s > t .

D_{t} F \in (S)^{*} and (t, ω) \mapsto E [D_{t} F ∣ F_{t}] belongs to L^{2} (λ \times P) .

D_{t} F \in (S)^{*} and (t, ω) \mapsto E [D_{t} F ∣ F_{t}] belongs to L^{2} (λ \times P) .

F = E [F] + \int_{0}^{T} E [D_{t} F ∣ F_{t}] d B (t)

F = E [F] + \int_{0}^{T} E [D_{t} F ∣ F_{t}] d B (t)

E [F \int_{0}^{T} φ (t) d B (t)] = E [\int_{0}^{T} E [D_{t} F ∣ F_{t}] φ (t) d t] .

E [F \int_{0}^{T} φ (t) d B (t)] = E [\int_{0}^{T} E [D_{t} F ∣ F_{t}] φ (t) d t] .

E [F \int_{0}^{T} φ (t) d B (t)] = E [(E [F] + \int_{0}^{T} E [D_{t} F ∣ F_{t}] d B (t)) (\int_{0}^{T} φ (t) d B (t))]

E [F \int_{0}^{T} φ (t) d B (t)] = E [(E [F] + \int_{0}^{T} E [D_{t} F ∣ F_{t}] d B (t)) (\int_{0}^{T} φ (t) d B (t))]

= E [\int_{0}^{T} E [D_{t} F ∣ F_{t}] φ (t) d t] .

F = \sum_{n = 0}^{\infty} I_{n} (f_{n}); f_{n} \in \hat{L}^{2} ((λ \times ν)^{n}),

F = \sum_{n = 0}^{\infty} I_{n} (f_{n}); f_{n} \in \hat{L}^{2} ((λ \times ν)^{n}),

I_{n} (f_{n}) := n! \int_{0}^{T} \int_{R_{0}} \int_{0}^{t_{n}} \int_{R_{0}} \dots \int_{0}^{t_{2}} \int_{R_{0}} f_{n} (t_{1}, ζ_{1}, \dots, t_{n}, ζ_{n}) \tilde{N} (d t_{1}, d ζ_{1}) \dots \tilde{N} (d t_{n}, d ζ_{n}),

I_{n} (f_{n}) := n! \int_{0}^{T} \int_{R_{0}} \int_{0}^{t_{n}} \int_{R_{0}} \dots \int_{0}^{t_{2}} \int_{R_{0}} f_{n} (t_{1}, ζ_{1}, \dots, t_{n}, ζ_{n}) \tilde{N} (d t_{1}, d ζ_{1}) \dots \tilde{N} (d t_{n}, d ζ_{n}),

∣∣ F ∣ ∣_{L^{2} (P)}^{2} = \sum_{n = 0}^{\infty} n! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} .

∣∣ F ∣ ∣_{L^{2} (P)}^{2} = \sum_{n = 0}^{\infty} n! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} .

∣∣ F ∣ ∣_{D_{1, 2}^{(\tilde{N})}}^{2} := \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} < \infty.

∣∣ F ∣ ∣_{D_{1, 2}^{(\tilde{N})}}^{2} := \sum_{n = 1}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} < \infty.

D_{t, ζ} F := \sum_{n = 1}^{\infty} n I_{n - 1} (f_{n} (\cdot, t, ζ)),

D_{t, ζ} F := \sum_{n = 1}^{\infty} n I_{n - 1} (f_{n} (\cdot, t, ζ)),

E [\int_{0}^{T} \int_{R_{0}} (D_{t, ζ} F)^{2} ν (d ζ) d t] = \sum_{n = 0}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} = ∣∣ F ∣ ∣_{D_{1, 2}^{(\tilde{N})}}^{2} .

E [\int_{0}^{T} \int_{R_{0}} (D_{t, ζ} F)^{2} ν (d ζ) d t] = \sum_{n = 0}^{\infty} nn! ∣∣ f_{n} ∣ ∣_{L^{2} ((λ \times ν)^{n})}^{2} = ∣∣ F ∣ ∣_{D_{1, 2}^{(\tilde{N})}}^{2} .

D_{t, ζ} F = f (t, ζ) \mbox f or a . a . (t, ζ) .

D_{t, ζ} F = f (t, ζ) \mbox f or a . a . (t, ζ) .

D_{t, ζ} (\int_{0}^{T} \int_{R_{0}} ψ (s, ζ) \tilde{N} (d s, d ζ)) = \int_{0}^{T} \int_{R_{0}} D_{t, ζ} ψ (s, ζ) \tilde{N} (d s, d ζ) + ψ (t, ζ) \mbox f or a . a . t, ζ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Hida-Malliavin white noise calculus approach to optimal control

Nacira Agram1,2 and Bernt Øksendal1

(8 November 2018)

Abstract

The classical maximum principle for optimal stochastic control states that if a control $\hat{u}$ is optimal, then the corresponding Hamiltonian has a maximum at $u=\hat{u}$ . The first proofs for this result assumed that the control did not enter the diffusion coefficient. Moreover, it was assumed that there were no jumps in the system. Subsequently it was discovered by Shige Peng (still assuming no jumps) that one could also allow the diffusion coefficient to depend on the control, provided that the corresponding adjoint backward stochastic differential equation (BSDE) for the first order derivative was extended to include an extra BSDE for the second order derivatives.

In this paper we present an alternative approach based on Hida-Malliavin calculus and white noise theory. This enables us to handle the general case with jumps, allowing both the diffusion coefficient and the jump coefficient to depend on the control, and we do not need the extra BSDE with second order derivatives.

The result is illustrated by an example of a constrained linear-quadratic optimal control.

11footnotetext: Department of Mathematics, University of Oslo, P.O. Box 1053 Blindern, N–0316 Oslo, Norway. Email: [email protected], [email protected]. This research was carried out with support of the Norwegian Research Council, within the research project Challenges in Stochastic Control, Information and Applications (STOCONINF), project number 250768/F20.22footnotetext: University of Biskra, Algeria.

MSC(2010):

60H05, 60H20, 60J75, 93E20, 91G80,91B70.

Keywords:

Stochastic maximum principle; spike perturbation; backward stochastic differential equation (BSDE); white noise theory; Hida-Malliavin calculus.

1 Introduction

Let $X^{u}(t)=X(t)$ be a solution of a controlled stochastic jump diffusion of the form

[TABLE]

Here $B(t)$ and $\tilde{N}(dt,d\zeta):=N(dt,d\zeta)-\nu(d\zeta)dt$ is a Brownian motion and an independent compensated Poisson random measure, respectively, jointly defined on a filtered probability space $(\Omega,\mathcal{F},\mathbb{F}=\{\mathcal{F}_{t}\}_{t\geq 0},P)$ satisfying the usual conditions. The measure $\nu$ is the Lévy measure of $N$ , $T>0$ is a given constant and $u=u(t)$ is our control process. We assume that

[TABLE]

Now for $u$ to be admissible, we require that $u$ is $\mathbb{F}$ -adapted and that $u(t)\in V$ for all $t$ for some given Borel set $V\subset\mathbb{R}$ . The given coefficients $b(t,x,u)=b(t,x,u,\omega),\sigma(t,x,u)=\sigma(t,x,u,\omega)$ and $\gamma(t,x,u,\zeta)=\gamma(t,x,u,\zeta,\omega)$ are assumed to be $\mathbb{F}$ -predictable for each given $x,u$ and $\zeta$ .

Problem 1.1

We want to find $\hat{u}$ such that

[TABLE]

where $\mathcal{A}$ denotes the set of admissible controls, and

[TABLE]

*is our performance functional, with a given $\mathbb{F}$ -adapted profit rate $f(t,x,u)=f(t,x,u,\omega)$ and a given $\mathcal{F}_{T}$ -measurable terminal payoff $g(x)=g(x,\omega)$ . Such a control $\hat{u}$ (if it exists) is called an optimal control.

In the classical maximum principle for optimal control one associates to the system a Hamiltonian function and an adjoint BSDE, involving the first order derivatives of the coefficients of the system. The maximum principle states that if $\hat{u}$ is optimal, then the corresponding Hamiltonian has a maximum at $u=\hat{u}$ . To prove this, one can perform a so-called spike perturbation of the optimal control, and study what happens in the limit when the spike perturbation converges to [math]. This was first done by Bensoussan [5], in the case when there are no jumps ( $\gamma=0$ ) and when the diffusion coefficient $\sigma$ does not depend on $u$ .

Subsequently it was discovered by Peng [15] (still in the case with no jumps) that the maximum principle could be extended to allow $\sigma$ to depend on $u$ provided that the original adjoint BSDE was accompanied by a second order BSDE and the Hamiltonian was extended accordingly. See e.g. Chapter 3 in Yong and Zhou [17] for a discussion of this.

The purpose of our paper is to show that if we use spike perturbation combined with white noise theory and the associated Hida-Malliavin calculus, we can obtain a maximum principle similar to the classical type, with the classical Hamiltonian and only the first order adjoint BSDE, allowing jumps and allowing both the diffusion coefficient $\sigma$ and the jump coefficient $\gamma$ to depend on $u$ .

We remark that if the set $\mathcal{A}$ of admissible control processes is convex, we can also use convex perturbation to obtain related (albeit weaker) versions of the maximum principle. See e.g. Bensoussan [5] and Øksendal and Sulem [13] and the references therein.

Also note that Rong proves in Chapter 12 in [16] that if we have jumps in the dynamics and the control domain is not convex, then the approach cannot allow the jump coefficient to depend on the control.

Our paper is organized as follows:

•

In Section 2, we give a short survey of the Hida-Malliavin calculus.

•

In Section 3, we prove our main result.

•

In Section 4, we illustrate our result by an example of a constrained linear-quadratic optimal control.

2 A brief review of Hida-Malliavin calculus for Lévy processes

The Malliavin derivative was originally introduced by Malliavin in [10] as a stochastic calculus of variation used to prove results about smoothness of densities of solutions of stochastic differential equations in $\mathbb{R}^{n}$ driven by Brownian motion. The domain of definition of the Malliavin derivative is a subspace $\mathbb{D}_{1,2}$ of $\mathbb{L}^{2}(P)$ . Subsequently, in Aase et al [1] the Malliavin derivative was put into the context of the white noise theory of Hida and extended to an operator defined on the whole of $\mathbb{L}^{2}(P)$ and with values in the Hida space $(\mathcal{S})^{\ast}$ of stochastic distributions. This extension is called the Hida-Malliavin derivative.

There are several advantages with working with this extended Hida-Malliavin derivative:

•

The Hida-Malliavin derivative is defined on all of $\mathbb{L}^{2}(P)$ , and it coincides with the classical Malliavin derivative on the subspace $\mathbb{D}_{1,2}.$

•

The Hida-Malliavin derivative combines well with the white noise calculus, including the Skorohod integral and calculus with the Wick product $\diamond$ .

•

Moreover, it extends easily to a Hida-Malliavin derivative with respect to a Poisson random measure.

These statements are made more precise in the following brief review, where we recall the basic definition and properties of Hida-Malliavin calculus for Lévy processes. The summary is partly based on Agram and Øksendal [2] and Agram et al [3], [4]. General references for this presentation are Aase et al [1], Benth [6], Lindstrøm et al [9], and the books Hida et al [8] and Di Nunno et al [7].

In a white noise context, the Hida-Malliavin derivative is simply a stochastic gradient. Equivalently, one can introduce this derivative by means of chaos expansions, as follows:

First, recall the Lévy–Itô decomposition theorem, which states that any Lévy process $Y(t)$ with

[TABLE]

can be written

[TABLE]

with constants $a$ and $b$ . In view of this we see that it suffices to deal with Hida-Malliavin calculus for $B(\cdot)$ and for

[TABLE]

separately.

2.1 Hida-Malliavin calculus for $B(\cdot)$

A natural starting point is the Wiener-Itô chaos expansion theorem, which states that any $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ can be written

[TABLE]

for a unique sequence of symmetric deterministic functions $f_{n}\in\mathbb{L}^{2}(\lambda^{n})$ , where $\lambda$ is Lebesgue measure on $[0,T]$ and

[TABLE]

(the $n$ -times iterated integral of $f_{n}$ with respect to $B(\cdot)$ ) for $n=1,2,\ldots$ and $I_{0}(f_{0})=f_{0}$ when $f_{0}$ is a constant.

Moreover, we have the isometry

[TABLE]

Definition 2.1 (Hida-Malliavin derivative $D_{t}$ with respect to $B(\cdot)$ )

Let $\mathbb{D}_{1,2}=\mathbb{D}_{1,2}^{(B)}$ be the space of all $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ such that its chaos expansion (2.1) satisfies

[TABLE]

For $F\in\mathbb{D}_{1,2}$ and $t\in[0,T]$ , we define the Hida-Malliavin derivative or the stochastic gradient) of $F$ at $t$ (with respect to $B(\cdot)$ ), $D_{t}F,$ by

[TABLE]

where the notation $I_{n-1}(f_{n}(\cdot,t))$ means that we apply the $(n-1)$ -times iterated integral to the first $n-1$ variables $t_{1},\cdots,t_{n-1}$ of $f_{n}(t_{1},t_{2},\cdots,t_{n})$ and keep the last variable $t_{n}=t$ as a parameter.

One can easily check that

[TABLE]

so $(t,\omega)\mapsto D_{t}F(\omega)$ belongs to $\mathbb{L}^{2}(\lambda\times P)$ .

Example 2.1

If $F={\textstyle\int_{0}^{T}}f(t)dB(t)$ with $f\in\mathbb{L}^{2}(\lambda)$ deterministic, then

[TABLE]

More generally, if $\mathrm{\psi}$$(s)$ is Itô integrable, $\mathrm{\psi}$$(s)\in\mathbb{D}_{1,2}$ for $a.a.\;s$ and $D_{t}\mathrm{\psi}(s)$ is Itô integrable for $a.a.\;t$ , then

[TABLE]

Some other basic properties of the Hida-Malliavin derivative $D_{t}$ are the following:

(i)

**Chain rule **

Suppose $F_{1},\ldots,F_{m}\in\mathbb{D}_{1,2}$ and that $\Phi:\mathbb{R}^{m}\rightarrow\mathbb{R}$ is $C^{1}$ with bounded partial derivatives. Then, $\Phi(F_{1},\cdots,F_{m})\in\mathbb{D}_{1,2}$ and

[TABLE] 2. (ii)

Duality formula

Suppose $\psi(t)$ is $\mathbb{F}$ -adapted with $\mathbb{E}[{\textstyle\int_{0}^{T}}\psi^{2}(t)dt]<\infty$ and let $F\in\mathbb{D}_{1,2}$ . Then,

[TABLE] 3. (iii)

**Malliavin derivative and adapted processes

**If $\varphi$ is an $\mathbb{F}$ -adapted process, then

[TABLE]

Remark 2.2

We put $D_{t}\varphi(t)=\underset{s\rightarrow t-}{\lim}D_{s}\varphi(t)$ (if the limit exists).

2.2 Extension to a white noise setting

In the following, we let $(\mathcal{S})^{\ast}$ denote the Hida space of stochastic distributions.

It was proved in Aase et al [1] that one can extend the Hida-Malliavin derivative operator $D_{t}$ from $\mathbb{D}_{1,2}$ to all of $\mathbb{L}^{2}(\mathcal{F}_{T},P)$ in such a way that, also denoting the extended operator by $D_{t}$ , for all $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ , we have

[TABLE]

Moreover, the following generalized Clark-Haussmann-Ocone formula was proved:

[TABLE]

for all $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ . See Theorem 3.11 in Aase et al [1] and also Theorem 6.35 in Di Nunno et al [7].

We can use this to get the following extension of the duality formula (ii) above:

Proposition 2.3 (The generalized duality formula)

Let $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ and let $\varphi(t,\omega)\in\mathbb{L}^{2}(\lambda\times P)$ be $\mathbb{F}$ -adapted. Then

[TABLE]

Proof. By (2.5) and (2.6) and the Itô isometry, we get

[TABLE]

$\square$

We will use this extension of the Hida-Malliavin derivative from now on.

2.3 Hida-Malliavin calculus for $\tilde{N}(\cdot)$

The construction of a stochastic derivative/Hida-Malliavin derivative in the pure jump martingale case follows the same lines as in the Brownian motion case. In this case, the corresponding Wiener-Itô Chaos Expansion Theorem states that any $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ (where, in this case, $\mathcal{F}_{t}=\mathcal{F}_{t}^{(\tilde{N})}$ is the $\sigma-$ algebra generated by $\eta(s):={\textstyle\int_{0}^{s}}{\textstyle\int_{\mathbb{R}_{0}}}\zeta\tilde{N}(dr,d\zeta);\;0\leq s\leq t$ ) can be written as

[TABLE]

where $\hat{L}^{2}((\lambda\times\nu)^{n})$ is the space of functions $f_{n}(t_{1},\zeta_{1},\ldots,t_{n},\zeta_{n})$ ; $t_{i}\in[0,T]$ , $\zeta_{i}\in\mathbb{R}_{0}$ for $i=1,..,n$ , such that $f_{n}\in\mathbb{L}^{2}((\lambda\times\nu)^{n})$ and $f_{n}$ is symmetric with respect to the pairs of variables $(t_{1},\zeta_{1}),\ldots,(t_{n},\zeta_{n}).$

It is important to note that in this case, the $n-$ times iterated integral $I_{n}(f_{n})$ is taken with respect to $\tilde{N}(dt,d\zeta)$ and not with respect to $d\eta(t).$ Thus, we define

[TABLE]

for $f_{n}\in\hat{L}^{2}((\lambda\times\nu)^{n}).$

The Itô isometry for stochastic integrals with respect to $\tilde{N}(dt,d\zeta)$ then gives the following isometry for the chaos expansion:

[TABLE]

As in the Brownian motion case, we use the chaos expansion to define the Malliavin derivative. Note that in this case, there are two parameters $t,\zeta,$ where $t$ represents time and $\zeta\neq 0$ represents a generic jump size.

Definition 2.4 (Hida-Malliavin derivative $D_{t,\zeta}$ with respect to $\tilde{N}(\cdot,\cdot)$ )

Let $\mathbb{D}_{1,2}^{(\tilde{N})}$ be the space of all $F\in\mathbb{L}^{2}(\mathcal{F}_{T},P)$ such that its chaos expansion (2.8) satisfies

[TABLE]

For $F\in\mathbb{D}_{1,2}^{(\tilde{N})}$ , we define the Hida-Malliavin derivative of $F$ at $(t,\zeta)$ (with respect to $\tilde{N}(\cdot,\cdot))$ , $D_{t,\zeta}F,$ by

[TABLE]

where $I_{n-1}(f_{n}(\cdot,t,\zeta))$ means that we perform the $(n-1)-$ times iterated integral with respect to $\tilde{N}$ to the first $n-1$ variable pairs $(t_{1},\zeta_{1}),\cdots,(t_{n},\zeta_{n}),$ keeping $(t_{n},\zeta_{n})=(t,\zeta)$ as a parameter.

In this case, we get the isometry.

[TABLE]

(Compare with (2.3).)

Example 2.2

If $F={\textstyle\int_{0}^{T}}{\textstyle\int_{\mathbb{R}_{0}}}f(t,\zeta)\tilde{N}(dt,d\zeta)$ for some deterministic $f(t,\zeta)\in\mathbb{L}^{2}(\lambda\times\nu)$ , then

[TABLE]

More generally, if $\psi(s,\zeta)$ is integrable with respect to $\tilde{N}(ds,d\zeta)$ , $\psi(s,\zeta)\in\mathbb{D}_{1,2}^{(\tilde{N})}$ for $a.a.\,s,\zeta$ and $D_{t,\zeta}\psi(s,\zeta)$ is integrable for $a.a.\,(t,\zeta)$ , then

[TABLE]

The properties of $D_{t,\zeta}$ corresponding to those of $D_{t}$ are the following:

(i)

**Chain rule

**Suppose $F_{1},\cdots,F_{m}\in\mathbb{D}_{1,2}^{(\tilde{N})}$ and that $\phi:\mathbb{R}^{m}\rightarrow\mathbb{R}$ is continuous and bounded. Then, $\phi(F_{1},\cdots,F_{m})\in\mathbb{D}_{1,2}^{(\tilde{N})}$ and

[TABLE]

(ii)

**Duality formula

**Suppose $\Psi(t,\zeta)$ is $\mathbb{F}$ -adapted and $\mathbb{E}[{\textstyle\int_{0}^{T}}{\textstyle\int_{\mathbb{R}_{0}}^{2}}\Psi(t,\zeta)\nu(d\zeta)dt]<\infty$ and let $F\in\mathbb{D}_{1,2}^{(\tilde{N})}$ . Then,

[TABLE]

(iii)

**Hida-Malliavin derivative and adapted processes

**If $\varphi$ is an $\mathbb{F}$ -adapted process, then,

[TABLE]

Remark 2.5

We put $D_{t,\zeta}\varphi(t)=\underset{s\rightarrow t-}{\lim}D_{s,\zeta}\varphi(t)$ ( if the limit exists).

2.4 Extension to a white noise setting

As in section 2.2, we note that there is an extension of the Hida-Malliavin derivative $D_{t,\zeta}$ from $\mathbb{D}_{1,2}^{(\tilde{N})}$ to $\mathbb{L}^{2}(\lambda\times P)$ such that the following extension of the duality theorem holds:

Proposition 2.6 (Generalized duality formula)

Suppose $\Psi(t,\zeta)$ is $\mathbb{F}$ -adapted and

[TABLE]

and let $F\in\mathbb{L}^{2}(\lambda\times P)$ . Then,

[TABLE]

Accordingly, note that from now on we are working with this generalized version of the Malliavin derivative. We emphasize that this generalized Hida-Malliavin derivative $DX$ (where $D$ stands for $D_{t}$ or $D_{t,\zeta}$ , depending on the setting) exists for all $X\in\mathbb{L}^{2}(P)$ as an element of the Hida stochastic distribution space $(\mathcal{S})^{\ast}$ , and it has the property that the conditional expectation $\mathbb{E}[DX|\mathcal{F}_{t}]$ belongs to $\mathbb{L}^{2}(\lambda\times P)$ , where $\lambda$ is Lebesgue measure on $[0,T]$ . Therefore, when using the Hida-Malliavin derivative, combined with conditional expectation, no assumptions on Hida-Malliavin differentiability in the classical sense are needed; we can work on the whole space of random variables in $\mathbb{L}^{2}(P)$ .

2.5 Representation of solutions of BSDE

The following result, due to Øksendal and Røse [12], is crucial for our method:

Theorem 2.7

Suppose that $f,p,q$ and $r$ are given càdlàg adapted processes in $\mathbb{L}^{2}(\lambda\times P),\mathbb{L}^{2}(\lambda\times P),\mathbb{L}^{2}(\lambda\times P)$ and $\mathbb{L}^{2}(\lambda\times\nu\times P)$ respectively, and they satisfy a BSDE of the form

[TABLE]

Then for a.a. $t$ and $\zeta$ the following holds:

[TABLE]

and

[TABLE]

3 The spike variation stochastic maximum principle

Throughout this work, we will use the following spaces:

•

$\mathcal{S}^{2}$ is the set of ${\mathbb{R}}$ -valued $\mathbb{F}$ -adapted càdlàg processes $(X(t))_{t\in[0,T]}$ such that

[TABLE]

•

$\mathbb{L}^{2}$ is the set of ${\mathbb{R}}$ -valued $\mathbb{F}$ -predictable processes $(Q(t))_{t\in[0,T]}$ such that

[TABLE]

•

$\mathbb{L}_{\nu}^{2}$ is the set of $\mathbb{F}$ -predictable processes $r:[0,T]\times\mathbb{R}_{0}\rightarrow\mathbb{R}$ such that

[TABLE]

•

$\mathcal{A}$ is a set of all $\mathbb{F}$ -predictable processes $u$ required to have values in a Borel set $V\subset\mathbb{R}$ . We call $\mathcal{A}$ the set of admissible control processes $u(\cdot)$ .

The state of our system $X^{u}(t)=X(t)$ satisfies the following SDE

[TABLE]

where $b(t,x,u)=b(t,x,u,\omega):\left[0,T\right]\times\mathbb{R}\times U\times\Omega\rightarrow\mathbb{R}$ , $\sigma(t,x,u)=\sigma(t,x,u,\omega):\left[0,T\right]\times\mathbb{R}\times U\times\Omega\rightarrow\mathbb{R}$ and $\gamma(t,x,u,\zeta)=:\left[0,T\right]\times\mathbb{R}\times U\times\mathbb{R}_{0}\times\Omega\rightarrow\mathbb{R}$ .

From now on we fix an open convex set $U$ such that $V\subset U$ and we assume that $b$ , $\sigma$ and $\gamma$ are continuously differentiable and admits uniformly bounded partial derivatives in $U$ with respect to $x$ and $u$ .

Moreover, we assume that the coefficients $b$ , $\sigma$ and $\gamma$ are $\mathbb{F}$ -adapted, and uniformly Lipschitz continuous with respect to $x$ , in the sense that there is a constant $C$ such that, for all $t\in[0,T],u\in V,\zeta\in\mathbb{R}_{0},\,x,x^{\prime}\in\mathbb{R}$ we have

[TABLE]

Under this assumption, there is a unique solution $X\in\mathcal{S}^{2}$ to the equation $\left(\ref{sde}\right)$ , such that

[TABLE]

The performance functional has the form

[TABLE]

with given functions $f:\left[0,T\right]\times\mathbb{R}\times U\times\Omega\rightarrow\mathbb{R}$ and $g:\Omega\times\mathbb{R}\rightarrow\mathbb{R},$ assumed to be $\mathbb{F}$ -adapted and $\mathcal{F}_{T}$ -measurable, respectively, and continuously differentiable with respect to $x$ and $u$ with bounded partial derivatives in $U$ .

Suppose that $\hat{u}$ is an optimal control. Fix $\tau\in[0,T),0<\epsilon<T-\tau$ and a bounded $\mathcal{F}_{\tau}$ -measurable $v$ and define the spike perturbed $u^{\epsilon}$ of the optimal control $\hat{u}$ by

[TABLE]

Let $X^{\epsilon}(t):=X^{u^{\epsilon}}(t)$ and $\hat{X}(t):=X^{\hat{u}}(t)$ be the solutions of $\left(\ref{sde}\right)$ corresponding to $u=u^{\epsilon}$ and $u=\hat{u}$ , respectively.

Define

[TABLE]

Then by the mean value theorem 111Recall that if a function $f$ is continuously differentiable on an open convex set $U\subset\mathbb{R}^{n}$ and continuous on the closure $\bar{U}$ , then for all $x,y\in\bar{U}$ there exists a point $\tilde{x}$ on the straight line connecting $x$ and $y$ such that

$f(y)-f(x)=f^{\prime}(\tilde{x})(y-x):=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}}(\tilde{x})(y_{i}-x_{i})$

(3.5)

, we can write

[TABLE]

where

[TABLE]

and

[TABLE]

and

[TABLE]

Here $(\tilde{u}(t),\tilde{X}(t))$ is a point on the straight line between $(\hat{u}(t),\hat{X}(t))$ and $(u^{\epsilon}(t),X^{\epsilon}(t))$ . With a similar notation for $\sigma$ and $\gamma$ , we get

[TABLE]

and

[TABLE]

On other words,

[TABLE]

and

[TABLE]

Remark 3.1

Note that since the process

[TABLE]

is a Lévy process, we know that for every given (deterministic) time $t\geq 0$ the probability that $\eta$ jumps at $t$ is [math]. Hence, for each $t$ , the probability that $X$ makes jump at $t$ is also [math]. Therefore we have

[TABLE] 2. 2.

We remark that the equations $\left(\ref{var1}\right)-\left(\ref{var2}\right)$ are linear SDE and then by our assumptions on the coefficients, they admit a unique solution.

Let $\mathcal{R}$ denote the set of (Borel) measurable functions $r:\mathbb{R}_{0}\rightarrow\mathbb{R}$ and define the Hamiltonian $H:\left[0,T\right]\times\mathbb{R}\times U\times\mathbb{R}\times\mathbb{R}\times\mathcal{R}\times\Omega\rightarrow\mathbb{R}$ , to be

[TABLE]

Let $(p^{\epsilon},q^{\epsilon},r^{\epsilon})\in\mathcal{S}^{2}\times\mathbb{L}^{2}\times\mathbb{L}_{\nu}^{2}$ be the solution of the following associated adjoint BSDE:

[TABLE]

where

[TABLE]

Lemma 3.2

The following holds,

[TABLE]

where $(\hat{p},\hat{q},\hat{r})$ is the solution of the BSDE

[TABLE]

Proof. By the Itô formula, we see that the solutions of the equations $\left(\ref{var1}\right)-\left(\ref{var2}\right),$ are

[TABLE]

and

[TABLE]

where

[TABLE]

For more details see Appendix.

From (3.15) we see that $Z^{\epsilon}(\tau+\epsilon)\rightarrow 0$ as $\epsilon\rightarrow 0^{+}$ , and then from (3.14) we deduce that $Z^{\epsilon}(t)\rightarrow 0$ as $\epsilon\rightarrow 0^{+}$ , for all $t$ .

The BSDE $\left(\ref{p}\right)$ is linear, and we can write the solution explicitly as follows (see e.g. Theorem 2.7 in Øksendal and Sulem [14]):

[TABLE]

where $\Gamma(t)\in\mathcal{S}^{2}$ is the solution of the linear SDE

[TABLE]

From this, we deduce that $p^{\epsilon}(t)\rightarrow\hat{p}(t),$ $q^{\epsilon}(t)\rightarrow\hat{q}(t)$ and $r^{\epsilon}(t,\zeta)\rightarrow\hat{r}(t,\zeta)$ as $\epsilon\rightarrow 0^{+}.\newline \square$

We now state and prove the main result of this paper.

Theorem 3.3 (Necessary maximum principle)

Suppose $\hat{u}\in\mathcal{A}$ is maximizing the performance $\left(\ref{perf}\right)$ . Then for all $t\in[0,T)$ and all bounded $\mathcal{F}_{t}$ -measurable $v\in V$ , we have

[TABLE]

Proof. Consider

[TABLE]

where

[TABLE]

and

[TABLE]

By the mean value theorem, we can write

[TABLE]

and, applying the Itô formula to $p^{\epsilon}(t)Z^{\epsilon}(t)$ and by $\left(\ref{p}\right),\left(\ref{var1}\right)$ and $\left(\ref{var2}\right)$ , we have

[TABLE]

Using the generalized duality formula $\left(\ref{geduB}\right)$ and $\left(\ref{geduN}\right)$ , we get

[TABLE]

where by the definition of $H$ $\left(\ref{h}\right)$

[TABLE]

Summing $\left(\ref{I1}\right)$ and $\left(\ref{I2}\right)$ , we obtain

[TABLE]

By the estimate of $Z^{\epsilon}$ $\left(\ref{esz}\right)$ , we get

[TABLE]

and by $\left(\ref{esa}\right)$ we have

[TABLE]

where $(\hat{p},\hat{q},\hat{r})$ solves the BSDE

[TABLE]

Using the above and the assumption that $\hat{u}$ is optimal, we get

[TABLE]

where, by Theorem 2.7,

[TABLE]

Hence

[TABLE]

Since this holds for all bounded $\mathcal{F}_{\tau}$ -measurable $v$ , we conclude that

[TABLE]

$\square$

4 Linear-Quadratic Optimal Control with Constraints

We now illustrate our main theorem by applying it to a linear-quadratic stochastic control problem with a constraint, as follows:

Consider a controlled SDE of the form

[TABLE]

Here $u\in\mathcal{A}$ is our control process (see below) and $\sigma$ and $\gamma$ is a given constant in $\mathbb{R}$ and function from $\mathbb{R}_{0}$ into $\mathbb{R}$ , respectively, with

[TABLE]

We want to control this system in such a way that we minimize its value at the terminal time $T$ with a minimal average use of energy, measured by the integral $\mathbb{E}[{\textstyle\int_{0}^{T}}u^{2}(t)dt]$ and we are only allowed to use nonnegative controls. Thus we consider the following constrained optimal control problem:

Problem 4.1

Find $\hat{u}\in\mathcal{A}$ (the set of admissible controls) such that

[TABLE]

where

[TABLE]

and $\mathcal{A}$ is the set of predictable processes $u$ such that $u(t)\geq 0$ for all $t\in[0,T]$ and

[TABLE]

Thus in this case the set $V$ of admissible control values is given by $V=[0,\infty)$ and we can use $U=V$ . The Hamiltonian is given by

[TABLE]

the adjoint BSDE for the optimal adjoint variables $\hat{p},\hat{q},\hat{r}$ is given by

[TABLE]

Hence

[TABLE]

Theorem 3.3 states that if $\hat{u}$ is optimal, then

[TABLE]

From this we deduce that

[TABLE]

Thus we see that we always have $\hat{u}(t)\geq\max\{\hat{p}(t),0\}$ . We claim that in fact we have equality, i.e. that

[TABLE]

To see this, suppose the opposite, namely that

[TABLE]

Then in particular $\hat{u}(t)>0$ , which by (ii) above implies that $\hat{u}(t)=\hat{p}(t)$ , a contradiction. We summarize what we have proved as follows:

Theorem 4.2

Suppose there is an optimal control $\hat{u}\in\mathcal{A}$ for Problem 4.1. Then

[TABLE]

where $(\hat{p},\hat{X})$ is the solution of the coupled forward-backward SDE system given by

[TABLE]

Remark 4.3

For comparison, in the case when there are no constraints on the control $u$ , we get from the well-known solution of the classical linear-quadratic control problem (see e.g. Øksendal [11], Example 11.2.4) that the optimal control $u^{\ast}$ is given in feedback form by

[TABLE]

5 Appendix

In this section, we give a solution of a general SDE with jumps. Let $X(t)$ satisfy the equation

[TABLE]

for given $\mathbb{F}$ -predictable processes $b_{0}(t),b_{1}(t),\sigma_{0}(t),\sigma_{1}(t),\gamma_{0}\left(t,\zeta\right),\gamma_{1}\left(t,\zeta\right)$ with $\gamma_{i}\left(t,\zeta\right)\geq-1$ for $i=0,1$ .

Now suppose

[TABLE]

Then, $\Upsilon(t)=\exp(\Pi(t))$ , where

[TABLE]

By the Itô formula as in Theorem 1.14 in Øksendal and Sulem [13], we have

[TABLE]

Now put

[TABLE]

Then, again by the Itô formula, we obtain

[TABLE]

Rearranging terms, we end up with

[TABLE]

Consequently,

[TABLE]

Hence

[TABLE]

Thus the unique solution $X(t)$ is given by

[TABLE]

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Aase, K., Øksendal, B., Privault, N., & Ubøe, J. (2000). White noise generalizations of the Clark-Haussmann-Ocone theorem with application to mathematical finance. Finance and Stochastics, 4(4), 465-496.
2[2] Agram, N., & Øksendal, B. (2015). Malliavin calculus and optimal control of stochastic Volterra equations. Journal of Optimization Theory and Applications, 167(3), 1070-1094.
3[3] Agram, N., Øksendal, B., & Yakhlef, S. (2018). Optimal control of forward-backward stochastic Volterra equations. In F. Gesztezy et al (editors): Partial Differential equations, Mathematical Physics, and Stochastic Analysis. A Volume in Honor of Helge Holden’s 60th Birthday . EMS Congress Reports.
4[4] Agram, N., Øksendal, B., & Yakhlef, S. (2017). New approach to optimal control of stochastic Volterra integral equations. ar Xiv:1709.05463.
5[5] Bensoussan, A. (1982). Lectures on stochastic control. In Nonlinear filtering and stochastic control (pp. 1-62). Springer, Berlin, Heidelberg.
6[6] Benth, F. E. (1993). Integrals in the Hida distribution space (S)*. In Lindstrøm, T., Øksendal,B. & Ustunel, A.S. (editors), Stochastic Analysis and Related Topics, 8, 89-99. Gordon and Breach.
7[7] Di Nunno, G., Øksendal, B. K., & Proske, F. (2009). Malliavin Calculus for Lévy Processes with Applications to Finance. Second Edition. Springer.
8[8] Hida, T., Kuo, H. H., Potthoff, J., & Streit, L. (1993). White Noise: An Infinite Dimensioanl Calculus. Springer.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Hida-Malliavin white noise calculus approach to optimal control

Abstract

MSC(2010):

Keywords:

1 Introduction

Problem 1.1

2 A brief review of Hida-Malliavin calculus for Lévy processes

2.1 Hida-Malliavin calculus for B(⋅)B(\cdot)B(⋅)

Definition 2.1** (Hida-Malliavin derivative DtD_{t}Dt​ with respect to B(⋅)B(\cdot)B(⋅))**

Example 2.1

** **Remark 2.2

2.2 Extension to a white noise setting

Proposition 2.3** (The generalized duality formula)**

2.3 Hida-Malliavin calculus for N~(⋅)\tilde{N}(\cdot)N~(⋅)

Definition 2.4** (Hida-Malliavin derivative Dt,ζD_{t,\zeta}Dt,ζ​ with respect to N~(⋅,⋅)\tilde{N}(\cdot,\cdot)N~(⋅,⋅))**

Example 2.2

** **Remark 2.5

2.4 Extension to a white noise setting

Proposition 2.6** (Generalized duality formula)**

2.5 Representation of solutions of BSDE

Theorem 2.7

3 The spike variation stochastic maximum principle

** **Remark 3.1

Lemma 3.2

Theorem 3.3** (Necessary maximum principle)**

4 Linear-Quadratic Optimal Control with Constraints

Problem 4.1

Theorem 4.2

** **Remark 4.3

5 Appendix

2.1 Hida-Malliavin calculus for $B(\cdot)$

Definition 2.1 (Hida-Malliavin derivative $D_{t}$ with respect to $B(\cdot)$ )

Remark 2.2

Proposition 2.3 (The generalized duality formula)

2.3 Hida-Malliavin calculus for $\tilde{N}(\cdot)$

Definition 2.4 (Hida-Malliavin derivative $D_{t,\zeta}$ with respect to $\tilde{N}(\cdot,\cdot)$ )

Remark 2.5

Proposition 2.6 (Generalized duality formula)

Remark 3.1

Theorem 3.3 (Necessary maximum principle)

Remark 4.3