A Finite Horizon Optimal Switching Problem with Memory and Application to Controlled SDDEs
Magnus Perninge

TL;DR
This paper addresses a finite horizon optimal switching problem with memory, establishing existence of optimal controls, and applies it to controlled stochastic delay differential equations, including practical hydro-power revenue maximization.
Contribution
It introduces a probabilistic approach to prove existence of optimal controls in switching problems with memory and applies it to stochastic delay differential equations with real-world relevance.
Findings
Existence of optimal control established using Snell envelopes.
Application to impulse control problems for SDDEs with jump processes.
Relevance demonstrated in hydro-power revenue maximization.
Abstract
We consider an optimal switching problem where the terminal reward depends on the entire control trajectory. We show existence of an optimal control by applying a probabilistic technique based on the concept of Snell envelopes. We then apply this result to solve an impulse control problem for stochastic delay differential equations driven by a Brownian motion and an independent compound Poisson process. Furthermore, we show that the studied problem arises naturally when maximizing the revenue from operation of a group of hydro-power plants with hydrological coupling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A Finite Horizon Optimal Switching Problem with Memory and Application to Controlled SDDEs111This work was supported by the Swedish Energy Authorities through grant number 42982-1
Magnus Perninge222M. Perninge is with the Department of Physics and Electrical Engineering, Linnaeus University, Växjö, Sweden. e-mail: [email protected].
Abstract
We consider an optimal switching problem where the terminal reward depends on the entire control trajectory. We show existence of an optimal control by applying a probabilistic technique based on the concept of Snell envelopes. We then apply this result to solve an impulse control problem for stochastic delay differential equations driven by a Brownian motion and an independent compound Poisson process. Furthermore, we show that the studied problem arises naturally when maximizing the revenue from operation of a group of hydro-power plants with hydrological coupling.
1 Introduction
The standard optimal switching problem (sometimes referred to as starting and stopping problem) is a stochastic optimal control problem of impulse type that arises when an operator controls a dynamical system by switching between the different members in a set of operation modes . In the two-modes setting () the modes may represent, for example, “operating” and “closed” when maximizing the revenue from mineral extraction in a mine as in [6]. In the multi-modes setting the operating modes may represent different levels of power production in a power plant when the owner seeks to maximize her total revenue from producing electricity [7] or the states “operating” and “closed” of single units in a multi-unit production facility as in [5].
In optimal switching the control takes the form , where is a sequence of times when the operator intervenes on the system and is the mode in which the system is operated during . The standard multi-modes optimal switching problem in finite horizon () can be formulated as finding the control that maximizes
[TABLE]
where is the operation mode (when starting in a predefined mode ), and are the running and terminal reward in mode , respectively and is the cost incurred by switching from mode to mode at time .
The standard optimal switching problem has been thoroughly investigated in the last decades after being popularised in [6]. In [16] a solution to the two-modes problem was found by rewriting the problem as an existence and uniqueness problem for a doubly reflected backward stochastic differential equation. In [11] existence of an optimal control for the multi-modes optimal switching problem was shown by a probabilistic method based on the concept of Snell envelopes. Furthermore, existence and uniqueness of viscosity solutions to the related Bellman equation was shown for the case when the switching costs are constant and the underlying uncertainty is modeled by a stochastic differential equation (SDE) driven by a Brownian motion. In [12] the existence and uniqueness results of viscosity solutions was extended to the case when the switching costs depend on the state variable. Since then, results have been extended to Knightian uncertainty [18, 17, 8] and non-Brownian filtration and signed switching costs [24]. For the case when the underlying uncertainty can be modeled by a diffusion process, generalization to the case when the control enters the drift and volatility term was treated in [14]. This was further developed to include state constraints in [20]. Another important generalization is to the case when the operator only has partial information about the present state of the diffusion process as treated in [23].
In the present work we consider the setting with running and terminal rewards that depend on the entire history of the control. We also show that a special case of the type of switching problems that we consider is that of a controlled stochastic delay differential equation (SDDE), driven by a finite intensity Lévy process.
To motivate our problem formulation we consider the situation when an operator of two hydro-power plants, located in the same river, wants to maximize her revenue from producing electricity during a fixed operation period. We assume that each plant has its own water reservoir. The power production in a hydropower plant depends on the drop height from the water level of the reservoir to the outlet and thus on the amount of water in the reservoir. As water that passes through the upstream plant will eventually reach the reservoir of the downstream plant we need to consider part of the control history in the upstream plant when optimizing operation of the downstream plant.
In this setting our cost functional can be written
[TABLE]
where . The contribution of the present work is twofold. First, we show that the problem of maximizing can be solved under certain assumptions on , and the switching costs by finding an optimal control in terms of a family of interconnected value processes, that we refer to as a verification family. We then show that the revenue maximization problem of the hydro-power producer can be formulated as an impulse control problem where the uncertainty is modeled by a controlled SDDE and use our initial result to find an optimal control for this problem.
The remainder of the article is organized as follows. In the next section we state the problem, set the notation used throughout the article and detail the set of assumptions that are made. Then, in Section 3 a verification theorem is derived. This verification theorem is an extension of the original verification theorem for the multi-modes optimal switching problem developed in [11] and presumes the existence of a verification family. In Section 4 we show that, under the assumptions made, there exists a verification family, thus proving existence of an optimal control for the switching problem with cost functional . In Section 5 we more carefully investigate the example of the hydro-power producer and show that the case of a controlled SDDE fits into the problem description investigated in Sections 3 and 4.
2 Preliminaries
We consider a finite horizon problem and thus assume that the terminal time is fixed with .
We let be a probability space, with a filtration satisfying the usual conditions in addition to being quasi-left continuous.
Remark 2.1**.**
Recall here the concept of quasi-left continuity: A càdlàg process is quasi-left continuous if for each predictable stopping time and every announcing sequence of stopping times we have , -a.s. A filtration is quasi-left continuous if for every predictable stopping time .
Throughout we will use the following notation:
- •
is the -algebra of -progressively measurable subsets of .
- •
For , we let be the set of all -valued, -measurable, càdlàg processes such that, -a.s., and let be the subset of processes that are quasi-left continuous.
- •
We let be the set of all -stopping times and for each we let be the corresponding subsets of stopping times such that , -a.s.
- •
We let be the set of all , where is a non-decreasing sequence of -stopping times (such that , -a.s.) and is -measurable (with , the initial operation mode).
- •
We let denote the subset of for which is finite -a.s. (*i.e.
*) and for all we let . For we let (and resp. ) be the subset of (and resp. ) with .
- •
We define the set and let be the corresponding subset of all finite sequences.
- •
For all , we let and .
- •
For , we let and define the map as for all .
To make notation more efficient we introduce the -measurable function:
[TABLE]
2.1 Problem formulation
In the above notation, our problem can be characterized by two objects:
- •
A -measurable map .
- •
A collection, , of -measurable processes.
We will make the following preliminary assumptions on these objects:
Assumption 2.2**.**
- (i)
The function is -a.s. right-continuous in the intervention times and bounded in the sense that:
- a)
. 2. b)
For all and any333Throughout we will use and to denote that last element in the vector and , respectively, whenever .* we have*
. 2. (ii)
For each and any we have , -a.s. 3. (iii)
We assume that are such that:
- a)
, -a.s. 2. b)
There is an such that for each with and , and for , we have
[TABLE]
-a.s.
The above assumptions are mainly standard assumptions for optimal switching problems translated to our setting. Assumptions (i.a) and (iii.a) together imply that the expected maximal reward is finite. Assumption (ii) implies that it is never optimal to switch at the terminal time. We show below that the “no-free-loop” condition (iii.b) together with (i.a) implies that, with probability one, the optimal control (whenever it exists) can only make a finite number of switches.
We consider the following problem:
Problem 1. Find , such that
[TABLE]
∎
As a step in solving Problem 1 we need the following proposition which is a standard result for optimal switching problems and is due to the “no-free-loop” condition.
Proposition 2.3**.**
Suppose that there is a such that for all . Then .
Proof. Pick and let , then . Furthermore, if holds then the switching mode must make an infinite number of loops and
[TABLE]
for all , by Assumptions 2.5.(iii.b) and 2.5.(i.a). However, again by Assumption 2.5.(i.a) we have444Throughout will denote a generic positive constant that may change value from line to line. . Hence, is dominated by the strategy of doing nothing and the assertion follows.∎
2.2 The Snell envelope
In this section we gather the main results concerning the Snell envelope that will be useful later on. Recall that a progressively measurable process is of class [D] if the set of random variables is uniformly integrable.
Theorem 2.4** (The Snell envelope).**
Let be an -adapted, -valued, càdlàg process of class [D]. Then there exists a unique (up to indistinguishability), -valued càdlàg process called the Snell envelope, such that is the smallest supermartingale that dominates . Moreover, the following holds (with ):
- (i)
For any stopping time ,
[TABLE] 2. (ii)
The Doob-Meyer decomposition of the supermartingale implies the existence of a triple where is a uniformly integrable right-continuous martingale, is a non-decreasing, predictable, continuous process with and is non-decreasing purely discontinuous predictable with , such that
[TABLE]
Furthermore, for all . 3. (iii)
Let be given and assume that for any predictable and any increasing sequence with and , -a.s, we have , -a.s. Then, the stopping time defined by is optimal after , i.e.
[TABLE]
Furthermore, in this setting the Snell envelope, , is quasi-left continuous, i.e. . 4. (iv)
Let be a sequence of càdlàg processes converging pointwisely to a càdlàg process and let be the Snell envelope of . Then the sequence converges pointwisely to a process and is the Snell envelope of .
In the above theorem (i)-(iii) are standard. Proofs can be found in [13] (see [22] for an English version), Appendix D in [19], [15] and in the appendix of [9]. Statement (iv) was proved in [11].
The Snell envelope will be the main tool in showing that Problem 1 has a solution.
2.3 Additional assumptions on regularity
From the definition of the Snell envelope it is clear that we need to make some further assumptions on the regularity of the involved processes. To facilitate this we define, for each , the value process corresponding to the control as
[TABLE]
with .
We make the following additional assumptions:
Assumption 2.5**.**
- (i)
For each and each and there is a sequence of maps such that
[TABLE]
Furthermore, we have
[TABLE] 2. (ii)
For all and all , the process is in for
3 A verification theorem
The method for solving Problem 1 will be based on deriving an optimal control under the assumption that a specific family of processes exists, and then showing that the family indeed does exist. We will refer to any such family of processes as a verification family.
Definition 3.1**.**
We define a verification family to be a family of càdlàg supermartingales such that:
- a)
The family satisfies the recursion
[TABLE] 2. b)
The family is bounded in the sense that . 3. c)
For all we have that for every and ,
[TABLE]
and for all we have
[TABLE] 4. d)
For every and every , the process is in .
The purpose of the present section is to reduce the solution of Problem 1 to showing existence of a verification family. This is done in the following verification theorem:
Theorem 3.2**.**
Assume that there exists a verification family . Then the family is unique (i.e. there is at most one verification family, up to indistinguishability) and:
- (i)
Satisfies (where ). 2. (ii)
Defines the optimal control, , for Problem 1, where is a sequence of -stopping times given by
[TABLE]
* is defined as a measurable selection of*
[TABLE]
and , with .
Proof. The proof is divided into three steps where we first, in steps 1 and 2, show that for any we have
[TABLE]
-a.s. for . Then in Step 3 we show that is the optimal control estabilishing (i) and (ii). A straightforward generalization to arbitrary initial conditions then gives that
[TABLE]
by which uniqueness follows.
Step 1 We start by showing that for each the recursion (3.1) can be written in terms of a -stopping time. From (3.1) we note that, by definition, is the smallest supermartingale that dominates
[TABLE]
Now, by Assumption 2.2.(iii) and property d) in the definition of a verification family (Definition 3.1) we note that is a càdlàg process of class [D] that is quasi-left continuous on . Furthermore, by Assumption 2.2.(ii) and property d) we get that for any sequence such that , -a.s. we have , -a.s. By Theorem 2.4.(iii) it thus follows that for any , there is a stopping time such that:
[TABLE]
Step 2 We now show that . We start by noting that is the Snell envelope of
[TABLE]
where , and by step 1 we thus have
[TABLE]
Moving on we pick . For , let and for . Furthermore, we define the processes and by
[TABLE]
and
[TABLE]
for all , where . Now, for each we have that
[TABLE]
is the product of an –measurable positive r.v. and a càdlàg supermartingale, thus, it is a càdlàg supermartingale for . Hence, is the sum of a finite number of càdlàg supermartingales and thus a càdlàg supermartingale itself. By definition we find that dominates which is of class [D] by Assumption 2.5.(i) and property b). To show that is in fact the Snell envelope of assume that is another càdlàg supermartingale that dominates for all . Then for each and , we have
[TABLE]
-a.s. which by (3.1) gives that
[TABLE]
Summing over all we get , -a.s.
Noting that and using (3.2) of property c) we find that
in probability, as . Hence, there is a subsequence such that the limit taken over the subsequence is 0, -a.s. Furthermore, as the convergence is uniform the limit process is càdlàg.
By right-continuity of the switching costs and and (3.3) of property c) we have that as , where for notational simplicity we abuse the notation in (3.6) and let
[TABLE]
Hence, has a subsequence such that , -a.s. as . This implies that is a càdlàg process which is of class [D] by Assumption 2.5.(i) and property b).
We thus have that is a sequence of càdlàg processes of class [D] that converges pointwisely to the càdlàg process of class [D] and that is the Snell envelope of , for all . Then by Theorem 2.4.(iv) we find that converges pointwisely to the Snell envelope Snell envelope of . Hence, \Big{(}Y^{\tau^{*}_{1},\ldots,\tau^{*}_{j};\beta^{*}_{1},\ldots,\beta^{*}_{j}}_{s}:\>\tau^{*}_{j}\leq s\leq T\Big{)} is the Snell envelope of .
To arrive at the second equality in (3.4) we note that the results we obtained in Step 1 implies that for any sequence with we have for all . Now, for all this gives
[TABLE]
where the last term can be made arbitrarily small and we, thus, have that and by Theorem 2.4.(iii) we get (3.4).
By induction we get that for each ,
[TABLE]
Now, arguing as in the proof of Proposition 2.3 and using property b) we find that . Letting and using dominated convergence we conclude that .
Step 3 It remains to show that the strategy is optimal. To do this we pick any other strategy . By the definition of in (3.1) we have
[TABLE]
but in the same way
[TABLE]
–a.s. By repeating this argument and using the dominated convergence theorem we find that which proves that is in fact optimal. Repeating the above procedure with as initial condition (3.5) follows.∎
The main difference between the above proof and the proof of Theorem 1 in the original work by Djehiche, Hamadéne and Popier [11] is that, due to the fact that the future reward at any time depends on the entire history of the control, we are forced consider a family of processes indexed by an uncountable set rather than a -tuple for some finite positive . Hence, we cannot simply write as the sum of a finite number of Snell envelopes. To arrive at the above verification theorem we therefore impose the right-continuity constraint assumed in Assumption 2.5.i. This effectively allowed us to find the two sequences of processes that approach on the one hand the value process corresponding to the optimal control and on the other hand the dominated process, in .
4 Existence
Theorem 3.2 presumes existence of the verification family . To obtain a satisfactory solution to Problem 1, we thus need to establish that a verification family exists. This is the topic of the present section. We will follow the standard existence proof which goes by applying a Picard iteration (see [7, 11, 17]). We thus define a sequence of families of processes as
[TABLE]
and
[TABLE]
for .
Proposition 4.1**.**
The sequence is uniformly bounded in the sense that there is a such that,
[TABLE]
and for all and , we have
[TABLE]
for all .
Proof. By the definition of we have that for any ,
[TABLE]
By Doob’s maximal inequality we have that for any
[TABLE]
Taking the supremum over all on both sides and using that the right hand side is uniformly bounded by Assumption 2.2.(i.a) the first bound follows.
Concerning the second claim, note that
[TABLE]
Now, arguing as above we find that
[TABLE]
where the right hand side is bounded by Assumption 2.2.(i.b). ∎
Proposition 4.2**.**
The family of processes satisfies:
- i)
For every and every and we have
[TABLE]
and
[TABLE]
as uniformly in . 2. ii)
For every and every , the process is in for
Proof. The proof will follow by induction and we use (i’) to denote the first statement without the uniformity.
For , we have by Assumption 2.5.(ii) and (i’) follows from Assumption 2.5.(i). Now, assume that there is a such that (i’) and (ii) holds for all . Applying a reasoning similar to that in the proof of Theorem 3.2 we find that
[TABLE]
But then by Assumption 2.5 we find that (i’) and (ii) hold for . By induction (i’) and (ii) hold for all .
It remains to show that (i) holds. By the above reasoning we find that, for each we have
[TABLE]
where the right hand side of the last inequality does not depend on and tends to zero as by Assumption 2.5.(i). The second statement in (i) follows by an identical argument.∎
Corollary 4.3**.**
For each and each there is a , such that
[TABLE]
with .
Proof. Follows from the definition of and Propositions 4.1 and 4.2 by applying the same argument as in the proof of the verification theorem (Theorem 3.2).
Proposition 4.4**.**
For each , the limit , exists as an increasing pointwise limit, -a.s. Furthermore, the process is càdlàg for each .
Proof. Since we have that, -a.s.,
[TABLE]
where the right hand side is bounded -a.s. by Proposition 4.1. Hence, the sequence is increasing and -a.s. bounded, thus, it converges -a.s. for all .
Concerning the second claim, note that for , we have
[TABLE]
for all (where the inequalities hold -a.s.). Now, arguing as in the proof of Proposition 4.1 we have
[TABLE]
We thus conclude that there is a -null set such that for each we have .
By the “no-free-loop” condition (Assumption 2.2.(iiib)) and the finiteness of we get that for any control ,
[TABLE]
-a.s. For (in the remainder of the proof denotes a generic -null set), we thus have
[TABLE]
where is a control corresponding to . This implies that for we have,
[TABLE]
Now, for all we have,
[TABLE]
where we introduced the process corresponding to the truncation of the optimal control. As the truncation only affects the performance of the controller when we have
[TABLE]
Applying Hölder’s inequality we get that for ,
[TABLE]
with , there is thus a constant such that
[TABLE]
for all . We conclude that for all , the sequence
is a sequence of càdlàg functions that converges uniformly which implies that the limit is a càdlàg function.∎
Proposition 4.5**.**
The family is a verification family.
Proof. As is the pointwise limit of an increasing sequence of càdlàg supermartingales it is a càdlàg supermartingale (see p. 86 in [10]). We treat each remaining property in the definition of a verification family separately:
a) Applying the convergence result to the right hand side of (4.2) and using the fact that, by Proposition 4.4,
[TABLE]
is a càdlàg process, (iv) of Theorem 2.4 gives
[TABLE]
b) Uniform boundedness was shown in Proposition 4.1.
c) We have
[TABLE]
where taking limits is interchangeable due to the uniform convergence property shown in Proposition 4.2.(i). The second statement in c), that is equation (3.3), follows by an identical argument.
d) We know from Proposition 4.4 that is càdlàg and by Proposition 4.1 it follows that . It remains to show that is quasi-left continuous. Using the notation from the proof of Proposition 4.4 we have for ,
[TABLE]
for all with . By Proposition 4.2.(ii) the first part tends to zero -a.s. as . Since was arbitrary and is -a.s. bounded the desired result follows. This finishes the proof.∎
5 Application to SDDEs with controlled volatility
We now move to the case of impulse control of SDDEs. However, we start by formalizing the hydro-power production problem proposed as a motivating example in the introduction.
5.1 Continuous time hydro-power planning
The increasing competitiveness of electricity markets calls for new operational standards in electric power production facilities. It has previously been acknowledged that optimal switching can be useful in deriving production schedules that maximize the revenue from electricity production [7, 11, 20]. Here we will extend the applicability of optimal switching by introducing a new example, the coordinated operation of hydropower plants interconnected by hydrological coupling.
We consider the situation where a central operator controls the output of two hydropower stations located in the same river (but note that the model is easily extended to consider an entire system of power stations).
We assume that Plant , for , has:
- •
A reservoir containing a volume m3 of water at time .
- •
A stochastic inflow m3/s to the reservoir that is modeled by a jump diffusion process.
- •
turbines that can be either “in operation”, producing MW by releasing m3/s of water through the turbine or “idle”.
We assume that the power plants are hydrologically connected in such a way that the water that passes through Plant 1 will reach the reservoir of Plant 2 after seconds.
We assume that we control the number of turbines in operation in each of the two plants. We thus let . The dynamics of the involved processes is then given by
[TABLE]
and an appropriate reward functional is
[TABLE]
where is the (stochastic) electricity price at time and is the value of water (per m3) stored in the reservoirs at the end of the operation period555Note that we expect the water in Reservoir 1 to have a higher value as it can be used in both plants..
5.2 A general SDDE model
Motivated by the above example we assume that is the completed filtration generated by an -dimensional Brownian motion and an -dimensional, independent, finite activity, Poisson random measure with intensity measure , where is the Lévy measure on of and is called the compensated jump martingale random measure of . For , we let solve
[TABLE]
where is a constant and is a deterministic càdlàg function with , and define recursively
[TABLE]
Finally we let be our controlled process666Whenever it exists, we refer to the limit process as a solution to the SDDE (5.3)-(5.5).
Remark 5.1**.**
Note that by letting and taking and letting the first rows of , and equal zeros we get which implies that the control enters all terms in the SDDE for .
We consider the situation when the functional is given by
[TABLE]
We assume that the parameters of the SDDE satisfies the following conditions:
Assumption 5.2**.**
- i)
The functions and are continuous in and satisfy
[TABLE]
for all . 2. ii)
There is a , with such that satisfies
[TABLE] 3. iii)
For all and all , the map satisfies
[TABLE]
Furthermore,
[TABLE]
for all and .
Remark 5.3**.**
Note in particular that since and are continuous in , and are uniformly bounded and Lipschitz continuity implies that
[TABLE]
We have the following result:
Proposition 5.4**.**
Under Assumption 5.2 the SDDE (5.3)-(5.5) admits a unique solution for each . Furthermore, the solution has moments of order , i.e. \sup_{u\in\mathcal{U}}\mathbb{E}\big{[}\sup_{t\in[0,T]}|X^{u}_{t}|^{4q}\big{]}<\infty.
Proof. We first note that existence of a unique solution to the SDDE follows by repeated use of Theorem 3.2 in [1] (where existence of a unique solution to a more general controlled SDDE is shown). It remains to show that the moment estimate holds. We have on and
[TABLE]
on . By Assumption 5.2.(iii) we get, for , using integration by parts, that
[TABLE]
By repeated application we find that
[TABLE]
with . Now, since and coincide on we have
[TABLE]
and
[TABLE]
Finally, using the Burkholder-Davis-Gundy inequality in combination with (5.6) we get
[TABLE]
where the constant does not depend on and it follows by Grönwall’s lemma that \mathbb{E}\Big{[}\sup_{t\in[0,T]}|X^{u,j}_{t}|^{4q}\Big{]} is bounded uniformly in . Now, the result follows since , -a.s., as .∎
For each and each we let
[TABLE]
and
[TABLE]
Proposition 5.5**.**
For all we have
[TABLE]
Proof. For we have, for ,
[TABLE]
Arguing as in the proof of Proposition 5.4 we find that for ,
[TABLE]
We thus find that, for each ,
[TABLE]
and the assertion again follows by applying Grönwall’s lemma and using Proposition 5.4.∎
To illustrate that switching does not diverge solutions we have the following useful lemma:
Lemma 5.6**.**
For and each , let and be processes in (with uniformly bounded) that solve the SDDE (5.3)-(5.5) on with control and such that
[TABLE]
as . Then,
[TABLE]
and for all we have
[TABLE]
Proof. By the contraction property of we have that . Using integration by parts we get, for ,
[TABLE]
Repeated application implies that
[TABLE]
Now, for we have
[TABLE]
Using Lipschitz continuity of and and the Burkholder-Davis-Gundy inequality we get
[TABLE]
where the constant does not depend on the control , and by Grönwall’s inequality we have
[TABLE]
Now, applying Jensen’s inequality gives (5.8). Furthermore, we have
[TABLE]
and (5.9) follows by an identical argument.∎
We add the following assumptions on the components of the cost functional and the functions .
Assumption 5.7**.**
- (i)
The functions and are both locally Lipschitz in . Furthermore, there are constants and such that
[TABLE]
for all . 2. (ii)
For all we have
[TABLE]
*for all . * 3. (iii)
There is a constant such that for any sequence with there is a subsequence with and for which
[TABLE]
It is straightforward to see that with the above assumptions the defined by
[TABLE]
satisfies Assumption 2.2. The remainder of this section is devoted to showing that also satisfies Assumption 2.5, guaranteeing the existence of an optimal control to the problem of maximizing .
Proposition 5.8**.**
For each and each and there is a map such that
[TABLE]
and
[TABLE]
Furthermore, we have
[TABLE]
and
[TABLE]
Proof. To simplify notation we let denote and let and (resp. and ) denote resp. (resp. and ). Furthermore, we let be the running maximum of the process .
We have:
i) , for all , -a.s.
ii) On we have .
iii) If , then .
Letting we get
[TABLE]
Hence,
[TABLE]
But and by induction it follows that
[TABLE]
If we iteratively define , for with and . Then we get, in the same manner,
[TABLE]
Now on we have
[TABLE]
Put together we find that for we have
[TABLE]
Applying Thm 66, p. 339 in [26] and Lipschitz continuity iteratively gives
[TABLE]
By Grönwall’s inequality and point ii) above we find that
[TABLE]
Moving on we consider the possibility of interventions in the period . Let and note that if , then there is a subsequence with with and such that, for all ,
[TABLE]
We then let777For we denote by the vector of ones.
. Arguing as above, we find that
[TABLE]
We now turn to the total revenue and let
[TABLE]
By right continuity of the switching costs, we find that
[TABLE]
-a.s. The difference in revenue can then be written
[TABLE]
By local Lipschitz continuity of and we get that, for each there is a such that and on . This gives us the relation
[TABLE]
where . Doob’s maximal inequality then gives that
[TABLE]
where we have used Hölder’s inequality and the moment estimate in Proposition 5.4 to arrive at the last inequality. For any we thus have
[TABLE]
Concerning the first term, we have that , where and on . On we let with
[TABLE]
and where is obtained from as was obtained from . Now, we proceed as above and get for each , that
[TABLE]
By (5.14) and (5.8) of Lemma 5.6 we then find that for each , the first term on the right hand side in (5.17) goes to 0 as . Concerning the second term we have, again by Hölder’s inequality and Proposition 5.4, that
[TABLE]
Now, , where does not depend on . For sufficiently large we thus see, by (5.16) and Chebyshev’s inequality, that the probability on the right hand side can be made arbitrarily small by choosing sufficiently large. For the third term we note that
[TABLE]
where the right hand side goes to 0 as by right-continuity of the switching costs. Finally, the last term of (5.17) can be made arbitrarily small by choosing large.
Concerning the second claim we note that with and the relation in (5.15) is replaced by
[TABLE]
Hence, appealing to (5.9) of Lemma 5.6, right-continuity and the result in Proposition 5.5 the first second and last terms in the equivalent to (5.17) tends to 0 as and (5.12) follows.
The last two statements given in equations (5.12)-(5.13) follow by a similar reasoning while noting that in this case which implies that , -a.s.∎
Lemma 5.9**.**
For all and we have
[TABLE]
-a.s. as .
Proof. Starting with we note that for we have
[TABLE]
which gives
[TABLE]
For and we have, for
[TABLE]
and
[TABLE]
which gives
[TABLE]
Repeated application renders
[TABLE]
Furthermore, we have
[TABLE]
where the right hand side tends to zero -a.s. as by -a.s. boundedness of
. Arguing as in the proof of Lemma 5.6 we find that
[TABLE]
and the assertion follows by right continuity of .∎
Lemma 5.10**.**
For all and all we have whenever , with , that
[TABLE]
for all .
Proof. Arguing as in the proof of the previous lemma we find that
[TABLE]
Furthermore, by Hölder’s inequality we have
[TABLE]
where . Now, by definition is a predictable stopping time and the jump part of our SDDE is -a.s. constant at predictable stopping times. We can, thus, apply Lemma 5.6 and the assertion follows.∎
Proposition 5.11**.**
*For all and all , the process
is in for all .*
Proof. Let . To show that has a càdlàg version we consider
[TABLE]
where the second term on the right hand side goes to zero -a.s. as by uniform integrability and right continuity of the filtration. Concerning the first term we have
[TABLE]
for each , by the local Lipschitz property of and . Concerning the last term Doob’s maximal inequality gives, for fixed ,
[TABLE]
Applying Hölder’s inequality to the right hand side and taking the supremum over , we get
[TABLE]
Now, by Chebyshev’s inequality and Proposition 5.5,
can be made arbitrarily small by choosing large. By monotonicity, it follows that the last term in (5.18) tends to zero, -a.s. as . We conclude that tends to , -a.s. when by right continuity of the switching costs in combination with Lemma 5.9 and it follows that has a càdlàg version.
Arguing as above we have that
[TABLE]
Letting the last term tends to zero -a.s. by uniform integrability and quasi-left continuity of the filtration. Concerning the first term we have (where we for notational convenience assume that )
[TABLE]
where the right hand side can be made arbitrarily small by Lemma 5.10 and quasi-left continuity of the switching costs. We conclude that
[TABLE]
which implies that in probability. Now since has left limits it follows that , -a.s. and we conclude that .∎
By the above results we conclude that an optimal control for the hydropower planning problem does exist (under the assumptions detailed in this section). With a few notable exceptions (see *e.g. * [3, 4] in the case of singular control problems and Chapter 7 in [25] for examples of solvable impulse control problems) finding explicit solutions to impulse control problems is difficult. Instead we often have to resort to numerical methods to approximate the optimal control. A plausible direction for obtaining numerical approximations of solutions to the hydropower operators problem would be to further develop the Monte Carlo technique originally proposed for optimal switching problems in [7] (and later analyzed in [2]) to obtain polynomial approximations of . Another possibility would be to apply the Markov-Chain approximations for stochastic control problems of delay systems developed in [21]. However, a thorough investigation of either direction is out of the scope of the present work and will be left as a topic of future research.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Agram and B. Øksendal. Stochastic control of memory mean-field processes. Appl. Math. Optim. , 79:181–204, 2019.
- 2[2] R. Aïd, L. Campi, N. Langrené, and H. Pham. A probabilistic numerical method for optimal multiple switching problems in high dimension. SIAM J. Financial Math. , 5(1):191–231, 2014.
- 3[3] I. Aslaksen, O. Bjerkholt, K. A. Brekke, T. Lindstrøm, and B. Øksendal. The choice between hydro and thermal power generation under uncertainty. In O. Bjerkholt, Ø. Olsen, and J. Vislie, editors, Recent Modelling Approaches in Applied Energy Economics , pages 187–205. Chapman and Hall, 1990.
- 4[4] I. Aslaksen, O. Bjerkholt, K. A. Brekke, T. Lindstrøm, and B. Øksendal. A class of solvable stochastic investment problems involving singular controls. Stochastics , 43:29–63, 1993.
- 5[5] K. A. Brekke and B. Øksendal. Optimal switching in an economic activity under uncertainty. SIAM J. Control Optim. , 32(4):1021–1036, 1994.
- 6[6] M. J. Brennan and E. S. Schwartz. Evaluating natural resource investments. J. Bus. , 58:135–157, 1985.
- 7[7] R. Carmona and M. Ludkovski. Pricing asset scheduling flexibility using optimal switching. Appl. Math. Finance , 15:405–447, 2008.
- 8[8] J. F. Chassagneux, R. Elie, and I. Kharroubi. A note on existence and uniqueness for solutions of multidimensional reflected bsdes. Electron. Commun. Probab. , 16:120–128, 2011.
