Optimal steering for non-Markovian Gaussian processes
Daniele Alpago, Yongxin Chen, Tryphon Georgiou, Michele Pavon

TL;DR
This paper derives a closed-form optimal control law for steering a non-Markovian Gaussian process with a finite-dimensional Markov realization to a desired terminal distribution, minimizing energy over a finite horizon.
Contribution
It provides the first closed-form solution for finite-energy steering of a non-Markovian process with a Markov realization, advancing control of partially observable stochastic systems.
Findings
Closed-form optimal control law derived.
Applicable to non-Markovian processes with Markov realizations.
Progress towards controlling systems with partial, noisy observations.
Abstract
At present, the problem to steer a non-Markovian process with minimum energy between specified end-point marginal distributions remains unsolved. Herein, we consider the special case for a non-Markovian process y(t) which, however, assumes a finite-dimensional stochastic realization with a Markov state process that is fully observable. In this setting, and over a finite time horizon [0,T], we determine an optimal (least) finite-energy control law that steers the stochastic system to a final distribution that is compatible with a specified distribution for the terminal output process y(T); the solution is given in closed-form. This work provides a key step towards the important problem to steer a stochastic system based on partial observations of the state (i.e., an output process) corrupted by noise, which will be the subject of forthcoming work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Stochastic processes and financial applications · Advanced Thermodynamics and Statistical Mechanics
Optimal steering for non-Markovian Gaussian processes
Daniele Alpago, Yongxin Chen, Tryphon Georgiou and Michele Pavon D. Alpago is with the Dipartimento di Ingegneria dell’Informazione, Università di Padova, 35131 Padova, Italy; [email protected]. Chen is with the School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA 30332;[email protected]. Georgiou is with the Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA 92697; [email protected]. Pavon is with the Dipartimento di Matematica “Tullio Levi-Civita”, Università di Padova, 35121 Padova, Italy; [email protected] Supported in part by the NSF under grants 1509387, 1901599, the AFOSR under grants FA9550-15-1-0045 and FA9550-17-1-0435, and by the University of Padova Research Project CPDA 140897.
Abstract
At present, the problem to steer a non-Markovian process with minimum energy between specified end-point marginal distributions remains unsolved. Herein, we consider the special case for a non-Markovian process which, however, assumes a finite-dimensional stochastic realization with a Markov state process that is fully observable. In this setting, and over a finite time horizon , we determine an optimal (least) finite-energy control law that steers the stochastic system to a final distribution that is compatible with a specified distribution for the terminal output process ; the solution is given in closed-form. This work provides a key step towards the important problem to steer a stochastic system based on partial observations of the state (i.e., an output process) corrupted by noise, which will be the subject of forthcoming work.
I Introduction
Throughout we will be considering a controlled evolution of the vector Gauss-Markov process that obeys the linear stochastic differential equation
[TABLE]
For this setting, in recent years, there has been considerable interest in the problem of minimum-energy steering of the (Gaussian) distribution of to a target distribution at time , [6, 7, 19, 17, 2]. Important extensions include [7] the more challenging case when the control process and the noise enter through different channels (i.e., having different “input matrices” in (1a)), and the infinite-horizon case where the goal is to achieve with minimum power a specified stationary state [7]; the latter generalizes the classical work on covariance control of Skelton et al. [20, 18]. Motivation for such problems is manifold: they represents a most natural relaxation of classical LQR steering problems and have important applications in quality control and industrial manufacturing, vehicle path planning [27], statistical physics as in cooling and control of nano-to-meter scale resonators, atomic force microscopy and so forth, see e.g., [14, 8].
Historically, the origin of the steering problem stems from a Gedankenexperiment formulated by Schrödinger in the early thirties [31, 32], seeking the most likely flow of particle distributions between observed end-point marginals. Schrödinger’s problem amounted to a problem in the theory of large deviations (which was unavailable at that time). Indeed, thanks to Sanov’s theorem [30], the Schrödinger’s problem amounts to seeking a probability distribution on particle trajectories having maximum entropy andwhich is in agreement with the end-point specified marginal distributions [16, 3, 21, 15, 35]. Then, in the late eighties and early nineties, following work of Jamison, Föllmer, Nagasawa, Wakolbinger, Fleming, Holland, Mitter and others, a clear connection was made with stochastic control [12, 13, 28]. The distribution on paths, corresponding to the uncontrolled evolution, plays the role of the “prior” measure in the maximum entropy problem which generalizes Schrödinger’s original one. At about the same time, Blaquiere [4] studied the control of the Fokker-Planck equation and later Brockett studied the Louiville equation [5] along a similar spirit, to steer distributions to a target one. This circle of control problems for uncertain system has recently been linked to yet another fast developing topic, Optimal Mass Transport (OMT) problem [34], when it was realized that Schrödinger’s bridge problem (SBP, as it seeks to “bridge” the two end-point marginals) may be viewed as a regularization of OMT and provides an effective computational approach to the latter [24, 25, 26, 22, 10].
Extending the Schrödinger problem to the case of non-Markov processes is a tantalizing one and a natural next step. While the general case is currently wide open, in the present paper we work out the special of steering the output of a Gauss-Markov model. More specifically, in conjunction with (1a), we consider the output process
[TABLE]
where is continuous and takes values in for . For instance, this case arises when we consider steering only some components of the state to a prescribed terminal distribution (see V). Clearly, by itself is not a Markov process. Thus, this seemingly innocuous problem falls into the category of Schrödinger bridge problems with non-Markov prior for which the form of the optimal control is, in general, unknown111See [29] for a considerably simpler “half-bridge” problem where only the final distribution is prescribed.. Problems where only a portion of the state needs to be specified arise, for instance, in thickness control (film extrusion) [1, 2] where the remaining components of the state vector might either not be of interest or may be difficult/expensive to measure. In Section V we discuss a case where it is of interest to regulate only the distribution of the momentum of stochastic oscillators.
The outline of the paper is as follows. In Section II, we recall some central results from [6] in the case of a Markovian prior. In Section III, we give a precise formulation of our stochastic control problem. In Section IV, we provide a closed-form solution to our problem by finding the terminal time state covariance which can be reached with minimum energy among those complying with the assigned covariance of . Finally, Section V illustrates the results in a problem of steering the momentum distribution of a stochastic oscillator to a desired one.
II Background
Let be the family of adapted222 only depends on and on for each ., finite energy control functions such that (1a) has a strong solution on and has distribution . The optimal steering problem reads
Problem 1
Determine
[TABLE]
In [6, Theorem 8], it was shown that, under controllability of the pair on the given time interval, is nonempty and the (unique) optimal control is a linear feedback of the state given by
[TABLE]
where and , taking values in the set of symmetric, matrices, are the unique nonsingular solutions on of the system of linear matrix equations
[TABLE]
nonlinearly coupled through the boundary conditions
[TABLE]
The solutions to these equations can actually be provided in closed form as a function of , see [6, Section III] for further details.
Let and be the probability measures on , the -dimensional continuous functions corresponding to the solutions of (1a) with control [math], and , respectively. Also let and be their initial-final joint density, respectively. In [6, Section IV], a well known decomposition of the relative entropy [15] was extended to the case of degenerate diffusions, to show that the Schrödinger bridge problem with marginals densities and can be reduced to the following maximum entropy problem for distributions on a finite-dimensional space:
Problem 2
Minimize over densities on the Kullback-Leibler index
[TABLE]
subject to the (linear) constraints
[TABLE]
Let be the covariance of . Since , has necessarily the structure
[TABLE]
for some . Let instead be the covariance corresponding to . Then, it has the form
[TABLE]
where
[TABLE]
with denoting the state-transition of determined by
[TABLE]
Thanks to the explicit form of relative entropy (Kullback-Leibler index) for Gaussian distributions [11], Problem 2 can be expressed in terms of covariances as follows:
[TABLE]
where is as in (7) and
[TABLE]
see [6, Section IV] for the details.
III Problem formulation
We consider the output process in (1c) and assume that the state is fully observable. The finite-dimensional Markovian representation (stochastic realization) for provided by (1a)-(1c) is available. Such a representation, as is well-known, constitutes the starting point of Kalman filtering and much of optimal control theory, and the construction of such a model with minimal state vector dimension has been the subject of intense study [23]. This too is our starting point.
Let us denote by be the family of adapted control functions such that (1a) has a strong solution on and has distribution . We formulate the following Schrödinger Bridge Problem with non-Markov prior:
Problem 3
Determine
[TABLE]
Notice that on one side, at , the boundary constraint requires matching the covariance for the state vector (which can be relaxed) while on the other end, at , requires matching the covariance of the output
[TABLE]
The value of is a parameter and there are in general several values for it such that (10) is satisfied333The case where only and are prescribed can be treated in a similar fashion by optimizing also with respect to .. Corresponding to each one of them, there is a feedback control in optimally performing the transfer of distributions according to [6]. Thus, the problem may be also viewed as that of determining the one final covariance , among those compatible with , whose corresponding optimal control (2) has minimum energy.
Inspired by the reduction of the classical case leading to Problem 2, we proceed in the next section to derive a closed-form solution of Problem 3.
IV Solution to the non-Markovian steering problem
In view of (9) in Section II, Problem 3 can be rewritten as
[TABLE]
where , constitute the given data while is a parameter. This can be further recast as
Problem 4
Given , , and as in (8), determine
[TABLE]
subject to and .
Now, let
[TABLE]
and
[TABLE]
where is the set of symmetric positive definite matrices.
We construct below the Lagrangian introducing a Lagrange multiplier and consider the unconstrained minimization
[TABLE]
The Lagrangian is given by (we write, for simplicity, instead of )
[TABLE]
where is a Lagrange multiplier and is a constant term. We first check the convexity of with respect to .
Proposition 1
* is jointly convex in over .*
Proof:
Let denoting the first variation of in the direction . Applying the chain rule,
[TABLE]
To check the convexity it is sufficient look at the diagonal of the “Hessian” of
[TABLE]
We have
[TABLE]
which is clearly non-negative on . ∎
To find the minimum of in is therefore sufficient to solve
[TABLE]
from which we get the two equations
[TABLE]
To compute the optimal , we use these equations in the Lagrangian and then proceed to maximize the resulting (concave) functional with respect to . Accordingly, the last equation we need is given by
[TABLE]
Let and note that . We immediately get and
[TABLE]
Therefore, and
[TABLE]
At this point we only need to find from equations (17), (18). Since we can always find a state space transformation such that (or a change of basis in the outputs’ space), without loss of generality, we can always assume that . Let
[TABLE]
Equation (18) becomes
[TABLE]
while equation (17) can be equivalently written as
[TABLE]
which reduces to the system of equations
[TABLE]
Plugging , and into (19), we get
[TABLE]
where
[TABLE]
Equation (22) is a quadratic equation with two solutions
[TABLE]
Clearly, by Schur complement, which implies . This singles out the solution . We can now recover from (21) and then and . Finally, from (20), one can find the multiplier :
[TABLE]
The above results can be summarized as follows.
Theorem 5
Let be as in (23) and be derived accordingly, then solves Problem 4. Furthermore, the solution to Problem 3 coincides with the solution to Problem 1 with .
V Example
Consider controlling the Ornstein-Uhlenbeck model of physical Brownian motion
[TABLE]
corresponding to a given quadratic potential with symmetric, positive-definite, and is the control force. By setting
[TABLE]
model (24) becomes
[TABLE]
where is zero-mean Gaussian with , and the pair is controllable. We consider a state dimension of and we assume for simplicity that the units are such that and .
We would like to steer the Gaussian distribution of the momentum equal to a final distribution at time with minimizing the quadratic control energy under the controlled dynamics (24). In other words, we are prescribing only the final covariance matrix of with . Figure 1 shows the trajectories of the state variables in the phase space (left) and the corresponding control efforts (right), i.e. the intersections of the phase plot with the slice planes and respectively.
Figure 2 highlights instead the trajectories of position (left) and momentum (right) with the corresponding confidence interval.
In all the figures, the transparent blue tube represent the ”” confidence interval, i.e. its intersection with the slice plane is given by
[TABLE]
The figures highlight the reduction of the variance of the momentum process as time increases to .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] K. J. Åström, Introduction to Stochastic Control Theorey , Academic Press, 1970.
- 2[2] Efstathios Bakolas, Finite-horizon covariance control for discrete-time stochastic linear systems subject to input constraints, Automatica , 91 , pp. 61-68, 2018.
- 3[3] A. Beurling, An automorphism of product measures, Ann. Math. 72 (1960), 189-200.
- 4[4] A. Blaquière, “Controllability of a Fokker-Planck equation, the Schrödinger system, and a related stochastic optimal control (revised version),” Dynamics and Control , vol. 2, no. 3, pp. 235–253, 1992.
- 5[5] R. Brockett, “Notes on the control of the Liouville equation,” in Control of Partial Differential Equations . Springer, 2012, pp. 101–129.
- 6[6] Y. Chen, T.T. Georgiou and M. Pavon, “Optimal steering of a linear stochastic system to a final probability distribution, Part I”, IEEE Trans. Aut. Control , 61 , Issue 5, 1158-1169, 2016.
- 7[7] Y. Chen, T.T. Georgiou and M. Pavon, “Optimal steering of a linear stochastic system to a final probability distribution, Part II”, IEEE Trans. Aut. Control , 61 , Issue 5, 1170-1180, 2016.
- 8[8] Y. Chen, T.T. Georgiou and M. Pavon, “Fast cooling for a system of stochastic oscillators”, J. Math. Phys. , 56 , n.11, 113302, 2015.
