A Parallel Decomposition Scheme for Solving Long-Horizon Optimal Control Problems
Sungho Shin, Timm Faulwasser, Mario Zanon, and Victor M. Zavala

TL;DR
This paper introduces a parallel temporal decomposition method for long-horizon optimal control problems, ensuring convergence under specific conditions related to boundary perturbation decay and overlap size, with demonstrated effectiveness on quadratic and non-convex problems.
Contribution
It proposes a novel parallel decomposition scheme with convergence guarantees based on boundary perturbation decay, applicable to both convex and non-convex problems.
Findings
Scheme converges if boundary perturbations decay asymptotically.
LQ problems satisfy the decay condition, ensuring convergence.
Overlap size influences convergence rate and scheme performance.
Abstract
We present a temporal decomposition scheme for solving long-horizon optimal control problems. In the proposed scheme, the time domain is decomposed into a set of subdomains with partially overlapping regions. Subproblems associated with the subdomains are solved in parallel to obtain local primal-dual trajectories that are assembled to obtain the global trajectories. We provide a sufficient condition that guarantees convergence of the proposed scheme. This condition states that the effect of perturbations on the boundary conditions (i.e., initial state and terminal dual/adjoint variable) should decay asymptotically as one moves away from the boundaries. This condition also reveals that the scheme converges if the size of the overlap is sufficiently large and that the convergence rate improves with the size of the overlap. We prove that linear quadratic problems satisfy the asymptotic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A Parallel Decomposition Scheme for Solving
Long-Horizon Optimal Control Problems
Sungho Shin1, Timm Faulwasser2, Mario Zanon3, and Victor M. Zavala1 1S. Shin and V. M. Zavala are with the Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706 USA (e-mail: [email protected]; [email protected]).2Timm Faulwasser is with Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, 76021 Karlsruhe, Germany. (e-mail: [email protected])3Mario Zanon is with IMT Lucca, 55100 Lucca, Italy.(e-mail: [email protected])TF acknowledges support by the Bundesministerium für Bildung und Forschung (BMBF), Grant 05M18CKA.
Abstract
We present a temporal decomposition scheme for solving long-horizon optimal control problems. In the proposed scheme, the time domain is decomposed into a set of subdomains with partially overlapping regions. Subproblems associated with the subdomains are solved in parallel to obtain local primal-dual trajectories that are assembled to obtain the global trajectories. We provide a sufficient condition that guarantees convergence of the proposed scheme. This condition states that the effect of perturbations on the boundary conditions (i.e., the initial state and terminal dual/adjoint variable) should decay asymptotically as one moves away from the boundaries. This condition also reveals that the scheme converges if the size of the overlap is sufficiently large and that the convergence rate improves with the size of the overlap. We prove that linear quadratic problems satisfy the asymptotic decay condition, and we discuss numerical strategies to determine if the condition holds in more general cases. We draw upon a non-convex optimal control problem to illustrate the performance of the proposed scheme.
I Introduction
Long-horizon optimal control problems (OCPs) arise in model predictive control (MPC) applications such as chemical process systems [1], autonomous vehicle steering [2], and battery systems [3]. They also appear in other application domains such as chemical production planning [4] and electricity production planning [5]. Different decomposition techniques have been reported in the literature to improve computational tractability of these problems including dual decomposition [6], alternating direction method of multipliers [7], dual dynamic programming [8], Gauss-Seidel schemes [9, 10], and parallel Newton schemes [11]. Such decomposition techniques allow scalable solutions of long-horizon OCPs by the use of parallel computers.
In this work, we study the convergence properties of a new decomposition paradigm that uses overlapping time domains. This approach is motivated by overlapping Schwartz schemes used for the solution of partial differential equations (PDEs) [12, 13]. In the proposed scheme, the time domain is decomposed into a set of partially overlapping subdomains. Subproblems associated with subdomains are solved in parallel to obtain local primal-dual trajectories by using current primal-dual information from the neighboring subdomains. The subdomain trajectories are then assembled (this can be interpreted as a projection operator) to update the primal-dual trajectories over the entire domain, and the procedure is repeated. We provide a sufficient condition that guarantees convergence of the proposed decomposition algorithm when applied to OCPs. This condition applies to linear and nonlinear problems. Specifically, the condition indicates that convergence of the decomposition scheme is guaranteed provided that the sensitivity of the primal-dual (state-adjoint) trajectories to perturbations in initial states and terminal cost gradients decay asymptotically as one moves away from the boundaries. We call this condition asymptotic decay of sensitivity (ADS). The condition also reveals that the algorithm converges provided that the size of the overlap is sufficiently large and that the convergence rate improves as the size of the overlapping regions increases.
ADS-like properties have been recently explored in the literature. Xu et. al. recently showed that an ADS condition (in the primal space) holds under a time-varying and inequality-constrained linear quadratic (LQ) control setting [15, 16]. To prove this, the authors assumed uniformly complete controllability and exploited the algebraic structure of the Riccati equation. The authors used the primal ADS property to show that trajectories of an overlapping temporal decomposition scheme approximate those of the long-horizon problem and that the approximation error decays as the size of the overlap increases. The authors also showed that receding horizon control provides approximate trajectories and that the error converges as the size of the overlapping regions increase. These works do not provide an algorithmic scheme that delivers optimal trajectories (for a given size of the overlapping region), as we do in this work. Shin et. al. recently established a primal ADS condition for general graph-structured, unconstrained quadratic programs and showed that this condition guarantees the convergence of an iterative overlapping decomposition algorithm [14]. The condition established in this work is specialized to constrained OCPs and operates in the primal-dual space. A primal ADS property has also been established in [17, Lemma 5]. This condition is exploited by the authors to establish the stability of economic MPC.
The paper is organized as follows. In Section II, we present the basic setting and describe the proposed decomposition scheme. In Section III, we propose a primal-dual ADS condition that guarantees convergence of the algorithm. In Section IV, we show that ADS holds for a simplified LQ setting. In Section V, we demonstrate the proposed scheme using a nonlinear economic MPC problem.
II Basic Definitions and Setting
We consider an OCP with a time domain set , an initial state , and a terminal cost gradient of the form:
[TABLE]
We denote this problem as . Here, and are the state and input variables at time ; and are the dual variables associated with (1b)-(1c) and (1d), respectively. Symbol is the dynamic mapping, is the stage cost mapping, and is the constraint mapping. We denote the set of real numbers and the set of integers as and , respectively, and we define . We use the syntax and . We denote primal-dual pairs as .
Remark 1
From the KKT conditions of (1), it follows that . Therefore, incorporating the terminal penalty term in the objective function as (1a) essentially constitute the terminal constraint of the dual variable at .
We now describe the proposed decomposition scheme for solving . We partition the time domain into a collection of non-overlapping (i.e., disjoint) sets of the form . We also define a collection of overlapping sets satisfying
[TABLE]
for any . We call the size of the overlap. The non-overlapping and overlapping time domains with are illustrated in Figure 1.
The scheme starts with an initial guess of the primal and dual trajectories with and . For each subdomain index and iteration counter , subproblems are solved in parallel. Each subproblem requires current primal-dual information from the neighboring subproblems (this information enters in the initial condition and terminal penalty). The solution of subproblem yields the local primal-dual trajectories . These trajectories are restricted to the non-overlapping subdomains by using the restriction . The solutions at are discarded. This restriction procedure over can be seen as a projection that assembles the entire primal-dual trajectory that is in turn used as the next guess in the algorithm.
The trajectory assembling procedure is sketched in Figure 1. The proposed decomposition scheme is summarized in Algorithm 1 and can be stated compactly as follows:
[TABLE]
III Convergence Results
We now provide a sufficient condition for (1) that guarantees convergence of the decomposition algorithm (2). We begin by making the following existence and uniqueness assumption on solutions of problem (1).
Assumption 1** (Existence and uniqueness of solution)**
There exist sets and such that, for any with and , there exists a unique primal-dual solution of with and for any .
Remark 2
The solution of always satisfies (by (1b)) and (see Remark 1), and thus we can write the solution of as . Because the scheme (2) is mostly concerned with the state and the adjoints (this is the information exchanged between domains), we will not explicitly indicate the controls and multipliers in the nomenclature.
We now state the following principle of optimality result:
Lemma 1
Assume that problem (1) satisfies Assumption 1 and consider with and . Let be the solution of , then is the solution of .
Proof:
We define the Lagrangian of as:
[TABLE]
From Assumption 1, the solution of exists and is unique and this implies that the KKT conditions hold at , , , ; that is,
[TABLE]
holds for any , where , , , .
From (4), we observe that the KKT conditions of with imply that the KKT conditions of hold with . This consistency is the result of using the duals as a terminal penalty in (1).
By Assumption 1, we have that . This implies that the solution of exists and is unique, and thus the KKT point is the unique solution. Thus, , , , and form the solutions of . ∎
Lemma 1 justifies the structure of (1); specifically, primal and dual information of the neighboring subproblems should be incorporated as initial states and terminal penalties, respectively. Furthermore, it implies that the primal-dual trajectory of the subproblems can be assembled to obtain the optimal primal-dual trajectory of the entire problem if the boundary conditions are set to the optimal values. From this we also see that a receding-horizon control scheme delivers the solution of the long-horizon problem if the terminal cost gradients are obtained from the dual solution of the long-horizon problem. Recently the effect of terminal cost gradient on the solution trajectory of optimal control problems and its connection with dual variables has been investigated in [17], [18]. In particular, it is shown that economic MPC can achieve asymptotic stability without terminal constraints by incorporating the dual of the underlying steady-state problem as a terminal cost gradient. Furthermore, it is shown that LQ economic MPC can be stabilized by adaptively tuning the terminal cost gradient. Such observations align with our observations of Lemma 1. Specifically, the quality of the solutions of OCPs (which in general improves with the horizon length) can also be improved by choosing a suitable terminal cost gradient.
We now state our sufficiency condition for convergence, that we call asymptotic decay of sensitivity (ADS).
Property 1** (Asymptotic decay of sensitivity)**
Assume that (1) satisfies Assumption 1. For given , with and , , let and be the solutions of and , respectively. There exist with as such that:
[TABLE]
holds for any .
Property 1 implies that the solution of at time becomes less sensitive to perturbations in the initial state and the terminal cost gradient as the time index moves away from the boundary.
Remark 3
It is important that the sequence is a uniform parameter that does not depend on and and that does not depend on the choice of the boundary conditions (i.e., and ). This enables a uniform bound that holds for different subproblems.
Remark 4** (Relation with turnpike properties)**
Property 1 is related but not equivalent to so-called turnpike properties of OCPs (see [19, 20]). Property 1 establishes a relationship between two solution trajectories, while a classical turnpike property compares a single solution trajectory with the steady-state solution. In addition, Property 1 requires asymptotic convergence of the difference between the solutions, while the steady-state turnpike requires a bound on the number of time indexes with . An in-depth investigation of the relation between Property 1 and time-varying turnpikes is subject to future work.
We now prove that the ADS property provides a sufficient condition guaranteeing convergence of the proposed decomposition scheme.
Theorem 1** (Convergence)**
Assume that (1) satisfies Assumption 1 and that Property 1 holds. Furthermore, consider scheme (2) with overlap and let be the solution of . We have that:
[TABLE]
Proof:
From Lemma 1 we have that the solution of on can be obtained from . For each , applying Property 1 to and yields
[TABLE]
for any and . For , we have that and yield
[TABLE]
From (7), we have that
[TABLE]
Equation (8) establishes (5). ∎
Theorem 1 indicates that the recursion (2) converges to the solution of the full problem if the size of the overlap is sufficiently large. Furthermore, the convergence rate converges asymptotically to zero with the size of the overlap (i.e., with a maximal overlap, the iteration converges in one iteration). This reveals a powerful feature of the proposed scheme: one can control the convergence rate by choosing the size of the overlap . However, as we increase , we also increase the complexity of the subproblem and thus a trade-off exists. Accordingly, the selection of a suitable should consider the convergence rate and the subproblem complexity.
We derive termination criteria for the parallel scheme based on the violation of the KKT conditions. We have that (4d) holds at each iteration. We also have that (4a)-(4c) hold except for , which are the boundary indices. For those, the residuals can be evaluated as follows.
[TABLE]
Residuals (9) can be derived by using the fact that (4a)-(4c) hold when and are replaced with and . Finally, we define the primal-dual residuals:
[TABLE]
and establish the termination criteria:
[TABLE]
IV Linear Quadratic Problems
In this section we show that Property 1 holds for problems with linear and controllable dynamics, convex quadratic objectives, no inequalities, and a single-variable input. Specifically, we make the following assumption. The analysis reveals connections with well-known optimization sensitivity results.
Assumption 2
Consider problem (1) with:
* with and * 2. 2.
* with controllable.* 3. 3.
There are no inequality constraints.
Assumption 2 guarantees that Assumption 1 holds with , . Now we state the main theorem of this section.
Theorem 2
Property 1 holds for OCPs (1) satisfying Assumption 2.
Proof:
Without loss of generality we assume that . For any , , , the solution of exists and is unique and can be obtained from
[TABLE]
where we define
[TABLE]
Since is controllable, the controllability matrix has full row rank. This implies that has a unique solution. Using this fact, we can construct and with: for any , , , and . We construct as:
[TABLE]
One can show that and that has full column rank by using the lower-triangular structure of . Observe that has full row rank. By the fundamental theorem of linear algebra, the null space of has dimension ; consequently, columns of span the null space of . Similarly, there exists a unique solution of for any where is the th standard unit vector of . Using this observation, we construct and with , , and for any . Now consider
[TABLE]
and observe that holds. We now apply a null-space projection to (12) using and . This way we obtain the equivalent unconstrained QP
[TABLE]
The solution of (15) is given by
[TABLE]
We have that and . Using (16), we can write
[TABLE]
where and . Since and are independent of the choice of , we obtain the solution of the form (17) with different boundary conditions . Thus, we have that
[TABLE]
where and . Considering
[TABLE]
and we can see that , , , , , , and . Quantities defined in (19) are all uniform (i.e., does not depend on the length of problem and boundary conditions).
The bandwidth of a matrix is defined as the smallest integer such that for any . Note that the bandwidth of is not greater than . The following is a modification of [14, Corollary 1].
Proposition 1
Consider positive definite . Suppose that for some . Then the following holds.
[TABLE]
Proof:
The proof is given in [14]. ∎
The following lemma establishes that there exist uniform upper and lower bounds for the eigenvalues of .
Lemma 2
Consider (1), such that Assumption 2 and let be from (13). Then there exist uniform parameters (independent of the choice ) such the eigenvalues of satisfy .
Proof:
The upper bound comes from
[TABLE]
and thus we define . We can find from the lower bound of for with . We have that
[TABLE]
where for for convenience and
[TABLE]
Now we show that are linearly independent. To establish a contradiction, suppose that are linearly dependent; then there exist not all-zero such that holds. This implies that
[TABLE]
By rearranging, we obtain
[TABLE]
Since and are not all-zero, a non-zero coefficient exists in (21), and thus are linearly dependent. This contradicts the assumption that is controllable. Consequently, are linearly independent.
From the linear independence of , one can show that has full column rank and thus is positive definite. From (20) one can show that
[TABLE]
Observe that is independent of the length of the problem. Thus, we can set uniform parameter . ∎
By Proposition 1 and Lemma 2, we have that
[TABLE]
where and are the uniform upper and lower bound of the eigenvalues of established in Lemma 2.
Finally, inspecting (18) we observe that and are sparse. In particular, there exists uniform parameter such that the following holds for .
[TABLE]
where and (these index sets correspond to the index of and among the entries of and , respectively). Using the sparsity structure identified in (24), one can show the following quantities are less than or equal to
[TABLE]
where we use the syntax and define if for convenience. From (18) and (25), we have
[TABLE]
By (23) and , we have that
[TABLE]
for and . We define
[TABLE]
Note that is a uniform parameter and (27)-(28) establish Property (1). This concludes the proof. ∎
Theorem 2 implies that the reduced Hessian is positive definite–a key requirement in optimization sensitivity results as the solution must be locally unique and bounded perturbations yield bounded differences in solutions. We thus expect that Theorem 2 can be generalized to LQ OCPs with multiple inputs, time-variant objectives and dynamics, and inequality constraints by exploiting algebraic properties. Here, we focus on a simple setting due to space limitations and to keep the presentation clear. For an even more general setting (nonlinear, inequality-constrained) setting, one can seek to validate the sensitivity property numerically by using simulations. We show how to do this in the next section.
V Numerical Examples
We use a nonlinear OCP to illustrate that Property 1 guarantees convergence of the decomposition algorithm and to highlight that the approach achieves faster solutions than off-the-shelf solvers. We consider a nonlinear economic MPC problem for a chemical reactor [20, 21]. The dynamics of the reactor are given by:
[TABLE]
The control step length is sec and the differential equations are discretized using an implicit Euler scheme with a step length of sec. We implemented a parallel version of the scheme in Julia. Problems were formulated in the modeling language JuMP [22], and were solved with the nonlinear programming solver IPOPT [23]. The scheme was executed on an Intel Xeon CPU E5-2698 v3 processor running at 2.30GHz.
We use numerical simulations to verify that Property 1 holds. Here, we assess the sensitivity of the primal-dual trajectories against perturbations to the initial state and the terminal cost gradient. The reference problem is formulated with (i.e., mins) and boundary conditions with . The boundary conditions are perturbed as , where is sampled from a normal random variable. It is known that the OCP with (29) has a periodic optimal solution when and a steady-state solution with [20]. The solutions of the reference (unperturbed) problem and 30 samples of perturbed problems are shown in Figure 2 () and Figure 3 (). When we see that, for both primal and dual solutions, the distances from the reference trajectories are small in the center of the time domain and grow as we approach the boundary. On the other hand, when , such convergence is not observed. This indicates that Property 1 holds for but does not hold for . This implies that the ADS property strongly depends on the stage cost . For high-dimensional systems where it is difficult to graphically assess the ADS property, one may consider assessing the error trajectory based on the norm of deviation from the reference trajectory.
We next verify that the decomposition scheme converges and that the convergence rate improves with the size of the overlap. A problem with (i.e., mins) was decomposed into subdomains and solved with , , and and . Figure 4 (left) shows the evolution of the residuals and confirms that the rate improves with the overlap. The algorithm did not converge when . These observations reinforce the key role of Property 1.
We also explore the computational efficiency achieved via parallel decomposition. A problem with was solved using IPOPT and we compared the solution time against that of the decomposition scheme with an overlap of and . The solution found by the decomposition scheme was equal to the solution from IPOPT. Figure 4 (right) clearly illustrates that the solution time decreases as the number of computing cores increases (this also increases the number of subdomains). We can see that, if a sufficient number of cores are used, the proposed scheme becomes faster than IPOPT.
VI Conclusions
We have presented a temporal decomposition scheme for long-horizon OCPs that solves subproblems on overlapping time domains. We have proposed an asymptotic sensitivity decay property that guarantees convergence of the algorithm. This property indicates that the sensitivity of the primal and dual trajectories to perturbations on the boundary conditions (initial state and terminal cost gradient) decays asymptotically as one move away from the boundaries. This property also indicates that the scheme converges if the overlap is sufficiently large and the convergence rate improves with the size of the overlap. We have demonstrated that the solution time of long-horizon OCPs can be improved with the proposed decomposition method.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Baldea and P. Daoutidis, “Control of integrated process networks-a multi-time scale perspective,” Computers & chemical engineering , vol. 31, no. 5-6, pp. 426–444, 2007.
- 2[2] P. Falcone, F. Borrelli, J. Asgari, H. E. Tseng, and D. Hrovat, “Predictive active steering control for autonomous vehicle systems,” IEEE Transactions on control systems technology , vol. 15, no. 3, pp. 566–580, 2007.
- 3[3] R. Kumar, M. J. Wenzel, M. J. Ellis, M. N. El Bsat, K. H. Drees, and V. M. Zavala, “Hierarchical mpc schemes for periodic systems using stochastic programming,” ar Xiv preprint ar Xiv:1804.10866 , 2018.
- 4[4] J. R. Jackson and I. E. Grossmann, “Temporal decomposition scheme for nonlinear multisite production planning and distribution models,” Industrial & engineering chemistry research , vol. 42, no. 13, pp. 3045–3055, 2003.
- 5[5] C. Barrows, M. Hummon, W. Jones, and E. Hale, “Time domain partitioning of electricity production cost simulations,” National Renewable Energy Lab.(NREL), Golden, CO (United States), Tech. Rep., 2014.
- 6[6] P. Giselsson, M. D. Doan, T. Keviczky, B. De Schutter, and A. Rantzer, “Accelerated gradient methods and dual decomposition in distributed model predictive control,” Automatica , vol. 49, no. 3, pp. 829–833, 2013.
- 7[7] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. , “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends® in Machine learning , vol. 3, no. 1, pp. 1–122, 2011.
- 8[8] A. M. Geoffrion, “Generalized benders decomposition,” Journal of optimization theory and applications , vol. 10, no. 4, pp. 237–260, 1972.
