Sparsity Invariance for Convex Design of Distributed Controllers
Luca Furieri, Yang Zheng, Antonis Papachristodoulou, Maryam Kamgarpour

TL;DR
This paper introduces a novel convex framework called Sparsity Invariance (SI) for designing optimal distributed LTI controllers with sparsity constraints, extending beyond quadratic invariance and ensuring global optimality in many cases.
Contribution
The paper develops the concept of Sparsity Invariance (SI), enabling convex design of distributed controllers that surpasses quadratic invariance limitations and guarantees global optimality when applicable.
Findings
SI always produces convex restrictions for sparsity-constrained control design.
SI guarantees global optimality when quadratic invariance holds.
Numerical examples demonstrate SI's superior performance and optimality in non-QI cases.
Abstract
We address the problem of designing optimal linear time-invariant (LTI) sparse controllers for LTI systems, which corresponds to minimizing a norm of the closed-loop system subject to sparsity constraints on the controller structure. This problem is NP-hard in general and motivates the development of tractable approximations. We characterize a class of convex restrictions based on a new notion of Sparsity Invariance (SI). The underlying idea of SI is to design sparsity patterns for transfer matrices Y(s) and X(s) such that any corresponding controller K(s)=Y(s)X(s)^-1 exhibits the desired sparsity pattern. For sparsity constraints, the approach of SI goes beyond the notion of Quadratic Invariance (QI): 1) the SI approach always yields a convex restriction; 2) the solution via the SI approach is guaranteed to be globally optimal when QI holds and performs at least as well as considering…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Sparsity Invariance for Convex Design of
Distributed Controllers
Luca Furieri, Yang Zheng, Antonis Papachristodoulou, and Maryam Kamgarpour111This research was gratefully funded by the European Union ERC Starting Grant CONENE. Antonis Papachristodoulou was supported in part by the EPSRC project EP/M002454/1. Luca Furieri and Maryam Kamgarpour are with the Automatic Control Laboratory, Department of Information Technology and Electrical Engineering, ETH Zürich, Switzerland. E-mails: {furieril, mkamgar}@control.ee.ethz.ch. Yang Zheng is with the School Of Engineering And Applied Sciences, Harvard Center for Green Buildings and Cities, Harvard University. E-mail: [email protected]. Antonis Papachristodoulou is with the Department of Engineering Science, University of Oxford, United Kingdom. Email: [email protected].
Abstract
We address the problem of designing optimal linear time-invariant (LTI) sparse controllers for LTI systems, which corresponds to minimizing a norm of the closed-loop system subject to sparsity constraints on the controller structure. This problem is NP-hard in general and motivates the development of tractable approximations. We characterize a class of convex restrictions based on a new notion of Sparsity Invariance (SI). The underlying idea of SI is to design sparsity patterns for transfer matrices and such that any corresponding controller exhibits the desired sparsity pattern. For sparsity constraints, the approach of SI goes beyond the notion of Quadratic Invariance (QI): 1) the SI approach always yields a convex restriction; 2) the solution via the SI approach is guaranteed to be globally optimal when QI holds and performs at least as well as considering a nearest QI subset. Moreover, the notion of SI naturally applies to designing structured static controllers, while QI is not utilizable. Numerical examples show that even for non-QI cases, SI can recover solutions that are 1) globally optimal and 2) strictly more performing than previous methods.
1 Introduction
The safe and efficient operation of several large-scale systems, such as the smart grid [1], biological networks [2], and automated highways [3], relies on the decision making of multiple interacting agents. Coordinating the decisions of these agents is challenged by a lack of complete information of the systems’ internal variables. Such limited information arises due to privacy concerns, geographic distance or the challenges of implementing a reliable communication network.
The celebrated work [4] highlighted that lacking full information can enormously complicate the design of optimal control inputs. Indeed, the optimal feedback control policies may not even be linear for the Linear Quadratic Gaussian (LQG) control problem without full output information. The intractability inherent to lack of full information was investigated in the works [5, 6]. The core challenges discussed therein motivated identifying special cases of optimal control problems with partial information for which efficient algorithms can be used.
Optimally controlling a linear time-invariant system (LTI) with distributed sensor measurements amounts to computing a linear controller that has a desired sparsity pattern and minimizes a norm of the closed-loop system. For this generally intractable problem, the notion of Quadratic Invariance (QI) was shown to be sufficient [7] and necessary [8] for an exact convex reformulation. A related problem of sensor-actuator architecture co-design was addressed in [9, 10] by exploiting QI and using sparsity-inducing norm penalties.
1.1 Previous work on non-QI cases
Given the importance and intricacy of computing optimal distributed controllers, a variety of approximation methods have been proposed for general systems and information structures that are not QI. For example, the authors in [11] developed semidefinite programs that are relaxations of this generally NP-hard problem. However, these relaxations might fail to recover a sparse controller that is stabilizing, as confirmed experimentally in [12]. To address this issue, polynomial optimization has been used in [12] to obtain a sequence of convex relaxations which converges to a stabilizing distributed controller. Nevertheless, performance of the recovered solution is not directly addressed in [12]. For the finite-horizon control problem, the authors in [13] derived convex upper bounds to the non-convex cost function to obtain conservative feasible solutions. However, the theoretical sub-optimality bounds were shown to be loose. Alternatively, the system level approach [14] proposed an implementation where controllers are required to share locally estimated disturbances in the state-feedback case and internal controller states in the output-feedback case. We note that the classical distributed control only requires to share output measurements, but no intermediate computations, among subsystems. The need to share this additional information in [14] might raise concerns of system security and vulnerability in safety critical applications [15], where each subsystem can only rely on its own sensor measurements.
A different approach to sparse output-feedback controller synthesis is to develop a convex restriction: the unstructured problem is reformulated as an equivalent convex program and convex constraints are added to guarantee the desired sparsity pattern of the recovered controllers. Convex restrictions exhibit specific advantages: 1) their optimal solutions can be readily computed with standard convex optimization techniques, and 2) all their feasible solutions are structured and stabilizing by design. A disadvantage is that a restriction may be infeasible even when the original problem is feasible. This motivates developing convex restrictions that are as tight as possible for improved feasibility and performance. In the literature, convex restrictions have mostly been developed for the special case of computing static controllers [16, 17, 18]. Within this setting, the problem of optimal sensor and actuator selection was addressed in [19, 20] with an ADMM approach. For the general case of dynamic controllers given non-QI information structures, the work [21] suggested restricting the desired sparsity pattern to a subset that is QI to obtain upper bounds on the minimum cost. However, to the best of the authors’ knowledge, a method for convex restrictions that can outperform [21] and goes beyond the notion of QI for sparsity constraints is not known.
1.2 Contributions
This paper proposes a generalized framework for the convex design of optimal and near-optimal LTI dynamic output-feedback controllers with a pre-determined sparsity pattern. Our underlying idea is to identify appropriate sparsity patterns for two transfer matrices and such that any corresponding feedback controller in the form exhibits the desired structure. This fundamental property is denoted as Sparsity Invariance (SI).
Our first contribution is to develop algebraic conditions on the binary matrices associated with the sparsities of and that are necessary and sufficient for SI. Among all such sparsities, we suggest a polynomial-time algorithm to design sparsities that lead to better performance for the distributed control problem at hand. Second, we show that the SI notion steps beyond that of QI in several ways. Indeed, SI can be applied to general systems subject to arbitrary sparsity constraints, regardless of whether QI holds. Furthermore, SI recovers a controller that is provably globally optimal when QI holds and performs at least as well as that obtained by considering a nearest QI sparsity subset [21] when QI does not hold. Third, we provide examples to show that, even if QI does not hold, controllers obtained through the SI approach can be 1) globally optimal and 2) in general strictly more performing than those obtained using the nearest QI subset approach of [21]. Finally, we remark that the SI concept is applicable to distributed static controller design, as studied in our preliminary work [18], whereas the Youla parametrization and thus the QI notion is not utilizable. For brevity, our theoretical discussion focuses on continuous-time systems, but our results also naturally hold for discrete-time systems with sparsity constraints, as we will discuss in the numerical results.
The rest of this paper is structured as follows. Section 2 states necessary background and presents the problem formulation. Section 3 introduces the class of convex restrictions under investigation and fully characterizes our notion of Sparsity Invariance (SI). We describe how SI can be utilized in an optimized way. In Section 4, we show that 1) SI encompasses the previous approaches based on the QI notion, and 2) that strictly better performing sparse controllers can be computed efficiently with the SI approach. We present numerical results in Section 5 and conclude the paper in Section 6.
2 Background and Problem Statement
Here, we first introduce some notation on sparsity structures and transfer functions. Then, we state the problem of distributed optimal control, and introduce the necessary background on the Youla parametrization of internally stabilizing controllers.
2.1 Notation and sparsity structures
We use and to denote real numbers, complex numbers and positive integers, respectively. The -th element in a matrix is referred to as . We use to denote the identity matrix of size , to denote the zero matrix of size and to denote the matrix of size with all entries set to .
Transfer functions: We denote the imaginary axis as and consider continuous-time transfer functions . A transfer matrix is the set of matrices whose entries are transfer functions. We denote the set of causal transfer matrices as . A transfer function is called proper (resp. strictly-proper) if it is rational and the degree of the numerator polynomial does not exceed (resp. is strictly lower than) the degree of the denominator polynomial. Similar to [7], we denote by the set of strictly proper transfer matrices. Finally, we let be the set of causal and stable transfer matrices.
Sparsity structures of transfer matrices can be conveniently represented by binary matrices. A binary matrix is a matrix with entries from the set , and we use to denote the set of binary matrices. Given a binary matrix , we define the associated sparsity subspace of causal transfer matrices as
[TABLE]
Similarly, given a transfer function , we define as the binary matrix given by
[TABLE]
We say that the transfer matrix is invertible if is invertible for almost all .
Let and be binary matrices. Throughout the paper, we adopt the following conventions: , and . We say if and only if , and if and only if and there exist indices such that . Also, we denote if and only if there exist indices such that . Given a binary matrix we denote its cardinality, i.e., the total number of nonzero entries, as
[TABLE]
Considering the following binary matrices
[TABLE]
we have and . Their cardinalities are and , respectively. For the following transfer matrix,
[TABLE]
if we consider the binary matrix in the example above, we have and .
2.2 Problem statement
We consider LTI systems in continuous-time
[TABLE]
where , , , , and are the state, control input, observed output, a performance signal defined based on our control objectives, and additive disturbance at time , respectively. The input-output transfer function representation for (1) can be written as
[TABLE]
with
[TABLE]
where belongs to . Notice that are proper transfer functions and is strictly proper.
Consider the interconnection of Figure 1.
A dynamic output-feedback controller with is said to be internally stabilizing if and only if the nine transfer matrices from to are stable. We denote the set of all causal LTI internally stabilizing output-feedback controllers as . We say that is stabilizable if only and if and any stabilizes . Furthermore, we say that a controller stabilizes if and only if the four transfer matrices from to are all stable. For the rest of the paper we make the following assumption.
Assumption 1: The system is stabilizable.
A test for stabilizability of is offered in [22, Chapter 4]. It is well-known [22, Chapter 4], [7] that under Assumption 1 a controller stabilizes if and only if it stabilizes . The control problem is to compute a dynamic output-feedback controller which minimizes a given norm of
[TABLE]
which is the closed-loop transfer function from to .
In distributed control, it is common to add the requirement that only uses partial output measurements. This requirement can be captured by adding the constraint for a given binary matrix , where encodes the fact that the -th scalar control input cannot measure the -th measurement output. We formulate this distributed, sparsity-constrained control problem as follows [7]:
[TABLE]
where is any norm of interest. It was shown that a necessary and sufficient condition for a feasible solution to to exist is that all the distributed fixed modes associated with lie in the left half of the complex plane [23]. Even if is feasible, directly computing its optimal solution is intractable because the set is non-convex in general. This can be easily verified by checking that when , the controller does not lie in in general. Furthermore, the cost function is non-convex in .
2.3 The Youla parametrization of stabilizing controller
The first step to convexify problem is to derive a convex formulation of the set and the function . This is achieved by using a doubly coprime factorization of .
Lemma 1** (Chapter 4 of [22])**
For any , there exist eight proper and stable transfer matrices defining a doubly coprime factorization of , that is, they satisfy
[TABLE]
Then, the Youla parametrization of all internally stabilizing controllers [24] establishes the following equivalence [22, Chapter 4]:
[TABLE]
Furthermore, it was proved in [22, Chapter 4] that the set of all closed-loop transfer functions from to achievable by is
[TABLE]
where is defined in (2) and , and . To facilitate our problem formulation, we define
[TABLE]
It directly follows from (4) that
[TABLE]
We notice that (3) implies and (5) implies . Hence, we have
[TABLE]
Now we can equivalently reformulate into the following optimization problem.
[TABLE]
Without the sparsity constraint , problem would be convex, as (5), (6) and the cost function are affine in . The primary source of non-convexity is the requirement that . We conclude that the complexity of distributed control is ultimately linked to the non-convex sparsity requirement on the Youla parameter.
3 Sparsity Invariance
One approach to remove the non-convex sparsity requirement on the Youla parameter is as follows: replace with the convex constraint that and comply with appropriate sparsity patterns, in a way such that is guaranteed to lie in . In other words, we restrict our attention to distributed sparse controllers defined as the product of two structured matrix factors. We note that related ideas appeared for the specific case of row-column sparsities (e.g. [10, 20]), but the case of arbitrary sparsities was not addressed.
Following the general idea above, in this paper we investigate a notion of Sparsity Invariance (SI) for convex design of sparse controllers. As will be thoroughly discussed in Section 4, SI leads to the largest known class of convex restrictions of for general systems subject to sparsity constraints on the controller.
Definition 1** (Sparsity Invariance (SI))**
*Given a binary matrix , the pair of binary matrices satisfies a property of sparsity invariance (SI) with respect to if *
[TABLE]
Motivated by the SI property, consider the following convex problem:
[TABLE]
where and , with invertible, are parameters to be designed before performing the optimization. For simplicity, one could select , but we illustrate in Example 1 of Section 4 that there are cases where a different choice of might lead to improved and even globally-optimal performance for non-QI problems. For any choice of and , the above program is convex. One fundamental question is when its feasible solutions lead to stabilizing controllers lying in the desired sparsity subspace . The notion of SI (1) defined above is a mathematical expression of this requirement.
In the next subsection we establish necessary and sufficient conditions on the binary matrices and to satisfy the SI property (1).
Remark 1
Note that the notion of SI is an algebraic requirement for binary matrices and , given a binary matrix . This is independent of the parameterization of internally stabilizing controllers. In addition to the Youla parameterization, we recently observed that the SI idea (1) is equivalently applicable within the system-level [14] (SLP) and input-output [25] (IOP) parameterizations, in both continuous- and discrete-time. We refer to [26, Remark 4] for details. For brevity, in this paper we will develop our theoretical results within the Youla parameterization, and note that they can be straightforwardly applied to the SLP and the IOP.**
Remark 2
We assume that . Since and is strictly proper, the assumption is without loss of generality for . For convenience, in the definition of problem we do not indicate explicitly as a parameter. This is because the SI property (1) only depends on the binary matrices and .
3.1 Characterization of SI
One immediate idea in designing the binary matrices and to guarantee is to simply select and similar to [16, 27, 17]. However, many other choices are available that lead to improved convex restrictions.
The next Theorem provides a full characterization of the SI property (1) in terms of the binary matrices and .
Theorem 1
Let and be such that . The following two statements are equivalent:
* and .* 2. 2.
*SI as per (1) holds. *
The proof of Theorem 1 is reported in Appendix A.1. The relevance of Theorem 1 to characterizing a class of convex restrictions of is stated in the following Corollary.
Corollary 1
Let and be such that , and . Then, problem is a convex restriction of for any invertible transfer matrix .
**Proof **
Problem is obviously convex. We only need to show that any solution to corresponds to a feasible solution of .**
First, given any invertible we have
[TABLE]
Let and in (1). Since (1) holds by Theorem 1, by definition and thus every solution of is a solution of .**
Second, since is equivalent to , we conclude that is a restriction of for every invertible .**
Finally, since and we have that by transitive closure of the graph having as its adjacency matrix. Hence, is a convex restriction of for every invertible .**
In summary, the algebraic conditions
[TABLE]
are equivalent to SI and yield a class of convex restrictions of . Clearly, our condition (10) includes the choice and is (block)-diagonal as per [27, 17, 16]. We will further show in Section 4 that the convex restrictions developed in [21] are a particular case of (10). Therefore, our notion of SI naturally encompasses and extends previous convex restrictions of .
Remark 3
For each and as per (10), it is always preferable to solve the convex restriction instead of . Indeed, notice that since and , then . Equivalently, when and satisfy sparsity invariance (10), so do and , and both and are convex restrictions of . Since requiring for some may be conservative in the case , we will focus on the convex restriction to avoid this possibility.**
After determining all the matrices and for sparsity invariance, a natural follow-up question arises: how can we choose and as per Theorem 1 to obtain a convex restriction of that is as tight as possible?
3.2 Optimized design of SI
Here, we study how to choose the binary matrices and optimally for a fixed invertible .
In order to determine the best performing choice for and satisfying (10), one would need in general to solve with the chosen for each and such that (10) holds, and then select the problem minimizing the objective . Clearly, this approach is not tractable in general, as one needs to solve a large number of convex programs that is exponential in and , that is, one convex program for each binary matrices and such that . Even if we simplify the search above by fixing any and looking for the best performing choice of , we would still need to solve a large number of convex programs that is exponential in , that is, one convex program for each binary matrix such that . To deal with the above challenges, here we suggest a suboptimal, but computationally efficient algorithm that generates a locally optimized binary matrix tailored to any chosen .
Specifically, our proposed approach is to select and then compute that binary matrix which is the least sparse among those satisfying
[TABLE]
Clearly, both and above are simplifications of the general problem of finding the globally tightest convex restriction of for a fixed invertible ; indeed, we do not optimize over and we impose (11), a condition stronger than the SI requirement (10). The gain is that is unique and can be computed efficiently as per Algorithm 1, which has a polynomial complexity of .
The idea behind Algorithm 1 is to only set an entry of to 0 if the condition would be violated. We now formalize the main result about .
Theorem 2
Consider a binary matrix , and define . Then,
There exists a unique such that 2. 2.
Such can be computed via Algorithm 1.
**Proof **
Let be the unique binary matrix generated by Algorithm 1. It is easy to check that by construction. Since , it follows and . We conclude .**
Next, consider any binary matrix . By definition, we have that and so whenever and . Then, since is set to [math] by Algorithm 1 if and only if and . Therefore, we have , .**
The next corollary connects our result to characterizing tight convex restrictions of .
Corollary 2
Given a binary matrix , compute as per Algorithm 1. Then, for every fixed invertible , is the tightest convex restriction of among those in the form with .
**Proof **
*Fix an invertible and consider the problems and , where and is generated by Algorithm 1. By Theorem 2, we have , meaning that . ***
The only difference between problem and problem is: requires while requires . Therefore, we conclude that admits the largest feasible region among all with . This completes our proof.
Our suggested procedure can find a tight convex restriction for by using the computationally efficient Algorithm 1, which makes the approach practical for practitioners. However, optimally choosing and is also a non-trivial task which we leave for future work. We remark that in the lack of any further insight, one can always choose and and still obtain sparse controllers with tight sub-optimality gaps, as will be shown experimentally in Section 5. Furthermore, as shown in Section 4, the trivial choice and combined with Algorithm 1 for choosing is sufficient to recover and extend the optimality results of [7], [21] which are based on the Quadratic Invariance (QI) notion. We conclude this section by providing an example to illustrate the SI approach.
Example 1
Motivated by the numerical example in [7], let us consider the unstable plant
[TABLE]
with , in continuous-time or , in discrete-time*, and define*
[TABLE]
Our goal is to design a stabilizing controller which minimizes and satisfies the sparsity pattern below:
[TABLE]
This information structure is depicted in Figure 2.
Here, we apply the proposed SI approach and Algorithm 1 for sparsity design in order to obtain a convex restriction of . For this instance, we choose to fix and . According to Theorem 2 and Corollary 2, the tightest convex restriction of such that is , where
[TABLE]
is generated via Algorithm 1. Given a doubly coprime factorization of , any solution of is in the form , where , and .**
Remark 4** (Performance improvement)**
The classical immediate idea would be to require that is diagonal as per [16, 27, 17]; instead, SI allows the off-diagonal entries of to be non-zero through the optimized choice of , thus removing unnecessary constraints on the entries of .* This additional freedom can be seen graphically on the right side of Figure 2; the information flow from outputs to control inputs remains the same as the one encoded by , but we allow for as many arrows as possible in the first stage from outputs to the rows of , thus maximizing the degrees of freedom in the optimization. In Section 5 we will numerically solve for this example and show that performance improvement over the method of [21] is obtained.*
4 Beyond Quadratic Invariance
We start by recalling the well-known notion of Quadratic Invariance (QI) [7] in Subsection 4.1, and its application to the design of globally optimal [7] and sub-optimal [21] distributed dynamic output-feedback controllers in Subsection 4.2. In Subsections 4.3, 4.4 we show that the suggested SI notion strictly goes beyond that of QI for sparsity constraints: 1) the controllers obtained using the SI notion perform at least as well as those obtained by [7] and [21]; 2) we show through examples that using the SI notion we can recover globally optimal controllers even when QI does not hold, and that strict performance improvements over [21] can be obtained in general. Last, in Subsection 4.5, we discuss the applicability of SI to computing distributed static controllers, whereas the QI notion is not applicable.
4.1 Quadratic Invariance
The celebrated work of [7] characterized conditions on and under which admits an exact convex reformulation in the Youla parameter , denoted as quadratic invariance (QI).
Definition 2** (Quadratic invariance [7])**
A subspace is QI with respect to if
[TABLE]
For the purpose of this paper, we will limit our focus to QI sparsity subspaces in the form . It is shown that given a controller that stabilizes and is itself stable, there exists a parametrization such that [7]. Accordingly, a convex optimization problem equivalent to is obtained. The requirement of a stable and stabilizing controller was removed in [28]. One main result from [28] is as follows:
Theorem 3** (Theorem IV.2 of [28])**
Consider any doubly-coprime factorization of and let be QI with respect to . Then, the following two statements hold:
If is such that , then is a stabilizing controller in . 2. 2.
For any there exists for which and .
According to Theorem 3, if is QI with respect to , then can be equivalently reformulated as
[TABLE]
The optimal solution of (12) can be used to recover the globally optimal solution of via .
4.2 Convex restrictions for non-QI sparsity patterns
When is not QI with respect to , the authors of [21] proposed finding a binary matrix such that is QI with respect to . Then, the constraint of problem can be replaced by , and any feasible for this convex program will correspond to a feasible controller
[TABLE]
This inclusion (13) directly follows from Theorem 3 and the fact that .
A challenge of this approach is to compute such that is QI and as close as possible to in order to reduce conservatism, in the sense that is minimized. In general, there might be multiple choices of with the same cardinality. Furthermore, the QI condition of [7, Theorem 26], where , is nonlinear in . For these reasons, a procedure to compute a closest QI subset of in polynomial time was not provided in [21]. Instead, we have shown that the polynomial time Algorithm 1 can be combined with the SI notion to find a convex restriction for any given . In the next subsections, we show that the recovered controllers perform at least as well as those based on the notion of QI by choosing appropriately, and can be strictly more performing in general even with the trivial choice .
4.3 Connections of SI with QI
Here, we show that it is not necessary to check the QI property in order to obtain a globally optimal solution. Note that checking the property of QI before solving was proposed in [7] and required in many subsequent works. Indeed, the approach in [7] is guaranteed to yield feasible solutions for only if QI holds. Instead, our technique can be directly applied given without first checking QI. This result is summarized in the following theorem and corollary.
Theorem 4
Let and let be the binary matrix generated by Algorithm 1 with . The following statements are equivalent.
- i)
* is QI with respect to .* 2. ii)
, where is generated by Algorithm 1 with .
**Proof **
i) ii): Suppose that is QI with respect to . We have that by [7, Theorem 26], implying that and ultimately
[TABLE]
In addition, we have that and by construction. It follows that . Also, according to Theorem 2, we have , such that . By posing , we have shown above that . Hence, .
ii) i): Suppose that , which implies . By definition of , we have observed that . It follows that
[TABLE]
Combining (14) with the fact that , we have
[TABLE]
This implies which is equivalent to QI by [7, Theorem 26].**
Corollary 3
*The following statements are equivalent. *
- i)
* is QI with respect to . * 2. ii)
* is equivalent to with , where is the binary matrix generated by Algorithm 1 with .*
**Proof **
*It is well-known [28, 8] that (12) is equivalent to if and only if QI holds. It remains to show that is equivalent to (12) if and only if QI holds. ***
We first show that lies in for every such that . Indeed, by (8) we have for every and thus . We have shown in Theorem 4 that QI is equivalent to , where is generated by Algorithm 1. It follows that the constraint makes the constraint redundant and thus with is equivalent to (12). This concludes the proof.
Essentially, Theorem 4 shows that QI is equivalent to . Since by (8) when , the constraint becomes redundant if and only if QI holds and the convex program we obtain with SI, namely with , is equivalent to due to the results of [7].
Theorems 1, 2 and 4, and Corollaries 1–3 can be summarized as follows.
Given any distributed sparsity-constrained control problem , one can always cast and solve its convex restriction , where is generated by Algorithm 1. 2. 2.
If is feasible, its optimal solution is also feasible for , and is certified to be globally optimal if is QI with respect to .
We remark that verifying QI is optional and can be done a-posteriori to check global optimality of the solution, but QI is not part of the controller design procedure in the SI approach. Hence, Theorem 4 expands the applicability of convex programming to compute distributed controllers for arbitrary systems and sparsity patterns, while maintaining previous global optimality results.
Example 2
Consider the unstable system and the sparsity pattern of Example 1. We can verify that , where , and hence is not QI with respect to . Instead, let us consider the new sparsity pattern
[TABLE]
We can verify that . Hence, is QI with respect to . By applying Algorithm 1 we obtain
[TABLE]
In accordance with Theorem 4 we have that , but (see the entries highlighted in red). By Corollary 3, we conclude that the convex program with is equivalent to with the sparsity constraint , while is a convex restriction of for every invertible .**
Next, we show that SI generalizes the class of restrictions of [21], based on finding QI subsets of which are nearest to . The result is a straightforward corollary of Theorem 4.
Corollary 4
Let be QI with respect to and let be minimal as proposed in [21]. Then, there exists such that , where is the minimum cost of with , and is the minimum cost of problem (12) with the constraint replaced by .
**Proof **
Let . Since is QI with respect to , we have by Theorem 4. Hence, for every , the matrix belongs to for every and the constraint is redundant. It follows that the choice achieves . Therefore, there exists a choice of such that the optimal solution of with performs at least as well as that of the problem obtained by considering a nearest QI subset as suggested in [21]. This completes our proof.**
Corollary 4 proves that the class of convex restrictions considered in [21] is a special case in the framework of SI, obtained by choosing and computing with our Algorithm 1. Furthermore, it is possible to choose to obtain strictly more performing convex restrictions, as we will show numerically in Section 5.
4.4 Strictly Beyond QI
So far, we have shown that the SI approach naturally recovers the previous QI results of [7] and [21] as specific cases by using Algorithm 1. Here and in Section 5, we show through examples the stronger results that
SI can recover globally optimal solutions when QI does not hold, 2. 2.
strictly better performance than the approach of [21] can be obtained.
For point 2), we refer to the numerical results in Section 5. For point 1), we consider an example taken from [14].
Example 3
Consider the optimal control problem:
[TABLE]
where , , and denotes i.i.d. disturbances distributed according to a normal distribution . The discrete-time transfer function of this system is . This problem without the sparsity constraint on is known as the LQR problem. By adding the sparsity constraint, it is an instance of in discrete-time. Notice that QI does not hold whenever the graph defined by is strongly connected because is equal to in general, and so thus violating QI.**
The reason to consider a discrete-time instance of is that one can solve analytically the corresponding problem where sparsity constraints are removed by computing a simple Riccati equation [29]. It so happens that the optimal solution for this problem is , which is also feasible and hence globally optimal for . Now, consider problem with , and . We can verify that a feasible solution for is , because
[TABLE]
This implies by (8). Hence, . Since by design (see Algorithm 1), we have as desired. It is immediate to verify that the resulting controller is . We conclude that, despite a lack of QI, a convex approximation which contains the global optimum of is found by using the proposed SI approach.**
Remark 5
The global optimality result for this example was also obtained using the SLP in [14]. The sparsities for the system level parameters in [14] were chosen empirically, while we provide an explicit methodology based on the SI condition (10) and Algorithm 1. Furthermore, we wish to clarify that obtaining global optimality certificates for for systems with non-QI constraints is still an open problem, which is not addressed neither by the system level approach [14] nor by our SI approach. Both our approach and that of [14] can certify optimality of the solution because the optimal solution of this simple instance is already known analytically.**
4.5 SI for static controller design
We conclude this section by highlighting another advantage of the SI notion over the QI notion; the SI notion can be used to compute sparse static control policies in a convex way, that is policies in the form where is a real matrix in . This topic has been thoroughly studied in our earlier work [18], where we derived a notion of SI limited to the static controller case. Here, we highlight that in contrast to the QI notion, SI is useful both for static and dynamic sparse controller design.
The main observation is that the Youla parametrization cannot achieve a convexification of the static controller design problem in general, because enforcing to be a real matrix is a non-convex requirement on the transfer matrix . Consequently, a different parametrization should be used and the QI property, tightly linked to using a Youla-like parametrization, will not be relevant anymore. The most well-known techniques to convexify the and norm-optimal state-feedback static controller design problems are based on computing appropriate quadratic Lyapunov functions through Linear Matrix Inequalities (LMI); see [30, 31] for a comprehensive review. The more general case of static output-feedback is known to be NP-hard [5] and an exact convex formulation does not exist.
As we illustrated in [18], when the distributed static control problem is formulated through LMIs, the controller is recovered as , where and are real decision variables, is symmetric positive semidefinite and is a quadratic Lyapunov function for the closed-loop system. If the controller must lie in a sparsity subspace , the only source of non-convexity stems from requiring that . This expression for the static controller in terms of the decision variables matches that of , which is valid for dynamic controllers in terms of the Youla parameter. According to Theorem 1 and Corollary 1, convex restrictions can be obtained by choosing binary matrices and as per (10) that satisfy the SI condition (1), and requiring that and for any invertible real matrix . We refer the interested reader to [18] for details.
Based on the discussion above, SI is a framework-independent notion which deals with sparsity patterns. Specifically, the SI notion translates, separately, to generalizations of QI-based synthesis of sparse dynamic controllers and of block-diagonal quadratic Lyapunov functions for designing sparse static controllers.
5 Experiments
With the goal of providing insight into our proposed method and showing its potential benefits when combined with standard controller design techniques, we continue here our Example 1 and provide numerical results.
5.1 Finite-dimensional approximation
Since the convex programs we have cast are infinite-dimensional, due to the decision variables being transfer matrices whose order is not fixed, it is necessary to resort to finite-dimensional approximations. When using the Youla parametrization in continuous-time, one can adapt the semidefinite programming technique of [32] to the norm by exploiting standard results from [33, 31]; when using the SLP or IOP parametrizations in discrete-time, one can use the corresponding finite impulse response (FIR) approximations of [14, 25]. The key common idea behind these approximations is to express each decision variable , which is a general stable transfer matrix in continuous-time (resp. discrete-time), in the approximated form
[TABLE]
for some and with . The real matrices for all become the finitely many real decision variables to optimize over. The approximation (16) is based on the well-known idea of Ritz approximations [34] and we refer the reader to [14, 25] for details on SLP and IOP.
Example 1 (continued) We will address the distributed controller design problem formulated in Example 1 both in discrete- and continuous-time. We have observed in Example 2 that is not QI with respect to . As we have summarized in Section 4.2, [21] suggests identifying a binary matrix such that is QI with respect to and is minimized. In this case, we verify by inspection that in (15) is the only QI sparsity pattern such that . As suggested in [21], we can thus substitute the constraint with and the corresponding convex program is a restriction of . Our goal is to compare tightness of this convex restriction with that of obtained through SI.
5.2 Numerical Results
As outlined above, we solved finite-dimensional approximations of the convex restriction proposed in [21] and of our convex restriction with obtained through SI. All the numerical programs were solved with MOSEK [35], called through MATLAB via YALMIP [36], on a standard laptop computer.
5.2.1 IOP in discrete-time
In our first experiment we considered the discrete-time version of . Since the approach of [32] requires finding an initial stable and stabilizing controller in heuristically, which is no trivial task in general, we used the IOP parametrization [25] and the discrete-time finite-dimensional approximation (16) for all decision variables. Using the notation of [26], where and , are input-output parameters, the closest QI subset approach of [21] requires , while our SI approach translates to and . Within this setting, no feasible solution could be obtained using the closest QI subset approach; instead, upon convergence over , we obtained a cost of using the proposed SI approach. To evaluate the suboptimality, we additionally solved for the nearest QI superset of defined as the binary matrix such that is QI and is minimized [21]; the corresponding optimal cost serves as a lower bound for that of . The QI superset is unique and is computed with the algorithm (13)-(14) of [21]. It turns out that is the full lower-triangular matrix. By solving for we obtained the lower bound upon convergence over , and hence the SI solution has near-optimal performance.
5.2.2 Youla in continuous-time
In our second experiment we considered the continuous-time version of and used the finite-dimensional approximation technique of [32]. A doubly-coprime factorization of was computed as per [7, Theorem 17] using the stable and stabilizing controller suggested in [7, Page 1995]. In (16), we chose and increased the value of until the improvement on the cost was negligible, thus approaching convergence to the optimal cost of the infinite-dimensional program. Upon convergence over , the closest QI subset method of [21] led to a cost of while the SI method led to a cost of . To evaluate this improvement in performance, we additionally solved for and obtained a lower bound of . We conclude that our SI solution has a relative improvement over that of [21] based on QI subsets of at least .
6 Conclusions
We have proposed the framework of Sparsity Invariance (SI) for convex design of optimal and near-optimal sparse controllers. One main insight is that the proposed SI approach offers a direct generalization of previous design methods based on the notion of Quadratic Invariance (QI). Indeed, SI can be directly applied to any systems and sparsity constraints. The recovered solution is globally optimal when QI holds and performs at least as well as the nearest QI subset when QI does not hold. We have shown the potential benefits of SI over previous methods through examples, and remarked that SI is naturally applicable to sparse static controller design.
Since the condition (10) is necessary and sufficient for the SI property (1), our results approach the limits in performance of convex restrictions of the sparsity constrained control problem based on structural conditions for the Youla parameter. This opens up the question of whether different and more performing design methodologies can be developed for this challenging problem. Another direction for research is to further refine the SI approach, by developing tractable heuristics to optimally design the binary matrices and and the parameter simultaneously based on the knowledge of the system . This could potentially improve upon Algorithm 1. Finally, it would be relevant to extend the SI idea to the case of delay constraints; in discrete-time, this might be possible by refining the results of [37].
Appendix A Appendix
A.1 Proof of Theorem 1
The proof relies on two Lemmas, whose proofs are reported in Appendix A.2 and Appendix A.3.
Lemma A1
Let with . Then,
For any invertible transfer matrix in ,
[TABLE] 2. 2.
There exists an invertible transfer matrix such that
[TABLE]
Lemma A2
Let and , and . Then, there exists such that
[TABLE]
We are now ready to prove Theorem 1.
: Let be invertible. By Lemma A1 we know that . Now let . Since , we have .
: We prove by contrapositive. First, suppose that . By the second statement of Lemma A1 it is possible to select such that . By the latter and Lemma A2, we can select such that , or equivalently . Next, suppose that . Since by hypothesis, then and . Hence, the same reasoning applies.
A.2 Proof of Lemma A1
Suppose is invertible. By Cayley-Hamilton’s theorem where , for every are the coefficients of the characteristic polynomial of and . We remark that Cayley-Hamilton is valid over square matrices defined over a commutative ring, such as that of causal transfer functions [38]. By pre-multiplying by and rearranging the terms:
[TABLE]
Since we have that for every integer . Hence, for every and the first statement follows by (17).
For the second statement, we iteratively construct starting from . Let . Define . Let and be the -th column and the -th row of respectively, and let be the entry of . Using the Sherman-Morrison identity [39], if is invertible we obtain
[TABLE]
Recall that each entry of a transfer matrix is a transfer function defined over . Hence, by the definition of an invertible transfer matrix (see Section 2), (18) holds for almost every . From (18), for any and , if , then . It follows that by choosing such that
[TABLE]
we obtain that
[TABLE]
The condition (A.2) is derived by setting the right hand side of (18) to be different from [math] for every such that and are not both null for every . Observe that as per (A.2) always exists, because there is no such that and are both null for every , and hence always admits a solution in . The structural augmentation (20) is exploited in the algorithm below.
The algorithm returns a matrix such that . Specifically, by exploiting (20) we obtain that at the end of the -th iteration of the “repeat-until” cycle.
A.3 Proof of Lemma A2
Let be any transfer matrix in . Assume that . Then, for some we have that and . We know by hypothesis that . Since , it is sufficient to update with for any in to guarantee that . Furthermore, by choosing for all such that , we avoid that adding to brings to [math] when . Hence, it is always possible to choose and such that and . By iterating the procedure for all such that , we converge to .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] F. Dörfler, M. R. Jovanović, M. Chertkov, and F. Bullo, “Sparsity-promoting optimal wide-area control of power networks,” IEEE Trans. on Pow. Syst. , vol. 29, no. 5, pp. 2281–2291, 2014.
- 2[2] T. P. Prescott and A. Papachristodoulou, “Layered decomposition for the model order reduction of timescale separated biochemical reaction networks,” Journal of theoretical biology , vol. 356, pp. 113–122, 2014.
- 3[3] Y. Zheng, S. E. Li, K. Li, F. Borrelli, and J. K. Hedrick, “Distributed model predictive control for heterogeneous vehicle platoons under unidirectional topologies,” IEEE Transactions on Control Systems Technology , vol. 25, no. 3, pp. 899–910, 2017.
- 4[4] H. S. Witsenhausen, “A counterexample in stochastic optimum control,” SIAM Journal on Control , vol. 6, no. 1, pp. 131–147, 1968.
- 5[5] V. D. Blondel and J. N. Tsitsiklis, “A survey of computational complexity results in systems and control,” Automatica , vol. 36, no. 9, pp. 1249–1274, 2000.
- 6[6] C. H. Papadimitriou and J. Tsitsiklis, “Intractable problems in control theory,” SIAM jour. on contr. and opt. , vol. 24, no. 4, pp. 639–654, 1986.
- 7[7] M. Rotkowitz and S. Lall, “A characterization of convex problems in decentralized control,” IEEE Transactions on Automatic Control , vol. 51, no. 2, pp. 274–286, 2006.
- 8[8] L. Lessard and S. Lall, “Quadratic invariance is necessary and sufficient for convexity,” in American Control Conference (ACC), 2011 . IEEE, 2011, pp. 5360–5362.
