Sparsity Invariance for Convex Design of Distributed Controllers

Luca Furieri; Yang Zheng; Antonis Papachristodoulou; Maryam Kamgarpour

arXiv:1906.06777·eess.SY·July 14, 2020

Sparsity Invariance for Convex Design of Distributed Controllers

Luca Furieri, Yang Zheng, Antonis Papachristodoulou, Maryam Kamgarpour

PDF

TL;DR

This paper introduces a novel convex framework called Sparsity Invariance (SI) for designing optimal distributed LTI controllers with sparsity constraints, extending beyond quadratic invariance and ensuring global optimality in many cases.

Contribution

The paper develops the concept of Sparsity Invariance (SI), enabling convex design of distributed controllers that surpasses quadratic invariance limitations and guarantees global optimality when applicable.

Findings

01

SI always produces convex restrictions for sparsity-constrained control design.

02

SI guarantees global optimality when quadratic invariance holds.

03

Numerical examples demonstrate SI's superior performance and optimality in non-QI cases.

Abstract

We address the problem of designing optimal linear time-invariant (LTI) sparse controllers for LTI systems, which corresponds to minimizing a norm of the closed-loop system subject to sparsity constraints on the controller structure. This problem is NP-hard in general and motivates the development of tractable approximations. We characterize a class of convex restrictions based on a new notion of Sparsity Invariance (SI). The underlying idea of SI is to design sparsity patterns for transfer matrices Y(s) and X(s) such that any corresponding controller K(s)=Y(s)X(s)^-1 exhibits the desired sparsity pattern. For sparsity constraints, the approach of SI goes beyond the notion of Quadratic Invariance (QI): 1) the SI approach always yields a convex restriction; 2) the solution via the SI approach is guaranteed to be globally optimal when QI holds and performs at least as well as considering…

Figures1

Click any figure to enlarge with its caption.

Figure 1

Equations115

Sparse (X) := {

Sparse (X) := {

such that X_{ij} = 0 for almost all ω \in R} .

X_{ij} := {01 if Y_{ij} (j ω) = 0 for almost all ω \in R, otherwise .

X_{ij} := {01 if Y_{ij} (j ω) = 0 for almost all ω \in R, otherwise .

∥ X ∥_{0} := i = 1 \sum m j = 1 \sum n X_{ij} .

∥ X ∥_{0} := i = 1 \sum m j = 1 \sum n X_{ij} .

X_{1} = [011101], X_{2} = [011001], X_{3} = [111001],

X_{1} = [011101], X_{2} = [011001], X_{3} = [111001],

Y = [0 \frac{1}{s + 1} \frac{1}{s + 1} \frac{1}{s + 1} 0 \frac{1}{s + 1}] \in R H_{\infty}^{2 \times 3},

Y = [0 \frac{1}{s + 1} \frac{1}{s + 1} \frac{1}{s + 1} 0 \frac{1}{s + 1}] \in R H_{\infty}^{2 \times 3},

\overset{x}{˙} (t) = A x (t) + B u (t) + H_{x} w (t),

\overset{x}{˙} (t) = A x (t) + B u (t) + H_{x} w (t),

y (t) = C_{y} x (t) + H_{y} w (t),

z (t) = C_{z} x (t) + D_{z} u (t) + H_{z} w (t),

[z y] = P [w u] = [P_{11} P_{21} P_{12} G] [w u],

[z y] = P [w u] = [P_{11} P_{21} P_{12} G] [w u],

P_{11}

P_{11}

P_{12}

P_{21}

G

f (K) = P_{11} + P_{12} K (I_{p} - GK)^{- 1} P_{21},

f (K) = P_{11} + P_{12} K (I_{p} - GK)^{- 1} P_{21},

Problem P_{K}

Problem P_{K}

K \in C_{stab} minimize

subject to

G = N_{r} M_{r}^{- 1} = M_{l}^{- 1} N_{l},

G = N_{r} M_{r}^{- 1} = M_{l}^{- 1} N_{l},

[U_{l} - N_{l} - V_{l} M_{l}] [M_{r} N_{r} V_{r} U_{r}] = I_{m + p} .

C_{stab} = {(V_{r} - M_{r} Q) (U_{r} - N_{r} Q)^{- 1} ∣ Q \in R H_{\infty}^{m \times p}} .

C_{stab} = {(V_{r} - M_{r} Q) (U_{r} - N_{r} Q)^{- 1} ∣ Q \in R H_{\infty}^{m \times p}} .

f (C_{stab}) = {T_{1} - T_{2} Q T_{3} ∣ Q \in R H_{\infty}^{m \times p}},

f (C_{stab}) = {T_{1} - T_{2} Q T_{3} ∣ Q \in R H_{\infty}^{m \times p}},

Y_{Q} = (V_{r} - M_{r} Q) M_{l},

Y_{Q} = (V_{r} - M_{r} Q) M_{l},

X_{Q} = (U_{r} - N_{r} Q) M_{l} .

C_{stab} = {Y_{Q} X_{Q}^{- 1} ∣ \eqref e q : Y d o u b l y co p r im e, \eqref e q : X d o u b l y co p r im e, Q \in R H_{\infty}^{m \times p}} .

C_{stab} = {Y_{Q} X_{Q}^{- 1} ∣ \eqref e q : Y d o u b l y co p r im e, \eqref e q : X d o u b l y co p r im e, Q \in R H_{\infty}^{m \times p}} .

X_{Q}

X_{Q}

= I_{p} + G (Y_{Q} + M_{r} QM_{l}) - N_{r} QM_{l}

= I_{p} + GY_{Q} .

Problem P_{Q}

Problem P_{Q}

Q \in R H_{\infty}^{m \times p} minimize

subject to

Y \in Sparse (T) and X \in Sparse (R)

Y \in Sparse (T) and X \in Sparse (R)

⇓

YX^{- 1} \in Sparse (S) .

P_{T, R}

P_{T, R}

Q \in R H_{\infty}^{m \times p} minimize

subject to

Y_{Q} Γ \in Sparse (T), X_{Q} Γ \in Sparse (R),

(Y_{Q} Γ) (X_{Q} Γ)^{- 1} = Y_{Q} X_{Q}^{- 1} .

(Y_{Q} Γ) (X_{Q} Γ)^{- 1} = Y_{Q} X_{Q}^{- 1} .

T \leq S and T R^{p - 1} \leq S,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Sparsity Invariance for Convex Design of

Distributed Controllers

Luca Furieri, Yang Zheng, Antonis Papachristodoulou, and Maryam Kamgarpour111This research was gratefully funded by the European Union ERC Starting Grant CONENE. Antonis Papachristodoulou was supported in part by the EPSRC project EP/M002454/1. Luca Furieri and Maryam Kamgarpour are with the Automatic Control Laboratory, Department of Information Technology and Electrical Engineering, ETH Zürich, Switzerland. E-mails: {furieril, mkamgar}@control.ee.ethz.ch. Yang Zheng is with the School Of Engineering And Applied Sciences, Harvard Center for Green Buildings and Cities, Harvard University. E-mail: [email protected]. Antonis Papachristodoulou is with the Department of Engineering Science, University of Oxford, United Kingdom. Email: [email protected].

Abstract

We address the problem of designing optimal linear time-invariant (LTI) sparse controllers for LTI systems, which corresponds to minimizing a norm of the closed-loop system subject to sparsity constraints on the controller structure. This problem is NP-hard in general and motivates the development of tractable approximations. We characterize a class of convex restrictions based on a new notion of Sparsity Invariance (SI). The underlying idea of SI is to design sparsity patterns for transfer matrices $\mathbf{Y}(s)$ and $\mathbf{X}(s)$ such that any corresponding controller $\mathbf{K}(s)=\mathbf{Y}(s)\mathbf{X}(s)^{-1}$ exhibits the desired sparsity pattern. For sparsity constraints, the approach of SI goes beyond the notion of Quadratic Invariance (QI): 1) the SI approach always yields a convex restriction; 2) the solution via the SI approach is guaranteed to be globally optimal when QI holds and performs at least as well as considering a nearest QI subset. Moreover, the notion of SI naturally applies to designing structured static controllers, while QI is not utilizable. Numerical examples show that even for non-QI cases, SI can recover solutions that are 1) globally optimal and 2) strictly more performing than previous methods.

1 Introduction

The safe and efficient operation of several large-scale systems, such as the smart grid [1], biological networks [2], and automated highways [3], relies on the decision making of multiple interacting agents. Coordinating the decisions of these agents is challenged by a lack of complete information of the systems’ internal variables. Such limited information arises due to privacy concerns, geographic distance or the challenges of implementing a reliable communication network.

The celebrated work [4] highlighted that lacking full information can enormously complicate the design of optimal control inputs. Indeed, the optimal feedback control policies may not even be linear for the Linear Quadratic Gaussian (LQG) control problem without full output information. The intractability inherent to lack of full information was investigated in the works [5, 6]. The core challenges discussed therein motivated identifying special cases of optimal control problems with partial information for which efficient algorithms can be used.

Optimally controlling a linear time-invariant system (LTI) with distributed sensor measurements amounts to computing a linear controller that has a desired sparsity pattern and minimizes a norm of the closed-loop system. For this generally intractable problem, the notion of Quadratic Invariance (QI) was shown to be sufficient [7] and necessary [8] for an exact convex reformulation. A related problem of sensor-actuator architecture co-design was addressed in [9, 10] by exploiting QI and using sparsity-inducing norm penalties.

1.1 Previous work on non-QI cases

Given the importance and intricacy of computing optimal distributed controllers, a variety of approximation methods have been proposed for general systems and information structures that are not QI. For example, the authors in [11] developed semidefinite programs that are relaxations of this generally NP-hard problem. However, these relaxations might fail to recover a sparse controller that is stabilizing, as confirmed experimentally in [12]. To address this issue, polynomial optimization has been used in [12] to obtain a sequence of convex relaxations which converges to a stabilizing distributed controller. Nevertheless, performance of the recovered solution is not directly addressed in [12]. For the finite-horizon control problem, the authors in [13] derived convex upper bounds to the non-convex cost function to obtain conservative feasible solutions. However, the theoretical sub-optimality bounds were shown to be loose. Alternatively, the system level approach [14] proposed an implementation where controllers are required to share locally estimated disturbances in the state-feedback case and internal controller states in the output-feedback case. We note that the classical distributed control only requires to share output measurements, but no intermediate computations, among subsystems. The need to share this additional information in [14] might raise concerns of system security and vulnerability in safety critical applications [15], where each subsystem can only rely on its own sensor measurements.

A different approach to sparse output-feedback controller synthesis is to develop a convex restriction: the unstructured problem is reformulated as an equivalent convex program and convex constraints are added to guarantee the desired sparsity pattern of the recovered controllers. Convex restrictions exhibit specific advantages: 1) their optimal solutions can be readily computed with standard convex optimization techniques, and 2) all their feasible solutions are structured and stabilizing by design. A disadvantage is that a restriction may be infeasible even when the original problem is feasible. This motivates developing convex restrictions that are as tight as possible for improved feasibility and performance. In the literature, convex restrictions have mostly been developed for the special case of computing static controllers [16, 17, 18]. Within this setting, the problem of optimal sensor and actuator selection was addressed in [19, 20] with an ADMM approach. For the general case of dynamic controllers given non-QI information structures, the work [21] suggested restricting the desired sparsity pattern to a subset that is QI to obtain upper bounds on the minimum cost. However, to the best of the authors’ knowledge, a method for convex restrictions that can outperform [21] and goes beyond the notion of QI for sparsity constraints is not known.

1.2 Contributions

This paper proposes a generalized framework for the convex design of optimal and near-optimal LTI dynamic output-feedback controllers with a pre-determined sparsity pattern. Our underlying idea is to identify appropriate sparsity patterns for two transfer matrices $\mathbf{Y}(s)$ and $\mathbf{X}(s)$ such that any corresponding feedback controller in the form $\mathbf{K}(s)=\mathbf{Y}(s)\mathbf{X}(s)^{-1}$ exhibits the desired structure. This fundamental property is denoted as Sparsity Invariance (SI).

Our first contribution is to develop algebraic conditions on the binary matrices associated with the sparsities of $\mathbf{Y}(s)$ and $\mathbf{X}(s)$ that are necessary and sufficient for SI. Among all such sparsities, we suggest a polynomial-time algorithm to design sparsities that lead to better performance for the distributed control problem at hand. Second, we show that the SI notion steps beyond that of QI in several ways. Indeed, SI can be applied to general systems subject to arbitrary sparsity constraints, regardless of whether QI holds. Furthermore, SI recovers a controller that is provably globally optimal when QI holds and performs at least as well as that obtained by considering a nearest QI sparsity subset [21] when QI does not hold. Third, we provide examples to show that, even if QI does not hold, controllers obtained through the SI approach can be 1) globally optimal and 2) in general strictly more performing than those obtained using the nearest QI subset approach of [21]. Finally, we remark that the SI concept is applicable to distributed static controller design, as studied in our preliminary work [18], whereas the Youla parametrization and thus the QI notion is not utilizable. For brevity, our theoretical discussion focuses on continuous-time systems, but our results also naturally hold for discrete-time systems with sparsity constraints, as we will discuss in the numerical results.

The rest of this paper is structured as follows. Section 2 states necessary background and presents the problem formulation. Section 3 introduces the class of convex restrictions under investigation and fully characterizes our notion of Sparsity Invariance (SI). We describe how SI can be utilized in an optimized way. In Section 4, we show that 1) SI encompasses the previous approaches based on the QI notion, and 2) that strictly better performing sparse controllers can be computed efficiently with the SI approach. We present numerical results in Section 5 and conclude the paper in Section 6.

2 Background and Problem Statement

Here, we first introduce some notation on sparsity structures and transfer functions. Then, we state the problem of distributed optimal control, and introduce the necessary background on the Youla parametrization of internally stabilizing controllers.

2.1 Notation and sparsity structures

We use $\mathbb{R},\,\mathbb{C}$ and $\mathbb{N}$ to denote real numbers, complex numbers and positive integers, respectively. The $(i,j)$ -th element in a matrix $Y\in\mathbb{R}^{m\times n}$ is referred to as $Y_{ij}$ . We use $I_{n}$ to denote the identity matrix of size $n\times n$ , $0_{m\times n}$ to denote the zero matrix of size $m\times n$ and $1_{m\times n}$ to denote the matrix of size $m\times n$ with all entries set to $1$ .

Transfer functions: We denote the imaginary axis as $j\mathbb{R}:=\{z\in\mathbb{C}\mid\mathchar 572\relax(z)=0\}$ and consider continuous-time transfer functions $\mathbf{f}:j\mathbb{R}\rightarrow\mathbb{C}$ . A $m\times n$ transfer matrix is the set of $m\times n$ matrices whose entries are transfer functions. We denote the set of $m\times n$ causal transfer matrices as $\mathcal{R}_{c}^{m\times n}$ . A transfer function is called proper (resp. strictly-proper) if it is rational and the degree of the numerator polynomial does not exceed (resp. is strictly lower than) the degree of the denominator polynomial. Similar to [7], we denote by $\mathcal{R}_{sp}^{m\times n}$ the set of $m\times n$ strictly proper transfer matrices. Finally, we let $\mathcal{RH}_{\infty}^{m\times n}$ be the set of $m\times n$ causal and stable transfer matrices.

Sparsity structures of transfer matrices can be conveniently represented by binary matrices. A binary matrix is a matrix with entries from the set $\{0,1\}$ , and we use $\{0,1\}^{m\times n}$ to denote the set of $m\times n$ binary matrices. Given a binary matrix $X\in\{0,1\}^{m\times n}$ , we define the associated sparsity subspace of causal transfer matrices as

[TABLE]

Similarly, given a transfer function $\mathbf{Y}\in\mathcal{R}_{c}^{m\times n}$ , we define $X=\text{Struct}(\mathbf{Y})$ as the binary matrix given by

[TABLE]

We say that the transfer matrix $\mathbf{X}\in\mathcal{R}_{c}^{n\times n}$ is invertible if $\mathbf{X}(j\omega)\in\mathbb{C}^{n\times n}$ is invertible for almost all $\omega\in\mathbb{R}$ .

Let $X,\hat{X}\in\{0,1\}^{m\times n}$ and $Z\in\{0,1\}^{n\times p}$ be binary matrices. Throughout the paper, we adopt the following conventions: $X+\hat{X}:=\text{Struct}(X+\hat{X})$ , and $XZ:=\text{Struct}(XZ)$ . We say $X\leq\hat{X}$ if and only if $X_{ij}\leq\hat{X}_{ij}\;\forall i,j$ , and $X<\hat{X}$ if and only if $X\leq\hat{X}$ and there exist indices $i,j$ such that $X_{ij}<\hat{X}_{ij}$ . Also, we denote $X\nleq\hat{X}$ if and only if there exist indices $i,j$ such that $X_{ij}>\hat{X}_{ij}$ . Given a binary matrix $X\in\{0,1\}^{m\times n}$ we denote its cardinality, i.e., the total number of nonzero entries, as

[TABLE]

Considering the following binary matrices

[TABLE]

we have ${X}_{2}<X_{1},{X}_{3}\nleq X_{1}$ and ${X}_{2}+X_{1}=X_{1}$ . Their cardinalities are $\|X_{1}\|_{0}=4,\|X_{2}\|_{0}=3$ and $\|X_{3}\|_{0}=4$ , respectively. For the following transfer matrix,

[TABLE]

if we consider the binary matrix $X_{1}$ in the example above, we have $\mathbf{Y}\in\text{Sparse}(X_{1})$ and $X_{1}=\text{Struct}(\mathbf{Y})$ .

2.2 Problem statement

We consider LTI systems in continuous-time

[TABLE]

where $x(t)\in\mathbb{R}^{n}$ , $u(t)\in\mathbb{R}^{m}$ , $y(t)\in\mathbb{R}^{p}$ , $z(t)\in\mathbb{R}^{q}$ , and $w(t)\in\mathbb{R}^{r}$ are the state, control input, observed output, a performance signal defined based on our control objectives, and additive disturbance at time $t\in\mathbb{R}$ , respectively. The input-output transfer function representation for (1) can be written as

[TABLE]

with

[TABLE]

where $s$ belongs to $j\mathbb{R}$ . Notice that $\mathbf{P}_{11},\mathbf{P}_{12},\mathbf{P}_{21}$ are proper transfer functions and $\mathbf{G}$ is strictly proper.

Consider the interconnection of Figure 1.

A dynamic output-feedback controller $u=\mathbf{K}y$ with $\mathbf{K}\in\mathcal{R}_{c}^{m\times p}$ is said to be internally stabilizing if and only if the nine transfer matrices from $w,~{}\nu_{1},~{}\nu_{2}$ to $z,~{}y,~{}u$ are stable. We denote the set of all causal LTI internally stabilizing output-feedback controllers as $\mathcal{C}_{\text{stab}}$ . We say that $\mathbf{P}$ is stabilizable if only and if $\mathcal{C}_{\text{stab}}\neq\emptyset$ and any $\mathbf{K}\in\mathcal{C}_{\text{stab}}$ stabilizes $\mathbf{P}$ . Furthermore, we say that a controller $\mathbf{K}$ stabilizes $\mathbf{G}$ if and only if the four transfer matrices from $\nu_{1},~{}\nu_{2}$ to $y,~{}u$ are all stable. For the rest of the paper we make the following assumption.

Assumption 1: The system $\mathbf{P}$ is stabilizable.

A test for stabilizability of $\mathbf{P}$ is offered in [22, Chapter 4]. It is well-known [22, Chapter 4], [7] that under Assumption 1 a controller $\mathbf{K}$ stabilizes $\mathbf{P}$ if and only if it stabilizes $\mathbf{G}$ . The control problem is to compute a dynamic output-feedback controller $\mathbf{K}\in\mathcal{C}_{\text{stab}}$ which minimizes a given norm $\|\cdot\|$ of

[TABLE]

which is the closed-loop transfer function from $w$ to $z$ .

In distributed control, it is common to add the requirement that $\mathbf{K}$ only uses partial output measurements. This requirement can be captured by adding the constraint $\mathbf{K}\in\text{Sparse}(S)$ for a given binary matrix $S\in\{0,1\}^{m\times p}$ , where $S_{ij}=0$ encodes the fact that the $i$ -th scalar control input cannot measure the $j$ -th measurement output. We formulate this distributed, sparsity-constrained control problem as follows [7]:

[TABLE]

where $\|\cdot\|$ is any norm of interest. It was shown that a necessary and sufficient condition for a feasible solution to $\mathcal{P}_{K}$ to exist is that all the distributed fixed modes associated with $S$ lie in the left half of the complex plane [23]. Even if $\mathcal{P}_{K}$ is feasible, directly computing its optimal solution is intractable because the set $\mathcal{C}_{\text{stab}}$ is non-convex in general. This can be easily verified by checking that when $\mathbf{K}_{1},\mathbf{K}_{2}\in\mathcal{C}_{\text{stab}}$ , the controller $\mathbf{K}=\frac{1}{2}(\mathbf{K}_{1}+\mathbf{K}_{2})$ does not lie in $\mathcal{C}_{\text{stab}}$ in general. Furthermore, the cost function $\|f(\mathbf{K})\|$ is non-convex in $\mathbf{K}$ .

2.3 The Youla parametrization of stabilizing controller

The first step to convexify problem $\mathcal{P}_{K}$ is to derive a convex formulation of the set $\mathcal{C}_{\text{stab}}$ and the function $f(\mathbf{K})$ . This is achieved by using a doubly coprime factorization of $\mathbf{G}$ .

Lemma 1 (Chapter 4 of [22])

For any $\mathbf{G}\in\mathcal{R}_{sp}^{p\times m}$ , there exist eight proper and stable transfer matrices defining a doubly coprime factorization of $\mathbf{G}$ , that is, they satisfy

[TABLE]

Then, the Youla parametrization of all internally stabilizing controllers [24] establishes the following equivalence [22, Chapter 4]:

[TABLE]

Furthermore, it was proved in [22, Chapter 4] that the set of all closed-loop transfer functions from $w$ to $z$ achievable by $\mathbf{K}\in\mathcal{C}_{\text{stab}}$ is

[TABLE]

where $f(\cdot)$ is defined in (2) and $\mathbf{T}_{1}=\mathbf{P}_{11}+\mathbf{P}_{12}\mathbf{V}_{r}\mathbf{M}_{l}\mathbf{P}_{21}$ , $\mathbf{T}_{2}=\mathbf{P}_{12}\mathbf{M}_{r}$ and $\mathbf{T}_{3}=\mathbf{M}_{l}\mathbf{P}_{21}$ . To facilitate our problem formulation, we define

[TABLE]

It directly follows from (4) that

[TABLE]

We notice that (3) implies $\mathbf{U}_{r}=\mathbf{M}_{l}^{-1}+\mathbf{GV}_{r}$ and (5) implies $\mathbf{V}_{r}\mathbf{M}_{l}=\mathbf{Y}_{Q}+\mathbf{M}_{r}\mathbf{Q}\mathbf{M}_{l}$ . Hence, we have

[TABLE]

Now we can equivalently reformulate $\mathcal{P}_{K}$ into the following optimization problem.

[TABLE]

Without the sparsity constraint $\text{Sparse}(S)$ , problem $\mathcal{P}_{Q}$ would be convex, as (5), (6) and the cost function are affine in $\mathbf{Q}$ . The primary source of non-convexity is the requirement that $\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}\in\text{Sparse}(S)$ . We conclude that the complexity of distributed control is ultimately linked to the non-convex sparsity requirement on the Youla parameter.

3 Sparsity Invariance

One approach to remove the non-convex sparsity requirement on the Youla parameter is as follows: replace $\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}\in\text{Sparse}(S)$ with the convex constraint that $\mathbf{Y}_{Q}$ and $\mathbf{X}_{Q}$ comply with appropriate sparsity patterns, in a way such that $\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}$ is guaranteed to lie in $\text{Sparse}(S)$ . In other words, we restrict our attention to distributed sparse controllers $\mathbf{K}\in\text{Sparse}(S)$ defined as the product of two structured matrix factors. We note that related ideas appeared for the specific case of row-column sparsities (e.g. [10, 20]), but the case of arbitrary sparsities was not addressed.

Following the general idea above, in this paper we investigate a notion of Sparsity Invariance (SI) for convex design of sparse controllers. As will be thoroughly discussed in Section 4, SI leads to the largest known class of convex restrictions of $\mathcal{P}_{K}$ for general systems subject to sparsity constraints on the controller.

Definition 1 (Sparsity Invariance (SI))

*Given a binary matrix $S$ , the pair of binary matrices $T,R$ satisfies a property of sparsity invariance (SI) with respect to $S$ if *

[TABLE]

Motivated by the SI property, consider the following convex problem:

[TABLE]

where $T\in\{0,1\}^{m\times p},R\in\{0,1\}^{p\times p}$ and $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ , with $\mathbf{\Gamma}$ invertible, are parameters to be designed before performing the optimization. For simplicity, one could select $\mathbf{\Gamma}=I_{p}$ , but we illustrate in Example 1 of Section 4 that there are cases where a different choice of $\mathbf{\Gamma}$ might lead to improved and even globally-optimal performance for non-QI problems. For any choice of $T,~{}R$ and $\mathbf{\Gamma}$ , the above program is convex. One fundamental question is when its feasible solutions lead to stabilizing controllers $\mathbf{K}=(\mathbf{Y}_{Q}\mathbf{\Gamma})(\mathbf{X}_{Q}\mathbf{\Gamma})^{-1}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}$ lying in the desired sparsity subspace $\text{Sparse}(S)$ . The notion of SI (1) defined above is a mathematical expression of this requirement.

In the next subsection we establish necessary and sufficient conditions on the binary matrices $T$ and $R$ to satisfy the SI property (1).

Remark 1

Note that the notion of SI is an algebraic requirement for binary matrices $R$ and $T$ , given a binary matrix $S$ . This is independent of the parameterization of internally stabilizing controllers. In addition to the Youla parameterization, we recently observed that the SI idea (1) is equivalently applicable within the system-level [14] (SLP) and input-output [25] (IOP) parameterizations, in both continuous- and discrete-time. We refer to [26, Remark 4] for details. For brevity, in this paper we will develop our theoretical results within the Youla parameterization, and note that they can be straightforwardly applied to the SLP and the IOP.**

Remark 2

We assume that $R\geq I_{p}$ . Since $\mathbf{X}_{Q}=I_{p}+\mathbf{GY}_{Q}\in\text{Sparse}(R)$ and $\mathbf{G}$ is strictly proper, the assumption is without loss of generality for $\mathbf{\Gamma}=I_{p}$ . For convenience, in the definition of problem $\mathcal{P}_{T,R}$ we do not indicate $\mathbf{\Gamma}$ explicitly as a parameter. This is because the SI property (1) only depends on the binary matrices $T$ and $R$ .

3.1 Characterization of SI

One immediate idea in designing the binary matrices $T$ and $R$ to guarantee $\mathbf{K}=(\mathbf{Y}_{\mathbf{Q}}\mathbf{\Gamma})(\mathbf{X}_{\mathbf{Q}}\mathbf{\Gamma})^{-1}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}\in\text{Sparse}(S)$ is to simply select $T=S$ and $R=I_{p}$ similar to [16, 27, 17]. However, many other choices are available that lead to improved convex restrictions.

The next Theorem provides a full characterization of the SI property (1) in terms of the binary matrices $T$ and $R$ .

Theorem 1

Let $T\in\{0,1\}^{m\times p}$ and $R\in\{0,1\}^{p\times p}$ be such that $R\geq I_{p}$ . The following two statements are equivalent:

$T\leq S$ * and $TR^{p-1}\leq S$ .* 2. 2.

*SI as per (1) holds. *

The proof of Theorem 1 is reported in Appendix A.1. The relevance of Theorem 1 to characterizing a class of convex restrictions of $\mathcal{P}_{K}$ is stated in the following Corollary.

Corollary 1

Let $T\in\{0,1\}^{m\times p}$ and $R\in\{0,1\}^{p\times p}$ be such that $R\geq I_{p}$ , $T\leq S$ and $TR^{p-1}\leq S$ . Then, problem $\mathcal{P}_{T,R^{p-1}}$ is a convex restriction of $\mathcal{P}_{K}$ for any invertible transfer matrix $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ .

Proof

Problem $\mathcal{P}_{T,R^{p-1}}$ is obviously convex. We only need to show that any solution to $\mathcal{P}_{T,R^{p-1}}$ corresponds to a feasible solution of $\mathcal{P}_{Q}$ .**

First, given any invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ we have

[TABLE]

Let $\mathbf{Y}=\mathbf{Y}_{Q}\mathbf{\Gamma}$ and $\mathbf{X}=\mathbf{X}_{Q}\mathbf{\Gamma}$ in (1). Since (1) holds by Theorem 1, by definition $\mathbf{YX}^{-1}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}\in\text{Sparse}(S)$ and thus every solution of $\mathcal{P}_{T,R}$ is a solution of $\mathcal{P}_{Q}$ .**

Second, since $\mathcal{P}_{Q}$ is equivalent to $\mathcal{P}_{K}$ , we conclude that $\mathcal{P}_{T,R}$ is a restriction of $\mathcal{P}_{K}$ for every invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ .**

Finally, since $TR^{p-1}\leq S$ and $R\geq I_{p}$ we have that $T(R^{p-1})^{p-1}\leq S$ by transitive closure of the graph having $R$ as its adjacency matrix. Hence, $\mathcal{P}_{T,R^{p-1}}$ is a convex restriction of $\mathcal{P}_{K}$ for every invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ .**

In summary, the algebraic conditions

[TABLE]

are equivalent to SI and yield a class of convex restrictions of $\mathcal{P}_{K}$ . Clearly, our condition (10) includes the choice $T=S$ and $R$ is (block)-diagonal as per [27, 17, 16]. We will further show in Section 4 that the convex restrictions developed in [21] are a particular case of (10). Therefore, our notion of SI naturally encompasses and extends previous convex restrictions of $\mathcal{P}_{K}$ .

Remark 3

For each $T$ and $R$ as per (10), it is always preferable to solve the convex restriction $\mathcal{P}_{T,R^{p-1}}$ instead of $\mathcal{P}_{T,R}$ . Indeed, notice that since $TR^{p-1}\leq S$ and $R\geq I_{p}$ , then $T(R^{p-1})^{p-1}\leq S$ . Equivalently, when $T$ and $R$ satisfy sparsity invariance (10), so do $T$ and $R^{p-1}$ , and both $\mathcal{P}_{T,R}$ and $\mathcal{P}_{T,R^{p-1}}$ are convex restrictions of $\mathcal{P}_{K}$ . Since requiring $\mathbf{X}_{Q}\in\text{Sparse}(R^{\prime})$ for some $R^{\prime}<R^{p-1}$ may be conservative in the case $\text{Sparse}(R^{\prime})\subset\text{Sparse}(R^{p-1})$ , we will focus on the convex restriction $\mathcal{P}_{T,R^{p-1}}$ to avoid this possibility.**

After determining all the matrices $T$ and $R$ for sparsity invariance, a natural follow-up question arises: how can we choose $T$ and $R$ as per Theorem 1 to obtain a convex restriction of $\mathcal{P}_{K}$ that is as tight as possible?

3.2 Optimized design of SI

Here, we study how to choose the binary matrices $T$ and $R$ optimally for a fixed invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ .

In order to determine the best performing choice for $T$ and $R$ satisfying (10), one would need in general to solve $\mathcal{P}_{T,R^{p-1}}$ with the chosen $\mathbf{\Gamma}$ for each $T$ and $R$ such that (10) holds, and then select the problem minimizing the objective $\|\mathbf{T}_{1}-\mathbf{T}_{2}\mathbf{Q}\mathbf{T}_{3}\|$ . Clearly, this approach is not tractable in general, as one needs to solve a large number of convex programs that is exponential in $m$ and $p$ , that is, one convex program for each binary matrices $T$ and $R$ such that $TR^{p-1}\leq S$ . Even if we simplify the search above by fixing any $T\leq S$ and looking for the best performing choice of $R$ , we would still need to solve a large number of convex programs that is exponential in $p$ , that is, one convex program for each binary matrix $R$ such that $TR^{p-1}\leq S$ . To deal with the above challenges, here we suggest a suboptimal, but computationally efficient algorithm that generates a locally optimized binary matrix $R$ tailored to any chosen $T\leq S$ .

Specifically, our proposed approach is to $1)$ select $T\leq S$ and then $2)$ compute that binary matrix $R^{\star}_{T}$ which is the least sparse among those satisfying

[TABLE]

Clearly, both $1)$ and $2)$ above are simplifications of the general problem of finding the globally tightest convex restriction $\mathcal{P}_{T,R}$ of $\mathcal{P}_{K}$ for a fixed invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ ; indeed, we do not optimize over $T$ and we impose (11), a condition stronger than the SI requirement (10). The gain is that $R^{\star}_{T}$ is unique and can be computed efficiently as per Algorithm 1, which has a polynomial complexity of $\mathcal{O}(mp^{2})$ .

The idea behind Algorithm 1 is to only set an entry of $R_{T}^{\star}$ to 0 if the condition $TR_{T}^{\star}\leq T$ would be violated. We now formalize the main result about $R_{T}^{\star}$ .

Theorem 2

Consider a binary matrix $T\in\{0,1\}^{m\times p}$ , and define $\mathcal{R}_{T}:=\{R\in\{0,1\}^{p\times p}\mid R\geq I_{p},(\ref{eq:restriction})\text{ holds}\}$ . Then,

There exists a unique $R^{\star}_{T}\in\mathcal{R}_{T}$ such that $R^{\star}_{T}\geq R^{p-1},\forall R\in\mathcal{R}_{T}.$ 2. 2.

Such $R^{\star}_{T}$ can be computed via Algorithm 1.

Proof

Let $R^{\star}_{T}$ be the unique binary matrix generated by Algorithm 1. It is easy to check that $TR^{\star}_{T}\leq T$ by construction. Since $R^{\star}_{T}\geq I_{p}$ , it follows $(TR^{\star}_{T})R^{\star}_{T}\leq TR^{\star}_{T}\leq T$ and $T(R^{\star}_{T})^{p-1}\leq\cdots\leq TR^{\star}_{T}\leq T$ . We conclude $R^{\star}_{T}\in\mathcal{R}_{T}$ .**

Next, consider any binary matrix $R\in\mathcal{R}_{T}$ . By definition, we have that $TR^{p-1}\leq T$ and so $(R^{p-1})_{jk}=0$ whenever $T_{ij}=1$ and $T_{ik}=0$ . Then, $R^{p-1}\leq R^{\star}_{T}$ since $(R^{\star}_{T})_{jk}$ is set to [math] by Algorithm 1 if and only if $T_{ik}=0$ and $T_{ij}=1$ . Therefore, we have $R^{p-1}\leq R^{\star}_{T}$ , $\forall R\in\mathcal{R}_{T}$ .**

The next corollary connects our result to characterizing tight convex restrictions of $\mathcal{P}_{K}$ .

Corollary 2

Given a binary matrix $T\leq S$ , compute $R^{\star}_{T}$ as per Algorithm 1. Then, for every fixed invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ , $\mathcal{P}_{T,R^{\star}_{T}}$ is the tightest convex restriction of $\mathcal{P}_{K}$ among those in the form $\mathcal{P}_{T,R^{p-1}}$ with $R\in\mathcal{R}_{T}$ .

Proof

*Fix an invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ and consider the problems $\mathcal{P}_{T,R^{p-1}}$ and $\mathcal{P}_{T,R^{\star}_{T}}$ , where $R\in\mathcal{R}_{T}$ and $R^{\star}_{T}$ is generated by Algorithm 1. By Theorem 2, we have $R^{p-1}\leq R^{\star}_{T}$ , meaning that $\text{Sparse}(R^{p-1})\subset\text{Sparse}(R^{\star}_{T})$ . ***

The only difference between problem $\mathcal{P}_{T,R^{p-1}}$ and problem $\mathcal{P}_{T,R^{\star}_{T}}$ is: $\mathcal{P}_{T,R^{p-1}}$ requires $\mathbf{X}_{Q}\mathbf{\Gamma}\in\text{Sparse}(R^{p-1})$ while $\mathcal{P}_{T,R^{\star}_{T}}$ requires $\mathbf{X}_{Q}\mathbf{\Gamma}\in\text{Sparse}(R^{\star}_{T})$ . Therefore, we conclude that $\mathcal{P}_{T,R^{\star}_{T}}$ admits the largest feasible region among all $\mathcal{P}_{T,R^{p-1}}$ with $R\in\mathcal{R}_{T}$ . This completes our proof.

Our suggested procedure can find a tight convex restriction for $\mathcal{P}_{K}$ by using the computationally efficient Algorithm 1, which makes the approach practical for practitioners. However, optimally choosing $\mathbf{\Gamma}$ and $T$ is also a non-trivial task which we leave for future work. We remark that in the lack of any further insight, one can always choose $T=S$ and $\mathbf{\Gamma}=I_{p}$ and still obtain sparse controllers with tight sub-optimality gaps, as will be shown experimentally in Section 5. Furthermore, as shown in Section 4, the trivial choice $T=S$ and $\mathbf{\Gamma}=I_{p}$ combined with Algorithm 1 for choosing $R$ is sufficient to recover and extend the optimality results of [7], [21] which are based on the Quadratic Invariance (QI) notion. We conclude this section by providing an example to illustrate the SI approach.

Example 1

Motivated by the numerical example in [7], let us consider the unstable plant

[TABLE]

with $u(\sigma)=u(s)=\frac{1}{s+1}$ , $v(\sigma)=v(s)=\frac{1}{s-1}$ in continuous-time or $u(\sigma)=u(z)=\frac{0.1}{z-0.5}$ , $v(\sigma)=v(z)=\frac{1}{z-2}$ in discrete-time*, and define*

[TABLE]

Our goal is to design a stabilizing controller $\mathbf{K}$ which minimizes $\|f(\mathbf{K})\|_{\mathcal{H}_{2}}$ and satisfies the sparsity pattern below:

[TABLE]

This information structure is depicted in Figure 2.

Here, we apply the proposed SI approach and Algorithm 1 for sparsity design in order to obtain a convex restriction of $\mathcal{P}_{K}$ . For this instance, we choose to fix $T=S$ and $\mathbf{\Gamma}=I_{p}$ . According to Theorem 2 and Corollary 2, the tightest convex restriction of $\mathcal{P}_{K}$ such that $TR^{p-1}=SR^{p-1}\leq S$ is $\mathcal{P}_{S,R^{\star}_{S}}$ , where $R^{\star}_{S}$

[TABLE]

is generated via Algorithm 1. Given a doubly coprime factorization of $\mathbf{G}$ , any solution of $\mathcal{P}_{S,R^{\star}_{S}}$ is in the form $\mathbf{K}=\mathbf{Y}_{Q}(\mathbf{X}_{Q})^{-1}\in\mathcal{C}_{\text{stab}}\cap\text{Sparse}(S)$ , where $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ , $\mathbf{X}_{Q}\in\text{Sparse}(R^{\star}_{S})$ and $(\mathbf{X}_{Q})^{-1}\in\text{Sparse}(R^{\star}_{S})$ .**

Remark 4 (Performance improvement)

The classical immediate idea would be to require that $\mathbf{X}_{Q}$ is diagonal as per [16, 27, 17]; instead, SI allows the off-diagonal entries of $\mathbf{X}_{Q}=I_{p}+\mathbf{GY}_{Q}$ to be non-zero through the optimized choice of $R^{\star}_{S}$ , thus removing unnecessary constraints on the entries of $(\mathbf{GY}_{Q})$ .* This additional freedom can be seen graphically on the right side of Figure 2; the information flow from outputs to control inputs remains the same as the one encoded by $S$ , but we allow for as many arrows as possible in the first stage from outputs to the rows of $\mathbf{X}^{-1}$ , thus maximizing the degrees of freedom in the optimization. In Section 5 we will numerically solve $\mathcal{P}_{S,R^{\star}_{S}}$ for this example and show that performance improvement over the method of [21] is obtained.*

4 Beyond Quadratic Invariance

We start by recalling the well-known notion of Quadratic Invariance (QI) [7] in Subsection 4.1, and its application to the design of globally optimal [7] and sub-optimal [21] distributed dynamic output-feedback controllers in Subsection 4.2. In Subsections 4.3, 4.4 we show that the suggested SI notion strictly goes beyond that of QI for sparsity constraints: 1) the controllers obtained using the SI notion perform at least as well as those obtained by [7] and [21]; 2) we show through examples that using the SI notion we can recover globally optimal controllers even when QI does not hold, and that strict performance improvements over [21] can be obtained in general. Last, in Subsection 4.5, we discuss the applicability of SI to computing distributed static controllers, whereas the QI notion is not applicable.

4.1 Quadratic Invariance

The celebrated work of [7] characterized conditions on $\mathbf{G}$ and $\text{Sparse}(S)$ under which $\mathcal{P}_{K}$ admits an exact convex reformulation in the Youla parameter $\mathbf{Q}$ , denoted as quadratic invariance (QI).

Definition 2 (Quadratic invariance [7])

A subspace $\mathcal{K}\subseteq\mathcal{R}_{c}^{m\times p}$ is QI with respect to $\mathbf{G}$ if

[TABLE]

For the purpose of this paper, we will limit our focus to QI sparsity subspaces in the form $\text{Sparse}(S)$ . It is shown that given a controller $\mathbf{K}_{\text{nom}}\in\text{Sparse}(S)$ that stabilizes $\mathbf{G}$ and is itself stable, there exists a parametrization such that $\mathbf{K}\in\text{Sparse}(S)\Leftrightarrow\mathbf{Q}\in\text{Sparse}(S)$ [7]. Accordingly, a convex optimization problem equivalent to $\mathcal{P}_{K}$ is obtained. The requirement of a stable and stabilizing controller $\mathbf{K}_{\text{nom}}$ was removed in [28]. One main result from [28] is as follows:

Theorem 3 (Theorem IV.2 of [28])

Consider any doubly-coprime factorization of $\mathbf{G}$ and let $\text{Sparse}(S)$ be QI with respect to $\mathbf{G}$ . Then, the following two statements hold:

If $\mathbf{Q}\in\mathcal{RH}_{\infty}^{m\times p}$ is such that $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ , then $\mathbf{K}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}$ is a stabilizing controller in $\text{Sparse}(S)$ . 2. 2.

For any $\mathbf{K}\in\mathcal{C}_{\text{stab}}\cap\text{Sparse}(S)$ there exists $\mathbf{Q}\in\mathcal{RH}_{\infty}^{m\times p}$ for which $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ and $\mathbf{K}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}$ .

According to Theorem 3, if $\text{Sparse}(S)$ is QI with respect to $\mathbf{G}$ , then $\mathcal{P}_{K}$ can be equivalently reformulated as

[TABLE]

The optimal solution $\mathbf{Q}^{\star}$ of (12) can be used to recover the globally optimal solution $\mathbf{K}^{\star}$ of $\mathcal{P}_{K}$ via $\mathbf{K}^{\star}=\mathbf{Y}_{Q^{\star}}\mathbf{X}_{Q^{\star}}^{-1}$ .

4.2 Convex restrictions for non-QI sparsity patterns

When $\text{Sparse}(S)$ is not QI with respect to $\mathbf{G}$ , the authors of [21] proposed finding a binary matrix $T_{\text{QI}}<S$ such that $\text{Sparse}(T_{\text{QI}})$ is QI with respect to $\mathbf{G}$ . Then, the constraint $\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}\in\text{Sparse}(S)$ of problem $\mathcal{P}_{Q}$ can be replaced by $\mathbf{Y}_{Q}\in\text{Sparse}(T_{\text{QI}})$ , and any feasible $\mathbf{Q}$ for this convex program will correspond to a feasible controller

[TABLE]

This inclusion (13) directly follows from Theorem 3 and the fact that $\text{Sparse}(T_{\text{QI}})\subset\text{Sparse}(S)$ .

A challenge of this approach is to compute $T_{\text{QI}}$ such that $\text{Sparse}(T_{\text{QI}})$ is QI and as close as possible to $S$ in order to reduce conservatism, in the sense that $\|S\|_{0}-\|T_{\text{QI}}\|_{0}$ is minimized. In general, there might be multiple choices of $T_{\text{QI}}$ with the same cardinality. Furthermore, the QI condition $T_{\text{QI}}\Delta T_{\text{QI}}\leq T_{\text{QI}}$ of [7, Theorem 26], where $\Delta=\text{Struct}(\mathbf{G})$ , is nonlinear in $T_{\text{QI}}$ . For these reasons, a procedure to compute a closest QI subset of $S$ in polynomial time was not provided in [21]. Instead, we have shown that the polynomial time Algorithm 1 can be combined with the SI notion to find a convex restriction for any given $T\leq S$ . In the next subsections, we show that the recovered controllers perform at least as well as those based on the notion of QI by choosing $T\leq S$ appropriately, and can be strictly more performing in general even with the trivial choice $T=S$ .

4.3 Connections of SI with QI

Here, we show that it is not necessary to check the QI property in order to obtain a globally optimal solution. Note that checking the property of QI before solving $\mathcal{P}_{K}$ was proposed in [7] and required in many subsequent works. Indeed, the approach in [7] is guaranteed to yield feasible solutions for $\mathcal{P}_{K}$ only if QI holds. Instead, our technique can be directly applied given $S$ without first checking QI. This result is summarized in the following theorem and corollary.

Theorem 4

Let $\Delta=\text{Struct}(\mathbf{G})$ and let $R^{\star}_{S}$ be the binary matrix generated by Algorithm 1 with $T=S$ . The following statements are equivalent.

i)

$\text{Sparse}(S)$ * is QI with respect to $\mathbf{G}$ .* 2. ii)

$R^{\star}_{S}\geq I_{p}+\Delta S$ , where $R^{\star}_{S}$ is generated by Algorithm 1 with $T=S$ .

Proof

i) $\Rightarrow$ ii): Suppose that $\text{Sparse}(S)$ is QI with respect to $\mathbf{G}$ . We have that $S\Delta S\leq S$ by [7, Theorem 26], implying that $S(I_{p}+\Delta S)\leq S$ and ultimately

[TABLE]

In addition, we have that $R^{\star}_{S}\geq I_{p}$ and $SR^{\star}_{S}\leq S$ by construction. It follows that $S(R^{\star}_{S})^{p-1}\leq\ldots\leq SR^{\star}_{S}\leq S$ . Also, according to Theorem 2, we have $R^{\star}_{S}\geq R$ , $\forall R\geq I_{p}$ such that $SR^{p-1}\leq S$ . By posing $R=I_{p}+\Delta S$ , we have shown above that $SR^{p-1}\leq S$ . Hence, $R^{\star}_{S}\geq R=I_{p}+\Delta S$ .

ii) $\Rightarrow$ i): Suppose that $R^{\star}_{S}\geq I_{p}+\Delta S$ , which implies $(R^{\star}_{S})^{p-1}\geq(I_{p}+\Delta S)^{p-1}$ . By definition of $R^{\star}_{S}$ , we have observed that $S(R^{\star}_{S})^{p-1}\leq S$ . It follows that

[TABLE]

Combining (14) with the fact that $(I_{p}+\Delta S)\geq I_{p}$ , we have

[TABLE]

This implies $S\Delta S\leq S$ which is equivalent to QI by [7, Theorem 26].**

Corollary 3

*The following statements are equivalent. *

i)

$\text{Sparse}(S)$ * is QI with respect to $\mathbf{G}$ . * 2. ii)

$\mathcal{P}_{K}$ * is equivalent to $\mathcal{P}_{S,R^{\star}_{S}}$ with $\mathbf{\Gamma}=I_{p}$ , where $R^{\star}_{S}$ is the binary matrix generated by Algorithm 1 with $T=S$ .*

Proof

*It is well-known [28, 8] that (12) is equivalent to $\mathcal{P}_{K}$ if and only if QI holds. It remains to show that $\mathcal{P}_{S,R^{\star}_{S}}$ is equivalent to (12) if and only if QI holds. ***

We first show that $\mathbf{X}_{Q}$ lies in $\text{Sparse}(I_{p}+\Delta S)$ for every $\mathbf{Q}\in\mathcal{RH}_{\infty}^{m\times p}$ such that $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ . Indeed, by (8) we have $\mathbf{X}_{Q}=I_{p}+\mathbf{GY}_{Q}$ for every $\mathbf{Q}\in\mathcal{RH}_{\infty}^{m\times p}$ and thus $\mathbf{X}_{Q}\in\text{Sparse}(I_{p}+\Delta S)$ . We have shown in Theorem 4 that QI is equivalent to $R^{\star}_{S}\geq I_{p}+\Delta S$ , where $R^{\star}_{S}$ is generated by Algorithm 1. It follows that the constraint $\mathbf{Y}_{Q}\mathbf{\Gamma}=\mathbf{Y}_{Q}\in\text{Sparse}(S)$ makes the constraint $\mathbf{X}_{Q}\mathbf{\Gamma}=\mathbf{X}_{Q}\in\text{Sparse}(R^{\star}_{S})$ redundant and thus $\mathcal{P}_{S,R^{\star}_{S}}$ with $\mathbf{\Gamma}=I_{p}$ is equivalent to (12). This concludes the proof.

Essentially, Theorem 4 shows that QI is equivalent to $R^{\star}_{S}\geq I_{p}+\Delta S$ . Since $\mathbf{X}_{Q}\in\text{Sparse}(I_{p}+\Delta S)$ by (8) when $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ , the constraint $\mathbf{X}_{Q}\in\text{Sparse}(R^{\star}_{S})$ becomes redundant if and only if QI holds and the convex program we obtain with SI, namely $\mathcal{P}_{S,R^{\star}_{S}}$ with $\mathbf{\Gamma}=I_{p}$ , is equivalent to $\mathcal{P}_{K}$ due to the results of [7].

Theorems 1, 2 and 4, and Corollaries 1–3 can be summarized as follows.

Given any distributed sparsity-constrained control problem $\mathcal{P}_{K}$ , one can always cast and solve its convex restriction $\mathcal{P}_{S,R^{\star}_{S}}$ , where $R^{\star}_{S}$ is generated by Algorithm 1. 2. 2.

If $\mathcal{P}_{S,R^{\star}_{S}}$ is feasible, its optimal solution is also feasible for $\mathcal{P}_{K}$ , and is certified to be globally optimal if $\text{Sparse}(S)$ is QI with respect to $\mathbf{G}$ .

We remark that verifying QI is optional and can be done a-posteriori to check global optimality of the solution, but QI is not part of the controller design procedure in the SI approach. Hence, Theorem 4 expands the applicability of convex programming to compute distributed controllers for arbitrary systems and sparsity patterns, while maintaining previous global optimality results.

Example 2

Consider the unstable system and the sparsity pattern $S$ of Example 1. We can verify that $S\Delta S\not\leq S$ , where $\Delta=\text{Sparse}(\mathbf{G})$ , and hence $\text{Sparse}(S)$ is not QI with respect to $\mathbf{G}$ . Instead, let us consider the new sparsity pattern

[TABLE]

We can verify that $S_{2}\Delta S_{2}\leq S_{2}$ . Hence, $\text{Sparse}(S_{2})$ is QI with respect to $\mathbf{G}$ . By applying Algorithm 1 we obtain

[TABLE]

In accordance with Theorem 4 we have that $R^{\star}_{S_{2}}\geq I_{p}+\Delta S_{2}$ , but $R^{\star}_{S}\not\geq I_{p}+\Delta S$ (see the entries highlighted in red). By Corollary 3, we conclude that the convex program $\mathcal{P}_{S_{2},R^{\star}_{S_{2}}}$ with $\mathbf{\Gamma}=I_{p}$ is equivalent to $\mathcal{P}_{K}$ with the sparsity constraint $\mathbf{K}\in\text{Sparse}(S_{2})$ , while $\mathcal{P}_{S,R^{\star}_{S}}$ is a convex restriction of $\mathcal{P}_{K}$ for every invertible $\mathbf{\Gamma}\in\mathcal{R}_{c}^{p\times p}$ .**

Next, we show that SI generalizes the class of restrictions of [21], based on finding QI subsets of $\text{Sparse}(S)$ which are nearest to $\text{Sparse}(S)$ . The result is a straightforward corollary of Theorem 4.

Corollary 4

Let $\text{Sparse}(T_{\text{QI}})\subseteq\text{Sparse}(S)$ be QI with respect to $\mathbf{G}$ and let $\|S\|_{0}-\|T_{\text{QI}}\|_{0}$ be minimal as proposed in [21]. Then, there exists $T\leq S$ such that $J^{\star}\leq J_{\text{QI}}$ , where $J^{\star}$ is the minimum cost of $\mathcal{P}_{T,R^{\star}_{T}}$ with $\mathbf{\Gamma}=I_{p}$ , and $J_{\text{QI}}$ is the minimum cost of problem (12) with the constraint $\mathbf{Y}_{Q}\in\text{Sparse}(S)$ replaced by $\mathbf{Y}_{Q}\in\text{Sparse}(T_{\text{QI}})$ .

Proof

Let $T=T_{\text{QI}}$ . Since $\text{Sparse}(T_{\text{QI}})$ is QI with respect to $\mathbf{G}$ , we have $R^{\star}_{T}\geq I_{p}+\Delta T$ by Theorem 4. Hence, for every $\mathbf{Y}_{Q}\mathbf{\Gamma}=\mathbf{Y}_{Q}\in\text{Sparse}(T)$ , the matrix $\mathbf{X}_{Q}=I_{p}+\mathbf{GY}_{Q}$ belongs to $\text{Sparse}(I_{p}+\Delta T)$ for every $\mathbf{Q}\in\mathcal{R}_{\infty}^{m\times p}$ and the constraint $\mathbf{X}_{Q}\mathbf{\Gamma}=\mathbf{X}_{Q}\in\text{Sparse}(R^{\star}_{T})$ is redundant. It follows that the choice $T=T_{\text{QI}}$ achieves $J^{\star}=J_{\text{QI}}$ . Therefore, there exists a choice of $T$ such that the optimal solution of $\mathcal{P}_{T,R_{T}^{\star}}$ with $\mathbf{\Gamma}=I_{p}$ performs at least as well as that of the problem obtained by considering a nearest QI subset as suggested in [21]. This completes our proof.**

Corollary 4 proves that the class of convex restrictions considered in [21] is a special case in the framework of SI, obtained by choosing $T=T_{\text{QI}}$ and computing $R^{\star}_{T_{\text{QI}}}$ with our Algorithm 1. Furthermore, it is possible to choose $T\leq S$ to obtain strictly more performing convex restrictions, as we will show numerically in Section 5.

4.4 Strictly Beyond QI

So far, we have shown that the SI approach naturally recovers the previous QI results of [7] and [21] as specific cases by using Algorithm 1. Here and in Section 5, we show through examples the stronger results that

SI can recover globally optimal solutions when QI does not hold, 2. 2.

strictly better performance than the approach of [21] can be obtained.

For point 2), we refer to the numerical results in Section 5. For point 1), we consider an example taken from [14].

Example 3

Consider the optimal control problem:

[TABLE]

where $z\in e^{j\mathbb{R}}$ , $A\in\mathbb{R}^{n\times n}$ , $A^{\text{bin}}=\text{Struct}(A)$ and $w(t)$ denotes i.i.d. disturbances distributed according to a normal distribution $\mathcal{N}(0_{n\times 1},I_{n})$ . The discrete-time transfer function of this system is $\mathbf{G}(z)=(zI_{p}-A)^{-1}$ . This problem without the sparsity constraint on $\mathbf{K}$ is known as the LQR problem. By adding the sparsity constraint, it is an instance of $\mathcal{P}_{K}$ in discrete-time. Notice that QI does not hold whenever the graph defined by $A$ is strongly connected because $\Delta=\text{Struct}(\mathbf{G}(z))=\text{Struct}\left((zI_{n}-A)^{-1}\right)$ is equal to $1_{n\times n}$ in general, and so $A^{\text{bin}}\Delta A^{\text{bin}}\not\leq A^{\text{bin}}$ thus violating QI.**

The reason to consider a discrete-time instance of $\mathcal{P}_{K}$ is that one can solve analytically the corresponding problem where sparsity constraints are removed by computing a simple Riccati equation [29]. It so happens that the optimal solution for this problem is $\mathbf{K}(z)=-A$ , which is also feasible and hence globally optimal for $\mathcal{P}_{K}$ . Now, consider problem $\mathcal{P}_{T,R}$ with $\mathbf{\Gamma}(z)=\mathbf{G}(z)$ , $T=A^{\text{bin}}$ and $R=R^{\star}_{A^{\text{bin}}}$ . We can verify that a feasible solution for $\mathcal{P}_{T,R}$ is $\mathbf{Y}_{Q}(z)=-\frac{A}{z}(zI_{n}-A)$ , because

[TABLE]

This implies $\mathbf{X}_{Q}(z)=I_{n}-\frac{A}{z}$ by (8). Hence, $\mathbf{X}_{Q}(z)\mathbf{\Gamma}(z)=\mathbf{X}_{\mathbf{Q}}(z)(zI_{n}-A)^{-1}=\frac{I_{n}}{z}$ . Since $R^{\star}_{A^{\text{bin}}}\geq I_{n}$ by design (see Algorithm 1), we have $\mathbf{X}_{Q}(z)\mathbf{\Gamma}(z)\in\text{Sparse}(R^{\star}_{A^{\text{bin}}})$ as desired. It is immediate to verify that the resulting controller is $\mathbf{K}(z)=\mathbf{Y}_{\mathbf{Q}}(z)\mathbf{X}_{Q}(z)^{-1}=-A$ . We conclude that, despite a lack of QI, a convex approximation which contains the global optimum of $\mathcal{P}_{K}$ is found by using the proposed SI approach.**

Remark 5

The global optimality result for this example was also obtained using the SLP in [14]. The sparsities for the system level parameters in [14] were chosen empirically, while we provide an explicit methodology based on the SI condition (10) and Algorithm 1. Furthermore, we wish to clarify that obtaining global optimality certificates for $\mathcal{P}_{K}$ for systems with non-QI constraints is still an open problem, which is not addressed neither by the system level approach [14] nor by our SI approach. Both our approach and that of [14] can certify optimality of the solution because the optimal solution of this simple instance is already known analytically.**

4.5 SI for static controller design

We conclude this section by highlighting another advantage of the SI notion over the QI notion; the SI notion can be used to compute sparse static control policies in a convex way, that is policies in the form $u(t)=Ky(t)$ where $K$ is a real matrix in $\text{Sparse}(S)$ . This topic has been thoroughly studied in our earlier work [18], where we derived a notion of SI limited to the static controller case. Here, we highlight that in contrast to the QI notion, SI is useful both for static and dynamic sparse controller design.

The main observation is that the Youla parametrization cannot achieve a convexification of the static controller design problem in general, because enforcing $\mathbf{K}=(\mathbf{V}_{r}-\mathbf{M}_{r}\mathbf{Q})(\mathbf{U}_{r}-\mathbf{N}_{r}\mathbf{Q})^{-1}$ to be a real matrix is a non-convex requirement on the transfer matrix $\mathbf{Q}$ . Consequently, a different parametrization should be used and the QI property, tightly linked to using a Youla-like parametrization, will not be relevant anymore. The most well-known techniques to convexify the $\mathcal{H}_{2}$ and $\mathcal{H}_{\infty}$ norm-optimal state-feedback static controller design problems are based on computing appropriate quadratic Lyapunov functions through Linear Matrix Inequalities (LMI); see [30, 31] for a comprehensive review. The more general case of static output-feedback is known to be NP-hard [5] and an exact convex formulation does not exist.

As we illustrated in [18], when the distributed static control problem is formulated through LMIs, the controller is recovered as $K=YX^{-1}$ , where $Y$ and $X$ are real decision variables, $X$ is symmetric positive semidefinite and $V(x)=x^{\mathsf{T}}X^{-1}x$ is a quadratic Lyapunov function for the closed-loop system. If the controller must lie in a sparsity subspace $\text{Sparse}(S)$ , the only source of non-convexity stems from requiring that $YX^{-1}\in\text{Sparse}(S)$ . This expression for the static controller in terms of the decision variables matches that of $\mathbf{K}=\mathbf{Y}_{Q}\mathbf{X}_{Q}^{-1}$ , which is valid for dynamic controllers in terms of the Youla parameter. According to Theorem 1 and Corollary 1, convex restrictions can be obtained by choosing binary matrices $T$ and $R$ as per (10) that satisfy the SI condition (1), and requiring that $Y\Gamma\in\text{Sparse}(T)$ and $X\Gamma\in\text{Sparse}(R)$ for any invertible real matrix $\Gamma\in\mathbb{R}^{n\times n}$ . We refer the interested reader to [18] for details.

Based on the discussion above, SI is a framework-independent notion which deals with sparsity patterns. Specifically, the SI notion translates, separately, to generalizations of QI-based synthesis of sparse dynamic controllers and of block-diagonal quadratic Lyapunov functions for designing sparse static controllers.

5 Experiments

With the goal of providing insight into our proposed method and showing its potential benefits when combined with standard controller design techniques, we continue here our Example 1 and provide numerical results.

5.1 Finite-dimensional approximation

Since the convex programs we have cast are infinite-dimensional, due to the decision variables being transfer matrices whose order is not fixed, it is necessary to resort to finite-dimensional approximations. When using the Youla parametrization in continuous-time, one can adapt the semidefinite programming technique of [32] to the $\mathcal{H}_{2}$ norm by exploiting standard results from [33, 31]; when using the SLP or IOP parametrizations in discrete-time, one can use the corresponding finite impulse response (FIR) approximations of [14, 25]. The key common idea behind these approximations is to express each decision variable $\mathbf{U}$ , which is a general stable transfer matrix in continuous-time (resp. discrete-time), in the approximated form

[TABLE]

for some $N\in\mathbb{N}$ and $a\in\mathbb{R}$ with $a>0$ . The real matrices $U[i]$ for all $i$ become the finitely many real decision variables to optimize over. The approximation (16) is based on the well-known idea of Ritz approximations [34] and we refer the reader to [14, 25] for details on SLP and IOP.

Example 1 (continued) We will address the distributed controller design problem formulated in Example 1 both in discrete- and continuous-time. We have observed in Example 2 that $\text{Sparse}(S)$ is not QI with respect to $\mathbf{G}$ . As we have summarized in Section 4.2, [21] suggests identifying a binary matrix $T_{\text{QI}}<S$ such that $\text{Sparse}(T_{\text{QI}})$ is QI with respect to $\mathbf{G}$ and $\|S\|_{0}-\|T_{\text{QI}}\|_{0}$ is minimized. In this case, we verify by inspection that $S_{2}$ in (15) is the only QI sparsity pattern $T_{\text{QI}}$ such that $\|S\|_{0}-\|T_{\text{QI}}\|_{0}\leq 2$ . As suggested in [21], we can thus substitute the constraint $\mathbf{Y}_{Q}(\mathbf{X}_{Q})^{-1}\in\text{Sparse}(S)$ with $\mathbf{Y}_{Q}\in\text{Sparse}(S_{2})$ and the corresponding convex program is a restriction of $\mathcal{P}_{K}$ . Our goal is to compare tightness of this convex restriction with that of $\mathcal{P}_{S,R^{\star}_{S}}$ obtained through SI.

5.2 Numerical Results

As outlined above, we solved finite-dimensional approximations of the convex restriction proposed in [21] and of our convex restriction $\mathcal{P}_{S,R^{\star}_{S}}$ with $\mathbf{\Gamma}=I_{p}$ obtained through SI. All the numerical programs were solved with MOSEK [35], called through MATLAB via YALMIP [36], on a standard laptop computer.

5.2.1 IOP in discrete-time

In our first experiment we considered the discrete-time version of $\mathbf{G}$ . Since the approach of [32] requires finding an initial stable and stabilizing controller in $\text{Sparse}(S)$ heuristically, which is no trivial task in general, we used the IOP parametrization [25] and the discrete-time finite-dimensional approximation (16) for all decision variables. Using the notation of [26], where $\mathbf{K}=\mathbf{UY}^{-1}$ and $\mathbf{U}$ , $\mathbf{Y}$ are input-output parameters, the closest QI subset approach of [21] requires $\mathbf{U}\in\text{Sparse}(S_{2})$ , while our SI approach translates to $\mathbf{U}\in\text{Sparse}(S)$ and $\mathbf{Y}\in\text{Sparse}(R_{S}^{\star})$ . Within this setting, no feasible solution could be obtained using the closest QI subset approach; instead, upon convergence over $N$ , we obtained a cost of $6.7278$ using the proposed SI approach. To evaluate the suboptimality, we additionally solved for the nearest QI superset of $S$ defined as the binary matrix $S_{3}\geq S$ such that $S_{3}$ is QI and $\|S_{3}\|_{0}-\|S\|_{0}$ is minimized [21]; the corresponding optimal cost serves as a lower bound for that of $\mathcal{P}_{K}$ . The QI superset is unique and is computed with the algorithm (13)-(14) of [21]. It turns out that $S_{3}$ is the full lower-triangular matrix. By solving for $S_{3}$ we obtained the lower bound $6.7268$ upon convergence over $N$ , and hence the SI solution has near-optimal performance.

5.2.2 Youla in continuous-time

In our second experiment we considered the continuous-time version of $\mathbf{G}$ and used the finite-dimensional approximation technique of [32]. A doubly-coprime factorization of $\mathbf{G}$ was computed as per [7, Theorem 17] using the stable and stabilizing controller $\mathbf{K}_{\text{nom}}$ suggested in [7, Page 1995]. In (16), we chose $a>0$ and increased the value of $N$ until the improvement on the cost was negligible, thus approaching convergence to the optimal cost of the infinite-dimensional program. Upon convergence over $N$ , the closest QI subset method of [21] led to a cost of $7.3367$ while the SI method led to a cost of $7.3098$ . To evaluate this improvement in performance, we additionally solved for $S_{3}$ and obtained a lower bound of $7.2163$ . We conclude that our SI solution has a relative improvement over that of [21] based on QI subsets of at least $\frac{7.3367-7.3098}{7.3367-7.2163}=22.3\%$ .

6 Conclusions

We have proposed the framework of Sparsity Invariance (SI) for convex design of optimal and near-optimal sparse controllers. One main insight is that the proposed SI approach offers a direct generalization of previous design methods based on the notion of Quadratic Invariance (QI). Indeed, SI can be directly applied to any systems and sparsity constraints. The recovered solution is globally optimal when QI holds and performs at least as well as the nearest QI subset when QI does not hold. We have shown the potential benefits of SI over previous methods through examples, and remarked that SI is naturally applicable to sparse static controller design.

Since the condition (10) is necessary and sufficient for the SI property (1), our results approach the limits in performance of convex restrictions of the sparsity constrained control problem based on structural conditions for the Youla parameter. This opens up the question of whether different and more performing design methodologies can be developed for this challenging problem. Another direction for research is to further refine the SI approach, by developing tractable heuristics to optimally design the binary matrices $T$ and $R$ and the parameter $\mathbf{\Gamma}$ simultaneously based on the knowledge of the system $\mathbf{P}$ . This could potentially improve upon Algorithm 1. Finally, it would be relevant to extend the SI idea to the case of delay constraints; in discrete-time, this might be possible by refining the results of [37].

Appendix A Appendix

A.1 Proof of Theorem 1

The proof relies on two Lemmas, whose proofs are reported in Appendix A.2 and Appendix A.3.

Lemma A1

Let $R\in\{0,1\}^{p\times p}$ with $R\geq I_{p}$ . Then,

For any invertible transfer matrix $\mathbf{X}$ in $\text{Sparse}\left(R\right)$ ,

[TABLE] 2. 2.

There exists an invertible transfer matrix $\mathbf{X}\in\text{Sparse}(R)$ such that

[TABLE]

Lemma A2

Let $T\in\{0,1\}^{m\times p}$ and $R\in\{0,1\}^{p\times p}$ , and $\text{Struct}(\mathbf{W})=R$ . Then, there exists $\mathbf{Z}\in\text{Sparse}(T)$ such that

[TABLE]

We are now ready to prove Theorem 1.

$1)\Rightarrow 2)$ : Let $\mathbf{X}\in\text{Sparse}(R)$ be invertible. By Lemma A1 we know that $\mathbf{X}^{-1}\in\text{Sparse}(R^{p-1})$ . Now let $\mathbf{Y}\in\text{Sparse}(T)$ . Since $TR^{p-1}\leq S$ , we have $\mathbf{YX}^{-1}\in\text{Sparse}(S)$ .

$2)\Rightarrow 1)$ : We prove by contrapositive. First, suppose that $TR^{p-1}\not\leq S$ . By the second statement of Lemma A1 it is possible to select $\mathbf{X}\in\text{Sparse}(R)$ such that $\text{Struct}(\mathbf{X}^{-1})=R^{p-1}$ . By the latter and Lemma A2, we can select $\mathbf{Y}\in\text{Sparse}(T)$ such that $\text{Struct}\left(\mathbf{YX}^{-1}\right)=TR^{p-1}$ , or equivalently $\mathbf{YX}^{-1}\not\in\text{Sparse}(S)$ . Next, suppose that $T\not\leq S$ . Since $R\geq I_{p}$ by hypothesis, then $TR\not\leq S$ and $TR^{p-1}\not\leq S$ . Hence, the same reasoning applies.

A.2 Proof of Lemma A1

Suppose $\mathbf{X}\in\text{Sparse}(R)$ is invertible. By Cayley-Hamilton’s theorem $\sum_{i=0}^{n}\lambda_{i}\mathbf{X}^{i}=0$ where $\{\lambda_{i}\}_{i=0}^{p}$ , $\lambda_{i}\in\mathcal{R}_{c}$ for every $i=1,\ldots,p$ are the coefficients of the characteristic polynomial of $\mathbf{X}$ and $\lambda_{0}=\det{\mathbf{X}}\neq 0$ . We remark that Cayley-Hamilton is valid over square matrices defined over a commutative ring, such as that of causal transfer functions [38]. By pre-multiplying by $\mathbf{X}^{-1}$ and rearranging the terms:

[TABLE]

Since $R\geq I_{p}$ we have that $R^{a}\geq R^{b}$ for every integer $a\geq b$ . Hence, $\lambda_{i}\mathbf{X}^{i}\in\text{Sparse}\left(R^{p-1}\right)$ for every $i$ and the first statement follows by (17).

For the second statement, we iteratively construct $\mathbf{X}$ starting from $\mathbf{X}=I_{p}$ . Let $\mathbf{\alpha}\in\mathcal{R}_{c}$ . Define $\tilde{\mathbf{X}}=\mathbf{X}+\mathbf{\alpha}e_{i}e_{j}^{\mathsf{T}}$ . Let $\mathbf{X}^{-1}_{:,i}\in\mathcal{R}_{c}^{p\times 1}$ and $\mathbf{X}^{-1}_{j,:}\in\mathcal{R}_{c}^{1\times p}$ be the $i$ -th column and the $j$ -th row of $\mathbf{X}^{-1}$ respectively, and let $\mathbf{X}^{-1}_{ij}$ be the entry $(i,j)$ of $\mathbf{X}^{-1}$ . Using the Sherman-Morrison identity [39], if $\tilde{\mathbf{X}}$ is invertible we obtain

[TABLE]

Recall that each entry of a transfer matrix is a transfer function defined over $s=j\omega$ . Hence, by the definition of an invertible transfer matrix (see Section 2), (18) holds for almost every $\omega\in\mathbb{R}$ . From (18), for any $i$ and $\mathbf{\alpha}\in\mathcal{R}_{c}$ , if $\mathbf{X}^{-1}_{ii}\neq 0$ , then $\tilde{\mathbf{X}}^{-1}_{ii}\neq 0$ . It follows that by choosing $\mathbf{\alpha}$ such that

[TABLE]

we obtain that

[TABLE]

The condition (A.2) is derived by setting the right hand side of (18) to be different from [math] for every $k$ such that $\mathbf{X}^{-1}_{ik}$ and $\mathbf{X}^{-1}_{jk}$ are not both null for every $\omega\in\mathbb{R}$ . Observe that $\mathbf{\alpha}$ as per (A.2) always exists, because there is no $k$ such that $\mathbf{X}^{-1}_{ik}$ and $\mathbf{X}^{-1}_{jk}$ are both null for every $\omega\in\mathbb{R}$ , and hence $\mathbf{\alpha}\left(\mathbf{X}^{-1}_{ii}\mathbf{X}^{-1}_{jk}-\mathbf{X}^{-1}_{ji}\mathbf{X}^{-1}_{ik}\right)\neq\mathbf{X}^{-1}_{ik}$ always admits a solution in $\alpha\in\mathcal{R}_{c}$ . The structural augmentation (20) is exploited in the algorithm below.

The algorithm returns a matrix $\mathbf{X}$ such that $\text{Struct}(\mathbf{X}^{-1})=R^{p-1}$ . Specifically, by exploiting (20) we obtain that $\text{Struct}(\mathbf{X}^{-1})\geq R^{s}$ at the end of the $s$ -th iteration of the “repeat-until” cycle.

A.3 Proof of Lemma A2

Let $\mathbf{Z}$ be any transfer matrix in $\text{Sparse}(T)$ . Assume that $\text{Struct}(\mathbf{ZW})<TR$ . Then, for some $(i,j,k)$ we have that $\mathbf{ZW}_{ij}=0$ and $T_{ik}=R_{kj}=1$ . We know by hypothesis that $\mathbf{W}_{kj}\neq 0$ . Since $\sum_{l=1}^{p}\mathbf{Z}_{il}\mathbf{W}_{lj}=0$ , it is sufficient to update $\mathbf{Z}_{ik}$ with $\mathbf{Z}_{ik}+\mathbf{\alpha}$ for any $\mathbf{\alpha}\neq 0$ in $\mathcal{R}_{c}$ to guarantee that $\mathbf{ZW}_{ij}\neq 0$ . Furthermore, by choosing $\mathbf{\alpha}\neq-\frac{\mathbf{ZW}_{it}}{\mathbf{W}_{kt}}$ for all $t$ such that $\mathbf{ZW}_{it}\neq 0$ , we avoid that adding $\mathbf{\alpha}$ to $\mathbf{Z}_{ik}$ brings $\mathbf{ZW}_{it}$ to [math] when $\mathbf{ZW}_{it}\neq 0$ . Hence, it is always possible to choose $k$ and $\mathbf{\alpha}$ such that $\mathbf{ZW}+\mathbf{\alpha}e_{i}e_{k}^{\mathsf{T}}>\mathbf{ZW}$ and $\mathbf{Z}\in\text{Sparse}(T)$ . By iterating the procedure for all $(i,j)$ such that $\text{Struct}(\mathbf{ZW})_{ij}<TR_{ij}$ , we converge to $\text{Struct}(\mathbf{ZW})=TR$ .

Bibliography39

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] F. Dörfler, M. R. Jovanović, M. Chertkov, and F. Bullo, “Sparsity-promoting optimal wide-area control of power networks,” IEEE Trans. on Pow. Syst. , vol. 29, no. 5, pp. 2281–2291, 2014.
2[2] T. P. Prescott and A. Papachristodoulou, “Layered decomposition for the model order reduction of timescale separated biochemical reaction networks,” Journal of theoretical biology , vol. 356, pp. 113–122, 2014.
3[3] Y. Zheng, S. E. Li, K. Li, F. Borrelli, and J. K. Hedrick, “Distributed model predictive control for heterogeneous vehicle platoons under unidirectional topologies,” IEEE Transactions on Control Systems Technology , vol. 25, no. 3, pp. 899–910, 2017.
4[4] H. S. Witsenhausen, “A counterexample in stochastic optimum control,” SIAM Journal on Control , vol. 6, no. 1, pp. 131–147, 1968.
5[5] V. D. Blondel and J. N. Tsitsiklis, “A survey of computational complexity results in systems and control,” Automatica , vol. 36, no. 9, pp. 1249–1274, 2000.
6[6] C. H. Papadimitriou and J. Tsitsiklis, “Intractable problems in control theory,” SIAM jour. on contr. and opt. , vol. 24, no. 4, pp. 639–654, 1986.
7[7] M. Rotkowitz and S. Lall, “A characterization of convex problems in decentralized control,” IEEE Transactions on Automatic Control , vol. 51, no. 2, pp. 274–286, 2006.
8[8] L. Lessard and S. Lall, “Quadratic invariance is necessary and sufficient for convexity,” in American Control Conference (ACC), 2011 . IEEE, 2011, pp. 5360–5362.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Sparsity Invariance for Convex Design of

Abstract

1 Introduction

1.1 Previous work on non-QI cases

1.2 Contributions

2 Background and Problem Statement

2.1 Notation and sparsity structures

2.2 Problem statement

2.3 The Youla parametrization of stabilizing controller

Lemma 1** (Chapter 4 of [22])**

3 Sparsity Invariance

Definition 1** (Sparsity Invariance (SI))**

Remark 1

Remark 2

3.1 Characterization of SI

Theorem 1

Corollary 1

**Proof **

Remark 3

3.2 Optimized design of SI

Theorem 2

**Proof **

Corollary 2

**Proof **

Example 1

Remark 4** (Performance improvement)**

4 Beyond Quadratic Invariance

4.1 Quadratic Invariance

Definition 2** (Quadratic invariance [7])**

Theorem 3** (Theorem IV.2 of [28])**

4.2 Convex restrictions for non-QI sparsity patterns

4.3 Connections of SI with QI

Theorem 4

**Proof **

Corollary 3

**Proof **

Example 2

Corollary 4

**Proof **

4.4 Strictly Beyond QI

Example 3

Remark 5

4.5 SI for static controller design

5 Experiments

5.1 Finite-dimensional approximation

5.2 Numerical Results

5.2.1 IOP in discrete-time

5.2.2 Youla in continuous-time

6 Conclusions

Appendix A Appendix

A.1 Proof of Theorem 1

Lemma A1

Lemma A2

A.2 Proof of Lemma A1

A.3 Proof of Lemma A2

Lemma 1 (Chapter 4 of [22])

Definition 1 (Sparsity Invariance (SI))

Proof

Proof

Proof

Remark 4 (Performance improvement)

Definition 2 (Quadratic invariance [7])

Theorem 3 (Theorem IV.2 of [28])

Proof

Proof

Proof