Block Factor-Width-Two Matrices in Semidefinite Programming

Aivar Sootla; Yang Zheng; and Antonis Papachristodoulou

arXiv:1903.04938·math.OC·March 13, 2019

Block Factor-Width-Two Matrices in Semidefinite Programming

Aivar Sootla, Yang Zheng, and Antonis Papachristodoulou

PDF

2 Repos

TL;DR

This paper introduces block factor-width-two matrices, a new subset of positive semidefinite matrices, enabling decomposition of large semidefinite constraints for more efficient large-scale semidefinite programming.

Contribution

It defines block factor-width-two matrices, derives their dual cones, and develops hierarchies of approximations to improve large-scale semidefinite programming.

Findings

01

Enables decomposition of large semidefinite constraints into smaller ones

02

Provides closed-form dual cone expressions

03

Demonstrates effectiveness in sum-of-squares optimization

Abstract

In this paper, we introduce a set of block factor-width-two matrices, which is a generalisation of factor-width-two matrices and is a subset of positive semidefinite matrices. The set of block factor-width-two matrices is a proper cone and we compute a closed-form expression for its dual cone. We use these cones to build hierarchies of inner and outer approximations of the cone of positive semidefinite matrices. The main feature of these cones is that they enable a decomposition of a large semidefinite constraint into a number of smaller semidefinite constraints. As the main application of these classes of matrices, we envision large-scale semidefinite feasibility optimisation programs including sum-of-squares (SOS) programs. We present numerical examples from SOS optimisation showcasing the properties of this decomposition.

Tables2

Table 1. TABLE I: Computational Results for an SDP Partition in Section IV-A

Computational Time (seconds)
$n$	Full SDP	Number of Partitions in SDP Variables
$n$	Full SDP	$4$	$10$	$20$	$50$
$10$	$2.38$	$1.43$	$1.29$	$1.28$	$1.49$
$15$	$27.3$	$23.3$	$15.6$	$10.1$	$5.36$
$20$	$489$	$252$	$98.1$	$66.8$	$28.1$
$25$	$\infty$	$1.97 \cdot 10^{3}$	$7.83 \cdot 10^{2}$	$5.71 \cdot 10^{2}$	$1.32 \cdot 10^{2}$
$30$	$\infty$	$\infty$	$5.68 \cdot 10^{3}$	$3.71 \cdot 10^{3}$	$8.4 \cdot 10^{2}$
Objective values
$10$	$- 0.9$	$- 0.45$	$134$	$483$	$2.12 \cdot 10^{3}$
$15$	$- 0.92$	$- 0.75$	$80.1$	$459$	$2.24 \cdot 10^{3}$
$20$	$- 0.87$	$- 0.87$	$- 0.11$	$251$	$1.91 \cdot 10^{3}$
$25$	$\infty$	$- 1.07$	$- 0.21$	$231$	$1.36 \cdot 10^{3}$
$30$	$\infty$	$\infty$	$- 0.37$	$177$	$1.77 \cdot 10^{3}$
Sizes of SDP Constraints
$10$	$66$	$32 - 34$	$12 - 14$	$6 - 8$	$2 - 4$
$15$	$136$	$68$	$26 - 28$	$12 - 14$	$4 - 6$
$20$	$231$	$114 - 116$	$46 - 48$	$22 - 24$	$8 - 10$
$25$	$351$	$174 - 176$	$70 - 72$	$34 - 36$	$14 - 16$
$30$	$496$	$248$	$98 - 100$	$48 - 50$	$18 - 20$

Table 2. TABLE II: Computational results for Section IV-B

Computational time
$n$	$20$	$25$	$30$	$35$	$40$	$45$	$50$
SOS	$5.28$	$14.4$	$35.9$	$87.2$	$175.0$	$316.0$	$487.8$
$ℱ 𝒲_{α, 2}$	$7.90$	$10.8$	$16.6$	$25.3$	$36.0$	$57.4$	$71.4$
$ℱ 𝒲_{2}$	$1.04$	$1.1$	$1.3$	$1.6$	$2.1$	$2.6$	$3.3$
Objective value
SOS	$149.0$	$266.5$	$316.2$	$460.8$	$562.0$	$746.9$	$919.8$
$ℱ 𝒲_{α, 2}$	$149.0$	$266.5$	$316.2$	$460.8$	$562.0$	$746.9$	$919.8$
$ℱ 𝒲_{2}$	$154.4$	$270.3$	$324.8$	$477.7$	$570.9$	$762.2$	$961.7$

Equations87

A = A_{11} A_{21} ⋮ A_{p 1} A_{12} A_{22} ⋮ A_{p 2} \dots \dots ⋱ \dots A_{1 p} A_{2 p} ⋮ A_{pp},

A = A_{11} A_{21} ⋮ A_{p 1} A_{12} A_{22} ⋮ A_{p 2} \dots \dots ⋱ \dots A_{1 p} A_{2 p} ⋮ A_{pp},

E_{ij} = [E_{i}^{T} E_{j}^{T}]^{T} \in R^{(k_{i} + k_{j}) \times N}, i \neq = j,

E_{ij} = [E_{i}^{T} E_{j}^{T}]^{T} \in R^{(k_{i} + k_{j}) \times N}, i \neq = j,

I = I_{k_{1}} I_{k_{2}} ⋱ I_{k_{p}} = E_{1} E_{2} ⋮ E_{p},

I = I_{k_{1}} I_{k_{2}} ⋱ I_{k_{p}} = E_{1} E_{2} ⋮ E_{p},

E_{i} = [0 \dots I_{k_{i}} \dots 0] \in R^{k_{i} \times N} .

X min

X min

⟨ A_{i}, X ⟩ = b_{i}, i = 1, \dots, m,

X \in S_{+}^{N},

y min

y min

f_{0} (x) + i = 1 \sum m y_{i} f_{i} (x) \geq 0, \forall x \in R^{n} .

p (x) = v_{d} (x)^{T} Q v_{d} (x),

p (x) = v_{d} (x)^{T} Q v_{d} (x),

y min

y min

v_{d} (x)^{T} Q v_{d} (x) =

f_{0} (x) + i = 1 \sum m y_{i} f_{i} (x), \forall x \in R^{n},

Q \in S_{+}^{N},

X = i = 1 \sum s e_{i}^{T} X_{i} e_{i}, with X_{i} \in S_{+}^{k}, e_{i} \in T_{k},

X = i = 1 \sum s e_{i}^{T} X_{i} e_{i}, with X_{i} \in S_{+}^{k}, e_{i} \in T_{k},

(F W_{k}^{N})^{*} = {Z \in S^{N} ∣ e_{i} Z e_{i}^{T} \in S_{+}^{k}, \forall e_{i} \in T_{k}} .

(F W_{k}^{N})^{*} = {Z \in S^{N} ∣ e_{i} Z e_{i}^{T} \in S_{+}^{k}, \forall e_{i} \in T_{k}} .

F W_{1}^{N} \subset

F W_{1}^{N} \subset

(F W_{N}^{N})^{*} \subset \dots \subset (F W_{2}^{N})^{*} \subset (F W_{1}^{N})^{*} .

X = i = 1 \sum N - 1 j = i + 1 \sum N E_{ij}^{T} X_{ij} E_{ij}, with X_{ij} \in S_{+}^{2},

X = i = 1 \sum N - 1 j = i + 1 \sum N E_{ij}^{T} X_{ij} E_{ij}, with X_{ij} \in S_{+}^{2},

X = i = 1 \sum p - 1 j = i + 1 \sum p E_{ij}^{T} X_{ij} E_{ij},

X = i = 1 \sum p - 1 j = i + 1 \sum p E_{ij}^{T} X_{ij} E_{ij},

F W_{2}^{N} = F W_{1, 2}^{N} \subset F W_{β, 2}^{N} \subset F W_{α, 2}^{N} \subset F W_{1, m a x_{i \neq = j} {k_{i} + k_{j}}}^{N} \subset F W_{{K_{1}, K_{2}}, 2}^{N} = S_{+}^{N},

F W_{2}^{N} = F W_{1, 2}^{N} \subset F W_{β, 2}^{N} \subset F W_{α, 2}^{N} \subset F W_{1, m a x_{i \neq = j} {k_{i} + k_{j}}}^{N} \subset F W_{{K_{1}, K_{2}}, 2}^{N} = S_{+}^{N},

X = i = 1 \sum p j = i + 1 \sum p + 1 E_{β ij}^{T} X_{ij} E_{β ij} = i = 1 \sum p - 1 j = i + 1 \sum p - 1 E_{β ij}^{T} X_{ij} E_{β ij} + i = 1 \sum p - 1 E_{β i p}^{T} X_{i p} E_{β i p} + i = 1 \sum p E_{β i (p + 1)}^{T} X_{i (p + 1)} E_{β i (p + 1)} .

X = i = 1 \sum p j = i + 1 \sum p + 1 E_{β ij}^{T} X_{ij} E_{β ij} = i = 1 \sum p - 1 j = i + 1 \sum p - 1 E_{β ij}^{T} X_{ij} E_{β ij} + i = 1 \sum p - 1 E_{β i p}^{T} X_{i p} E_{β i p} + i = 1 \sum p E_{β i (p + 1)}^{T} X_{i (p + 1)} E_{β i (p + 1)} .

X = i = 1 \sum p - 1 j = i + 1 \sum p E_{α ij}^{T} X_{ij} E_{α ij} .

X = i = 1 \sum p - 1 j = i + 1 \sum p E_{α ij}^{T} X_{ij} E_{α ij} .

i = 1 \sum p - 1 E_{α i p}^{T} X_{i p} E_{α i p} = i = 1 \sum p - 1 E_{β i p}^{T} X_{i p} E_{β i p} + i = 1 \sum p E_{β i (p + 1)}^{T} X_{i (p + 1)} E_{β i (p + 1)} .

i = 1 \sum p - 1 E_{α i p}^{T} X_{i p} E_{α i p} = i = 1 \sum p - 1 E_{β i p}^{T} X_{i p} E_{β i p} + i = 1 \sum p E_{β i (p + 1)}^{T} X_{i (p + 1)} E_{β i (p + 1)} .

X_{ij} = (X_{ij}^{11} X_{ij}^{12} X_{ij}^{12} X_{ij}^{22}),

X_{ij} = (X_{ij}^{11} X_{ij}^{12} X_{ij}^{12} X_{ij}^{22}),

X_{i p} = 000 0 X_{p (p + 1)}^{11} X_{p (p + 1)}^{12} 0 X_{p (p + 1)}^{12} X_{p (p + 1)}^{22} \frac{1}{p - 1} + X_{i (p + 1)}^{11} 0 X_{i (p + 1)}^{12} 000 X_{i (p + 1)}^{12} 0 X_{i (p + 1)}^{22} + X_{i p}^{11} X_{i p}^{12} 0 X_{i p}^{12} X_{i p}^{22} 0 000 .

X_{i p} = 000 0 X_{p (p + 1)}^{11} X_{p (p + 1)}^{12} 0 X_{p (p + 1)}^{12} X_{p (p + 1)}^{22} \frac{1}{p - 1} + X_{i (p + 1)}^{11} 0 X_{i (p + 1)}^{12} 000 X_{i (p + 1)}^{12} 0 X_{i (p + 1)}^{22} + X_{i p}^{11} X_{i p}^{12} 0 X_{i p}^{12} X_{i p}^{22} 0 000 .

X = 68 - 2 - 2 81611 - 2 110 - 1 - 2 1 - 1 24 .

X = 68 - 2 - 2 81611 - 2 110 - 1 - 2 1 - 1 24 .

X_{12} = (4.5 8 8 14.5), X_{13} = (1 - 2 - 2 6), X_{23} = (1112),

X_{12} = (4.5 8 8 14.5), X_{13} = (1 - 2 - 2 6), X_{23} = (1112),

X_{14} = (0.5 - 2 - 2 12), X_{24} = (0.5 1 16), X_{34} = (2 - 1 - 1 6) .

X_{13} = 1.5 - 2 - 2 - 2 7 - 0.5 - 2 - 0.5 15, X_{23} = 1.5 11 1 3, - 0.5 1 - 0.5 9 .

X_{13} = 1.5 - 2 - 2 - 2 7 - 0.5 - 2 - 0.5 15, X_{23} = 1.5 11 1 3, - 0.5 1 - 0.5 9 .

(F W_{α, 2}^{N})^{*} = {Z \in S^{N} ∣ E_{ij} Z E_{ij}^{T} \in S_{+}^{k_{i} + k_{j}}, \forall1 \leq i < j \leq p} .

(F W_{α, 2}^{N})^{*} = {Z \in S^{N} ∣ E_{ij} Z E_{ij}^{T} \in S_{+}^{k_{i} + k_{j}}, \forall1 \leq i < j \leq p} .

(F W_{1, 2}^{N})^{*} \supset (F W_{β, 2}^{N})^{*} \supset (F W_{α, 2}^{N})^{*} \supset (F W_{1, m a x_{i \neq = j} {k_{i} + k_{j}}}^{N})^{*} \supset (F W_{{K_{1}, K_{2},}, 2}^{N})^{*} = S_{+}^{N},

(F W_{1, 2}^{N})^{*} \supset (F W_{β, 2}^{N})^{*} \supset (F W_{α, 2}^{N})^{*} \supset (F W_{1, m a x_{i \neq = j} {k_{i} + k_{j}}}^{N})^{*} \supset (F W_{{K_{1}, K_{2},}, 2}^{N})^{*} = S_{+}^{N},

X min

X min

⟨ A_{i}, X ⟩ = b_{i}, i = 1, \dots, m

X \in F W_{α, 2}^{N} .

X_{l j} min

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Block Factor-Width-Two Matrices in Semidefinite Programming

Aivar Sootla, Yang Zheng, and Antonis Papachristodoulou The authors are with the Department of Engineering Science, University of Oxford, Parks Road, Oxford, OX1 3PJ, U.K. e-mail: {aivar.sootla, yang.zheng, antonis}@eng.ox.ac.uk. AS and AP are supported by EPSRC Grant EP/M002454/1, YZ is supported in part by the Clarendon Scholarship, and in part by the Jason Hu Scholarship.

Abstract

In this paper, we introduce a set of block factor-width-two matrices, which is a generalisation of factor-width-two matrices and is a subset of positive semidefinite matrices. The set of block factor-width-two matrices is a proper cone and we compute a closed-form expression for its dual cone. We use these cones to build hierarchies of inner and outer approximations of the cone of positive semidefinite matrices. The main feature of these cones is that they enable a decomposition of a large semidefinite constraint into a number of smaller semidefinite constraints. As the main application of these classes of matrices, we envision large-scale semidefinite feasibility optimisation programs including sum-of-squares (SOS) programs. We present numerical examples from SOS optimisation showcasing the properties of this decomposition.

I Introduction

Optimisation programs with positive semidefinite (PSD) constraints (or semidefinite programs — SDPs) are one of the major computational tools in linear systems theory [1, 2]. The introduction of sum-of-squares polynomial optimisation (or SOS programming) [3, 4] (and the dual moment approach [5]) extended the use of SDPs to polynomial optimisation and thus allowed addressing many nonlinear control problems in polynomial time.

Modern SDPs (and especially SOS programs) are often large-scale, that is, the PSD constraints have large dimensions. Consequently, developing fast SDP solvers has received considerable attention in the literature. Solvers for sparse programs were developed in [6, 7, 8] (ADMM-based) and in [9, 10] (interior-point solver) and a general purpose ADMM-based solver was developed in [11]. The sparsity of the PSD constraint was also exploited in the context of SOS programming [12, 13, 14]. The key idea in these sparsity-exploiting approaches is to decompose large PSD constraints into a number of smaller PSD constraints, while the optimal objective of the program remains the same for a special class of sparsity patterns [15]. Since the PSD constraint typically induces the largest computational burden, the computational time can be significantly reduced by using these techniques. These sparsity exploiting techniques can also be used for linear control applications [16].

A related approach to speed-up SOS programming was taken in [17], where the authors replaced the PSD cone with the cone of factor-width-two matrices (which we denote ${\mathcal{FW}}_{2}^{N}$ where $N$ stands for the dimension of the matrix). A matrix has a factor width two if it can be represented as a sum of rank two PSD matrices [18] and hence it is also PSD. A certificate for ${\mathcal{FW}}_{2}^{N}$ matrices can be written as a number of second-order cone constraints, which can reduce the computational and memory burden as demonstrated in [17]. We note that ${\mathcal{FW}}_{2}^{N}$ matrices are also scaled diagonally dominant (SDD) as discussed in [18]. The reader unfamiliar with SDD matrices is referred to [19] for details. We only highlight that the individual entries of SDD matrices satisfy a particular set of constraints.

As discussed in [17] the size of the cone ${\mathcal{FW}}_{2}^{N}$ is significantly smaller than the size of the PSD cone, therefore, the restricted problem may be infeasible or the optimal solution of the ${\mathcal{FW}}_{2}^{N}$ program may be significantly different from the optimal solution of the original SDP. There are several approaches to bridge this restriction gap, cf. [20]. For example, one can employ factor-width- $k$ matrices, which can be decomposed into a sum of PSD matrices of rank $k$ . Enforcing this constraint, however, is problematic due to a large number of $k\times k$ PSD constraints, which is $N$ choose $k$ , i.e., $N\choose k$ . Therefore, the computational burden can actually increase in comparison to the original SDP.

In this paper, we take a different route in order to enrich the cone of factor-width-two matrices: We draw inspiration from SDD matrices and consider their block extension. The key idea of this extension is to partition a matrix into a set of non-intersecting blocks of entries and enforce the SDD constraints on these blocks instead of the individual entries [21]. We introduce the class of block factor-width-two matrices based on the block SDD definitions from [22, 23]. A block factor-width-two matrix is also PSD and the constraint “the matrix is block factor-width-two” can be enforced using a number of PSD constraints whose size is determined by the size of the blocks. We proceed by deriving a hierarchy of inner and outer approximations of the PSD cone based on the block partition. We propose to use this approximation in SDPs by replacing the PSD cone constraint with a block “factor-width-two” constraint. The optimal objective value of the SDPs typically cannot be achieved using this technique, however, the computational cost is reduced. Striking the balance between the accuracy of the solution and the speed can be delicate in general, therefore, we envision the feasibility of SDPs without a specific sparsity structure as the main application. For example, finding a Lyapunov function certifying stability of a nonlinear system often results in a feasibility SDP without a particular sparsity structure. Therefore, in this paper, we mainly focus on SOS programs as an application.

In Section II we cover preliminaries. In Section III we introduce block factor-width-two matrices, a hierarchy of inner and outer approximations of the PSD cone and their SDP and SOS applications. We present numerical examples in Section IV and conclude the paper in Section V.

Notation. The matrix $A^{T}$ denotes the transpose of $A\in{\mathbb{R}}^{n\times n}$ . We denote the sets of $n$ by $n$ symmetric, positive definite, positive semidefinite matrices as ${\mathbb{S}}^{n}$ , ${\mathbb{S}}_{+}^{n}$ , ${\mathbb{S}}_{++}^{n}$ , respectively. We use $I_{k}$ to denote an identity matrix of dimension $k\times k$ .

II Preliminaries

II-A Partitioned Matrices

We say that a matrix $A\in{\mathbb{R}}^{N\times N}$ has $\alpha=\{k_{1},\dots,k_{p}\}$ -partition with $N=\sum\limits_{i=1}^{p}k_{i}$ , if $A$ can be written as

[TABLE]

where $A_{ij}\in{\mathbb{R}}^{k_{i}\times k_{j}}$ . For a partition $\alpha=\{k_{1},\dots,k_{p}\}$ we define block basis matrices

[TABLE]

where

[TABLE]

We also define a relation between a partition $\beta$ of the matrix $A$ and a coarser partition $\alpha$ of the same matrix.

Definition 1

Let $\alpha=\{k_{1},\dots,k_{p_{1}}\}$ and $\beta=\{l_{1},\dots,l_{p_{2}}\}$ , where $p_{1}<p_{2}$ and $\sum_{i=1}^{p_{1}}l_{i}=\sum_{i=1}^{p_{2}}k_{i}$ . We say that $\beta$ is a sub-partition of $\alpha$ and write $\alpha\sqsupseteq\beta$ , if there exist integers $\{m_{i}\}_{i=1}^{p_{1}}$ such that $k_{i}=\sum_{j=m_{i}}^{m_{i+1}-1}l_{j}$ and $m_{1}=1$ , $m_{p_{1}}=p_{2}$ , $m_{i}<m_{i+1}$ for all $i$ .

For example, given $\alpha=\{4,2\}$ , $\beta=\{2,2,2\}$ and $\gamma=\{1,1,1,1,1,1\}$ , we have $\alpha\sqsupseteq\beta\sqsupseteq\gamma$ .

II-B Semidefinite and sum-of-squares programming

The standard primal-form semidefinite program (SDP) is an optimisation problem of the form:

[TABLE]

where $C,A_{i}\in\mathbb{S}^{N},i=1,\ldots,m$ and $b\in\mathbb{R}^{m}$ are given problem data.

SDPs have found many applications in linear systems theory, such as stabilization and $\mathcal{H}_{2}/\mathcal{H}_{\infty}$ control [2]. Also, nonlinear control problems in a polynomial field can often be written as polynomial optimisation programs: Given a set of polynomials $f_{0}(x),f_{1}(x),\ldots,f_{m}(x)$ (their coefficients are given) with $x\in{\mathbb{R}}^{n}$ and the vector $b\in\mathbb{R}^{m}$ we aim to solve

[TABLE]

Even though the nonnegativity constraint in (3) is convex, the program is infinite dimensional due to the dependence on $x$ . Therefore, a tractable sum-of-squares relaxation of the nonnegative constraint is typically used. Given $x\in\mathbb{R}^{n}$ , a polynomial $p(x)$ of degree $2d$ is called a sum-of-squares (SOS) polynomial if it can be written into a sum of squares of other polynomials of degree no greater than $d$ . It is known ([24, 25]) that $p(x)$ admits an SOS decomposition if and only if there exists $Q\in\mathbb{S}^{N}_{+}$ with $N={n+d\choose d}$ such that

[TABLE]

where $v_{d}(x)$ is a vector of monomials of degree no greater than $d$ . Replacing the nonnegative constraint with an SOS constraint yields the following optimisation program:

[TABLE]

where the constraints imply that $f_{0}(x)+\sum_{i=1}^{m}y_{i}f_{i}(x)$ is an SOS polynomial. Matching the coefficients on both sides polynomial equality leads to a set of linear equality constraints on $Q$ and $y$ , and we obtain an SDP of the form (2) with additional free variables.

II-C Factor Width of Positive Semidefinite Matrices

It is well-known that small and medium-sized SDPs can be solved up to an arbitrary accuracy in polynomial time via interior point methods [1]. However, as the size of the PSD cone $N$ in (2) increases, the current state-of-the-art interior point algorithms become impractical in terms of memory requirements, computational burden or numerical accuracy. In the recent work [17] it was proposed to speed up semidefinite and SOS optimisation by replacing the PSD cone by a cone of factor-width-two matrices. This work is based on the following definitions from [18].

Definition 2

A matrix $X\in{\mathbb{S}}_{+}^{N}$ belongs to the class of factor-width- $k$ matrices (denoted as ${\mathcal{FW}}_{k}^{N}$ ) if and only if

[TABLE]

where ${\mathcal{T}}_{k}$ is a collection of matrices $e_{i}\in{\mathbb{R}}^{k\times N}$ with every row having only one non-zero element equal to one, the columns being orthonormal, and $s={N\choose k}$ .

The matrices $e_{i}$ can be seen as a decomposition basis for the matrix $X$ . It can be shown that a dual (with respect to the trace inner product) set to ${\mathcal{FW}}_{k}^{N}$ can be characterised as follows

[TABLE]

One can also show that the following hierarchy of inner and outer approximations of ${\mathbb{S}}_{+}^{N}$ holds:

[TABLE]

Replacing $\mathbb{S}^{N}_{+}$ in (2) with ${\mathcal{FW}}_{k}^{N}$ leads to a restriction with multiple $k\times k$ PSD cones. In particular, the factor-width-two matrices can be written as

[TABLE]

where the matrices $E_{ij}$ are defined as in Section II-A with $\alpha=\{1,\dots,1\}$ and $p=N$ . The constraints $X_{ij}\in{\mathbb{S}}^{2}_{+}$ can be equivalently written as second-order cone constraints, which can be solved much faster compared to solving SDPs. This fact has been used to solve large-scale SOS programs in [17]. However, the solution from ${\mathcal{FW}}_{2}^{N}$ might be very conservative. As was pointed out in [17], increasing the factor width may reduce the degree of conservatism, but this requires working with a combinatorial number ${N\choose k}$ of PSD cones of size $k\times k$ , which is not practical.

III Block Factor-width-two Matrices

III-A Block factor-width-two matrices

In this section, we introduce the class of block factor-width-two matrices, which is less conservative than ${\mathcal{FW}}_{2}^{N}$ and more scalable than ${\mathcal{FW}}_{k}^{N}$ ( $k\geq 3$ ).

Definition 3

We say that $\alpha=\{k_{1},\dots,k_{p}\}$ -partitioned matrix $X\in{\mathbb{S}}^{N}$ belongs to the class ${\mathcal{FW}}^{N}_{\alpha,2}$ if and only if

[TABLE]

where $X_{ij}\in{\mathbb{S}}^{k_{i}+k_{j}}_{+}$ and $E_{ij}$ are defined in (1).

It is straightforward to show that the set ${\mathcal{FW}}_{\alpha,2}^{N}$ is a cone with a non-empty interior, which is also:

•

convex: for any $X,Y\in{\mathcal{FW}}_{\alpha,2}^{N}$ , $0\leq\theta\leq 1$ , we have that $\theta X+(1-\theta)Y\in{\mathcal{FW}}_{\alpha,2}^{N}$ ,

•

salient: for any nonzero $X\in{\mathcal{FW}}_{\alpha,2}^{N}$ , $-X\not\in{\mathcal{FW}}_{\alpha,2}^{N}$ ,

•

pointed: the zero matrix is in ${\mathcal{FW}}_{\alpha,2}^{N}$ .

We will show in what follows that this cone is additionally closed, which makes it a proper cone (closed, convex, pointed, salient cone with non-empty interior).

The main difference with the definition of factor-width-two matrices comes in the partition $\alpha$ , which dictates the sizes of $X_{ij}$ ’s and $E_{ij}$ ’s, as well as their number. The number of basis matrices $E_{ij}$ is the same as in the case when we treat every block $X_{ij}$ as a scalar and apply the factor width decomposition to it. In our definition, we have a fixed partition $\alpha$ and a fixed “block factor-width”, which is equal to two. In order to create a hierarchy of approximations of ${\mathbb{S}}^{N}_{+}$ we can increase the “block factor-width”, which means increasing the number of basis matrices $E_{ij}$ . However, we can also build a hierarchy based on the partition coarsening, which reduces the number of basis matrices $E_{ij}$ .

Theorem 1

Given $\alpha=\{k_{1},\dots,k_{p}\}$ , $\beta=\{\widetilde{k}_{1},\dots,\widetilde{k}_{q}\}$ and $\alpha\sqsupseteq\beta$ , we have the following inclusion:

[TABLE]

where $\mathbf{1}=\{1,1,\ldots,1\}$ , $K_{1}$ , $K_{2}$ are positive integers and $K_{1}+K_{2}=N$ .

Proof:

First, ${\mathcal{FW}}_{2}^{N}={\mathcal{FW}}_{\mathbf{1},2}^{N}$ , ${\mathcal{FW}}_{\max_{i\neq j}\{k_{i}+k_{j}\}}\subset{\mathcal{FW}}_{\{K_{1},K_{2}\},2}^{N}={\mathbb{S}}_{+}^{N}$ hold by definition. Furthermore, ${\mathcal{FW}}_{\alpha,2}^{N}\subset{\mathcal{FW}}_{\mathbf{1},\max_{i\neq j}\{k_{i}+k_{j}\}}^{N}$ is true since in the decomposition for ${\mathcal{FW}}_{\alpha,2}^{N}$ we use PSD matrices of dimension at most $\max_{i\neq j}\{k_{i}+k_{j}\}$ .

In order to prove ${\mathcal{FW}}_{\beta,2}^{N}\subset{\mathcal{FW}}_{\alpha,2}^{N}$ it suffices to consider the case $\beta=\{k_{1},\dots,k_{p-1},k_{p},k_{p+1}\}$ , $\alpha=\{k_{1},\dots,k_{p-1},k_{p}+k_{p+1}\}$ . Let $E_{\beta ij}$ for $i,j=1,\dots,p+1$ be the decomposition basis for the $\beta$ -partition and $E_{\alpha ij}$ for $i,j=1,\dots,p$ be the decomposition basis the $\alpha$ -partition. By the premise, there exist $X_{ij}\in{\mathbb{S}}_{+}^{k_{i}+k_{j}}$ such that:

[TABLE]

We need to construct $\widetilde{X}_{ij}$ so that $X$ is decomposed as:

[TABLE]

Since the first $p-1$ blocks in both partitions are the same, we have that $E_{\alpha ij}=E_{\beta ij}$ and $\widetilde{X}_{ij}=X_{ij}$ for all $i,j<p$ . Therefore, in order to obtain the decomposition (6), it remains to construct $\widetilde{X}_{ip}$ for $i=1,\dots,p-1$ such that

[TABLE]

Consider the matrices $X_{ij}$ for $i<p$ and $j=p,p+1$ and split them according to the partition

[TABLE]

where $X_{ij}^{11}\in{\mathbb{S}}_{+}^{k_{i}}$ , $X_{ij}^{12}\in{\mathbb{R}}^{k_{i}\times k_{j}}$ , $X_{ij}^{22}\in{\mathbb{S}}_{+}^{k_{j}}$ . It can be verified by direct computation that the identity (7) holds if $\widetilde{X}_{ip}$ for $i<p$ are chosen as follows:

[TABLE]

Thus we complete the proof. ∎

Example 1

Consider the following PSD matrix

[TABLE]

It can be verified that $X\in{\mathcal{FW}}_{\beta,2}^{4}$ for the partition $\beta=\{1,1,1,1\}$ and the matrices in the decomposition can be chosen as follows:

[TABLE]

If we collapse the last two entries into a block and obtain the partition $\alpha=\{1,1,2\}$ , then we can use the constructions in Theorem 1 in order to obtain the matrices $\widetilde{X}_{12}=X_{12}$ ,

[TABLE]

The matrices $\widetilde{X}_{12}$ , $\widetilde{X}_{13}$ , $\widetilde{X}_{23}$ are PSD, which shows that $X\in{\mathcal{FW}}_{\alpha,2}^{4}$ .

We can also describe a dual set of matrices to ${\mathcal{FW}}_{\alpha,2}^{N}$ matrices (with respect to the trace inner product), which creates an outer approximation hierarchy for the cone ${\mathbb{S}}^{N}_{+}$ .

Corollary 1

The dual to ${\mathcal{FW}}_{\alpha,2}^{N}$ with respect to the trace inner product is defined as:

[TABLE]

Furthermore, let $\alpha=\{k_{1},\dots,k_{p}\}$ and $\beta=\{\widetilde{k}_{1},\dots,\widetilde{k}_{q}\}$ , $\alpha\sqsupseteq\beta$ , then we have the following inclusions:

[TABLE]

where $K_{1}$ , $K_{2}$ are positive integer and $K_{1}+K_{2}=N$ .

Proof:

The proof of the first part follows after noticing that for any matrix $Z\in{\mathbb{S}}^{N}$ such that $E_{ij}ZE_{ij}^{T}\in{\mathbb{S}}_{+}^{k_{i}}$ for all $1\leq i<j\leq p$ , and for any matrix $X\in{\mathcal{FW}}_{\alpha,2}^{N}$ , we have $\textrm{trace}(XZ)\geq 0$ . The proof of the second part is straightforward. ∎

Remark 1

Using the terminology in [26] the cone $({\mathcal{FW}}_{\alpha,2}^{N})^{\ast}$ is a partially separable cone, which ensures that its dual is ${\mathcal{FW}}_{\alpha,2}^{N}$ and $({\mathcal{FW}}_{\alpha,2}^{N})^{\ast}$ is a proper cone.

The major difference between our hierarchy of ${\mathcal{FW}}_{\alpha,2}^{N}$ and the hierarchy of ${\mathcal{FW}}_{k}^{N}$ is the number of basis matrices, which in our case is substantially lower due to two reasons: We use factor-width-two generalisations and we coarsen the partitions. Therefore the number of basis matrices is equal to $p(p-1)/2$ for $\alpha=\{k_{1},\dots,k_{p}\}$ , and as we make a partition coarser the number $p$ and hence the number of basis matrices decreases. We note however that the set ${\mathcal{FW}}_{3}^{N}$ is not contained in ${\mathcal{FW}}_{\alpha,2}^{N}$ for $p>2$ . This is because ${\mathcal{FW}}_{3}^{N}$ contains all possible combinations of $e_{i}$ ’s as the basis vectors (read all possible partitions). In contrast, ${\mathcal{FW}}_{\alpha,2}^{N}$ will not consider certain choices of partitions. Therefore, our approach has a particular advantage in applications where partitions come as a natural property of the problem.

III-B Applications to SDPs in the Standard Primal Form

The main idea is to replace the cone ${\mathbb{S}}_{+}^{N}$ with ${\mathcal{FW}}_{\alpha,2}^{N}$ or its dual in order to obtain a restriction of the original program. Consider a restriction of (2), where we assume the matrix $X$ is partitioned according to $\alpha=\{k_{1},\dots,k_{p}\}$ ,

[TABLE]

This program can be cast in the SDP form as follows:

[TABLE]

which is amenable for a straightforward implementation in standard SDP solvers such as SeDuMi [27], MOSEK [28] or SCS [11]. This program has the same number of equality constraints as (2), but the number and the dimensions of PSD constraints are different. We can also perform a relaxation of (2) by replacing $X\in{\mathbb{S}}_{+}^{N}$ by $X\in({\mathcal{FW}}_{\alpha,2}^{N})^{\ast}$ . We will not discuss the relaxation in detail since we focus on the restriction of the primal SDP.

IV Numerical Examples

In our numerical examples, we used YALMIP [29] in order to reformulate the polynomial optimisation program into a standard SDP and we solve the SDPs using MOSEK [28]111Code is available via https://github.com/zhengy09/SDPfw.

IV-A Polynomial Optimisation

We consider the polynomial optimisation problem:

[TABLE]

where

[TABLE]

We added the last term, so that the problem does not enjoy the structure exploited by the methods in [12, 13]. We vary $n$ and obtain different semidefinite optimisation problems in the standard primal form with constraints of different sizes listed in Table I.

We partition the SDP as discussed in Section III-B. We fix the partition size $p$ and we choose the size of the blocks as the closest integers to $N/p$ , where $N$ is the size of the SDP constraint. In particular, if $k_{1}\leq N/n\leq k_{2}$ , then we pick the maximum number of blocks of size $k_{1}$ and the rest of size $k_{2}$ . The number of SDP constraints, as discussed above is equal to $p(p-1)/2$ . Note that the number of linear constraints remains the same as in the full SDP.

We present the computational times and the objective values in Table I. It noticeable that with a finer partition we obtain faster solutions, which are, however, conservative in terms of the objective function. Fine partitions may be very useful for feasibility programs, while coarse partitions are competitive for large SDPs, where the value of the objective function is important. Note that for large-scale instances $n\geq 25$ , MOSEK ran out of memory on our machine. On the other hand, our strategy of using block factor-with-two matrices can still provide a useful upper bound for (10).

IV-B Matrix Sum-of-Squares Programming

In our second example, we show that there exists a natural partition $\alpha$ in the case of the matrix-vision of SOS programs. Indeed, consider a polynomial matrix constraint:

[TABLE]

Treating this constraint directly is intractable and the usual technique is the SOS-relaxation, which results in the following reformulation [30]

[TABLE]

where the constraint (11) is actually a linear constraint linking the coefficients of $P(x)$ with the matrix $Q$ . The SOS programs are known to suffer from the curse of dimensionality, in particular, the size of $Q$ grows combinatorially when we vary both $m$ and $d$ . Therefore, another technique was proposed in [17], which replaces the constraint (12) with

[TABLE]

Now instead of the large semidefinite constraint we are dealing with a large number of $2\times 2$ PSD constraints, which can actually be cast as second order cone constraints.

In addition, we can address this problem by replacing the constraint (12) with $Q\in{\mathcal{FW}}_{\alpha,2}^{N}$ with a natural choice of $\alpha$ . In particular, we assume that $P(x)\in{\mathcal{FW}}_{2}^{n}$ for every $x\in{\mathbb{R}}^{n}$ resulting in the decomposition

[TABLE]

and only then use the SOS relaxation on the polynomial matrices $P_{ij}(x)$ of the dimension $2\times 2$ . In order to avoid the question of existence of such decompositions, we restrict the search of $P(x)$ to the following set of constraints:

[TABLE]

Some rudimentary linear algebra results in the following reformulation of the PSD constraints:

[TABLE]

where $\alpha$ is pre-determined.

Using the ${\mathcal{FW}}_{\alpha,2}^{N}$ restriction provides with a larger set of solutions than ${\mathcal{FW}}_{\mathbf{1},2}^{N}$ . For example, the polynomial matrix:

[TABLE]

satisfies the constraints (15, 16), but does not satisfy the constraints (13, 14).

We further test our approach on the following program:

[TABLE]

where every entry of $P(x)$ is a random polynomial of degree two in three variables, and we vary the dimension of $P(x)$ . The computational results are depicted in Table II. Our restriction offers faster computational solutions with almost the same optimal objectives compared to the standard SOS technique, while the technique [17] provides even faster solutions, but their quality is worse.

V Conclusion and Discussion

We introduced a novel class of matrices and presented a hierarchy of inner and outer approximations of the cone of positive semidefinite (PSD) matrices. Both inner and outer approximations are proper cones and enjoy useful duality relations, furthermore, the inclusion certificates for these cones is a set of PSD constraints smaller than the size of the matrix. This allows deriving a hierarchy of scalable relaxations and restrictions of semidefinite programs (SDPs). The inner approximations (cones ${\mathcal{FW}}_{\alpha,2}^{N}$ ) are built by partitioning the matrix into a non-intersecting set of entries. It is not entirely clear at the moment how to build “the best” partition in terms of the solution of the particular SDP. However, in some problems, the partition comes naturally from the problem formulation, e.g., the matrix-version of SOS programs discussed in Section IV-B. Our numerical experiments suggest that these hierarchies can be used for dense large-scale SDPs, which arise in SOS programming.

Our future work will investigate the consequences of block factor-width-two matrices in relevant control applications that involve SDPs. Also, it would be interesting to incorporate the properties of block factor-width-two matrices in the development of first-order algorithms (e.g., the solvers [11, 6]) for solving general SDPs.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Boyd and L. Vandenberghe, Convex optimization . Cambridge university press, 2004.
2[2] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear matrix inequalities in system and control theory . Siam, 1994, vol. 15.
3[3] V. Powers and T. Wörmann, “An algorithm for sums of squares of real polynomials,” J Pure Applied Algebra , vol. 127, no. 1, pp. 99–104, 1998.
4[4] P. A. Parrilo, “Semidefinite programming relaxations for semialgebraic problems,” Math Prog , vol. 96, no. 2, pp. 293–320, 2003.
5[5] J.-B. Lasserre, Moments, positive polynomials and their applications . World Scientific, 2010, vol. 1.
6[6] Y. Zheng, G. Fantuzzi, A. Papachristodoulou, P. Goulart, and A. Wynn, “Fast ADMM for semidefinite programs with chordal sparsity,” in Proc Am Control Conf , 2017, pp. 3335–3340.
7[7] ——, “Fast ADMM for homogeneous self-dual embeddings of sparse SD Ps,” Proc IFAC-Papers On Line , vol. 50, no. 1, pp. 8411–8416, 2017.
8[8] ——, “Chordal decomposition in operator-splitting methods for sparse semidefinite programs,” Math Prog A , 2019.