Exploring the bounds on the positive semidefinite rank
Andrii Riazanov, Mikhail Vyalyiy

TL;DR
This paper investigates the limitations of existing bounds on the positive semidefinite rank of matrices related to polytopes, showing they cannot produce exponential lower bounds on extension complexity, and relates these bounds to the matrix's regular rank.
Contribution
It proves that current bounds on PSD-rank are polynomially bounded by the regular rank, providing new insights into extension complexity limitations.
Findings
Existing bounds are upper bounded by polynomial functions of regular rank.
No exponential lower bounds on PSD-rank can be derived from current bounds.
An upper bound on mutual information based on regular rank is established.
Abstract
The nonnegative and positive semidefinite (PSD-) ranks are closely connected to the nonnegative and positive semidefinite extension complexities of a polytope, which are the minimal dimensions of linear and SDP programs which represent this polytope. Though some exponential lower bounds on the nonnegative and PSD- ranks has recently been proved for the slack matrices of some particular polytopes, there are still no tight bounds for these quantities. We explore some existing bounds on the PSD-rank and prove that they cannot give exponential lower bounds on the extension complexity. Our approach consists in proving that the existing bounds are upper bounded by the polynomials of the regular rank of the matrix, which is equal to the dimension of the polytope (up to an additive constant). As one of the implications, we also retrieve an upper bound on the mutual information of an arbitrary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Complexity and Algorithms in Graphs · Machine Learning and Algorithms
Exploring the bounds on the positive semidefinite rank
Andrii Riazanov
[email protected] Skolkovo Institute of Science and Technology; Moscow Institute of Physics and Technology (State University).
Mikhail Vyalyi
[email protected] Dorodnicyn Computing Centre, FRC CSC RAS; Moscow Institute of Physics and Technology (State University); National Research University Higher School of Economics. The study has been funded by the Russian Academic Excellence Project ’5-100’.
Abstract
The nonnegative and positive semidefinite (PSD-) ranks are closely connected to the nonnegative and positive semidefinite extension complexities of a polytope, which are the minimal dimensions of linear and SDP programs which represent this polytope. Though some exponential lower bounds on the nonnegative [FMP*+*12] and PSD- [LRS15] ranks has recently been proved for the slack matrices of some particular polytopes, there are still no tight bounds for these quantities. We explore some existing bounds on the PSD-rank and prove that they cannot give exponential lower bounds on the extension complexity. Our approach consists in proving that the existing bounds are upper bounded by the polynomials of the regular rank of the matrix, which is equal to the dimension of the polytope (up to an additive constant). As one of the implications, we also retrieve an upper bound on the mutual information of an arbitrary matrix of a joint distribution, based on its regular rank.
1 Introduction
Linear optimization plays an important role in computer science and mathematics. Though there exist efficient algorithms of linear optimization over convex sets, for the polytopes with exponential number of facets they still work too long in general case. That is why one may want to represent such “hard” convex set as a projection (linear map) of some “easier” convex set, for example of some affine slice of the cone of nonnegative orthant or the cone of positive semidefinite matrices, since on slices of both these cones linear optimization has efficient algorithms. Such representations are called the nonnegative and the positive semidefinite (PSD-) extensions, respectively.
Since many problems of combinatorial optimization can be represented as linear programs over a polytope, studying the extensions of convex polytopes is an important and challenging problem. The natural question is to find the minimal dimension for which there exists an extension of the given polytope. It can be also formulated as determining the smallest dimensions of LP or SDP programs which represent optimization over the given polytope, and such sizes are called the nonnegative and the semidefinite extension complexities, respectively.
In the context of P NP we do not expect to find small nonnegative or PSD- extension complexities for NP-hard problems, since that would mean that there exist polynomial algorithms for solving these problems. However, there is still no general approach for proving the lower bounds on these quantities, and only a few exponential lower bounds for some particular problems has recently been proved. All such results use the connection between extension complexity and matrix factorizations, which was first discovered in [Yan91] for the nonnegative extension complexity and nonnegative matrix factorizations. Further, this approach was extended in [GPT13] for the general case of cone factorizations, and the same result for PSD-factorizations was also obtained in [FMP*+*12]. This instrument gave an opportunity to explore the nonnegative and PSD- extension complexities of polytopes via studying some characteristics of their slack matrices called the nonnegative and the PSD- ranks. For example, in the 1980s there were attempts to prove P = NP by providing the polynomial-sized linear program to solve the NP-hard travelling salesman problem (TSP). However, using the described approach, Yannakakis proved in [Yan91] that any symmetric LP which solves TSP has exponential size, which meant invalidity of all such attempts, since all the presented LPs were symmetric. The extension of this result for any (not only symmetric) TSP was first presented in [FMP*+*12], where the authors used the connection between the nonnegative rank of the matrix and the nondeterministic communication complexity of its support. In this work, the exponential lower bounds on the nonnegative rank were also proved for CUT and Stable Set polytopes. The first analogical bounds for the PSD-extension complexity were presented in [LRS15] using the sum-of-squares SDP hierarchy.
Since exponential lower bounds were obtained for some particular cases only, it is still a challenging problem to obtain reasonable estimations and bounds for the nonnegative and PSD- ranks. This problem is widely discussed during the last decade. For instance, exponential bounds on the nonnegative rank, and thus on the nonnegative extension complexity, were proved in [Rot14] for the matching polytope , where the author used the extension of Razborov’s result [Raz90]. We address the reader to the review [FGP*+*15] for more details about recent research on the PSD-rank.
There is also a problem of determining the computational complexity of computing the nonnegative and PSD- ranks. Both problems are known to be NP-hard, and recent research [Shi16] shows that the problem of computing the PSD-rank is complete in – the existential theory of the reals.
Contribution
In this paper we explore the lower bounds on the PSD-rank introduced in [LWdW16], which we will further address as bounding functionals (of a matrix). We show that these functionals cannot give exponential bounds on the PSD-rank, and thus on the positive semidefinite extension complexity. Our approach consists in proving that the bounding functionals of the slack matrix are bounded above by the polynomial of the regular rank of this matrix and the logarithm of the matrix size. Since for any polytope we have , it would mean that the bounds are polynomial in the dimension of the polytope.
As one of the implications of our approach, we achieve the upper bound on the mutual information for an arbitrary matrix of a joint distribution. More precisely, we show that the mutual information is bounded above by the logarithm of the rank of the matrix.
Outline of the paper
This paper is organized as follows. In Sect. 2 we introduce all the necessary notations and explain some connections between the PSD-rank and the quantum communication complexity. In Sect. 3 we present the bounding functionals from [LWdW16] and explain how the lower bound on the PSD-rank can be obtained via the mutual information. Finally, in Sect. 4 the upper bounds on the bounding functionals are proved. In particular, Theorem 4.1 shows that the mutual information of two discrete random variables is bounded above by the logarithm of the regular rank of the matrix of their joint distribution.
2 Preliminaries
2.1 Nonnegative and PSD- matrix factorizations
The nonnegative matrix factorization of the nonnegative matrix is the decomposition , where , , and are nonnegative matrices. Alternatively, such factorization can be thought of as two sets of vectors , such that . Then the nonnegative rank of , denoted , is the smallest for which such nonnegative factorization of exists.
Similarly, the positive semidefinite rank is the minimal integer for which there exist two sets of complex Hermitian positive semidefinite matrices , such that . Such factorization is called the positive semidefinite factorization, and it has many applications in combinatorial optimization and communication complexity. If to restrict the matrices in the factorization to be real symmetric positive semidefinite, one will obtain the definition of the real PSD-rank . It can be shown ([LWdW16]), that the restriction for matrices to be real can increase at most by the factor of 2, e.g. . Since in our context we only study asymptotic bounds on the ranks, there is no difference between considering or .
We would like to emphasize that rescaling the nonnegative matrix by multiplying its rows or columns by any positive factors does not change its nonnegative and PSD- ranks. Indeed, multiplication of the row of by corresponds to the multiplication of by the same factor in the nonnegative factorization. Similarly, it corresponds to the multiplication of by in the PSD-factorization. Obviously, the situation with the columns of is the same.
2.2 Extension complexity
The nonnegative extension complexity of the polytope is the smallest number such that can be expressed as a projection of an affine slice of the nonnegative -dimensional orthant . Similarly, the semidefinite (PSD-) extension complexity of is the minimum number for which there exists an affine slice of the cone of complex Hermitian positive semidefinite matrices that projects onto .
In other words, for optimizing over some polytope one may want to represent is as , where is some close convex cone, is some affine subspace of , and is a linear map (projection). Such representations are called -lifts, ([GPT13]), or -extensions. If to choose from the families of the cones of nonnegative orthants or positive semidefinite matrices , the nonnegative and PSD- extension complexities for the given polytope correspond to minimal and for which such representations exist.
2.3 Factorization theorem
As it was discussed in Introduction, [Yan91], [GPT13], and [FMP*+*12] proved that the extension complexities and matrix factorizations are interconnected. Here we present the Factorization theorem, which explains the relations between these two notions.
Let be a polytope in with vertices and facets, thus . Then the slack matrix of the polytope is defined as the nonnegative matrix with , where is the vertex of . Then the Factorization theorem can be formulated as follows:
Factorization Theorem**.**
The nonnegative extension complexity of is equal to . Similarly, the PSD-extension complexity of is equal to .
This approach allows applying techniques for estimating or bounding such algebraic notions as sizes of matrix factorizations to answer geometrical questions about the complexities of the polytopes.
2.4 Quantum communication complexity
In this section, we describe the connection between the quantum communication complexity and . First, we will consider one-way quantum communication protocol.
A quantum state is a positive semidefinite matrix with . A measurement is the set of positive semidefinite matrices , indexed by the finite set of nonnegative real numbers , with the condition . The measurements are also called POVM (“Positive Operator Value Measure”) in the literature. POVMs work in the following way: when we apply the measurement to the state , the outcome is with probability .
Then the process of communication is set as follows: initially, Alice has the integer , and Bob has . Then Alice sends an -dimensional quantum state to Bob, who measures it with POVM and outputs the result. We say that such a protocol computes the nonnegative matrix in expectation, if the expected value of Bob’s output on the input is equal to (the entry of the matrix in row and column). Then the quantum communication complexity of the matrix is the logarithm of such a minimal size of dimension , for which there exists a one-way quantum protocol which computes in expectation.
Fiorini et. al. [FMP*+*12] and Jain et. al. [JSWZ13] proved that the minimal amount of quantum information needed for Alice and Bob to generate the nonnegative matrix is completely determined by the PSD-rank of this matrix. More precisely, they showed that the quantum communication complexity of is equal to .
3 Bounding functionals on the PSD-rank
In this section, we present some existing general lower bounds on from [LWdW16], which we address as bounding functionals. Except for the bound via mutual information, the bounding functionals are introduced here without justification. We address the reader to the original article for more details on the bounds. For convenience, we preserve the notations for the bounding functionals from the original article.
3.1 Bound via Mutual Information
If and are two random variables, then the mutual information is defined as follows:
[TABLE]
where is Shannon entropy. The mutual information can be interpreted as the number of bits of information about that are revealed by the value of . We will now use Holevo’s theorem [Wat11] to bound the mutual information. It claims that the number of classical bits of information that Alice can communicate to Bob by sending qubits does not exceed . From the previous passage we know that we need exactly qubits of information to compute the matrix . Normalizing and considering it as a matrix of joint distribution , we then have:
Fact 3.1**.**
Let be a matrix of a joint distribution of two discrete random variables with finite support, . Then
[TABLE]
3.2 Bounding functionals from [LWdW16]
For two probability distributions and fidelity is defined as .
Recall that the left stochastic matrix is the matrix with nonnegative entries, with each column summing to . Further in the text we will omit “left” and just use the term “stochastic matrix” instead.
Then we have the following lower bounds:
Fact 3.2**.**
Let be a stochastic matrix. Then
[TABLE]
where the is taken over all probability distributions , and is the column of .
Fact 3.3**.**
Let be a stochastic matrix. Then
[TABLE]
Fact 3.4**.**
Let be a stochastic matrix. Then
[TABLE]
where the is taken over all probability distributions , and is the column of .
4 Upper bounds on the bounding functionals
All the bounds from section 3 were explored and compared in [LWdW16]. It turned out that in different cases , or can give better bounds on than others, and some of them can be tight in some particular cases. However, the key question of whether these functions can give exponential lower bounds on the PSD-rank with respect to the regular rank was not addressed. In this section we answer this question negatively.
In the context of combinatorial optimization, we would like to show that for the polytope of some NP-hard problem the semidefinite extension complexity is exponential in the dimension. Following the arguments from Section 2.2, it suffices to show that the PSD-rank of the corresponding slack matrix is exponential. It is easy to show ([GGK*+*13]) that the regular rank of the slack matrix equals to the dimension of the polytope plus one: . For all the presented bounding functionals we provide the upper bounds polynomial in the regular rank of the matrix and the logarithm of the matrix size, which means that they cannot be exponential in the dimension.
4.1 Row elimination transformation
We will now describe the row elimination transformation, which will be used for proving the required bounds.
Let be a nonnegative matrix with . Without loss of generality, assume that first rows are non-zero. They are linearly dependent, so there exists a nontrivial set of real numbers , such that . Since all entries of are nonnegative, there are both negative and positive numbers among . For such a set of real numbers we denote by the closed interval , which is properly defined due to the last remark.
Then we define the matrix as follows: for the -th row of equals , for the -th row of coincides with the -th row of . We call the matrix -transformation of .
First of all, note that . Moreover, it holds that when is equal to one of the ends of , at least one of the coefficients is equal to zero. It means that for the matrix is nonnegative matrix, and when is either the left or the right end of , has more zero rows than .
Next, we prove that sums of columns do not change after row elimination transformation. Indeed,
[TABLE]
In particular, it means that if is stochastic, then for is also stochastic. Similarly, if is a matrix of a joint distribution, then is also a matrix of some joint distribution for from .
4.2 Upper bound on (Mutual Information)
Let be the matrix of a joint distribution of two discrete random variables
[TABLE]
Let , and , be the marginal probabilities of and respectively:
[TABLE]
Then the mutual information between and can also be defined as:
[TABLE]
where we set (the logarithm here and further is to the base 2). We also denote .
Theorem 4.1**.**
Let be the matrix of a joint distribution of and . Then
[TABLE]
Proof.
Denote . We will now transform the original matrix in such a way, that the mutual information will not decrease, but the new matrix will have at most non-zero rows.
Suppose has more than non-zero rows. Then we apply the row elimination transformtaion and consider the -transformation of the original matrix. Since we have already shown that it is also a matrix of some joint distribution, we explore how the mutual information changes after such transformations.
First, since the -transformation does not change the sums in the columns of , we have . Then, since is the sum of entries in the -th row, we obtain .
Note that since and coincide on rows with indexes larger than , we may omit the summation over these rows:
[TABLE]
[TABLE]
[TABLE]
Now recall that the -transformation is valid for , where the left end of is negative, and the right end is positive. It means that we can choose an end of the interval of such that . It only remains to note that with the chosen value of at least one of the first rows in becomes zero.
To get an upper bound on the mutual information, we apply -transformations with such suitable ’s that the number of non-zero rows strictly decreases and the mutual information does not decrease. At the end of such procedure we obtain the matrix with at most non-zero rows for which . Since is the matrix of joint distribution, we have , where the support of has cardinality at most . Using the equality and the non-negativity of the conditional entropy, we finally have:
[TABLE]
∎
4.3 Upper bound on
We will show that is upper bounded by :
Theorem 4.2**.**
Let be a stochastic matrix, . Then
[TABLE]
We start with proving the following well-known fact:
Lemma 4.1**.**
For distributions it holds , where is norm of the vector , and thus is the statistical distance between the distributions.
Proof.
[TABLE]
[TABLE]
∎
Now, we have
[TABLE]
Then we need to prove the lower bound on .
We will find the lower bound on this quadratic form for an arbitrary distribution . Without loss of generality, assume .
Lemma 4.2**.**
There exists such that .
Proof.
Suppose the opposite: . Then
[TABLE]
[TABLE]
∎
Then we have
[TABLE]
Now, using the RMS-AM inequality and Lemma 4.1, we get:
[TABLE]
For any stochastic matrix denote – the arithmetic mean of statistical distances between columns of . It now suffices to show the upper bound on .
Lemma 4.3**.**
Let be a stochastic matrix with . Then there exists a stochastic matrix such that .
Proof.
We apply the row elimination algorithm. Suppose has more then non-zero rows. Consider then the -transformation of the original matrix. Since the -transformation does not change the sums of entries in every column of the matrix, is also stochastic. We now explore how changes after the -transformation:
[TABLE]
So, the difference is linear in terms of . Remind again that the -transformation is valid for , where the left end of is negative, and the right end is positive. It means that we can choose an end of the interval of such that and with the chosen value of at least one of the first rows in becomes zero. When we apply such -transformations with suitable ’s, the number of non-zero rows strictly decreases, and does not decrease. At the end of such procedure we will obtain the matrix with at most non-zero rows for which .
∎
Lemma 4.4**.**
Let be a stochastic matrix. Then
[TABLE]
Proof.
If , then , where we just used .
Now suppose . Denote .
We now construct the matrix by sorting every row of . Obviously, , since it is just a permutation of terms. Then
[TABLE]
For each in this sum it occurs times with the sign and times with the sign . Hence,
[TABLE]
Clearly, takes its maximal value when the sum in the first columns of is maximal. Since and the sums of all the entries in and coincide and are equal to , to maximize we need to have ones in total in the first columns of . Denote . If , then the matrix consists of ones only (since it is stochastic), then and the inequality in the lemma is obvious. If , then it is easy to show that . Note that exactly first summands are nonnegative in (4), so to maximize first columns of should be filled with ones:
Such matrix would correspond to the following matrix :
[TABLE]
[TABLE]
[TABLE]
Then
[TABLE]
∎
Proof of Theorem 4.2.
The first columns of form the matrix with . Using Lemma 4.3, we conclude that there exists such that . Applying Lemma 4.4 we get . Then from (3):
[TABLE]
Then from (2) for every distribution we obtain:
[TABLE]
And finally, using (1),
[TABLE]
∎
4.4 Upper bound on
Theorem 4.3**.**
Let be a stochastic matrix, . Then
[TABLE]
Proof.
Again, we apply the row elimination transformation. Note that since every row in the matrix after this transformation is either multiplied by some nonnegative factor or remains unchanged, the maximal element in this row is, obviously, multiplied by the same factor or remains constant as well.
Suppose has at least non-zero rows, and without loss of generality, suppose that these are the first rows of M. Now consider the -transformation of , and explore how the functional changes after such transformation, taking the last remark into consideration:
[TABLE]
[TABLE]
Similarly to previous proofs, is linear in terms of , and therefore when equals one of the ends of , the difference between and is nonnegative, while has strictly less non-zero rows, then . Again, applying such transformations with suitable ’s, at the end we obtain the matrix with at most non-zero rows, for which . It only remains to note that in the formula for there are at most non-zero summands, each less or equal than (since is also stochastic). Therefore, we have .
∎
4.5 Upper bound on
Theorem 4.4**.**
Let be a stochastic matrix, . Then
[TABLE]
Proof.
Simply applying (5) and (6) , we get:
[TABLE]
The last inequality is due to Theorem 4.3. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[FGP + 15] Hamza Fawzi, João Gouveia, Pablo A. Parrilo, Richard Z. Robinson, and Rekha R. Thomas. Positive semidefinite rank. Mathematical Programming , 153(1):133–177, Jul 2015.
- 2[FMP + 12] Samuel Fiorini, Serge Massar, Sebastian Pokutta, Hans Raj Tiwary, and Ronald de Wolf. Linear vs. semidefinite extended formulations. In Proceedings of the 44th symposium on Theory of Computing - STOC’12 . Association for Computing Machinery (ACM), 2012.
- 3[GGK + 13] João Gouveia, Roland Grappe, Volker Kaibel, Kanstantsin Pashkovich, Richard Z. Robinson, and Rekha R. Thomas. Which nonnegative matrices are slack matrices? Linear Algebra and its Applications , 439(10):2921–2933, nov 2013.
- 4[GPT 13] João Gouveia, Pablo A. Parrilo, and Rekha R. Thomas. Lifts of convex sets and cone factorizations. Mathematics of Operations Research , 38(2):248–264, May 2013.
- 5[JSWZ 13] Rahul Jain, Yaoyun Shi, Zhaohui Wei, and Shengyu Zhang. Efficient protocols for generating bipartite classical distributions and quantum states. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1503–1512. Society for Industrial & Applied Mathematics (SIAM), Jan 2013.
- 6[LRS 15] James R. Lee, Prasad Raghavendra, and David Steurer. Lower bounds on the size of semidefinite programming relaxations. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC’15 . Association for Computing Machinery (ACM), 2015.
- 7[L Wd W 16] Troy Lee, Zhaohui Wei, and Ronald de Wolf. Some upper and lower bounds on PSD-rank. Mathematical Programming , 162(1-2):495–521, Jul 2016.
- 8[Raz 90] A. A. Razborov. On the distributional complexity of disjointness. In Automata, Languages and Programming , pages 249–253. Springer Nature, 1990.
