The Lov\'asz Theta Function for Random Regular Graphs and Community   Detection in the Hard Regime

Jess Banks; Robert Kleinberg; Cristopher Moore

arXiv:1705.01194·cs.CC·August 29, 2017

The Lov\'asz Theta Function for Random Regular Graphs and Community Detection in the Hard Regime

Jess Banks, Robert Kleinberg, Cristopher Moore

PDF

TL;DR

This paper investigates the limitations of the Lovász theta function and sum-of-squares proofs in refuting k-colorability in random regular graphs, revealing computational hardness in certain regimes and providing bounds related to graph girth.

Contribution

It establishes bounds on the degree for which the Lovász theta function can refute k-colorability, showing failure above the phase transition and linking to community detection hardness.

Findings

01

Refutation fails above the k-colorability transition.

02

Refutation fails below the Kesten-Stigum threshold.

03

Provides explicit bounds on theta for regular graphs with given girth.

Abstract

We derive upper and lower bounds on the degree $d$ for which the Lov\'asz $ϑ$ function, or equivalently sum-of-squares proofs with degree two, can refute the existence of a $k$ -coloring in random regular graphs $G_{n, d}$ . We show that this type of refutation fails well above the $k$ -colorability transition, and in particular everywhere below the Kesten-Stigum threshold. This is consistent with the conjecture that refuting $k$ -colorability, or distinguishing $G_{n, d}$ from the planted coloring model, is hard in this region. Our results also apply to the disassortative case of the stochastic block model, adding evidence to the conjecture that there is a regime where community detection is computationally hard even though it is information-theoretically possible. Using orthogonal polynomials, we also provide explicit upper bounds on $ϑ (\overline{G})$ for regular graphs of a…

Equations188

d \geq d_{first} = 2 k ln k - ln k .

d \geq d_{first} = 2 k ln k - ln k .

d_{c} = d_{first} - O (1) .

d_{c} = d_{first} - O (1) .

d_{c} \sim \frac{k lo g k}{( τ - 1 ) ^{2}},

d_{c} \sim \frac{k lo g k}{( τ - 1 ) ^{2}},

d_{KS} = (\frac{k - 1}{τ - 1})^{2} .

d_{KS} = (\frac{k - 1}{τ - 1})^{2} .

d < d_{KS} = (\frac{k - 1}{τ - 1})^{2} + 1 .

d < d_{KS} = (\frac{k - 1}{τ - 1})^{2} + 1 .

\frac{d}{2 d - 1} + 1 - ϵ \leq ϑ (\overline{G_{n, d}}) \leq \frac{d}{2 d - 1} + 2 + ϵ .

\frac{d}{2 d - 1} + 1 - ϵ \leq ϑ (\overline{G_{n, d}}) \leq \frac{d}{2 d - 1} + 2 + ϵ .

k > 2 + \frac{d}{2 d - 1},

k > 2 + \frac{d}{2 d - 1},

d < 2 (k - 2) ((k - 2) + (k - 2)^{2} - 1) = (4 - o_{k} (1)) d_{KS} .

d < 2 (k - 2) ((k - 2) + (k - 2)^{2} - 1) = (4 - o_{k} (1)) d_{KS} .

\frac{k - τ}{1 - τ} > 2 + \frac{d}{2 d - 1} .

\frac{k - τ}{1 - τ} > 2 + \frac{d}{2 d - 1} .

j = 1 \sum m g_{j} (x) f_{j} (x) = S + ϵ where S = ℓ = 1 \sum t h_{ℓ} (x)^{2} .

j = 1 \sum m g_{j} (x) f_{j} (x) = S + ϵ where S = ℓ = 1 \sum t h_{ℓ} (x)^{2} .

S (x) = α, α^{'} \sum S (α, α^{'}) x^{(α)} x^{(α^{'})},

S (x) = α, α^{'} \sum S (α, α^{'}) x^{(α)} x^{(α^{'})},

\tilde{E} [1]

\tilde{E} [1]

\tilde{E} [f_{j}]

\tilde{E} [p^{2}]

x_{i, c}

x_{i, c}

p_{i}^{sing}

p_{ij}^{col}

p^{cut}

p^{cut}

i, c \sum b_{i, c} p_{i, c}^{bool} + i \sum s_{i} p_{i}^{sing} + (i, j) \in E \sum g_{ij} p_{ij}^{col} = S + ϵ

i, c \sum b_{i, c} p_{i, c}^{bool} + i \sum s_{i} p_{i}^{sing} + (i, j) \in E \sum g_{ij} p_{ij}^{col} = S + ϵ

ϑ (\overline{G}) = P min κ > 0 such that (1 1 / κ 1 / κ P)

ϑ (\overline{G}) = P min κ > 0 such that (1 1 / κ 1 / κ P)

P_{ii}

P_{ij}

ϑ (\overline{G}) = D max ⟨ D, J ⟩ such that D

tr D

D_{ij}

\hat{ϑ} (\overline{G}) = P min κ > 0 such that (1 1 / κ 1 / κ P)

\hat{ϑ} (\overline{G}) = P min κ > 0 such that (1 1 / κ 1 / κ P)

P_{ii}

\displaystyle\big{\langle}P,A\big{\rangle}

\hat{ϑ} (\overline{G}) = η, b max ⟨ D, J ⟩ such that D ≜ η A + diag b

\displaystyle\operatorname{tr}D=\big{\langle}\boldsymbol{b},\boldsymbol{1}\big{\rangle}

\hat{ϑ} (\overline{G}) > \frac{k - τ}{1 - τ} .

\hat{ϑ} (\overline{G}) > \frac{k - τ}{1 - τ} .

ϑ (\overline{G}) \geq \hat{ϑ} (\overline{G}) \geq 1 + d /∣ λ_{m i n} ∣ .

ϑ (\overline{G}) \geq \hat{ϑ} (\overline{G}) \geq 1 + d /∣ λ_{m i n} ∣ .

D ≜ \frac{1}{n} (\mathbbm 1 + \frac{1}{∣ λ _{m i n} ∣} A),

D ≜ \frac{1}{n} (\mathbbm 1 + \frac{1}{∣ λ _{m i n} ∣} A),

ϑ (\overline{G}) \geq \hat{ϑ} (\overline{G}) > 1 + \frac{d}{2 d - 1} - ϵ .

ϑ (\overline{G}) \geq \hat{ϑ} (\overline{G}) > 1 + \frac{d}{2 d - 1} - ϵ .

\frac{k - τ}{1 - τ} < 1 + \frac{d}{2 d - 1} .

\frac{k - τ}{1 - τ} < 1 + \frac{d}{2 d - 1} .

k < 1 + \frac{d}{2 d - 1} .

k < 1 + \frac{d}{2 d - 1} .

\hat{ϑ} (\overline{G}) \leq ϑ (\overline{G}) < 1 + \frac{d}{2 ( 1 - ϵ _{γ} ) d - 1} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The Lovász Theta Function for Random Regular Graphs

and Community Detection in the Hard Regime

Jess Banks

,

Robert Kleinberg

and

Cristopher Moore

Abstract.

We derive upper and lower bounds on the degree $d$ for which the Lovász $\vartheta$ function, or equivalently sum-of-squares proofs with degree two, can refute the existence of a $k$ -coloring in random regular graphs $G_{n,d}$ . We show that this type of refutation fails well above the $k$ -colorability transition, and in particular everywhere below the Kesten-Stigum threshold. This is consistent with the conjecture that refuting $k$ -colorability, or distinguishing $G_{n,d}$ from the planted coloring model, is hard in this region. Our results also apply to the disassortative case of the stochastic block model, adding evidence to the conjecture that there is a regime where community detection is computationally hard even though it is information-theoretically possible. Using orthogonal polynomials, we also provide explicit upper bounds on $\vartheta(\overline{G})$ for regular graphs of a given girth, which may be of independent interest.

Dept. of Mathematics, University of California-Berkeley, Berkeley CA

Dept. of Computer Science, Cornell University, Ithaca NY

Santa Fe Institute, Santa Fe NM

1. Introduction

Many constraint satisfaction problems have phase transitions in the random case: as the ratio between the number of constraints and the number of variables increases, there is a critical value at which the probability that a solution exists, in the limit $n\to\infty$ , suddenly drops from one to zero. Above this transition, most instances are too constrained and hence unsatisfiable. But how many constraints do we need before it becomes easy to prove that a typical instance is unsatisfiable? When is there likely to be a short refutation, which we can find in polynomial time, proving that no solution exists?

For a closely related problem, suppose that a constraint satisfaction problem is generated randomly, but with a particular solution “planted” in it. Given the instance, can we recover the planted solution, at least approximately? For that matter, can we tell whether the instance was generated from this planted model, as opposed to an un-planted model with no built-in solution? We can think of this as a statistical inference problem. If there is an underlying pattern in a dataset (the planted solution) but also some noise (the probabilistic process by which the instance is generated) the question is how much data (how many constraints) we need before we can find the pattern, or confirm that one exists.

Here we focus on the $k$ -colorability of random graphs, and more generally the community detection problem. Let $G=G(n,p=d/n)$ denote the Erdős-Rényi graph with $n$ vertices and average degree $d$ . A simple first moment argument shows that with high probability $G$ is is not $k$ -colorable if

[TABLE]

(We say that an event $E_{n}$ on graphs of size $n$ holds with high probability if $\lim_{n\to\infty}\Pr[E_{n}]=1$ , and with positive probability if $\liminf_{n\to\infty}\Pr[E_{n}]>0$ .) Sophisticated uses of the second moment method [8, 24] show that this is essentially tight, and that the $k$ -colorability transition occurs at

[TABLE]

Now consider the planted coloring model, where we choose a coloring $\sigma$ uniformly at random and condition $G$ on the event that $\sigma$ is proper. If $d>d_{c}$ , then $G(n,d/n)$ is probably not $k$ -colorable, while graphs drawn from the planted model are $k$ -colorable by construction. Thus, above the $k$ -colorability transition, we can tell with high probability whether $G$ was drawn from the planted or un-planted model by checking to see if $G$ is $k$ -colorable. However, searching exhaustively for $k$ -colorings would take exponential time.

A similar situation holds for the stochastic block model, a model of graphs with community structure also known as the planted partition problem (see [47, 2] for reviews). For our purposes, we will define it as follows: fix a constant $\tau$ , and say a partition $\sigma$ of the vertices into $k$ groups is “good” if a fraction $\tau/k$ of the edges connect vertices within groups. Equivalently, if $G$ has $m$ edges, $\sigma$ is a multiway cut with $(1-\tau/k)m$ edges crossing between groups. Generalizing the planted coloring model where $\tau=0$ , the block model chooses $\sigma$ uniformly, and conditions $G$ on the event that $\sigma$ is good. The cases $\tau>1$ and $\tau<1$ , where vertices are more or less likely to be connected to others in the same group, are called assortative (or ferromagnetic) and disassortative (or antiferromagnetic) respectively.

Two natural problems related to the block model are detection, i.e., telling with high probability whether $G$ was drawn from the block model or from $G(n,d/n)$ , and reconstruction, finding a partition which is significantly correlated with the planted partition $\sigma$ . (This is sometimes called weak reconstruction to distinguish it from finding $\sigma$ exactly, which becomes possible when $d=\Theta(\log n)$ [16, 1, 3, 30, 31, 9, 50].) Both problems become information-theoretically possible at a point called the condensation transition [39, 22, 19], and the first and second moment methods [12] show that this scales as

[TABLE]

where $\sim$ hides a multiplicative constant. As in $k$ -coloring this is roughly the first-moment bound above which, with high probability, no good partitions exist in $G(n,d/n)$ . However, the obvious algorithms for detection and reconstruction, such as searching exhaustively for good partitions or sampling from an appropriate Gibbs distribution [6, 4], require exponential time.

In fact, conjectures from statistical physics [40, 25, 26] suggest this exponential difficulty is sometimes unavoidable. Specifically, these conjectures state that polynomial-time algorithms for detection and reconstruction exist if and only if $d$ is above the Kesten-Stigum threshold [34, 35],

[TABLE]

Several polynomial-time algorithms are now known to succeed whenever $d>d_{\mathrm{KS}}$ , including variants of belief propagation [49, 5] and spectral algorithms based on non-backtracking walks [48, 38, 43, 17]. Moreover, for $k=2$ we know that the information-theoretic and Kesten-Stigum thresholds coincide [51]. Comparing (2) and (3) we see that for any $\tau\neq 1$ we have $d_{c}<d_{\mathrm{KS}}$ for sufficiently large $k$ , and in fact this occurs for some $\tau<1$ when $k=4$ and more generally when $k\geq 5$ [6, 4, 12].

Thus in the regime $d_{c}<d<d_{\mathrm{KS}}$ , detection and reconstruction are information-theoretically possible, but are conjectured to be computationally hard. In particular, this conjecture implies that there is no way to refute the existence of a coloring, or of a good partition, whenever $d<d_{\mathrm{KS}}$ , even when $d$ is large enough so that a coloring or partition probably does not exist. Our goal in this paper is to rule out spectral refutations based on the Lovász theta function, or equivalently sum-of-squares proofs of degree two.

For technical reasons, we focus on random $d$ -regular graphs, which we denote $G_{n,d}$ . A series of papers applying the first and second moment methods in this setting [46, 7, 33, 21] have determined the likely chromatic number of $G_{n,d}$ for almost all $d$ , showing that the critical $d$ for $k$ -colorability is $d_{c}=d_{\mathrm{first}}-O(1)$ just as for $G(n,d/n)$ . (There are a few values of $d$ and $k$ where $G_{n,d}$ could be $k$ -colorable with probability strictly between [math] and $1$ , so this transition might not be completely sharp.)

We define the $d$ -regular block model by choosing a planted partition $\sigma$ uniformly at random and conditioning $G_{n,d}$ on the event that $\sigma$ is good. Equivalently, we choose $G$ uniformly from all $d$ -regular graphs such that a fraction $\tau/k$ of their $m=dn/2$ edges connect vertices within groups. We claim that our results also apply to the regular block model proposed in [51] where $d$ -regular graphs are chosen with probability proportional to $\tau^{\textrm{\# within-group edges}}((k-\tau)/(k-1))^{\textrm{\# between-group edges}}$ : in that case, the fraction of within-group edges fluctuates, but is $\tau/k+o(1)$ with high probability.111These models are not to be confused with a stricter model, where for some constants $q_{rs}$ each vertex in group $r$ has exactly $q_{rs}$ neighbors in group $s$ [18, 23, 53, 15]. Our model only constrains the total number of edges within or between groups. We again conjecture that refuting the existence of a coloring or a good partition is exponentially hard below the Kesten-Stigum bound. Since the branching ratio of a $d$ -regular tree is $d-1$ , in the regular case this becomes

[TABLE]

Main results. The Lovász $\vartheta$ function, which we review below, gives a lower bound on the chromatic number which can be computed in polynomial time. In particular, if $\vartheta(\overline{G})>k$ , this provides a polynomial-time refutation of $G$ ’s $k$ -colorability. We first prove that this type of refutation exactly corresponds to sum-of-squares proofs of degree two in a natural encoding of $k$ -colorability as a system of polynomials; this is intuitive, but it does not seem to have appeared in the literature. We then show the following bounds on the likely value of $\vartheta(\overline{G})$ when $G$ is a random $d$ -regular graph.

Theorem 1.

Let $d$ be constant. For any constant $\epsilon>0$ , with high probability

[TABLE]

As a consequence, the Lovász $\vartheta$ function cannot refute $k$ -colorability with high probability if

[TABLE]

and in particular if $d$ is below the Kesten-Stigum threshold.

Rearranging, no refutation of this kind can exist when

[TABLE]

Our lower bound on $\vartheta(\overline{G_{n,d}})$ follows easily from Friedman’s theorem [29] on the spectrum of $G_{n,d}$ . For the upper bound, we first use orthogonal polynomials to derive explicit bounds on $\vartheta(\overline{G})$ for arbitrary regular graphs of a given girth—which may be of independent interest—and then employ a concentration argument for $G_{n,d}$ .

We also relate the Lovász $\vartheta$ function to the existence of a good partition in the disassortative case of the block model, giving

Theorem 2.

Fix $\tau<1$ and say a partition is good if a fraction $\tau/k$ of its edges connect endpoints in the same group. Then sum-of-squares proofs of degree two cannot refute the existence of a good partition in $G_{n,d}$ if

[TABLE]

Thus degree-two sum-of-squares cannot distinguish the regular stochastic block model from $G_{n,d}$ until $d$ is roughly a factor of $4$ above the Kesten-Stigum threshold.

Related work. The distribution of $\vartheta(\overline{G})$ for the Erdős-Rényi graph $G=G(n,p)$ and the random $d$ -regular graph $G=G_{n,d}$ were studied in [20]. In particular, that work showed that when $d$ is sufficiently large, with high probability $\vartheta(\overline{G_{n,d}})>c\sqrt{d}$ for a constant $c>0$ . Our results tighten this lower bound, making the constant $c$ explicit, and provide a nearly-matching upper bound.

Our results on the power of degree-two sum-of-squares refutations for $k$ -colorability contribute to a recent line of work on refutations of random CSPs, which we briefly survey. If we define the density of a CSP as the ratio of constraints to variables—which for coloring equals half the average degree of the graph—then the conjectured hard regime for $k$ -coloring corresponds to a range of densities bounded below and above by constants (i.e., depending on $k$ but not $n$ ). For CSPs such as $k$ -SAT and $k$ -XOR, there is again a satisfiability transition at constant density, but with high probability sum-of-squares refutations with constant degree do not exist unless the density is much higher, namely $\Omega(n^{k/2-1})$ [55], a result which was recently extended to general CSPs whose constraint predicate supports a $(k-1)$ -wise uniform distribution [36]. Conversely, if a predicate does not support a $t$ -wise uniform distribution, then [10] shows that there is an efficient sum-of-squares refutation when the density is $\tilde{O}(n^{t/2}-1)$ . For coloring, this gives refutations at roughly constant density; our contribution makes this a nearly-precise constant in the special case of degree-two sum-of-squares on random regular graphs.

The hidden clique problem also has a conjectured hard regime. It is well known that the random graph $G(n,1/2)$ has no cliques larger than $O(\log n)$ [28] but it is conjectured to be computationally hard to distinguish $G(n,1/2)$ from a graph with a planted clique of size $o(n^{1/2})$ . A sequence of progressively stronger sum-of-squares lower bounds for this problem [27, 32, 45] have culminated in the theorem that with high probability the degree- $d$ sum-of-squares proof system cannot refute the existence of a clique of size $n^{1/2-c(d/\log n)^{1/2}}$ in $G(n,1/2)$ for some constant $c>0$ [13].

In contrast to the aforementioned work on refuting random $k$ -CSPs and planted cliques, our result pertains to a much more specific pair of problems, namely $k$ -coloring and the stochastic block model, and only to degree-two sum-of-squares refutations; but it attains a sharp bound, within an additive constant, on the density at which these refutations become possible. We conjecture that sum-of-squares refutations of any constant degree do not exist below the Kesten-Stigum threshold, but it seems difficult to extend our current techniques to degree higher than two.

2. Colorings, Partitions, and the Lovász $\vartheta$ Function

2.1. Background on sum-of-squares

One type of refutation which has gained a great deal of interest recently is sum-of-squares proofs: see [14] for a review. Suppose we encode our variables and constraints as a system of $m$ polynomial equations on $n$ variables, $f_{j}(x_{1},x_{2},\ldots,x_{n})=0$ for all $j=1,\ldots,m$ . One way to prove that no solution $\boldsymbol{x}\in\mathbb{R}^{n}$ exists—in algebraic terms, that this variety is empty—is to find a linear combination of the $f_{j}$ which is greater than zero for all $\boldsymbol{x}$ . Moreover, the positivstellensatz of Krivine [37] and Stengle [57] shows that a polynomial is nonnegative over $\mathbb{R}^{n}$ if and only if it is a sum of squares of polynomials. Thus we need polynomials $g_{1},\ldots,g_{m}$ and $h_{1},\ldots,h_{t}$ and a constant $\epsilon>0$ (which we can always scale to $1$ if we like) such that

[TABLE]

This proof technique is complete as well as sound. That is, there is such a set of polynomials $\{g_{j}\}$ and $\{h_{\ell}\}$ if and only if no solution exists.

Even when the $f_{j}$ are of low degree, the polynomials $g_{j}$ and $h_{\ell}$ might be of high degree, making them difficult to find. However, we can ask when a refutation exists where both sides of (5) have degree $\delta$ or less. As we take $\delta=2,4,6,\ldots$ we obtain the SOS hierarchy. The case $\delta=2$ is typically equivalent to a familiar semidefinite relaxation of the problem. More generally, a degree- $\delta$ refutation exists if and only if a certain semidefinite program on $O(n^{\delta})$ variables is feasible: thus we can find degree- $\delta$ refutations, or confirm that they do not exist, in time $\mathrm{poly}(n^{\delta})$ [56, 52, 54, 41]. To see why, note that if we write a polynomial $S(\boldsymbol{x})$ as a bilinear form on monomials $x^{(\alpha)}=\prod_{i}x^{\alpha_{i}}$ of degree $\delta/2$ ,

[TABLE]

then $S(\boldsymbol{x})$ is a sum of squares of degree $\delta/2$ polynomials if and only if the matrix $\mathcal{S}$ is positive semidefinite, or equivalently if $\mathcal{S}$ is the sum of positive symmetric rank-one matrices. These are outer products of vectors with themselves, so there are vectors $w_{1},\ldots,w_{t}$ such that $\mathcal{S}=\sum_{\ell=1}^{t}{w_{\ell}}\otimes{w_{\ell}}$ and $S=\sum_{\ell}h_{\ell}^{2}$ where $h_{\ell}(\boldsymbol{x})=\sum_{\alpha}w_{\ell}(\alpha)x^{(\alpha)}$ . Finally, the constraint that $S=\sum_{j}g_{j}f_{j}-\epsilon$ for some $\{g_{j}\}$ and some $\epsilon>0$ corresponds to a set of linear inequalities on the entries of $\mathcal{S}$ .

The dual object to a degree- $\delta$ refutation is a pseudoexpectation. This is a linear operator $\tilde{\mathbb{E}}$ on polynomials of degree at most $\delta$ with the properties that

[TABLE]

If we write $\tilde{\mathbb{E}}$ as a bilinear form on monomials $x^{(\alpha)}$ , then (6) and (7) are linear constraints on its entries, and (8) states that this matrix is positive semidefinite. The resulting SDP is dual to the SDP for refutations, so each of these SDPs is feasible precisely when the other is not. Thus there is a degree- $\delta$ refutation if and only if no degree- $\delta$ pseudoexpectation exists, and vice versa.

We can think of a pseudoexpectation as a way for an adversary to fool the SOS proof system. The adversary claims there are are many solutions—even if in reality there are none—and offers to compute the expectation of any low-degree polynomial over the set of solutions. As long as (6) and (7) hold, this appears to be a distribution over valid solutions, and as long as (8) holds, the SOS prover cannot catch the adversary in an obvious lie like the claim that some quantity of degree $\delta/2$ has negative variance.

2.2. Colorings, partitions, and sum-of-squares

For a given graph $G$ with adjacency matrix $A$ , we can encode the problem of $k$ -colorability as the following system of polynomial equations in $kn$ variables $\boldsymbol{x}=\{x_{i,c}\}$ , where $i\in[n]$ indexes vertices and $c\in[k]$ indexes colors:

[TABLE]

Then $G$ is $k$ -colorable if and only if (9)–(11) has a solution in $\mathbb{R}^{kn}$ . We can encode the stochastic block model similarly: fix $\tau$ , and recall that a partition of $G$ into $k$ groups is good if a fraction $\tau/k$ of the edges have endpoints in the same group. If $G$ has $m$ edges, we can replace constraint (11) with

[TABLE]

A degree- $\delta$ sum-of-squares refutation of (9)–(11) is an equation of the form

[TABLE]

where $b_{i,c},s_{i},g_{ij}$ are polynomials over $\boldsymbol{x}$ , $S$ is a sum of squares of polynomials, $\epsilon$ is a small positive constant which we will omit when clear, and the degree of each side is at most $\delta$ . Such an equation is a proof that no coloring exists. Replacing $\sum_{i,j}g_{ij}p^{\textup{col}}_{ij}$ with $g_{\textup{cut}}p^{\textup{cut}}$ gives a refutation of the system formed by (9), (10), and (12), proving that no good partition exists. We focus on refutations of degree two, which as we will see are related to a classic relaxation of graph coloring.

2.3. The Lovász $\vartheta$ function

An orthogonal representation of a graph $G$ with $n$ vertices is an assignment of a unit vector $u_{i}\in\mathbb{R}^{n}$ to each vertex $i$ such that $\big{\langle}u_{i},u_{j}\big{\rangle}=0$ for all $(i,j)\in E$ . The Lovász function, denoted $\vartheta(\overline{G})$ by convention, is the smallest $\kappa$ for which there is an orthogonal representation $\{u_{i}\}$ and an additional unit vector $\mathfrak{z}\in\mathbb{R}^{n}$ such that $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}=1/\sqrt{\kappa}$ : that is, such that all the $u_{i}$ lie on a cone222To see that this definition of $\vartheta$ is equivalent to the more common one that $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}\leq 1/\sqrt{\kappa}$ for every $i$ , i.e., where the $u_{i}$ can be in the interior of this cone, simply rotate each $u_{i}$ in the subspace perpendicular to its neighbors until $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}$ is exactly $1/\sqrt{\kappa}$ . of width $\cos^{-1}(1/\sqrt{\kappa})$ .

The Gram matrix $P_{ij}=\big{\langle}u_{i},u_{j}\big{\rangle}$ of an orthogonal representation is positive semidefinite with $P_{ii}=1$ and $P_{ij}=0$ for $(i,j)\in E$ . Adding an auxiliary row and column for the inner products with $\mathfrak{z}$ , we can define $\vartheta$ in terms of a semidefinite program,

[TABLE]

where $\mathbb{J}$ is the matrix of all $1$ s and $\big{\langle}A,B\big{\rangle}=\operatorname{tr}(A^{\dagger}B)=\sum_{i,j}A_{ij}B_{ij}$ denotes the matrix inner product.

If $G$ is $k$ -colorable then $\vartheta(\overline{G})\leq k$ , since we can use the first $k$ basis vectors $e_{1},\ldots,e_{k}$ as an orthogonal representation and take $\mathfrak{z}=(1/\sqrt{k})\sum_{t=1}^{k}e_{t}$ . Thus if $\vartheta(\overline{G})>k$ , the Lovász function gives a polynomial-time refutation of $k$ -colorability. As stated above, degree-two sum-of-squares proofs typically correspond to well-known semidefinite relaxations, and the next theorem shows that this is indeed the case here.

Theorem 3.

There is a degree-2 SOS refutation of $k$ -colorability for a graph $G$ if and only if $\vartheta(\overline{G})>k$ .

We prove this in the Appendix, where we show that any orthogonal representation of $G$ that lies on an appropriate cone lets us define a pseudoexpectation for the system (9)–(11). This will also allow us to modify the SDPs for refutations and pseudoexpectations, and work with simplified but equivalent versions.

2.4. Good partitions and a relaxed Lovász function

The reader may have noticed that while the coloring constraint (11) fixes the inner product $\sum_{c}x_{i,c}x_{j,c}=\big{\langle}x_{i},x_{j}\big{\rangle}$ to zero for each edge $(i,j)\in E$ , the “good partition” constraint (12) only fixes the sum of all these inner products. This suggests a slight relaxation of the Lovász $\vartheta$ function, where we weaken the SDP (14) by replacing the individual constraints on $P_{ij}$ for all $(i,j)\in E$ with a constraint on their sum. In other words, we allow a vector coloring where neighboring vectors are orthogonal on average. We denote the resulting function $\hat{\vartheta}$ :

[TABLE]

Since $\hat{\vartheta}$ is a relaxation of $\vartheta$ , we always have $\hat{\vartheta}(\overline{G})\leq\vartheta(\overline{G})$ .

This modified Lovász function $\hat{\vartheta}$ is equivalent to degree-two SOS for good partitions in the dissasortative case of the block model, in the following sense.

Theorem 4.

If $\tau<1$ , there exists a degree-two SOS refutation of a partition of $G$ where a fraction $\tau/k$ of the edges are within groups if and only if

[TABLE]

Once again we leave the proof to the Appendix. Note that the SDP (16) for $\hat{\vartheta}$ contains no information about $k$ or $\tau$ : this relaxed orthogonal representation has the uncanny capacity to fool degree-two SOS about an entire family of related cuts of different sizes and qualities.

2.5. Upper and lower bounds

With these theorems in hand, we can set about producing degree-two sum-of-squares refutations and pseudoexpectations for our problems; throughout this section we will refer to these simply as ‘refutations’ and ‘pseudoexpectations’. In fact, the same construction will give us asymptotically optimal refutations and pseudoexpectations for both the coloring and partition problems.

To warm-up, we have the following simple construction of a refutation, which we will phrase in terms of the Lovász theta function and its relaxed version.

Lemma 1.

Let $G$ be a $d$ -regular graph, and let $\lambda_{\min}$ be the smallest eigenvalue of its adjacency matrix $A$ . Then

[TABLE]

Proof.

We construct a feasible solution $D$ to the dual SDP (17) by taking

[TABLE]

and use the fact that $\big{\langle}A,\mathbb{J}\big{\rangle}=dn$ . ∎

By invoking Friedman’s theorem [29] that (as $n\to\infty$ ) the smallest eigenvalue of a random $d$ -regular graph is with high probability larger than $-2(1+\epsilon)\sqrt{d-1}$ for any $\epsilon>0$ , we obtain:

Corollary 1.

When $G=G_{n,d}$ , for any $\epsilon>0$ , with high probability

[TABLE]

Putting this together with Theorems 3 and 4 gives

Corollary 2.

If $G=G_{n,d}$ and $\tau<1$ , with high probability there exists a refutation of a partition with a fraction $\tau/k$ of within-group edges when

[TABLE]

Setting $\tau=0$ , a refutation of $k$ -colorability exists with high probability when

[TABLE]

Note that for large $k$ , the minimum value of $d$ satisfying (21) is a factor of four above the Kesten-Stigum threshold in both the coloring and partition problems.

Our construction for this lower bound on $\vartheta$ is quite simple, but remarkably we find that for both the coloring and partition problems, it is asymptotically optimal in $d$ and $k$ . In particular,

Theorem 5.

For any $d$ -regular graph $G$ with girth at least $\gamma$ , we have

[TABLE]

where $\epsilon_{\gamma}$ is a sequence of constants which decrease to zero as $\gamma\to\infty$ .

Since for any constant $\gamma$ a random regular graph has girth $\gamma$ with positive probability [59, Theorem 2.12], we rely on the following result showing that $\vartheta(\overline{G_{n,d}})$ is concentrated in an interval of width one. The proof is essentially the same as that of [7] for the chromatic number, and is given in the Appendix.

Lemma 2.

Let $\theta\geq 3$ . If $\vartheta(\overline{G_{n,d}})\leq\theta$ with positive probability, then $\vartheta(\overline{G_{n,d}})\leq\theta+1$ with high probability.

Corollary 3.

If $G=G_{n,d}$ , with high probability there does not exist a refutation of a partition with a fraction $\tau/k$ of within-group edges when

[TABLE]

Setting $\tau=0$ , with high probability no refutation of $k$ -colorability exists when

[TABLE]

Thus for both problems, no degree-two sum-of-squares refutation exists until $d$ is roughly a factor of $4$ above the Kesten-Stigum threshold.

3. Constructing a Pseudoexpectation with Orthogonal Polynomials

We now prove Theorem 5 by constructing a feasible solution to the primal SDP (14): that is, unit vectors $\{u_{i}\}$ such that $\big{\langle}u_{i},u_{j}\big{\rangle}=0$ for every edge $(i,j)$ , and a unit vector $\mathfrak{z}$ so that $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}=1/\sqrt{\kappa}$ for all $i$ . Recall that such a collection exists if and only if $\vartheta(\overline{G})\leq\kappa$ .

It is convenient to instead define a set of unit vectors $\{v_{i}\}$ such that $\big{\langle}v_{i},v_{j}\big{\rangle}=-1/(\kappa-1)$ for every edge $(i,j)$ . We claim that such a set exists if and only if $\vartheta(\overline{G})\leq\kappa$ . In one direction, given $\{u_{i}\}$ and $\mathfrak{z}$ with the above properties, if we define

[TABLE]

then the $v_{i}$ are unit vectors with $\big{\langle}v_{i},v_{j}\big{\rangle}=-1/(\kappa-1)$ for $(i,j)\in E$ . For instance, if the $u_{i}$ are $k$ orthogonal basis vectors, then the $v_{i}$ point to the corners of a $k$ -simplex. In the other direction, given $\{v_{i}\}$ we can take $\mathfrak{z}$ to be a unit vector perpendicular to all the $v_{i}$ , and define

[TABLE]

Then $\big{\langle}u_{i},u_{j}\big{\rangle}=0$ for $(i,j)\in E$ , and $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}=1/\sqrt{\kappa}$ for all $i$ . This means that we can characterize the Lovász $\vartheta$ function with a slightly different SDP, which uses the Gram matrix of the $\{v_{i}\}$ :

[TABLE]

We will show that for any $d$ -regular graph $G$ with girth at least $\gamma$ , this SDP has a feasible solution with

[TABLE]

where $\epsilon_{\gamma}$ depends only on $\gamma$ and tends to zero as $\gamma\to\infty$ . Therefore, there is a pseudoexpectation that prevents degree-two SOS from refuting $k$ -colorability for any $k\geq\kappa$ . We will construct this pseudoexpectation by taking a linear combination of the “non-backtracking powers” of $G$ ’s adjacency matrix $A$ .

Denote by $A^{(t)}$ the matrix whose $i,j$ entry is the number of non-backtracking walks of length $t$ from $i$ to $j$ ; that is, walks which may freely wander the graph so long as they do not make adjacent pairs of steps $a\to b\to a$ for any vertices $a,b$ . There is a simple two-term recursion for these matrices: to count non-backtracking walks of length $t+1$ , we first extend each walk of length $t$ by one edge, and then subtract off those that backtracked on the last step. This gives

[TABLE]

Borrowing notation from [11], we can write $A^{(t)}$ in closed form as

[TABLE]

where $q_{t}(z)$ is a polynomial of degree $t$ . Specifically,

[TABLE]

and for $t>1$ the $q_{t}$ satisfy the Chebyshev recurrence

[TABLE]

We can write $q_{t}$ explicitly as

[TABLE]

and $U_{t}$ is the $t$ th Chebyshev polynomial of the second kind (note that $U_{-1}(z)=0$ ).

Let $\mu(z)$ denote the Kesten-McKay measure $\mu$ on the interval $[-1,+1]$ , which after scaling by $2\sqrt{d-1}$ describes the typical spectral density of a random regular graph [44]:

[TABLE]

Then the polynomials $q_{t}$ are orthonormal with respect to this measure. That is, if we define the inner product

[TABLE]

then

[TABLE]

If the girth of the graph is at least $\gamma$ , there is no way for a non-backtracking walk of length $\gamma-2$ or less to return to its starting point or to a neighbor of its starting point, so $\big{\langle}\mathbbm{1},A^{(t)}\big{\rangle}=\big{\langle}A,A^{(t)}\big{\rangle}=0$ for $1<t\leq\gamma-2$ . We can thus satisfy the diagonal and edge constraints of (24) by considering solutions of the form

[TABLE]

since the first two terms ensure that $P$ has $1$ s on its diagonal and $-1/(\kappa-1)$ on the edges. If we write

[TABLE]

our job is to optimize the coefficients $c_{t}$ for $1<t\leq\gamma-2$ so as to minimize $c_{1}$ , and hence $\kappa$ , while ensuring that $P\succeq 0$ .

The eigenvalues of the matrix $f(A/(2\sqrt{d-1}))$ are of the form $f(\lambda/(2\sqrt{d-1}))$ where $\lambda$ ranges over all of $A$ ’s eigenvalues. Therefore, $P\succeq 0$ if and only if $f(\lambda/(2\sqrt{d-1}))$ for all eigenvalues $\lambda$ of $A$ . Friedman’s celebrated theorem [29] shows that, with high probability, the eigenvalues of $A$ are contained in the set

[TABLE]

for any $\epsilon>0$ . Thus we require that

[TABLE]

We will relax this condition slightly by demanding just that $f$ is nonnegative on $[-1,+1]$ , although as we will see the resulting optimum is achieved by a function which is nonnegative on all of $\mathbb{R}$ . First we use orthonormality (29) to write the coefficients $c_{t}$ as inner products,

[TABLE]

Then we optimize the pseudoexpectation as follows,

[TABLE]

When the degree $\gamma-2$ of $f$ is even, we can solve this optimization problem explicitly. Set $m=\gamma/2$ , and let $r_{1}>\cdots>r_{m}$ be the roots of $q_{m}$ in decreasing order; it follows from standard arguments about orthogonal polynomials that these are all in the support of $\mu$ , i.e., in the interval $[-1,+1]$ . Consider the following polynomial of degree $2(m-1)=\gamma-2$ ,

[TABLE]

where

[TABLE]

is a normalizing factor to ensure that $\big{\langle}q_{0},s\big{\rangle}=1$ . We claim that $s(z)$ is the optimum of (33). To prove this, we begin with a general lemma on orthogonal polynomials and quadrature. The proof is standard (e.g. [58]) but we include it in the Appendix for completeness.

Lemma 3.

Let $\{p_{t}\}$ be a sequence of polynomials of degree $t$ which are orthogonal with respect to a measure $\rho$ supported on a compact interval $I$ . Then the roots $r_{1},\ldots,r_{t}$ of $p_{t}$ form a quadrature rule which is exact for any polynomial $u$ of degree less than $2t$ , in that

[TABLE]

for some positive weights $\{\omega_{1},\ldots,\omega_{t}\}$ independent of $u$ .

Now let $g(z)=z-r_{m}$ . In view of Lemma 3, for any polynomial $f(z)$ of degree at most $\gamma-2$ , the inner product $\langle g,f\rangle$ can be expressed using the roots $r_{1},\ldots,r_{m}$ of $q_{m}$ as a quadrature,

[TABLE]

Note that $\omega_{j}(r_{j}-r_{m})>0$ for every $1\leq j\leq m-1$ , since $r_{m}$ is the left-most root. If impose the constraints that $f(r_{j})\geq 0$ for all $j=1,\ldots,m-1$ , then $\langle g,f\rangle\geq 0$ . If we also impose the constraint $\langle f,q_{0}\rangle=1$ , then

[TABLE]

with equality if and only if $f(r_{j})=0$ for all $j=1,\ldots,m-1$ . Since $s(z)$ obeys this equality condition, we have

[TABLE]

and this is the minimum possible value of $c_{1}=\langle q_{1},s\rangle$ subject to the constraints that $\langle q_{0},f\rangle=1$ and $f(r_{j})\geq 0$ for $j=1,\ldots,m-1$ . Moreover, $s(z)\geq 0$ on all of $\mathbb{R}$ , so $s(z)$ in fact obeys the stronger constraint (32).

Referring back to (31) gives

[TABLE]

and so

[TABLE]

Finally, we obtain (22) by defining $\epsilon_{\gamma}=r_{m}+1$ . Since $r_{m}\to-1$ as $m$ tends to infinity333The fact that $r_{m}\to-1$ as $m\to\infty$ can be deduced, for example, by using the definition of $q_{m}$ in (27) to observe that $q_{m}(-1)$ and $q_{m}(-\cos(\frac{\pi}{m-1}))$ have opposite signs, and then applying the intermediate value theorem. , we have $\epsilon_{\gamma}\to 0$ as $\gamma\to\infty$ , completing the proof.

We end with a brief note on the above construction. Recall that our project for the last several pages has been to set the coefficients of non-backtracking paths of length $t$ in a feasible solution $P$ to the SDP (24),

[TABLE]

As discussed in the Appendix, this matrix can be translated into a degree-two pseudoexpectation $\tilde{\mathbb{E}}$ for the coloring problem: a linear operator that claims to give the joint distribution of colors at at each pair of vertices $i$ and $j$ . The reader will find there that $P_{ij}$ is related to the ‘pseudocorrelation’ between vertices $i$ and $j$ , by

[TABLE]

Our expansion of $P$ in terms of non-backtracking paths means that, for most pairs $i,j$ , this pseudoexpectation depends only on the shortest path distance $d(i,j)$ . Specifically, whenever $d(i,j)=t\leq\gamma-2$ and the shortest path is unique, we have $P_{ij}=a_{t}$ , and if $d(i,j)>\gamma-2$ then $P_{ij}=0$ . One might think that in the limit of large $\gamma$ , the optimal pseudoexpectation would make the natural choice that $a_{t}=(1-k)^{-t}$ : in that case, the pseuodocorrelation would decay just as if these shortest paths were colored uniformly at random, ignoring correlations with the remainder of the graph. However, a quick calculation shows that this choice is in fact not optimal. In fact, the optimal coefficients we derive above cause the pseudocorrelation to decay more quickly with distance than this naïve guess, namely (in the limit of large $d$ and large girth) as $a_{t}\approx t(2(1-k))^{-t}$ .

Acknowledgements

We are grateful to Charles Bordenave, Emmanuel Abbe, Amin Coja-Oghlan, Yash Deshpande, Marc Lelarge, and Alex Russell for helpful conversations. Part of this work was done while C.M. was visiting École Normale Supérieure. Part of this work was done while R.K. was a researcher at Microsoft Research New England. C.M. is supported by the John Templeton Foundation and the Army Research Office under grant W911NF-12-R-0012.

Appendix A Proof of Theorems 3 and 4

We prove Theorems 3 and 4 by directly simplifying the SDP that defines feasible degree-two pseudoexpectations. The first step is a broad result on the structure of these objects that applies to any set of constraints which includes the boolean (9) and single-color (10) constraints and is suitably symmetric; we then specialize to the coloring and partition problems.

Recall that a degree-two pseudoexpectation for a system of polynomials $f_{j}(\boldsymbol{x})=0$ is a linear operator $\tilde{\mathbb{E}}:\mathbb{R}[\boldsymbol{x}]_{\leq 2}\to\mathbb{R}$ which satisfies

•

$\tilde{\mathbb{E}}[1]=1$

•

$\tilde{\mathbb{E}}[f_{j}q]=0$ for any polynomials $f_{j}$ and $q$ such that $\deg f_{j}q\leq 2$

•

$\tilde{\mathbb{E}}[p^{2}]\geq 0$ for any polynomial $p$ with $\deg p^{2}\leq 2$

We can identify such objects with PSD $(nk+1)\times(nk+1)$ matrices of the form

[TABLE]

where $\ell_{i,c}=\tilde{\mathbb{E}}[x_{i,c}]$ and $\mathcal{E}_{(i,c),(j,c^{\prime})}=\tilde{\mathbb{E}}[x_{i,c}\,x_{j,c^{\prime}}]$ . It is useful to think of $\mathcal{E}$ as a block matrix, with a $k\times k$ block $\mathcal{E}_{ij}$ corresponding to each pair of vertices $i,j$ . Consistency with the boolean and single-color constraints (9), (10) then controls the diagonal elements and row and colum sums of each of these blocks,

[TABLE]

Moreover, each of our constraints is fixed under permutations of the colors, and $\tilde{\mathbb{E}}$ inherits this symmetry. That is the matrix carries with it a natural $S_{k}$ action that simultaneously permutes $\tilde{\mathbb{E}}[x_{i,c}]\to\tilde{\mathbb{E}}[x_{i,\sigma(c)}]$ and $\tilde{\mathbb{E}}[x_{i,c}\,x_{j,c^{\prime}}]\to\tilde{\mathbb{E}}[x_{i,\sigma(c)}\,x_{i,\sigma(c^{\prime})}]$ . This action preserves the spectrum of $\tilde{\mathbb{E}}$ as a matrix, as well as every hard constraint. By convexity, we may assume that $\tilde{\mathbb{E}}$ is stabilized under it, by beginning with an arbitrary pseudoexpectation and averaging over its orbit.

This assumption substantially constrains and simplifies $\tilde{\mathbb{E}}$ . In particular we are free to (i) assume that $\ell_{i,c}=\tilde{\mathbb{E}}[x_{i,c}]=1/k$ and (ii) assume that each $k\times k$ block in $\mathcal{E}$ has only two distinct values: one on the diagonal and the other off the diagonal. In other words, the pseudoexpectation claims that the marginal distribution of each vertex is uniform, and that joint marginal of any two vertices depends only on the probability that they have the same or different colors. As a result, for each $i,j$ we can assume that $\mathcal{E}_{ij}$ is a linear combination of the identity matrix $\mathbbm{1}_{k}$ and the matrix $\mathbb{J}_{k}$ of all 1s, and that the row and column sums of $\mathcal{E}_{ij}$ are all $1/k$ . In that case for each $i,j$ we can write

[TABLE]

for some $P_{ij}$ , or equivalently that

[TABLE]

for some $n\times n$ matrix $P$ . Note that

[TABLE]

so (37) requires that $P_{ii}=1$ for all $i$ .

Since the pseudoexpectation (36) consists of $\mathcal{E}$ with an additional row and column, we consider the following lemma. We leave its proof as an exercise for the reader.

Lemma 4.

For any matrix $X$ , vector $\boldsymbol{v}$ and scalar $b>0$ ,

[TABLE]

if and only if $X-(1/b){\boldsymbol{v}}\otimes{\boldsymbol{v}}\succeq 0$ .

Since $\boldsymbol{\ell}$ is the $nk$ -dimensional vector whose entries are all $1/k$ , we have $\boldsymbol{\ell}\otimes\boldsymbol{\ell}=\mathbb{J}_{nk}/k^{2}$ . Thus (40) and Lemma 4 imply that $\tilde{\mathbb{E}}\succeq 0$ if and only if

[TABLE]

Since $\mathbbm{1}_{k}-\mathbb{J}_{k}/k$ is a projection operator, this in turn occurs if and only if

[TABLE]

To summarize, finding a pseudoexpectation is equivalent to finding a PSD matrix $P\in\mathbb{R}^{n\times n}$ with $P_{ii}=1$ for all $i$ , such that $P$ remains PSD when we subtract the rank-one matrix $\mathbb{J}_{n}/k$ . However, we have thus far only reasoned about the boolean and single color constraints, and including either the coloring or cut constraint places an additional restriction on $P$ . In the case of coloring, we demanded that

[TABLE]

for every edge $(i,j)$ . This implies that $\operatorname{tr}\mathcal{E}_{ij}=0$ , and so $P_{ij}=0$ for each edge. Collecting these observations, a pseudoexpectation for coloring exists exactly when $k>\vartheta(\overline{G})$ , where

[TABLE]

Finally, note that $\mathbb{J}_{n}/\kappa=v\otimes v$ where $v=\boldsymbol{1}_{n}/\sqrt{\kappa}$ . Applying Lemma 4 again then gives exactly the PSD (14) for the Lovasz $\vartheta$ function, thus completing the proof of Theorem 3.

In the case of good partitions, we required that

[TABLE]

but this means that

[TABLE]

Following the path above, a degree-two pseudoexpectation exists for community detection when $k>\hat{\vartheta}_{\tau}(\overline{G})$ , where

[TABLE]

A priori, it seems that we may need to solve a different SDP for each value of $\tau$ , but a bit more work shows that this is not the case. Lemma 4 lets us transform the SDP (16) for $\hat{\vartheta}$ to the following problem,

[TABLE]

The following lemma then shows us how to relate optima of (45) to those of (44) for any $\tau$ in the disassortative range $\tau<1$ , thus completing the proof of Theorem 4.

Lemma 5.

For any $\tau<1$ ,

[TABLE]

Proof.

We show how to translate back and forth between solutions of (44) and (45). Given a matrix $P$ , define

[TABLE]

It is easy to check that $P_{ii}=1$ if and only if $(P_{\tau})_{ii}=1$ , and $\langle P_{\tau},A\rangle=(\tau/\kappa_{\tau})dn$ if and only if $\langle P,A\rangle=0$ . Finally, if we set

[TABLE]

then

[TABLE]

so $P_{\tau}-\mathbb{J}_{n}/\kappa_{\tau}\succeq 0$ if and only if $P-\mathbb{J}_{n}/\kappa\succeq 0$ . Thus (45) is feasible for $\kappa$ if and only if (44) is feasible for $\kappa_{\tau}$ . Since $\hat{\vartheta}(\overline{G})$ and $\hat{\vartheta}_{\tau}(\overline{G})$ are the smallest $\kappa$ and $\kappa_{\tau}$ respectively for which this is the case, (47) implies (46). ∎

Appendix B Proof of Lemma 3

It is immediate that there is such a quadrature rule for polynomials of degree strictly less than $t$ , since the space of linear functionals on such polynomials has dimension $t$ and is thus spanned by the $t$ linearly independent functionals which evaluate at the roots $x_{i}$ . Now let $\deg u<2t$ . We can divide $u$ by $p_{t}$ to write $u(z)=a(z)p_{t}+b(z)$ where $\deg a,\deg b<t$ . We have

[TABLE]

since $p_{t}$ is orthogonal to all polynomials of degree less than $t$ and has roots $r_{i}$ . This verifies exactness of the quadrature rule for polynomials of degree smaller than $2t$ .

To show that the weights $\{\omega_{i}\}$ are positive, let $i\in\{1,\ldots,t\}$ and let $v_{i}(z)=(p_{t}(z)/(z-r_{i}))^{2}$ be the polynomial with double roots at every root of $p_{t}$ save $r_{i}$ . Since $v_{i}$ is everywhere nonnegative and is a polynomial of degree $2t-2<t$ , we have

[TABLE]

but since $v(z)$ is nonnegative, $\omega_{i}$ must be positive.

Appendix C Proof of Lemma 2

The proof closely follows [7, Theorem 4] which shows that the chromatic number of $G_{n,d}$ is concentrated on two adjacent integers, and which is in turn based on the proof in [42] of two-point concentration for $G(n,p)$ with $p=O(n^{-5/6-\epsilon})$ . Recall the configuration model [59], where we make $d$ “copies” of each vertex corresponding to its half-edges, and then choose uniformly from all $(dn-1)!!=(dn)!/(2^{dn/2}(dn/2)!)$ perfect matchings of these copies. If we denote the set of such matchings by $\mathcal{P}_{n,d}$ and condition the corresponding multigraphs on having no self-loops or multiple edges, the resulting distribution is uniform on the set of $d$ -regular graphs, and occupies a constant fraction of the total probability of $\mathcal{P}_{n,d}$ . Thus any property which holds with high probability for $\mathcal{P}_{n,d}$ holds with high probability for $G_{n,d}$ as well.

If $P,P^{\prime}$ are two perfect matchings in $\mathcal{P}_{n,d}$ , we write $P\sim P^{\prime}$ if they differ by a single swap, changing $\{(a,b),(c,d)\}$ to $\{(a,c),(b,d)\}$ or $\{(a,d),(b,c)\}$ . The following martingale inequality [59, Theorem 2.19] shows that a random variable which is Lipschitz with respect to these swaps is concentrated.

Lemma 6.

Let $c$ be a constant, and let $X$ be a random variable defined on $\mathcal{P}_{n,d}$ such that $|X(P)-X(P^{\prime})|\leq c$ whenever $P\sim P^{\prime}$ . Then

[TABLE]

Now fix $\theta$ , and define $X$ as the minimum number of edge constraints $P_{ij}=0$ in the SDP (14) violated by an otherwise feasible solution with $\kappa=\theta$ . This meets the Lipschitz condition with $c=2$ . By assumption $X=0$ with positive probability. Lemma 6 then implies that (say) $\mathbb{E}[X]\leq(1/2)\sqrt{n\log n}$ , in which case $X<\sqrt{n\log n}$ with high probability.

Let $S$ denote the set of endpoints of the violated edges. Then there is an orthogonal representation $\{u_{i}\}$ of the subgraph induced by $V\setminus S$ and a unit vector $\mathfrak{z}$ such that $\big{\langle}u_{i},\mathfrak{z}\big{\rangle}=1/\sqrt{\theta}$ and $\big{\langle}u_{i},u_{j}\big{\rangle}=0$ if $(i,j)\in E$ and $i,j\notin S$ . Our goal is to “fix” $\{u_{i}\}$ on the violated edges, and if necessary on some additional vertices, to give an orthogonal representation $\{v_{i}\}$ for all of $G$ .

As in [7, 42], we inductively build a set of vertices $S=U_{0},U_{1},\ldots,U_{T}=U$ as follows. Given $U_{t}$ , let $U_{t+1}=U_{t}\cup\{i,j\}$ where $i,j\notin U_{t}$ , $(i,j)\in E$ , and $i$ and $j$ each have at least one neighbor in $U_{t}$ . We define $T$ as the step at which there is no such pair $i,j$ and this process ends. Let $I$ denote $U$ ’s neighborhood, i.e., the set of vertices outside $U$ which have a neighbor in $U$ . Then $I$ is an independent set, since otherwise the process would have continued. We make the following claim:

Lemma 7.

With high probability, the subgraph induced by $U$ is 3-colorable.

Proof.

For all $0\leq t\leq T$ we have $|U_{t}|=2t+|S|$ . Moreover, the subgraph induced by $U_{t}$ has at least $3t+|S|/2=(3/2)|U_{t}|-|S|$ edges and thus average degree at least $3-2|S|/|U_{t}|$ . On the other hand, a crude union bound shows that for any $d$ and any $\beta>2$ , there is an $\alpha>0$ such that, with high probability, all induced subgraphs of $G$ containing $\alpha n$ or fewer vertices have average degree less than $\beta$ . Since $|S|=o(n)$ with high probability, this implies that $|U_{t}|\leq(2+o(1))|S|$ for all $t$ , and in particular that $|U|=o(n)$ .

The same union bound then implies that with high probability the subgraph induced by $|U|$ , and all its subgraphs, have average degree less than $3$ . But this means that this subgraph has no 3-core: that is, it has at least one vertex of degree less than 3, and so will the subgraph we get by deleting this vertex, and so on. Working backwards, we can 3-color the entire subgraph by starting with the empty set and adding these vertices back in, since at least one of the three colors will always be available to them. ∎

To define our orthogonal representation, let $w$ be a unit vector such that $\big{\langle}\mathfrak{z},w\big{\rangle}=\big{\langle}u_{i},w\big{\rangle}=0$ for all $i\notin S$ ; such a vector exists since $|S|\geq 2$ . Then define

[TABLE]

Then $|\mathfrak{z}^{\prime}|^{2}=1$ , and $\big{\langle}w,\mathfrak{z}^{\prime}\big{\rangle}=\big{\langle}u_{i},\mathfrak{z}^{\prime}\big{\rangle}=1/\sqrt{\theta+1}$ for all $i\notin S$ . Moreover, there exist three mutually orthogonal unit vectors $y_{1},y_{2},y_{3}$ such that $\big{\langle}y_{j},\mathfrak{z}^{\prime}\big{\rangle}=1/\sqrt{\theta+1}$ and $\big{\langle}y_{j},w\big{\rangle}=0$ for all $j\in\{1,2,3\}$ . This follows from the fact that the following matrix is PSD whenever $\theta\geq 3$ , in which case it can be realized as the Gram matrix of $\{y_{1},y_{2},y_{3},w,\mathfrak{z}^{\prime}\}$ :

[TABLE]

Finally, let $\sigma(i)\in\{1,2,3\}$ be a proper 3-coloring of the subgraph induced by $U$ . Then the following is an orthogonal representation of $G$ ,

[TABLE]

and $\big{\langle}v_{i},\mathfrak{z}^{\prime}\big{\rangle}=1/\sqrt{\theta+1}$ for all $i$ . This gives a feasible solution to the SDP (14) with $\kappa=\theta+1$ , implying that $\vartheta(\overline{G})\leq\theta+1$ .

Bibliography59

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. Abbe, A.S. Bandeira, and G. Hall. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory , 62(1):471–487, 2016.
2[2] Emmanuel Abbe. Community detection and stochastic block models: recent developments. J. Machine Learning Research , 2017. to appear.
3[3] Emmanuel Abbe and Colin Sandon. Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In Proc. 56th Annual Symposium on Foundations of Computer Science, FOCS , pages 670–688, 2015.
4[4] Emmanuel Abbe and Colin Sandon. Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap. Ar Xiv preprints , 1512.09080, 2015.
5[5] Emmanuel Abbe and Colin Sandon. Achieving the KS threshold in the general stochastic block model with linearized acyclic belief propagation. In Proc. Neural Information Processing Systems (NIPS) , pages 1334–1342, 2016.
6[6] Emmanuel Abbe and Colin Sandon. Crossing the KS threshold in the stochastic block model with information theory. In IEEE Intl. Symp. on Information Theory, ISIT , pages 840–844, 2016.
7[7] Dimitris Achlioptas and Cristopher Moore. The chromatic number of random regular graphs. In Proc. 8th International Workshop on Randomization and Computation (RANDOM) , pages 219–228, 2004.
8[8] Dimitris Achlioptas and Assaf Naor. The two possible values of the chromatic number of a random graph. Ann. Math. , 162:1335–1351, 2005.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The Lovász Theta Function for Random Regular Graphs

Abstract.

1. Introduction

Theorem 1**.**

Theorem 2**.**

2. Colorings, Partitions, and the Lovász ϑ\varthetaϑ Function

2.1. Background on sum-of-squares

2.2. Colorings, partitions, and sum-of-squares

2.3. The Lovász ϑ\varthetaϑ function

Theorem 3**.**

2.4. Good partitions and a relaxed Lovász function

Theorem 4**.**

2.5. Upper and lower bounds

Lemma 1**.**

Proof.

Corollary 1**.**

Corollary 2**.**

Theorem 5**.**

Lemma 2**.**

Corollary 3**.**

3. Constructing a Pseudoexpectation with Orthogonal Polynomials

Lemma 3**.**

Acknowledgements

Appendix A Proof of Theorems 3 and 4

Lemma 4**.**

Lemma 5**.**

Proof.

Appendix B Proof of Lemma 3

Appendix C Proof of Lemma 2

Lemma 6**.**

Lemma 7**.**

Proof.

Theorem 1.

Theorem 2.

2. Colorings, Partitions, and the Lovász $\vartheta$ Function

2.3. The Lovász $\vartheta$ function

Theorem 3.

Theorem 4.

Lemma 1.

Corollary 1.

Corollary 2.

Theorem 5.

Lemma 2.

Corollary 3.

Lemma 3.

Lemma 4.

Lemma 5.

Lemma 6.

Lemma 7.