Independent sets in the hypercube revisited

Matthew Jenssen; Will Perkins

arXiv:1907.00862·math.CO·February 10, 2022

Independent sets in the hypercube revisited

Matthew Jenssen, Will Perkins

PDF

TL;DR

This paper improves the asymptotic understanding of independent sets in the hypercube by combining combinatorial and statistical physics methods, providing sharper results and detailed structural insights.

Contribution

It introduces a novel combination of graph container methods with cluster expansion techniques to refine asymptotics and structural descriptions of independent sets in the hypercube.

Findings

01

Sharper asymptotic formulas for independent sets

02

Detailed probabilistic structure of typical independent sets

03

Answers to open questions posed by Galvin

Abstract

We revisit Sapozhenko's classic proof on the asymptotics of the number of independent sets in the discrete hypercube ${0, 1}^{d}$ and Galvin's follow-up work on weighted independent sets. We combine Sapozhenko's graph container methods with the cluster expansion and abstract polymer models, two tools from statistical physics, to obtain considerably sharper asymptotics and detailed probabilistic information about the typical structure of (weighted) independent sets in the hypercube. These results refine those of Korshunov and Sapozhenko and Galvin, and answer several questions of Galvin.

Equations277

i (Q_{d}) = (1 + o (1)) \cdot 2 e \cdot 2^{2^{d - 1}}

i (Q_{d}) = (1 + o (1)) \cdot 2 e \cdot 2^{2^{d - 1}}

i (Q_{d}) = 2 e \cdot 2^{2^{d - 1}} (1 + \frac{3 d ^{2} - 3 d - 2}{8 \cdot 2 ^{d}} + \frac{243 d ^{4} - 646 d ^{3} - 33 d ^{2} + 436 d + 76}{384 \cdot 2 ^{2 d}} + O (d^{6} \cdot 2^{- 3 d}))

i (Q_{d}) = 2 e \cdot 2^{2^{d - 1}} (1 + \frac{3 d ^{2} - 3 d - 2}{8 \cdot 2 ^{d}} + \frac{243 d ^{4} - 646 d ^{3} - 33 d ^{2} + 436 d + 76}{384 \cdot 2 ^{2 d}} + O (d^{6} \cdot 2^{- 3 d}))

Z (λ)

Z (λ)

μ (I)

μ (I)

λ_{t} (d)

λ_{t} (d)

e^{- s t 2^{- 1/ t}} 2^{2 - 2/ t - t} (2^{1/ t} - 1)^{t} \sum ∣ Aut (T) ∣^{- 1}

e^{- s t 2^{- 1/ t}} 2^{2 - 2/ t - t} (2^{1/ t} - 1)^{t} \sum ∣ Aut (T) ∣^{- 1}

X_{T} \Rightarrow Pois (ρ),

X_{T} \Rightarrow Pois (ρ),

\tilde{X}_{T} = \frac{X _{T} - m _{T}}{σ _{T}} \Rightarrow N (0, 1) .

\tilde{X}_{T} = \frac{X _{T} - m _{T}}{σ _{T}} \Rightarrow N (0, 1) .

Z (λ)

Z (λ)

Z (λ)

Z (λ)

\frac{p ( X )}{p ( \emptyset )} = \frac{λ ^{∣ X ∣}}{( 1 + λ ) ^{∣ N (X) ∣}} .

\frac{p ( X )}{p ( \emptyset )} = \frac{λ ^{∣ X ∣}}{( 1 + λ ) ^{∣ N (X) ∣}} .

w (Γ)

w (Γ)

ϕ (H)

ϕ (H)

L_{k}

L_{k}

Z (λ)

Z (λ)

Z (λ)

Z (λ)

Z (λ)

Z (λ)

Ξ (P)

Ξ (P)

lo g Ξ (P)

lo g Ξ (P)

w (Γ)

w (Γ)

S^{'} ≁ S \sum ∣ w (S^{'}) ∣ e^{f (S^{'}) + g (S^{'})}

S^{'} ≁ S \sum ∣ w (S^{'}) ∣ e^{f (S^{'}) + g (S^{'})}

Γ \in C Γ ≁ S \sum ∣ w (Γ) ∣ e^{g (Γ)} \leq f (S) .

Γ \in C Γ ≁ S \sum ∣ w (Γ) ∣ e^{g (Γ)} \leq f (S) .

G (a, b) = {A \subseteq E : A 2-linked, ∣ [A] ∣ = a, ∣ N (A) ∣ = b} .

G (a, b) = {A \subseteq E : A 2-linked, ∣ [A] ∣ = a, ∣ N (A) ∣ = b} .

A \in G (a, b) \sum \frac{λ ^{∣ A ∣}}{( 1 + λ ) ^{b}} \leq 2^{d} exp (- \frac{C _{1} ( b - a ) lo g d}{d ^{2/3}}) .

A \in G (a, b) \sum \frac{λ ^{∣ A ∣}}{( 1 + λ ) ^{b}} \leq 2^{d} exp (- \frac{C _{1} ( b - a ) lo g d}{d ^{2/3}}) .

w (S) = \frac{λ ^{∣ S ∣}}{( 1 + λ ) ^{∣ N (S) ∣}} .

w (S) = \frac{λ ^{∣ S ∣}}{( 1 + λ ) ^{∣ N (S) ∣}} .

ν (Γ)

ν (Γ)

lo g Z (λ) - lo g [2 (1 + λ)^{2^{d - 1}} Ξ] = O (exp (- 2^{d} / d^{4})) .

lo g Z (λ) - lo g [2 (1 + λ)^{2^{d - 1}} Ξ] = O (exp (- 2^{d} / d^{4})) .

∥ \overset{μ}{^} - μ ∥_{T V} = O (exp (- 2^{d} / d^{4})) .

∥ \overset{μ}{^} - μ ∥_{T V} = O (exp (- 2^{d} / d^{4})) .

T_{k}

T_{k}

γ (d, k)

γ (d, k)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Independent sets in the hypercube revisited

Matthew Jenssen

School of Mathematics, University of Birmingham, Birmingham, UK

[email protected].

and

Will Perkins

Department of Mathematics, Statistics, and Computer Science

University of Illinois at Chicago

851 S. Morgan, Chicago, IL

[email protected].

Abstract.

We revisit Sapozhenko’s classic proof on the asymptotics of the number of independent sets in the discrete hypercube $\{0,1\}^{d}$ and Galvin’s follow-up work on weighted independent sets. We combine Sapozhenko’s graph container methods with the cluster expansion and abstract polymer models, two tools from statistical physics, to obtain considerably sharper asymptotics and detailed probabilistic information about the typical structure of (weighted) independent sets in the hypercube. These results refine those of Korshunov and Sapozhenko and Galvin, and answer several questions of Galvin.

Key words and phrases:

independent sets, hypercube, cluster expansion

1991 Mathematics Subject Classification:

05C30, 05C31, 82B20

1. Introduction

Let $Q_{d}$ denote the discrete hypercube of dimension $d$ : the graph with vertex set $\{0,1\}^{d}$ with edges between vectors that differ in exactly one coordinate. An independent set in a graph $G$ is a set of vertices that induce no edges. Let $i(G)$ denote the number of independent sets of $G$ .

Korshunov and Sapozhenko proved the following result on the number of independent sets of the hypercube.

Theorem 1 (Korshunov and Sapozhenko [17]).

[TABLE]

as $d\to\infty$ .

A beautiful and influential proof of Theorem 1 was later given by Sapozhenko in [22]. See [8] for an exposition of this proof.

One of our main results in this paper will be to reinterpret Sapozhenko’s proof in terms of the cluster expansion from statistical physics. This allows us to compute additional terms in the asymptotic expansion of $i(Q_{d})$ among other things. For instance, we can compute the asymptotics to the third order in $2^{-d}$ .

Theorem 2.

[TABLE]

as $d\to\infty$ .

More generally, we give a formula and an algorithm for computing the asymptotics to arbitrary order in $2^{-d}$ .

Theorem 1 (along with Sapozhenko’s techniques) provided the first glimpse of a rich landscape of phenomena concerning independent sets in $Q_{d}$ . To describe the phenomena we take the perspective of statistical physics. The independence polynomial of the hypercube is

[TABLE]

where $\mathcal{I}(Q_{d})$ is the set of all independent sets of $Q_{d}$ . In particular, $Z(1)=i(Q_{d})$ . The independence polynomial is the partition function of the hard-core model from statistical physics: a probability distribution on independent sets weighted by the fugacity parameter $\lambda$ . This distribution is defined by

[TABLE]

The hard-core model (or hard-core lattice gas) is a simple model of a gas, and in statistical physics it is most commonly studied on the integer lattice $\mathbb{Z}^{d}$ . As is common in the literature, we will refer to vertices contained in an independent set drawn from the hard-core model as ‘occupied’.

Let $\mathcal{E}\subset V(Q_{d})$ be the set of ‘even’ vertices of the hypercube whose coordinates sum to an even number and let $\mathcal{O}\subset V(Q_{d})$ be the ‘odd’ vertices whose coordinates sum to an odd number. We note that $Q_{d}$ is a bipartite graph with bipartition $(\mathcal{E},\mathcal{O})$ . Kahn [15] showed that for constant $\lambda$ , typical independent sets drawn from $\mu$ contain either mostly even vertices or mostly odd vertices, and thus the hard-core model on $Q_{d}$ exhibits a kind of ‘phase coexistence’ in the language of statistical physics.

By generalizing Sapozhenko’s techniques, Galvin [7] was able to describe the typical structure of independent sets drawn from $\mu$ in greater detail and for a wider range of parameters $\lambda$ . We need two definitions to describe these results.

Definition 3.

For an independent set $I\in\mathcal{I}(Q_{d})$ , we say $\mathcal{E}$ is the minority side of the bipartition if $|\mathcal{E}\cap I|<|\mathcal{O}\cap I|$ and the majority side otherwise. If $\mathcal{E}$ is the minority side, then $\mathcal{O}$ is the majority side and vice versa.

Definition 4.

A set $S\subseteq\mathcal{E}$ (or $\mathcal{O}$ ), is $2$ -linked if the subgraph of $Q_{d}$ induced by the vertex set $S\cup N(S)$ is connected; in other words, $S$ is connected in the graph $Q_{d}^{2}$ (the square of the graph $Q_{d}$ ).

Galvin showed that for the hard-core model on $Q_{d}$ at fugacity $\lambda=1+s/d$ with $s$ constant, the number of occupied vertices on the minority side is asymptotically distributed as a Poisson random variable with mean $e^{-s/2}/2$ and with high probability (whp) all $2$ -linked components of occupied vertices on the minority side are of size $1$ [7, Theorem 1.4]. He conjectured that there is in fact a series of thresholds at which $2$ -linked components of size $t$ emerge in a Poisson fashion and asked as an open problem for the distribution of occupied $2$ -linked components of size $t$ on the minority side for all $t$ .

Here we prove his conjecture and answer his question in a strong form. We show that the emergence of a $2$ -linked occupied component of size $t$ on the minority side has a sharp threshold at $\lambda_{t}=2^{1/t}-1$ and we identify precisely the scaling window about this threshold. We also essentially determine the asymptotic joint distribution of the number of such components (see Theorem 6).

Theorem 5.

For $t\geq 1$ fixed, let

[TABLE]

Then for the hard-core model on $Q_{d}$ at fugacity $\lambda_{t}(d)$ :

•

if $s(d)\to\infty$ as $d\to\infty$ then whp there are no $2$ -linked occupied components of size $t$ on the minority side;

•

if $s(d)\to-\infty$ then whp there are $2$ -linked occupied components of size $t$ on the minority side;

•

if $s(d)$ tends to a constant $s$ then the distribution of the number of $2$ -linked occupied components of size $t$ on the minority side converges to a Poisson distribution with mean

[TABLE]

where the sum is over all trees $T$ on $t$ vertices and $\text{Aut}(T)$ denotes the automorphism group of the tree $T$ .

In fact we prove much more detailed probabilistic results. Define the defect type of a $2$ -linked component $S$ of $\mathcal{E}$ or $\mathcal{O}$ to be the isomorphism class of the induced subgraph $Q_{d}^{2}[S]$ . In particular there is a unique defect type of size $1$ (an isolated vertex), a unique defect type of size $2$ (two vertices at distance $2$ in $Q_{d}$ ), but two defect types of size $3$ : $3$ vertices whose distance- $2$ graph forms a clique and $3$ vertices whose distance- $2$ graph forms a path. For a given defect type $T$ , let $X_{T}$ be the random variable that counts the number of $2$ -linked occupied components of type $T$ on the minority side in the hard-core model on $Q_{d}$ . Let $m_{T}=\mathbb{E}X_{T}$ and $\sigma^{2}_{T}=\text{var}(X_{T})$ (for the asymptotics of $m_{T}$ and $\sigma^{2}_{T}$ see Lemma 19 and Corollary 21 below).

We determine the limiting distribution of the number of each type of defect and show that the number of defects of different types are asymptotically independent.

Theorem 6.

There is a constant $C_{0}>0$ such that if $\lambda\geq C_{0}\log d/d^{1/3}$ and $T$ is a defect type then the following holds. If $T$ and $\lambda$ are such that $m_{T}\to\rho$ as $d\to\infty$ for some constant $\rho>0$ , then

[TABLE]

where ‘ $\Rightarrow$ ’ denotes convergence in distribution. If $T$ and $\lambda$ are such that $m_{T}\to\infty$ as $d\to\infty$ , then

[TABLE]

Moreover, suppose we have two finite sets of defect types $\mathcal{T}_{1},\mathcal{T}_{2}$ so that for each $T\in\mathcal{T}_{1}$ , there exists $\rho_{T}>0$ so that $m_{T}\to\rho_{T}$ , and for each $T\in\mathcal{T}_{2}$ , $m_{T}\to\infty$ . Then the collection of random variables $\{X_{T}\}_{T\in\mathcal{T}_{1}}\cup\{\tilde{X}_{T}\}_{T\in\mathcal{T}_{2}}$ converges in distribution to a collection of independent Poisson and standard normal random variables.

We remark that the condition that $\lambda\geq C_{0}\log d/d^{1/3}$ is a technical requirement of a container lemma due to Galvin which is a key ingredient in our proofs (see Lemma 11 below). We expect that Theorem 6 in fact extends to the range $\lambda>(1+\Omega(1))\log d/d$ .

There is a close connection between computing accurate estimates of the partition function and deriving probabilistic information about the hard-core model. As a key step in proving his probabilistic results, Galvin gave a significant generalization of Theorem 1 to counting weighted independent sets in the hypercube; that is, computing the asymptotics of $Z(\lambda)$ for general $\lambda$ .

Theorem 7 (Galvin [7]).

For $\lambda\geq\sqrt{2}-1+\frac{(\sqrt{2}+\Omega(1))\log d}{d}$ ,

[TABLE]

Moreover, there is a constant $C_{0}>0$ so that for $\lambda\geq C_{0}\log d/d^{1/3}$ ,

[TABLE]

The formula (1) generalises Theorem 1, and determines the asymptotics of $Z(\lambda)$ for $\lambda>\sqrt{2}-1$ , while the formula (2) finds the asymptotics of $\log Z(\lambda)$ for $\lambda=\Omega(\log d/d^{1/3})$ .

Our techniques based on the cluster expansion will allow us to sharpen Theorem 7 considerably: we find a formula that can be used not only to determine the asymptotics of $Z(\lambda)$ for all constant $\lambda$ but also to give an expansion of $\log Z(\lambda)$ to arbitrary order in $2^{-d}$ .

To write the formula we need some notation that comes from polymer models in the statistical mechanics of lattice systems [18]. Before we introduce these notions formally, we describe some of the intuition underlying the proof of Theorem 2 and the results to come. An immediate lower bound on $Z(\lambda)$ of $2(1+\lambda)^{2^{d-1}}-1$ comes by considering the contribution from independent sets which lie entirely in one side of the bipartition of $Q_{d}$ . We call the collection of independent sets which lie entirely in $\mathcal{E}$ (or $\mathcal{O}$ ) the even (odd) ground state. Taking $\lambda=1$ , for example, there is a constant factor gap between this trivial lower bound $i(Q_{d})\geq 2\cdot 2^{2^{d-1}}-1$ and the correct asymptotics of Theorem 1. Therefore a constant proportion of independent sets do not belong to a ground state. However, almost all independent sets are very close to a ground state independent set. Thus it is natural to describe independent sets in terms of their deviations from a ground state: given a subset $X\subseteq\mathcal{E}$ , let $p(X)$ denote the probability that an independent set $I$ chosen according to $\mu$ satisfies $I\cap\mathcal{E}=X$ . When $X$ is small, we think of it as a deviation from the odd ground state and note that the relative ‘cost’ of such a deviation is

[TABLE]

We denote this cost, or weight, of a deviation $X$ by $w(X)$ . Crucially, the weight $w(X)$ factorises over the $2$ -linked components of $X$ , and so we define an even polymer to be any $2$ -linked subset of $\mathcal{E}$ , and define its weight by (3). We define odd polymers similarly.

The language of polymer models allows us to relate the partition function $Z(\lambda)$ to the partition function of a (multivariate) hard-core model on an auxiliary graph whose vertices are polymers and each polymer $S$ has its associated weight $w(S)$ as its fugacity. A key feature of this transformation is that while at large $\lambda$ an independent set drawn from $\mu_{Q_{d},\lambda}$ is typically very structured, the corresponding deviations on the minority side are typically unstructured and behave almost independently. Using the cluster expansion, we are then able to extract almost complete probabilistic information from our model. In particular it allows us to precisely quantify the contribution to $Z(\lambda)$ from small deviations, and allows us to compute $\log Z(\lambda)$ to essentially arbitrary accuracy.

The cluster expansion is a powerful and classical tool in the rigorous study of statistical mechanics. In our context, it is the multivariate Taylor expansion of the logarithm of the partition function of our auxiliary hard-core model. Studying this infinite series naturally leads to the question of convergence. Verifying the convergence of the cluster expansion amounts to showing that the number of polymers of a given weight is not too large. This is where the container method of Sapozhenko comes in. In fact, all of the ingredients needed to show that this polymer model has a convergent cluster expansion are already present in Sapozhenko’s work and Galvin’s extensions. In some sense Sapozhenko rediscovered the concept of a polymer model and computed the smallest order terms of the cluster expansion by hand. Certainly the intuition behind the specific polymer model is clear in his work.

We now venture to make some of the above mentioned notions more concrete. Recall that an even/odd polymer is a $2$ -linked subset of $\mathcal{E}/\mathcal{O}$ respectively. The size of a polymer $S$ , $|S|$ , is the number of vertices in $S$ . Since $Q_{d}$ exhibits symmetry between $\mathcal{E}$ and $\mathcal{O}$ we will restrict our attention to even polymers. We say two even polymers $S_{1},S_{2}$ are compatible if $d_{G}(S_{1},S_{2})>2$ ; that is if $S_{1}\cup S_{2}$ is not $2$ -linked. Otherwise $S_{1}$ and $S_{2}$ are incompatible (and note that each polymer is incompatible with itself). For a tuple $\Gamma$ of even polymers, the incompatibility graph, $H(\Gamma)$ , is the graph with vertex set $\Gamma$ and an edge between any two incompatible polymers. An even cluster $\Gamma$ is an ordered tuple of even polymers so that $H(\Gamma)$ is connected. The size of a cluster $\Gamma$ is $\|\Gamma\|=\sum_{S\in\Gamma}|S|$ . Let $\mathcal{C}$ be the set of all even clusters and $\mathcal{C}_{k}$ the set of all even clusters of size $k$ .

Recall that for a polymer $S$ , we define its weight to be $w(S)=\lambda^{|S|}(1+\lambda)^{-|N(S)|}\,.$ For a cluster $\Gamma$ we define

[TABLE]

where $\phi(H)$ is the Ursell function of a graph $H$ , defined by

[TABLE]

Finally for $k\geq 1$ we define

[TABLE]

Note that by symmetry $L_{k}$ would be identical if we had considered odd polymers and odd clusters instead.

We can now state our formula for $Z(\lambda)$ .

Theorem 8.

Suppose $\lambda\geq C_{0}\log d/d^{1/3}$ and $\lambda$ is bounded as $d\to\infty$ . Then for all fixed $k\geq 1$ ,

[TABLE]

where for each fixed $k$ , $|\varepsilon_{k}|=O\left(\frac{2^{d}\lambda^{k+1}d^{2k}}{(1+\lambda)^{d(k+1)}}\right)$ as $d\to\infty$ . Moreover, $L_{k}$ can be computed in time $e^{O(k\log k)}$ .

In fact, it is not essential that $\lambda$ remain bounded as $d\to\infty$ : a similar formula holds for all values of $\lambda$ with an addition of $\exp(-2^{d}/d^{4})$ to $\varepsilon_{k}$ , but for simplicity here we focus on the more interesting cases when $\lambda$ is bounded or tends to [math].

As a quick check, note that at $\lambda=1$ , $L_{1}=1/2$ since there are $2^{d-1}$ polymers of size $1$ and each has weight $2^{-d}$ . Moreover, $\varepsilon_{2}=O(d^{4}2^{-2d})=o(1)$ and so Theorem 8 implies that $i(Q_{d})=2\cdot 2^{2^{d-1}}e^{1/2+o(1)}$ , recovering Theorem 1.

More generally, Theorem 8 extends Theorem 7. For instance, we can give a closed-form formula for the asymptotics of $Z(\lambda)$ for any constant $\lambda$ .

Corollary 9.

For any fixed $t\geq 1$ and for $\lambda\geq 2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{\omega(1)}{d}$ ,

[TABLE]

For example, if $\lambda\geq 2^{1/3}-1+\frac{2^{7/3}\log d}{3d}+\frac{\omega(1)}{d}$ , then

[TABLE]

We find it rather remarkable how well the two tools from statistical physics, polymer models and the cluster expansion, work with the graph container method, and we expect many further applications of this combination of methods. See [21] for a survey of the graph container method. In forthcoming work, Keevash and the first author [13] apply this combination of methods to resolve conjectures of Galvin and Engbers [5] and Kahn and Park [16] on the number of $q$ -colourings of $Q_{d}$ . As a future research direction, we ask whether these statistical physics tools can be used in conjunction with the method of hypergraph containers [1, 23] to derive finer asymptotics and probabilistic information in some of the many extremal combinatorics problems in which hypergraph containers have been deployed.

The paper is organised as follows: We introduce abstract polymer models and the cluster expansion in Section 2, and then specialise to the hypercube and prove Theorem 8 in Section 3. We prove the probabilistic results of Theorems 5 and 6 in Section 4. We explicitly compute $L_{1},L_{2}$ , and $L_{3}$ and prove Theorem 2 in Section 5.

Related work

As Galvin remarked in [7], only a few properties of the hypercube $Q_{d}$ are needed in deriving Theorem 7; the same is true for Theorems 6 and 8. The essential properties are that the graph be bipartite and that some isoperimetric estimates hold (of the form of Lemma 12 below). In fact, using an approach to approximate counting based on the cluster expansion [12, 14], one could obtain efficient algorithms to approximate the partition function $Z_{G}(\lambda)$ and to sample from the hard-core model for a class of graphs with these properties. The polymer models used in [14, 20, 3] to sample from the hard-core model on random regular bipartite graphs are very similar to the ones used here. For a similar class of bipartite graphs Galvin and Tetali [10] showed that the Glauber dynamics Markov chain for sampling from the hard-core model exhibits slow mixing; that proof is also based on extending the ideas of Sapozhenko.

2. Polymer models and the cluster expansion

Here we introduce the main tools we will use, abstract polymer models [11, 18] and the cluster expansion, both tools from statistical physics that have been used extensively to study phase diagrams of lattice spin models. We have already encountered the terms ‘polymer’ and ‘cluster’ in the previous section. Indeed, the polymers from the introduction are concrete examples of a more general notion which we introduce now.

Let $\mathcal{P}$ be a finite set whose elements we call ‘polymers’. We equip $\mathcal{P}$ with a complex-valued weight $w(S)$ for each polymer $S$ as well as a symmetric and reflexive incompatibility relation between polymers. We write $S\nsim S^{\prime}$ if polymers $S$ and $S^{\prime}$ are incompatible. Let $\Omega$ be the collection of pairwise compatible sets of polymers from $\mathcal{P}$ , including the empty set of polymers. Then the polymer model partition function is

[TABLE]

where the contribution from the empty set is $1$ .

A cluster is an ordered tuple of polymers whose incompatibility graph $H(\Gamma)$ is connected. Let $\mathcal{C}$ be the set of all clusters. The cluster expansion is the formal power series in the weights $w(S)$

[TABLE]

where

[TABLE]

and $\phi(H)$ is the Ursell function as defined in (4). In fact the cluster expansion is simply the multivariate Taylor series for $\log\Xi(\mathcal{P})$ in the variables $w(S)$ , as observed by Dobrushin [4]. See also Scott and Sokal [24] for a derivation of the cluster expansion and much more.

A sufficient condition for the convergence of the cluster expansion is given by a theorem of Kotecký and Preiss.

Theorem 10 ([18]).

Let $f:\mathcal{P}\to[0,\infty)$ and $g:\mathcal{P}\to[0,\infty)$ be two functions. Suppose that for all polymers $S\in\mathcal{P}$ ,

[TABLE]

then the cluster expansion converges absolutely. Moreover, if we let $g(\Gamma)=\sum_{S\in\Gamma}g(S)$ and write $\Gamma\nsim S$ if there exists $S^{\prime}\in\Gamma$ so that $S\nsim S^{\prime}$ , then for all polymers $S$ ,

[TABLE]

We remark that one could simply take $g\equiv 0$ in (8) in order to establish convergence of the cluster expansion. However, allowing $g$ to take non-zero values (thus strengthening (8)) allows us to give strong tail bounds on the cluster expansion via (9). This will allow us to show that certain truncations of the cluster expansion serve as good approximations to the logarithm of the partition function.

3. Polymers in the hypercube

We now return to our specific setting with polymers derived from the hard-core model on $Q_{d}$ . These polymers will essentially be the same as those defined in Section 1. Here we will study the cluster expansion of this polymer model in depth.

3.1. Preliminaries

We begin with some notation and lemmas from [7].

For a set $A\subseteq\mathcal{E}$ (and analogously for $A\subseteq\mathcal{O}$ ), let $|A|$ denote the number of vertices of $A$ , $N(A)$ be the set of neighbours of $A$ , and $[A]=\{v\in\mathcal{E}:N(v)\subseteq N(A)\}$ the bipartite closure of $A$ . Clearly $|[A]|\geq|A|$ . Let

[TABLE]

The following lemma of Galvin is based on the graph container method of Sapozhenko [22]. This is a key technical ingredient in [22, 7] and in the results of this paper.

Lemma 11 ([7]).

There exist constants $C_{0},C_{1}>0$ , so that for all $\lambda\geq C_{0}\log d/d^{1/3}$ , all $a\leq 2^{d-2}$ ,

[TABLE]

In what follows, we will always assume that $\lambda\geq C_{0}\log d/d^{1/3}$ to allow us to apply Lemma 11.

We will also use the following isoperimetric estimates, which come from [6, 17] but can also be found in [7].

Lemma 12.

Suppose $S\subseteq\mathcal{E}$ (or $S\subseteq\mathcal{O}$ ). Then

(1)

If $|S|\leq d/10$ , then $|N(S)|\geq d|S|-2|S|^{2}$ . 2. (2)

If $|S|\leq d^{4}$ , then $|N(S)|\geq d|S|/10$ . 3. (3)

If $|S|\leq 2^{d-2}$ , then $|N(S)|\geq\left(1+\frac{1}{2\sqrt{d}}\right)|S|$ .

We also make use of the following, from, e.g. [9].

Lemma 13.

The number of $2$ -linked subsets $S\subseteq\mathcal{E}$ of size $t$ which contain a given vertex $v$ is at most $(ed^{2})^{t-1}$ .

3.2. The defect polymer model

We begin by fixing a side of the bipartition which we call the defect side. Let us suppose this side is $\mathcal{E}$ (the case where $\mathcal{O}$ is the defect side will be identical).

We define a polymer to be a $2$ -linked subset $S$ of the defect side in $Q_{d}$ so that $|[S]|\leq 2^{d-2}$ . Let $\mathcal{P}$ be the set of all such polymers (we will make use of a subscript, as in $\mathcal{P}_{\mathcal{E}}$ or $\mathcal{P}_{\mathcal{O}}$ , if we want to indicate which is the defect side). Two polymers $S,S^{\prime}$ are compatible if $S\cup S^{\prime}$ is not $2$ -linked. Let $\Omega$ be the set of all pairwise compatible sets of polymers from $\mathcal{P}$ . The weight functions are defined as

[TABLE]

Let $\Xi=\Xi(\mathcal{P})$ denote the resulting polymer model partition function (and note that by symmetry $\Xi$ is the same regardless of the defect side).

The partition function $\Xi$ is the normalizing constant of a probability distribution $\nu$ on $\Omega$ defined by

[TABLE]

Using $\nu$ we can define a probability measure $\hat{\mu}$ on $\mathcal{I}(Q_{d})$ as follows:

(1)

With probability $1/2$ choose $\mathcal{D}=\mathcal{E}$ or $\mathcal{D}=\mathcal{O}$ to be the defect side. 2. (2)

Choose a polymer configuration $\Gamma\in\Omega_{\mathcal{D}}$ from $\nu$ and assign all vertices of $\cup_{S\in\Gamma}S$ to be occupied on the defect side $\mathcal{D}$ . 3. (3)

For each vertex $v$ on the non-defect side that is not blocked by an occupied vertex on the defect side, include $v$ in the independent set independently with probability $\frac{\lambda}{1+\lambda}$ .

The resulting distribution $\hat{\mu}$ is not exactly the hard-core model $\mu$ on $Q_{d}$ , but we will show that the two distributions are very close in total variation distance. Moreover, we will show that a scaling of the partition function $\Xi$ is a very good approximation of the hard-core partition function $Z(\lambda)$ . Note that the defect side need not be the minority side: in step $3$ we may choose no vertices to be occupied opposite the defect side. Nevertheless, we will show below that with very high probability the defect side is in fact the minority side of an independent set sampled according to $\hat{\mu}$ (Lemma 17 below).

Lemma 14.

We have

[TABLE]

Moreover,

[TABLE]

We will prove Lemma 14 after showing that the polymer model satisfies the Kotecký–Preiss condition. Lemma 14 allows us to work with $\Xi$ and $\nu$ to prove Theorems 6 and 8. In particular, to prove Theorem 8 we will approximate $\Xi$ by truncating the cluster expansion for $\log\Xi$ and exponentiating. To prove Theorem 6 we will prove the probabilistic statements for polymer configurations sampled from $\nu$ and then use Lemmas 14 and 17 to transfer these results to results about the minority side of an independent set drawn from $\mu$ .

We define the truncated cluster expansion of $\log\Xi$ as

[TABLE]

We now show that condition (8) holds for the defect polymer model with appropriate choices of functions $f(\cdot)$ and $g(\cdot)$ , and thus $T_{k}$ gives a good approximation to $\log\Xi$ .

Lemma 15.

For integers $d,k\geq 1$ , let

[TABLE]

Then for $d$ sufficiently large

[TABLE]

In particular, for $k$ fixed and $d$ sufficiently large,

[TABLE]

Proof.

Let $g:\mathcal{P}\to[0,\infty)$ be defined by $g(S)=\gamma(d,|S|)$ and define $f:\mathcal{P}\to[0,\infty)$ by $f(S)=|S|/d^{3/2}$ . We will show that the Kotecký–Preiss condition (8) holds. That is, for every $S\in\mathcal{P}$ ,

[TABLE]

To prove this we will show that for all $v\in\mathcal{E}$ ,

[TABLE]

and this will suffice since $S^{\prime}\nsim S$ if and only if $S^{\prime}\ni v$ for some $v\in N^{2}(S)$ and $|N^{2}(S)|\leq d^{2}|S|$ . We will break up the sum according to the different cases of $\gamma(d,k)$ .

First we sum over $S$ with $|S|\leq\frac{d}{10}$ . We use the fact that for such $S$ , $|N(S)|\geq d|S|-2|S|^{2}$ by Lemma 12, and that there are at most $\exp(3k\log d)$ $2$ -linked sets $S$ of size $k$ that contain a fixed vertex $v$ by Lemma 13.

[TABLE]

which is at most $\frac{1}{3d^{7/2}}$ for $d$ large enough.

We next sum over $S$ with $\frac{d}{10}<|S|\leq d^{4}$ . We use the fact that for such $S$ , $|N(S)|\geq d|S|/10$ by Lemma 12.

[TABLE]

and so if $\lambda\geq C_{0}\log d/d$ and $d$ is large enough, then this sum is at most $\frac{1}{3d^{7/2}}$ .

Now turning to $S$ with $d^{4}<|S|\leq 2^{d-2}$ , we have that $|N(S)|\geq|S|(1+1/(2\sqrt{d}))$ , and so

[TABLE]

where the last inequality comes from applying Lemma 11. In the sum, we have $(b-a)\geq a/(2\sqrt{d})$ and $a>d^{4}$ , and so

[TABLE]

for large enough $d$ , and so

[TABLE]

for $d$ large enough. Putting the three bounds together gives (14).

To prove the lemma we now apply Theorem 10, applying (9) for the polymer $S$ containing the single vertex $v$ to obtain:

[TABLE]

Summing over all $v$ gives

[TABLE]

Since $\gamma(d,k)/k$ is non-increasing in $k$ , we have, recalling that $g(\Gamma)=\sum_{S\in\Gamma}g(S)$ and $\|\Gamma\|=\sum_{S\in\Gamma}|S|$ ,

[TABLE]

Keeping only terms in inequality (16) corresponding to clusters of size at least $k$ , we have

[TABLE]

as desired. ∎

The Kotecký–Preiss condition also allows us to prove a simple large deviation result for the total size of all polymers in a random polymer configuration drawn from $\nu$ .

Suppose $X$ is a random variable whose moment generating function $\mathbb{E}e^{tX}$ is defined for $t$ in a neighbourhood of [math]. We will make extensive use of the cumulant generating function of $X$ , defined as

[TABLE]

that is, the logarithm of the moment generating function.

Lemma 16.

Let $\mathbf{\Gamma}$ be a random configuration drawn from the distribution $\nu$ . Then with probability at least $1-O\left(\exp(-2^{d}/d^{4})\right)$ , we have

[TABLE]

where $\|\mathbf{\Gamma}\|=\sum_{S\in\mathbf{\Gamma}}|S|$ .

Proof.

We introduce an auxiliary polymer model with modified polymer weights:

[TABLE]

Let $\tilde{\Xi}$ be the associated polymer model partition function. Then $\log\tilde{\Xi}-\log\Xi=h_{t}(\|\mathbf{\Gamma}\|)$ at $t=d^{-3/2}$ where $\mathbf{\Gamma}$ is a random polymer configuration from the original polymer model.

In the proof of Lemma 15, all of the estimates hold if we were to replace $f(S)=|S|/d^{3/2}$ by $\tilde{f}(S)=2|S|/d^{3/2}$ . Therefore the proof shows that the Kotecký–Preiss condition holds for the polymer weights $\tilde{w}(S)$ , and the functions $f(S),g(S)$ as above. Applying (9) and summing over all polymers of size $1$ gives

[TABLE]

Then since $\Xi\geq 1$ , we have

[TABLE]

By Markov’s inequality we have

[TABLE]

and setting $t=d^{-3/2}$ gives

[TABLE]

for large enough $d$ since for $\lambda\geq C_{0}\log d/d^{1/3}$ , $(1+\lambda)^{d}$ grows faster than any fixed polynomial in $d$ . ∎

This large deviation bound allows us to show that with very high probability over an independent set drawn from $\hat{\mu}$ , the defect side is the minority side.

Lemma 17.

With probability at least $1-O\left(\exp(-2^{d}/d^{4})\right)$ over the random independent set $\mathbf{I}$ drawn from $\hat{\mu}$ , the minority side of the bipartition is the defect side.

Proof.

Let $\mathcal{D}$ and $\mathcal{M}$ denote the defect and minority side respectively selected under $\hat{\mu}$ . By Lemma 16 we have

[TABLE]

Let us fix an element $\Gamma$ in the sample space of $\mathbf{\Gamma}$ with $\|\Gamma\|\leq 2^{d}/d^{2}$ and let $V(\Gamma):=\bigcup_{S\in\Gamma}S$ . Let $\mathbf{X}$ denote the size of the intersection of $\mathbf{I}$ with the non-defect side conditioned on the event that $\mathbf{\Gamma}=\Gamma$ . Note that $\mathbf{X}$ has a ${\rm Bin}(M,\lambda/(1+\lambda))$ distribution where $M=2^{d-1}-|N(V(\Gamma))|\geq(1-2/d)2^{d-1}$ . We then have

[TABLE]

for $d$ sufficiently large. For the penultimate inequality we applied the Chernoff bound. The result follows. ∎

Now we can prove Lemma 14.

Proof of Lemma 14.

We say an independent set $I$ is captured by the odd polymer model if every $2$ -linked component $S$ of $\mathcal{O}\cap I$ has $|[S]|\leq 2^{d-2}$ and captured by the even polymer model if every $2$ -linked component $S$ of $\mathcal{E}\cap I$ has $|[S]|\leq 2^{d-2}$ . If we view $2(1+\lambda)^{2^{d-1}}\Xi$ as the sum of $(1+\lambda)^{2^{d-1}}\Xi$ for $\Xi$ representing the odd polymer model and $(1+\lambda)^{2^{d-1}}\Xi$ for $\Xi$ representing the even polymer model, then each $I$ that is captured by the odd polymer model contributes $\lambda^{|I|}$ to the first summand and each $I$ that is captured by the even polymer model contributes $\lambda^{|I|}$ to the second summand.

Observe first that every $I\in\mathcal{I}(Q_{d})$ is captured by either the odd or the even polymer model. Indeed suppose not, then there exists $I\in\mathcal{I}(Q_{d})$ which contains a set $S\subseteq\mathcal{O}$ with $|[S]|>2^{d-2}$ and a set $S^{\prime}\subseteq\mathcal{E}$ with $|[S^{\prime}]|>2^{d-2}$ . It follows that $|N(S)|=|N([S])|>2^{d-2}$ (since, for example, $Q_{d}$ contains a perfect matching). However then $N(S)\cap[S^{\prime}]\neq\emptyset$ and so $S\cap N(S^{\prime})=S\cap N([S^{\prime}])\neq\emptyset$ , contradicting the fact that $I$ is an independent set.

It remains to bound the contribution to $2(1+\lambda)^{2^{d-1}}\Xi$ from independent sets that are counted twice. That is, bound $\sum_{I\in B}\lambda^{|I|}$ where $B$ denotes the collection of independent sets that are captured by both the odd and even polymer models. However, any such independent set can be selected by $\hat{\mu}$ conditioned on the event that $\mathcal{M}\neq\mathcal{D}$ (using the notation of Lemma 17). Letting $\mathbf{I}$ denote the independent set selected by $\hat{\mu}$ we have by Lemma 17 that

[TABLE]

All together this gives the inequalities

[TABLE]

and so

[TABLE]

which gives (10). Recall one formula for the total variation distance between discrete probability measures:

[TABLE]

The total variation distance bound is then immediate from (18) as the only independent sets that have higher probability under $\hat{\mu}$ than $\mu$ are those that are counted twice. ∎

Now we can prove Theorem 8.

Proof of Theorem 8.

First we prove the estimate $|L_{r}|=O\left(\frac{2^{d}\lambda^{r}d^{2(r-1)}}{(1+\lambda)^{dr-r^{2}}}\right)$ for $r$ fixed. Let $\Gamma$ be a cluster with $\|\Gamma\|=r$ . Since $V(\Gamma):=\bigcup_{S\in\Gamma}S$ is a 2-linked set of size at most $r$ , there are $O(2^{d}d^{2(r-1)})$ possibilities for $V(\Gamma)$ by Lemma 13. Given a set $X\subseteq V(Q_{d})$ of size at most $r$ , there are at most a constant number of clusters $\Gamma$ of size $r$ such that $V(\Gamma)=X$ . It follows that the number of clusters of size $r$ is $O(2^{d}d^{2(r-1)})$ . By Lemma 12, the weight of any cluster of size $r$ is $O(\lambda^{r}/(1+\lambda)^{dr-r^{2}})$ (note that the Ursell function of a cluster of size $r$ is simply a constant). The claimed estimate on $|L_{r}|$ follows.

Let $k\geq 1$ be fixed. By (13) we have that

[TABLE]

for $d$ sufficiently large and so

[TABLE]

It follows from Lemma 14 that

[TABLE]

where $|\varepsilon_{k}|=O\left(\frac{2^{d}\lambda^{k+1}d^{2k}}{(1+\lambda)^{d(k+1)}}\right)$ (it is here we use that $\lambda$ is bounded as $d\to\infty$ ).

Finally we show that $L_{k}$ can be computed in time $e^{O(k\log k)}$ . Let $X$ be the family of all $2$ -linked subsets of $\mathcal{E}$ of size at most $k$ which contain the vertex $\underline{0}=(0,\ldots,0)$ . Given $S\in X$ , we call a coordinate $i$ active for $S$ if $x_{i}=1$ for some $x\in S$ . We note that every $S\in X$ has at most $2k$ active coordinates. For $A\subseteq[d]$ , we let $X_{A}$ denote the set of elements in $X$ whose set of active coordinates is precisely $A$ .

For $m\in[k]$ , we will construct the list $\mathcal{L}_{m}$ of all the elements $S\in X$ with $|S|=m$ and whose set of active coordinates are a subset of $[2k]$ . We do so iteratively. Suppose we have constructed the list $\mathcal{L}_{m}$ . For a vertex $v\in V(Q_{d})$ , and $\{i,j\}\subseteq[d]$ , let $v_{ij}$ denote the vertex of $Q_{d}$ obtained by flipping the $i$ th and $j$ th coordinate of $v$ . For each pair $\{i,j\}\subseteq[2k]$ , $S\in\mathcal{L}_{m}$ and $v\in S$ , add $S\cup\{v_{ij}\}$ to the list $\mathcal{L}_{m+1}$ if $v_{ij}\notin S$ . This procedure generates the whole list $\mathcal{L}_{m+1}$ and shows that $|\mathcal{L}_{m+1}|\leq m\binom{2k}{2}|\mathcal{L}_{m}|$ and so $|\mathcal{L}_{k}|\leq k!\binom{2k}{2}^{k}=e^{O(k\log k)}$ . For $m\in[k]$ and $a\in[2k]$ , let $\mathcal{L}^{a}_{m}$ denote the subset of $\mathcal{L}_{m}$ consisting of those sets whose active coordinates are precisely $[a]$ . Note that we can generate the list $\mathcal{L}^{a}_{m}$ in time $e^{O(k\log k)}$ by checking the elements of $\mathcal{L}_{m}$ one by one.

For a cluster $\Gamma$ , we define the active coordinates of $\Gamma$ to be the active coordinates of the set $V(\Gamma)=\bigcup_{S\in\Gamma}S$ . For fixed $a\in[2k]$ and $m\in[k]$ , we generate the list $\mathcal{G}_{m,k,a}$ of all clusters of size $k$ containing $\underline{0}$ with active coordinates $[a]$ and $|V(\Gamma)|=m$ . To do this we run through each $S\in\mathcal{L}^{a}_{m}$ and create the list of clusters $\Gamma$ of size $k$ with $V(\Gamma)=S$ . We claim that this can be done in time $e^{O(k\log k)}$ . Recall that a cluster of size $k$ is an ordered set of polymers $(\gamma_{1},\ldots,\gamma_{\ell})$ such that $\sum_{i=1}^{\ell}|\gamma_{i}|=k$ . Let us fix $S\in\mathcal{L}^{a}_{m}$ . Since there are at most $2^{k}$ ordered integer partitions of $k$ , it suffices to show that for a fixed such partition $(m_{1},\ldots,m_{\ell})$ (so that $\sum_{i}m_{i}=k$ ) we may find, in time $e^{O(k\log k)}$ , all clusters $(\gamma_{1},\ldots,\gamma_{\ell})$ for which $|\gamma_{i}|=m_{i}$ for all $i$ and $\bigcup_{i}\gamma_{i}=S$ . To do this we can simply check each element of $\binom{S}{m_{1}}\times\ldots\times\binom{S}{m_{\ell}}$ (a set of size at most $e^{O(k\log k)}$ ) to see if it constitutes a legitimate cluster.

By symmetry of coordinates and vertex transitivity of $Q_{d}$ we have

[TABLE]

Finally we note that by using an algorithm of Björklund, Husfeldt, Kaski, and Koivisto [2, Theorem 1], we may calculate the Ursell function of a cluster $\Gamma\in\mathcal{G}_{j,k,a}$ in time $e^{O(k)}$ . Moreover for a set $S\in\mathcal{L}_{j}^{a}$ where $j\in[k]$ , we can calculate $|N(S)|$ in time $O(k^{2})$ . We can therefore calculate $w(\Gamma)$ in time $e^{O(k)}$ . ∎

4. Probabilistic properties via the cluster expansion

Here we use the cluster expansion to prove Theorems 5 and 6 and Corollary 9. Using Lemmas 14 and 17 we see that up to $O(\exp(-2^{d}/d^{4}))$ total variation error, we may replace the minority side of an independent set drawn from $\mu$ with the defect side of an independent set drawn from $\hat{\mu}$ ; or in other words, a polymer configuration drawn from $\nu$ . Thus in this section we will let $X_{T}$ denote the (random) number of polymers of type $T$ in a random polymer configuration $\mathbf{\Gamma}$ drawn from $\nu$ , and prove the conclusions of Theorems 6 and 5 for these random variables. We will also assume throughout this section that $C_{0}\log d/d^{1/3}\leq\lambda\leq 2$ . Theorem 6 is vacuous if $\lambda>2$ since $m_{T}\to 0$ for all types $T$ in that case; the formula (6) in Corollary 9 holds for $\lambda>2$ by Theorem 7.

We begin with some preliminaries on cumulants of random variables. Recall the cumulant generating function of a random variable $X$ , $h_{t}(X)=\log\mathbb{E}e^{tX}$ . The $k$ th cumulant of $X$ is defined by taking derivatives of $h_{t}(X)$ and evaluating at [math]:

[TABLE]

In fact the cumulants of $X$ are related to the moments of $X$ by a non-linear change of basis (see e.g. [19]). In particular, $\kappa_{1}(X)=\mathbb{E}X$ and $\kappa_{2}(X)=\text{var}(X)$ . Moreover, if a random variable $X$ has a distribution determined by its moments, and if for a sequence of random variables $X_{n}$ we have $\lim_{n\to\infty}\kappa_{k}(X_{n})=\kappa_{k}(X)$ for all $k\geq 1$ , then $X_{n}$ converges to $X$ in distribution (denoted $X_{n}\Rightarrow X$ ). We will use this fact in conjunction with the following fact.

Fact 18.

If $X$ has a Poisson distribution with mean $m$ , then $\kappa_{k}(X)=m$ for all $k$ . If $X$ has a standard normal distribution (mean [math], variance $1$ ) then $\kappa_{1}(X)=0$ , $\kappa_{2}(X)=1$ , and $\kappa_{k}(X)=0$ for all $k\geq 3$ .

We also need a few preliminaries about defect types. First, for fixed $t$ the number of defect types of size $t$ is bounded independent of $d$ . Let $\tau(S)$ denote the type of a polymer $S$ . The weight of a polymer $S$ is determined by $\tau(S)$ , since $|N(S)|$ is determined by the number of edges of $S$ in the graph $Q_{d}^{2}[S]$ . Let $w_{T}$ denote $w(S)$ for $S$ of type $T$ . Using Lemma 12, we have the simple bounds

[TABLE]

for a type $T$ of size $t$ and $d$ large enough. Note that for any fixed $k\geq 1$ and any type $T$ , we have $d^{k}w_{T}\to 0$ as $d\to\infty$ ; that is, each polymer weight decays super-polynomially fast in $d$ . We denote by $n_{T}=n_{T}(d)$ the number of polymers of type $T$ .

Lemma 19.

Let $T$ be a defect type of a fixed size $t$ . Then

[TABLE]

Moreover, if $T$ is a tree defect type then

[TABLE]

where $c_{T}=2^{-t}|\text{Aut}(T)|^{-1}$ and if $T$ is not a tree then

[TABLE]

Proof.

By the vertex transitivity of $Q_{d}$ , every vertex of $\mathcal{E}$ (or $\mathcal{O}$ ) is contained in the same number of polymers of type $T$ . Let us denote this number by $n_{T,v}$ and note that $n_{T}=2^{d-1}n_{T,v}/t$ . The lower bound in (23) follows from the fact that if there exists a polymer with type $T$ , then certainly $n_{T,v}\geq 1$ . The upper bound follows from the fact that every vertex of $Q_{d}$ is contained in at most $(ed^{2})^{(t-1)}$ $2$ -linked sets of size $t$ by Lemma 13.

Since $T$ is a connected graph we may fix an ordering $(x_{1},\ldots,x_{t})$ of the vertices of $T$ so that $T_{i}:=T[\{x_{1},\ldots,x_{i}\}]$ is connected for all $i\in[t]$ . We let $d_{i}$ denote the degree of the vertex $x_{i}$ in the graph $T_{i}$ .

We will construct an injective graph homomorphism $\varphi:T\to Q_{d}^{2}[\mathcal{E}]$ recursively as follows. Suppose that we have constructed an injective graph homomorphism $\varphi_{i}:T_{i}\to Q_{d}^{2}[\mathcal{E}]$ for some $i\leq t-1$ and let $m_{i}$ denote the number of such homomorphisms. We now extend $\varphi_{i}$ to an injective graph homomorphism $\varphi_{i+1}:T_{i+1}\to Q_{d}^{2}[\mathcal{E}]$ . We consider two cases.

If $d_{i+1}>1$ , then $\varphi_{i+1}(x_{i+1})$ must lie in the joint neighbourhood of $\varphi_{i}(x)$ and $\varphi_{i}(y)$ for some $x,y\in V(T_{i})$ . For any pair of vertices $u,v\in\mathcal{E}$ their codegree in $Q_{d}^{2}[\mathcal{E}]$ is at most $2(d-2)$ and so there are at most $2(d-2)$ choices for $\varphi_{i+1}(x_{i+1})$ whence

[TABLE]

Suppose now that $d_{i+1}=1$ and let $R_{i}$ denote the set of possible choices for $\varphi_{i+1}(x_{i+1})$ . We note that $u\in R_{i}$ if and only if $u$ is adjacent to $\varphi_{i}(x_{i})$ and non-adjacent to $\varphi_{i}(x_{j})$ for $j<i$ in $Q_{d}^{2}[\mathcal{E}]$ . Again using the fact that the maximum codegree in $Q_{d}^{2}[\mathcal{E}]$ is $2(d-2)$ it follows that $\binom{d}{2}-2(d-2)\leq|R_{i}|\leq\binom{d}{2}$ . We then have that

[TABLE]

If $T$ is not a tree then $d_{i+1}>1$ for some $i\leq t-1$ . It follows by (24) and the upper bound of (25) that $m_{t}=O(2^{d}d^{2(t-1)-1})=O(2^{d}d^{2t-3})$ . The bound $n_{T}=O(2^{d}d^{2t-3})$ follows from the fact that $n_{T}=m_{t}/|\text{Aut}(T)|$ where $\text{Aut}(T)$ denotes the automorphism group of the graph $T$ (recall that $t$ is a constant).

If $T$ is a tree then $d_{i+1}=1$ for all $i\leq t-1$ and so by (25) $m_{t}=(1+o(1))2^{d-1}d^{2(t-1)}2^{-(t-1)}$ . The result follows. ∎

Now fix a defect type $T$ and let $X_{T}$ be the number of polymers of type $T$ in $\mathbf{\Gamma}$ . We introduce modified polymer weights $\tilde{w}$ , given by

[TABLE]

Let $\tilde{\Xi}$ be the corresponding polymer model partition function. Then we have

[TABLE]

and so

[TABLE]

If the cluster expansion for $\log\tilde{\Xi}$ converges absolutely, we can write

[TABLE]

where $Y_{T}(\Gamma)=\sum_{S\in\Gamma}\mathbf{1}_{\tau(S)=T}$ , the number of polymers of type $T$ in the cluster $\Gamma$ .

The following lemma gives bounds on cluster weights using the Kotecký–Preiss condition. Theorem 6 will then follow in a series of corollaries.

Lemma 20.

Consider a fixed defect type $T$ , and let $k\geq 1$ be a fixed integer. Then

[TABLE]

as $d\to\infty$ .

Moreover if $\{T_{1},\dots,T_{\ell}\}$ is a fixed set of distinct defect types, and $k_{1},\dots k_{\ell}$ are fixed positive integers, then

[TABLE]

Proof.

In the sum in (27), if we consider only clusters made up of a single polymer of type $T$ then we get a contribution of exactly $n_{T}w_{T}$ , and so it remains to show that the contribution of all other terms is $o(n_{T}w_{T})$ . Let $t$ denote the number of vertices in a graph of type $T$ . We first consider the contribution to the sum (27) from clusters $\Gamma$ with $Y_{T}(\Gamma)=1$ and $\|\Gamma\|>t$ . By (12), we may bound this contribution by

[TABLE]

since from (22) and (23)

[TABLE]

Consider now the contribution to the sum (27) from clusters $\Gamma$ with $Y_{T}(\Gamma)=y>1$ . For such a cluster, recalling that $g(S)=\gamma(d,|S|)$ for a polymer $S$ and $g(\Gamma)=\sum_{S\in\Gamma}g(S)$ , we have

[TABLE]

Thus, using (16) we may bound this contribution by

[TABLE]

where the above inequality holds for $d$ large enough (independent of $y$ ). The result follows since

[TABLE]

Next we turn to (28). Consider a cluster $\Gamma$ with $Y_{T_{1}}(\Gamma)=y_{1},\dots,Y_{T_{\ell}}(\Gamma)=y_{\ell}$ , where $y_{1},\dots,y_{\ell}\geq 1$ . Then we have

[TABLE]

where $t_{j}$ is the size of a polymer of type $T_{j}$ . By (16) and (22), the contribution of such clusters to the sum in (28) is therefore at most

[TABLE]

Finally, let $K=\max\{k_{1},\dots k_{\ell},t_{1},\dots t_{\ell}\}$ , so that summing over all positive integer vectors $\vec{y}=(y_{1},\dots,y_{\ell})$ , we have

[TABLE]

Putting these estimates together yields (28). ∎

An immediate corollary of Lemma 20 gives the asymptotics of $m_{T}$ , $\sigma^{2}_{T}$ for a given type $T$ .

Corollary 21.

Let $T$ be a defect type. Then

[TABLE]

Proof.

These formulae follow from (26) and (27) by taking $k=1$ and $k=2$ respectively. ∎

We can also use Lemma 20 to prove Poisson convergence.

Corollary 22.

Suppose for a given type $T$ and fugacity $\lambda$ we have $m_{T}\to\rho>0$ as $d\to\infty$ . Then $X_{T}\Rightarrow\text{Pois}(\rho)$ .

Proof.

Using Fact 18, it is enough to show that $\kappa_{k}(X_{T})\to\rho$ for all $k\geq 1$ . By (26) and our assumption we have

[TABLE]

and therefore using (26) again,

[TABLE]

for all $k\geq 1$ . ∎

In a similar fashion, we obtain asymptotic normality if $m_{T}\to\infty$ .

Corollary 23.

Fix a type $T$ . If $\lambda$ is such that $m_{T}\to\infty$ as $d\to\infty$ , then $\tilde{X}_{T}=(X_{T}-m_{T})/\sigma_{T}\Rightarrow\text{N}(0,1)$ .

Proof.

By Fact 18, it suffices to show that $\kappa_{1}(\tilde{X}_{T})\to 0$ , $\kappa_{2}(\tilde{X}_{T})\to 1$ , and $\kappa_{k}(\tilde{X}_{T})\to 0$ for all $k\geq 3$ . By the definition of $\tilde{X}_{T}$ , we have $\kappa_{1}(\tilde{X}_{T})=0$ and $\kappa_{2}(\tilde{X}_{T})=1$ . By translation invariance and scaling of higher cumulants, for $k\geq 3$ we have

[TABLE]

by (26). By Lemmas 20 and 21 we have $\sum_{\Gamma\in\mathcal{C}}w(\Gamma)Y_{T}(\Gamma)^{k}=(1+o(1))\sigma^{2}_{T}$ , and so for $k\geq 3$ ,

[TABLE]

as $d\to\infty$ since our assumption on $m_{T}$ implies $\sigma_{T}\to\infty$ . ∎

To study the joint distribution of the counts of different defect types, it is convenient to work with the joint cumulants of a collection of random variables. Given a set of random variables $(X_{1},\dots,X_{\ell})$ and non-negative integers $k_{1},\dots,k_{\ell}$ , we define the joint cumulant

[TABLE]

In particular, with this notation

[TABLE]

We will use the fact that the joint cumulants of independent random variables vanish; that is, if $\ell\geq 2$ , $X_{1},\dots,X_{\ell}$ are independent random variables, and $k_{1},\dots,k_{\ell}$ are positive integers, then

[TABLE]

Generalizing formula (26) to collections of random variables, we can express the joint cumulants of defect type counts via a modified cluster expansion. Let $\{T_{1},\dots,T_{\ell}\}$ be a set of distinct defect types and let $k_{1},\dots,k_{\ell}$ be non-negative integers. Then

[TABLE]

Corollary 24.

Consider two fixed sets $\mathcal{T}_{1}$ and $\mathcal{T}_{2}$ of distinct defect types so that for each $T\in\mathcal{T}_{1}$ , $m_{T}\to\rho_{T}$ for some $\rho_{T}>0$ , and for each $T\in\mathcal{T}_{2}$ , $m_{T}\to\infty$ as $d\to\infty$ . Then the collection of random variables $\{X_{T}\}_{T\in\mathcal{T}_{1}}\cup\{\tilde{X}_{T}\}_{T\in\mathcal{T}_{2}}$ converges in distribution to a collection of independent Poisson and standard normal random variables.

Proof.

We will use the fact that the distribution of a collection of Poisson and normal random variables is determined by its joint moments, or equivalently, by its joint cumulants. Here, working with cumulants instead of moments will simplify calculations considerably. From Corollaries 22 and 23 we know that the cumulants of each of the individual random variables in the collection converge to the corresponding cumulants of the corresponding Poisson or normal random variable. Therefore it is enough to show convergence of the joint cumulants involving at least two of the random variables, and from (32), we must show that these converge to [math]. In particular, for $T_{1},\dots,T_{j}\in\mathcal{T}_{1}$ , and $T_{j+1},\dots,T_{\ell}\in\mathcal{T}_{2}$ , we will show

[TABLE]

as $d\to\infty$ as long as least two of the $k_{i}$ ’s are positive. Since $\sigma^{2}_{T}\to\rho_{T}>0$ for $T\in\mathcal{T}_{1}$ , it will suffice to show (33) when we center and normalise all of the random variables, that is, for $T_{1},\dots,T_{\ell}\in\mathcal{T}_{1}\cup\mathcal{T}_{2}$ ,

[TABLE]

as long as at least two of the $k_{i}$ ’s are positive. WLOG we can assume that $\ell\geq 2$ , $k_{i}\geq 1$ for all $i$ , and that $w_{T_{1}}\geq w_{T_{2}}\geq\cdots\geq w_{T_{\ell}}$ . By scaling and translation invariance, we have

[TABLE]

Then using (28) from Lemma 20 we have

[TABLE]

First suppose that $k_{1}\geq 2$ . Then since $\sigma_{T_{i}}=\Omega(1)$ for all $i$ and $2^{d}w_{T_{1}}=O(\sigma^{2}_{T_{1}})$ , we have

[TABLE]

since $w_{T}$ tends to [math] faster than any fixed polynomial in $d$ for any type $T$ . On the other hand if we have $k_{1}=1$ , then

[TABLE]

∎

Theorem 6 follows from Corollaries 22, 23, and 24. We now prove Theorem 5.

Proof of Theorem 5.

We can assume in what follows that $\lambda\leq 2$ , since if $\lambda>2$ whp there are no occupied vertices on the minority side (Theorem 1.2 of [7]).

First we show that if $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{\omega(1)}{d}$ , then whp there are no $2$ -linked components of size $t$ on the defect side.

Let $T$ the type of a polymer of size $t$ . We have the bounds

[TABLE]

where the upper bound uses Lemma 12, and so $w_{T}=\Theta(\lambda^{t}(1+\lambda)^{-dt})=\Theta((1+\lambda)^{-dt})$ for this range of $\lambda$ . By Lemma 13, $n_{T}=O(2^{d}d^{2(t-1)})$ , and so by Corollary 21,

[TABLE]

Now plugging in $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{s}{d}$ for some $s$ , we have

[TABLE]

and so as $s\to\infty$ , $m_{T}\to 0$ . This is true for any type $T$ of size $t$ , and since there are a constant number of such types, Markov’s inequality shows that whp there are no polymers of size $t$ in $\mathbf{\Gamma}$ if $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{\omega(1)}{d}$ .

Now suppose $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}-\frac{\omega(1)}{d}$ . Consider a type $T$ where $T$ is isomorphic to a tree on $t$ vertices. In this case we have $n_{T}=\Theta(2^{d}d^{2(t-1)})$ by Lemma 19, and so for $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{s}{d}$ the previous calculation gives

[TABLE]

In particular if $s\to-\infty$ as $d\to\infty$ then $m_{T}\to\infty$ . By Corollary 21, $\sigma^{2}_{T}\sim m_{T}$ , and so by the second-moment method (Paley-Zygmund inequality), $X_{T}\geq 1$ whp.

To prove the second part of Theorem 5, suppose that $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{s(d)}{d}$ where $s(d)$ converges to a constant $s$ as $d\to\infty$ . Then for any type $T$ of size $t$ that is not a tree, by Lemma 19 we have $m_{T}=o(1)$ as $d\to\infty$ , and since there is a constant number of such types, we know that whp there are no non-tree $2$ -linked components of size $t$ on the minority side. Let $T_{1},\dots,T_{\ell}$ be the defect types of size $t$ that are trees. The proof of Lemma 19 shows that in fact every tree on $t$ vertices is a defect type. Note that for each $i$ we have that $w_{T_{i}}=\lambda^{t}(1+\lambda)^{-dt+2(t-1)}$ . Then by Lemma 19 and Corollary 21 we have that $m_{T_{i}}=(c_{T_{i}}+o(1))(\lambda^{t}(1+\lambda)^{-dt+2(t-1)}2^{d}d^{2(t-1)})$ for each $i$ , and so by a similar calculation as above we have that $m_{T_{i}}\to\rho_{i}$ as $d\to\infty$ where

[TABLE]

By Corollary 24, the collection of random variables $X_{T_{1}},\dots,X_{T_{\ell}}$ converges to a collection of independent Poisson random variables with mean $\rho_{1},\dots,\rho_{\ell}$ , and therefore their sum is distributed as Poisson with mean $\sum_{i=1}^{\ell}\rho_{i}$ , completing the proof of Theorem 5. Calculating this mean explicitly amounts to calculating $|\text{Aut}(T)|$ for every tree $T$ on $t$ vertices, a task whose running time depends only on $t$ (a constant). ∎

The proof of Corollary 9 involves a similar calculation.

Proof of Corollary 9.

We may again assume that $\lambda\leq 2$ since for larger $\lambda$ , $Z(\lambda)=(2+o(1))(1+\lambda)^{2^{d-1}}$ by Theorem 7. Now fix $t\geq 1$ and take $\lambda=2^{1/t}-1+\frac{2^{1+1/t}(t-1)\log d}{td}+\frac{\omega(1)}{d}$ . We then can apply Theorem 8 with $k=t$ to obtain

[TABLE]

But by the same calculation as above in the proof of Theorem 5 we have

[TABLE]

and

[TABLE]

and so

[TABLE]

The example formula (7) follows from the computation of $L_{1},L_{2}$ given below in Section 5. ∎

5. Computation of the cluster weights

Here we compute $L_{1},L_{2},L_{3}$ explicitly to use in Theorem 2 and Corollary 9.

Proposition 25.

We have

[TABLE]

At $\lambda=1$ , this is

[TABLE]

Polymers

There is a single type of polymer of size $1$ . There are $2^{d-1}$ of these, and each has weight $\lambda(1+\lambda)^{-d}$ .

There is a single type of polymer of size $2$ . There are $2^{d-3}d(d-1)$ of these and each has weight $\lambda^{2}(1+\lambda)^{-2d+2}$ .

There are two types of polymers of size $3$ : those that form a clique in the distance $2$ graph and those that form a path on $3$ vertices. There are $2^{d-2}d(d-1)(d-2)/3$ of the first type and each has weight $\lambda^{3}(1+\lambda)^{-3d+5}$ ; there are $2^{d-4}d(d-1)(d-2)(d-3)$ of the second type and each has weight $\lambda^{3}(1+\lambda)^{-3d+4}$ .

Clusters

There is a single cluster type of size $1$ , each consisting of single polymer of size $1$ , with Ursell function $1$ . Thus

[TABLE]

There are two types of clusters of size $2$ : an ordered pair of incompatible polymers of size $1$ , of which there are $2^{d-1}+2^{d-2}d(d-1)$ , with Ursell function $-1/2$ and weight $\lambda^{2}(1+\lambda)^{-2d}$ , and one polymer of size $2$ with Ursell function $1$ and count and weight given above.

All together this gives:

[TABLE]

At $\lambda=1$ this is

[TABLE]

There are five types of clusters of size $3$ :

(1)

One polymer of size $3$ , first type: $2^{d-2}d(d-1)(d-2)/3$ of weight $\lambda^{3}(1+\lambda)^{-3d+5}$ , Ursell function $1$ . 2. (2)

One polymer of size $3$ , second type: $2^{d-4}d(d-1)(d-2)(d-3)$ of weight $\lambda^{3}(1+\lambda)^{-3d+4}$ , Ursell function $1$ . 3. (3)

Three polymers of size $1$ , incompatibility graph is a triangle: $2^{d-1}+3\cdot 2^{d-2}d(d-1)+2^{d-1}d(d-1)(d-2)$ of weight $\lambda^{3}(1+\lambda)^{-3d}$ , Ursell function $1/3$ . 4. (4)

Three polymers of size $1$ , incompatibility graph is a path on $3$ vertices: $3\cdot 2^{d-3}d(d-1)(d-2)(d-3)$ of weight $\lambda^{3}(1+\lambda)^{-3d}$ , Ursell function $1/6$ . 5. (5)

One polymer of size $2$ , one of size $1$ : $2^{d-2}d(d-1)[d(d-1)-2(d-2)]$ of weight $\lambda^{3}(1+\lambda)^{-3d+2}$ , Ursell function $-1/2$ .

All together this gives:

[TABLE]

At $\lambda=1$ this is

[TABLE]

Proof of Theorem 2.

Theorem 8 tells us that

[TABLE]

since $L_{1}=1/2$ . If we write $L_{k}=a_{k-1}2^{-(k-1)d}$ , then we have

[TABLE]

Since the Taylor series for $\exp(a_{1}x+a_{2}x^{2}+a_{3}x^{3})$ around $x=0$ is

[TABLE]

we have

[TABLE]

which gives Theorem 2. ∎

Acknowledgements

WP supported in part by NSF Career award 1847451. Part of this work was done while WP visiting the Simons Institute for the Theory of Computing. We thank Lina Li for some very helpful comments on this paper.

Bibliography24

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Balogh, R. Morris, and W. Samotij. Independent sets in hypergraphs. Journal of the American Mathematical Society , 28(3):669–709, 2015.
2[2] A. Björklund, T. Husfeldt, P. Kaski, and M. Koivisto. Computing the Tutte polynomial in vertex-exponential time. In Proceedings of the Forty-ninth Annual Symposium on Foundations of Computer Science , FOCS 2008, pages 677–686. IEEE, 2008.
3[3] S. Cannon and W. Perkins. Counting independent sets in unbalanced bipartite graphs. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 1456–1466. SIAM, 2020.
4[4] R. Dobrushin. Estimates of semi-invariants for the Ising model at low temperatures. Translations of the American Mathematical Society-Series 2 , 177:59–82, 1996.
5[5] J. Engbers and D. Galvin. H-coloring tori. Journal of Combinatorial Theory, Series B , 102(5):1110–1133, 2012.
6[6] D. Galvin. On homomorphisms from the Hamming cube to ℤ ℤ \mathbb{Z} . Israel Journal of Mathematics , 138(1):189–213, 2003.
7[7] D. Galvin. A threshold phenomenon for random independent sets in the discrete hypercube. Combinatorics, probability and computing , 20(1):27–51, 2011.
8[8] D. Galvin. Independent sets in the discrete hypercube. ar Xiv preprint ar Xiv:1901.01991 , 2019.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Independent sets in the hypercube revisited

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction

Theorem 1** (Korshunov and Sapozhenko [17]).**

Theorem 2**.**

Definition 3**.**

Definition 4**.**

Theorem 5**.**

Theorem 6**.**

Theorem 7** (Galvin [7]).**

Theorem 8**.**

Corollary 9**.**

Related work

2. Polymer models and the cluster expansion

Theorem 10** ([18]).**

3. Polymers in the hypercube

3.1. Preliminaries

Lemma 11** ([7]).**

Lemma 12**.**

Lemma 13**.**

3.2. The defect polymer model

Lemma 14**.**

Lemma 15**.**

Proof.

Lemma 16**.**

Proof.

Lemma 17**.**

Proof.

Proof of Lemma 14.

Proof of Theorem 8.

4. Probabilistic properties via the cluster expansion

Fact 18**.**

Lemma 19**.**

Proof.

Lemma 20**.**

Proof.

Corollary 21**.**

Proof.

Corollary 22**.**

Proof.

Corollary 23**.**

Proof.

Corollary 24**.**

Proof.

Proof of Theorem 5.

Proof of Corollary 9.

5. Computation of the cluster weights

Proposition 25**.**

Polymers

Clusters

Proof of Theorem 2.

Acknowledgements

Theorem 1 (Korshunov and Sapozhenko [17]).

Theorem 2.

Definition 3.

Definition 4.

Theorem 5.

Theorem 6.

Theorem 7 (Galvin [7]).

Theorem 8.

Corollary 9.

Theorem 10 ([18]).

Lemma 11 ([7]).

Lemma 12.

Lemma 13.

Lemma 14.

Lemma 15.

Lemma 16.

Lemma 17.

Fact 18.

Lemma 19.

Lemma 20.

Corollary 21.

Corollary 22.

Corollary 23.

Corollary 24.

Proposition 25.