Fast algorithms at low temperatures via Markov chains

Zongchen Chen; Andreas Galanis; Leslie Ann Goldberg; Will Perkins,; James Stewart; Eric Vigoda

arXiv:1901.06653·cs.DS·April 14, 2021

Fast algorithms at low temperatures via Markov chains

Zongchen Chen, Andreas Galanis, Leslie Ann Goldberg, Will Perkins,, James Stewart, Eric Vigoda

PDF

TL;DR

This paper introduces a new Markov chain approach for polymer models that achieves rapid mixing at low temperatures, enabling efficient sampling and approximation algorithms for the Potts and hard-core models on bounded-degree graphs.

Contribution

The authors develop a Markov chain method that bypasses complex zero-free region analysis, providing faster sampling algorithms with optimal running times for certain statistical physics models.

Findings

01

Achieves $O(n \, \log n)$ sampling time for the Potts model.

02

Achieves $O(n^2 \, \log n)$ sampling time for the hard-core model.

03

Proves polynomial mixing time for spin Glauber dynamics in restricted state spaces.

Abstract

We define a discrete-time Markov chain for abstract polymer models and show that under sufficient decay of the polymer weights, this chain mixes rapidly. We apply this Markov chain to polymer models derived from the hard-core and ferromagnetic Potts models on bounded-degree (bipartite) expander graphs. In this setting, Jenssen, Keevash and Perkins (2019) recently gave an FPTAS and an efficient sampling algorithm at sufficiently high fugacity and low temperature respectively. Their method is based on using the cluster expansion to obtain a complex zero-free region for the partition function of a polymer model, and then approximating this partition function using the polynomial interpolation method of Barvinok. Our approach via the polymer model Markov chain circumvents the zero-free analysis and the generalization to complex parameters, and leads to a sampling algorithm with a fast…

Equations153

μ_{G, λ} (I)

μ_{G, λ} (I)

Z (G)

Z (G)

μ_{G} (Γ)

μ_{G} (Γ)

γ^{'} ≁ γ \sum ∣ γ^{'} ∣ w_{γ^{'}} \leq θ ∣ γ ∣

γ^{'} ≁ γ \sum ∣ γ^{'} ∣ w_{γ^{'}} \leq θ ∣ γ ∣

w_{γ}

w_{γ}

μ_{G, β} (σ)

μ_{G, β} (σ)

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}} \leq ∣ γ ∣.

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}} \leq ∣ γ ∣.

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}} \leq v \in γ \cup \partial γ \sum k \geq 1 \sum γ^{'} \in C (G); ∣ γ^{'} ∣ = k, v \in γ^{'} \sum e^{k} e^{- τ k} .

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}} \leq v \in γ \cup \partial γ \sum k \geq 1 \sum γ^{'} \in C (G); ∣ γ^{'} ∣ = k, v \in γ^{'} \sum e^{k} e^{- τ k} .

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}}

γ^{'} ≁ γ \sum e^{∣ γ^{'} ∣} w_{γ^{'}}

\leq \frac{∣ γ ∣ ( Δ + 1 )}{e Δ} k \geq 1 \sum e^{- 3 k} \leq ∣ γ ∣,

\frac{μ _{G} ( Γ ^{'} )}{μ _{G} ( Γ )} = \frac{\prod _{γ^{'} \in Γ^{'}} w _{γ^{'}}}{\prod _{γ^{'} \in Γ} w _{γ^{'}}}

\frac{μ _{G} ( Γ ^{'} )}{μ _{G} ( Γ )} = \frac{\prod _{γ^{'} \in Γ^{'}} w _{γ^{'}}}{\prod _{γ^{'} \in Γ} w _{γ^{'}}}

T_{x}(\varepsilon)=\min\{t>0\mid\mbox{ for all $t^{\prime}\geq t$, $\|P^{t^{\prime}}(x,\cdot)-\nu(\cdot)\|_{TV}\leq\varepsilon$}\},

T_{x}(\varepsilon)=\min\{t>0\mid\mbox{ for all $t^{\prime}\geq t$, $\|P^{t^{\prime}}(x,\cdot)-\nu(\cdot)\|_{TV}\leq\varepsilon$}\},

E [D (X_{t + 1}, Y_{t + 1})]

E [D (X_{t + 1}, Y_{t + 1})]

E [D (X_{t + 1}, Y_{t + 1})]

E [D (X_{t + 1}, Y_{t + 1})]

Pr [k = k] = (1 - e^{- r}) e^{- r k} .

Pr [k = k] = (1 - e^{- r}) e^{- r k} .

γ \in A (v) \sum w_{γ} e^{r ∣ γ ∣}

γ \in A (v) \sum w_{γ} e^{r ∣ γ ∣}

\displaystyle=O\Big{(}1+\sum_{k\geq 1}\Pr[\mathbf{k}=k]\big{(}k^{5}(e\Delta)^{2k}+k^{c}(e(q-1)\Delta)^{k}\big{)}\Big{)}

\displaystyle=O\Big{(}1+\sum_{k\geq 1}\Pr[\mathbf{k}=k]\big{(}k^{5}(e\Delta)^{2k}+k^{c}(e(q-1)\Delta)^{k}\big{)}\Big{)}

\displaystyle=O\Big{(}1+\sum_{k\geq 1}e^{-rk}\,k^{c}(e(q-1)\Delta)^{2k}\Big{)}=O\Big{(}1+\sum_{k\geq 1}k^{c}\,e^{-(\tau^{\prime}+1)k}\Big{)}=O(1)\,,

w_{γ} (ρ) = w_{γ} e^{- ρ ∣ γ ∣}

w_{γ} (ρ) = w_{γ} e^{- ρ ∣ γ ∣}

Z (ρ) = Z (G; ρ) = Γ \in Ω \sum γ \in Γ \prod w_{γ} (ρ) = Γ \in Ω \sum γ \in Γ \prod w_{γ} e^{- ρ ∣ γ ∣} .

Z (ρ) = Z (G; ρ) = Γ \in Ω \sum γ \in Γ \prod w_{γ} (ρ) = Γ \in Ω \sum γ \in Γ \prod w_{γ} e^{- ρ ∣ γ ∣} .

\frac{1}{Z ( 0 )} = \frac{1}{Z ( ρ _{0} )} = \frac{Z ( ρ _{1} )}{Z ( ρ _{0} )} \frac{Z ( ρ _{2} )}{Z ( ρ _{1} )} \dots \frac{Z ( ρ _{ℓ} )}{Z ( ρ _{ℓ - 1} )} \frac{1}{Z ( ρ _{ℓ} )} .

\frac{1}{Z ( 0 )} = \frac{1}{Z ( ρ _{0} )} = \frac{Z ( ρ _{1} )}{Z ( ρ _{0} )} \frac{Z ( ρ _{2} )}{Z ( ρ _{1} )} \dots \frac{Z ( ρ _{ℓ} )}{Z ( ρ _{ℓ - 1} )} \frac{1}{Z ( ρ _{ℓ} )} .

W_{i} = γ \in Γ_{i} \prod \frac{w _{γ} ( ρ _{i + 1} )}{w _{γ} ( ρ _{i} )}, where Γ_{i} \sim μ_{ρ_{i}} .

W_{i} = γ \in Γ_{i} \prod \frac{w _{γ} ( ρ _{i + 1} )}{w _{γ} ( ρ _{i} )}, where Γ_{i} \sim μ_{ρ_{i}} .

E [W_{i}] = \frac{Z ( ρ _{i + 1} )}{Z ( ρ _{i} )} and E [W_{i}^{2}] = \frac{Z ( ρ _{i + 2} )}{Z ( ρ _{i} )} .

E [W_{i}] = \frac{Z ( ρ _{i + 1} )}{Z ( ρ _{i} )} and E [W_{i}^{2}] = \frac{Z ( ρ _{i + 2} )}{Z ( ρ _{i} )} .

E [W] = \frac{Z ( ρ _{ℓ} )}{Z ( 0 )} and E [W^{2}] = \frac{Z ( ρ _{ℓ} ) Z ( ρ _{ℓ + 1} )}{Z ( 0 ) Z ( ρ _{1} )} .

E [W] = \frac{Z ( ρ _{ℓ} )}{Z ( 0 )} and E [W^{2}] = \frac{Z ( ρ _{ℓ} ) Z ( ρ _{ℓ + 1} )}{Z ( 0 ) Z ( ρ _{1} )} .

E [W_{i}]

E [W_{i}]

= \frac{1}{Z ( ρ _{i} )} Γ_{i} \in Ω \sum γ \in Γ_{i} \prod w_{γ} e^{- (i + 1) ∣ γ ∣/ n} = \frac{Z ( ρ _{i + 1} )}{Z ( ρ _{i} )}

E [W_{i}^{2}]

E [W_{i}^{2}]

= \frac{1}{Z ( ρ _{i} )} Γ_{i} \in Ω \sum γ \in Γ_{i} \prod w_{γ} e^{- (i + 2) ∣ γ ∣/ n} = \frac{Z ( ρ _{i + 2} )}{Z ( ρ _{i} )} .

E [W] = i = 0 \prod ℓ - 1 E [W_{i}] = i = 0 \prod ℓ - 1 \frac{Z ( ρ _{i + 1} )}{Z ( ρ _{i} )} = \frac{Z ( ρ _{ℓ} )}{Z ( ρ _{0} )}

E [W] = i = 0 \prod ℓ - 1 E [W_{i}] = i = 0 \prod ℓ - 1 \frac{Z ( ρ _{i + 1} )}{Z ( ρ _{i} )} = \frac{Z ( ρ _{ℓ} )}{Z ( ρ _{0} )}

E [W^{2}] = i = 0 \prod ℓ - 1 E [W_{i}^{2}] = i = 0 \prod ℓ - 1 \frac{Z ( ρ _{i + 2} )}{Z ( ρ _{i} )} = \frac{Z ( ρ _{ℓ} ) Z ( ρ _{ℓ + 1} )}{Z ( ρ _{0} ) Z ( ρ _{1} )} . \qed

E [W^{2}] = i = 0 \prod ℓ - 1 E [W_{i}^{2}] = i = 0 \prod ℓ - 1 \frac{Z ( ρ _{i + 2} )}{Z ( ρ _{i} )} = \frac{Z ( ρ _{ℓ} ) Z ( ρ _{ℓ + 1} )}{Z ( ρ _{0} ) Z ( ρ _{1} )} . \qed

1 \leq Z (ρ_{ℓ}) \leq e^{ε /2} .

1 \leq Z (ρ_{ℓ}) \leq e^{ε /2} .

Z (ρ_{ℓ}) \leq γ \in C (G) \prod (1 + w_{γ} e^{- ℓ ∣ γ ∣/ n}) .

Z (ρ_{ℓ}) \leq γ \in C (G) \prod (1 + w_{γ} e^{- ℓ ∣ γ ∣/ n}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Fast algorithms at low temperatures via Markov chains††thanks: These results were announced in preliminary form (without proofs) as a brief abstract in the proceedings of APPROX/RANDOM 2019

Zongchen Chen School of Computer Science, Georgia Institute of Technology. Research supported in part by NSF grants CCF-1617306 and CCF-1563838.

Andreas Galanis The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) ERC grant agreement no. 334828. The paper reflects only the authors’ views and not the views of the ERC or the European Commission. The European Union is not liable for any use that may be made of the information contained therein. Authors’ address: Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK.

Leslie Ann Goldberg*†*

Will Perkins Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago. Supported in part by NSF grants DMS-1847451 and CCF-1934915. Part of this work was done while WP was visiting the Simons Institute for the Theory of Computing.

James Stewart*†*

Eric Vigoda∗

(April 13, 2021)

Abstract

Efficient algorithms for approximate counting and sampling in spin systems typically apply in the so-called high-temperature regime, where the interaction between neighboring spins is “weak”. Instead, recent work of Jenssen, Keevash and Perkins yields polynomial-time algorithms in the low-temperature regime on bounded-degree (bipartite) expander graphs using polymer models and the cluster expansion.

In order to speed up these algorithms (so the exponent in the run time does not depend on the degree bound) we present a Markov chain for polymer models and show that it is rapidly mixing under exponential decay of polymer weights. This yields, for example, an $O(n\log n)$ -time sampling algorithm for the low-temperature ferromagnetic Potts model on bounded-degree expander graphs. Combining our results for the hard-core and Potts models with Markov chain comparison tools, we obtain polynomial mixing time for Glauber dynamics restricted to appropriate portions of the state space.

1 Introduction

The hard-core model from statistical physics is defined on the set of independent sets of a graph $G$ , where the independent sets are weighted by a fugacity $\lambda>0$ . The associated Gibbs distribution $\mu_{G,\lambda}$ is defined as follows, for an independent set $I$ :

[TABLE]

where $Z_{G,\lambda}=\sum_{I\in\mathcal{I}(G)}\lambda^{|I|}$ is the hard-core partition function (also called the independence polynomial), $\mathcal{I}(G)$ is the set of independent sets of $G$ , and $\lambda>0$ is the fugacity.

In applications, there are two important computational tasks associated to a spin model such as the hard-core model. Given an error parameter $\varepsilon\in(0,1)$ , an $\varepsilon$ -approximate counting algorithm outputs a number $\hat{Z}$ so that $e^{-\varepsilon}Z_{G,\lambda}\leq\hat{Z}\leq e^{\varepsilon}Z_{G,\lambda}$ , and an $\varepsilon$ -approximate sampling algorithm outputs a random sample $I$ with distribution $\hat{\mu}$ so that the total variation distance satisfies $\|\mu_{\lambda}-\hat{\mu}\|_{TV}<\varepsilon$ .

While classical statistical physics is most interested in studying the hard-core model on the integer lattice $\mathbb{Z}^{d}$ , the perspective of computer science is to consider wider families of graphs, such as the set of all graphs, all graphs of maximum degree $\Delta$ , or all bipartite graphs of maximum degree $\Delta$ .

Almost all proven efficient algorithms for approximate counting and sampling from the hard-core model work for low fugacities (the weak interaction regime, akin to the low temperature regime of the Potts model). In the high temperature regime there are at least three distinct algorithmic approaches to approximate counting and sampling: Markov chains, correlation decay, and polynomial interpolation. One striking advantage of the Markov chain approach is that the algorithms are much faster and simpler than the algorithms from the other approaches. In particular, it is common for a Markov chain sampling algorithm to run in time $O(n\log n)$ , e.g., see [13, 14], while typical running times for algorithms based on correlation decay [31, 25] and polynomial interpolation [1] are $n^{O(\log\Delta)}$ where $\Delta$ is the maximum degree of the graph.

In general there are no known efficient algorithms at low temperatures (high fugacities), but recently efficient algorithms have been developed for some special classes of graphs including subsets of $\mathbb{Z}^{d}$ [18], random regular bipartite graphs, and bipartite expander graphs in general [20, 24]. What these bipartite graphs have in common is that for large enough $\lambda$ , typical independent sets drawn from $\mu_{G,\lambda}$ align closely with one side or the other of the bipartition (the two ground states). This phenomenon is related to the phase transition phenomenon in infinite graphs, and implies the exponentially slow mixing time of local Markov chains [5, 16, 26]. The algorithms introduced in [18] exploit this phenomenon by expressing the partition function $Z_{G,\lambda}$ in terms of deviations from the two ground states, and then using a truncation of a convergent series expansion (the Taylor series or the cluster expansion) to approximate the log partition function. In statistical physics this is called a perturbative approach, and while in general it does not work in the largest possible range of parameter space, when it does work it gives a very detailed probabilistic understanding of the model [28, 7, 10].

To apply the so-called perturbative approach at low temperatures, one rewrites the original spin model as a new model in which single spin interactions are replaced by the interaction of connected components representing deviations from a chosen ground state. Such models are known in general as abstract polymer models [22], see Section 1.1 for the polymer models we consider here, and have long been used in statistical physics to understand phase transitions. In this paper, we show that once a low-temperature spin model has been transformed into a polymer model, Markov chains once again become an effective algorithmic tool. Using this approach we obtain nearly linear and quadratic time sampling algorithms for low temperature models on expander graphs in cases where only $n^{O(\log\Delta)}$ -time algorithms were previously known.

1.1 Subset polymer models

Abstract polymer models, as defined by Kotecký and Preiss [22], are an important tool in studying the equilibrium phases of statistical physics models on lattices, see, e.g., [23, 7] among many others.111See also the relevant notion of ‘animal models’ by Dobrushin [10]. The paper [4] has a more detailed history of their use in statistical physics and combinatorics. Recently, polymer models have been used to develop efficient deterministic algorithms for sampling and approximating the partition functions of statistical physics models on lattices [18] and expander graphs [20, 24] at low temperatures, the regime in which Markov chains like the Glauber dynamics are known to mix slowly.

We will study the following class of abstract polymer models, known as subset polymer models (defined by Gruber and Kunz [17]). We begin by describing the relevant polymers: start with a finite host graph $G$ and a set $[q]=\{0,\ldots,q-1\}$ of spins. For each vertex $v$ , there is a ground-state spin $g_{v}$ . A polymer $\gamma$ consists of a connected set of vertices together with an assignment $\sigma_{\gamma}$ of spins from $\{0,\ldots,q-1\}\setminus g_{v}$ to each vertex $v\in\gamma$ (we abuse notation and use $\gamma$ to denote both the polymer and the associated set of vertices). The size of a polymer, $|\gamma|$ , is the number of vertices in $\gamma$ . The set of all polymers is $\mathcal{P}(G)$ .

A polymer model on $G$ consists of a set $\mathcal{C}(G)\subseteq\mathcal{P}(G)$ of ‘allowed’ polymers, and a non-negative weight $w_{\gamma}$ for each polymer $\gamma\in\mathcal{C}(G)$ . We denote this model by $(\mathcal{C}(G),w)$ . Two polymers $\gamma$ and $\gamma^{\prime}$ are called ‘compatible’ (written $\gamma\sim\gamma^{\prime}$ ) if their distance in the host graph is at least $2$ ; otherwise they are ‘incompatible’ (written $\gamma\nsim\gamma^{\prime}$ ). The state space of allowable configurations is $\Omega=\{\Gamma\subseteq\mathcal{C}(G)\mid\forall\gamma,\gamma^{\prime}\in\Gamma,\gamma\sim\gamma^{\prime}\}$ .

The partition function of the polymer model is

[TABLE]

where the empty set of polymers contributes $1$ to the partition function. The Gibbs measure $\mu_{G}$ is the probability distribution on $\Omega$ given by

[TABLE]

Note that the polymer model is in fact a hard-core model on the ‘incompatibility graph’ of $\mathcal{C}(G)$ , where two polymers are joined by an edge if they are incompatible, with non-uniform fugacities given by the weights $w_{\gamma}$ . The geometry inherited from the host graph $G$ and the sizes of the polymers adds additional structure to the model.

Example 1.

One instance of a polymer model is the hard-core model itself: polymers are single vertices of the graph $G$ , labeled with ‘1’ (for occupied) against a ground state ‘0’ (for unoccupied). Each polymer (vertex) $v$ comes with the weight function $w_{v}=\lambda$ . Then the set of allowable polymer configurations is exactly the set of independent sets of $G$ , and so the polymer model partition function is exactly the partition function of the hard-core model on $G$ .

Example 2.

A second instance of a polymer model is related to the ferromagnetic $q$ -color Potts model on a graph $G$ (see Definition 8 below). Fix a color $g\in[q]$ to be the ground state color, and define polymers to be connected subgraphs of $G$ of size at most $M$ , with vertices labeled by the remaining colors $[q]\setminus\{g\}$ . A polymer $\gamma$ has weight function $w_{\gamma}=e^{-\beta B(\gamma)}$ where $B(\gamma)$ is the number of bichromatic edges in $\gamma$ plus the size of the edge boundary of $\gamma$ in $G$ . A configuration of compatible polymers maps to a unique Potts configuration $\sigma$ in which all connected components of non- $g$ -colored vertices have size at most $M$ , and the weight of $\sigma$ in the Potts model is exactly the product of the weight functions of the polymers. The polymer model partition function $Z(G)$ , with an appropriate choice of $M$ , represents the contribution to the Potts model partition function of colorings where color $g$ ‘dominates’, see also Section 1.3 for more details.

As with the hard-core model, there are two main computational problems associated to a polymer model: approximate sampling from $\mu_{G}$ and approximate counting of $Z(G)$ . We will approach them both via Markov chain algorithms. In general we will be interested in families of polymer models defined on classes of graphs. We denote such a family $(\mathcal{C}(\cdot),w,\mathcal{G})$ , where for each graph $G\in\mathcal{G}$ , $(\mathcal{C}(G),w)$ is a polymer model. We will always use $n$ to denote the number of vertices of a graph $G$ .

We consider two conditions on the weight functions $w_{\gamma}$ and give their algorithmic consequences.

Definition 1.

A polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer mixing condition if there exists $\theta\in(0,1)$ such that

[TABLE]

for all $G\in\mathcal{G}$ and all $\gamma\in\mathcal{C}(G)$ .

We postpone the formal definition of mixing time to Section 2.2 and state our first main result here.

Theorem 2.

Suppose that a polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer mixing condition (2). Then for each $G\in\mathcal{G}$ there is a Markov chain making single polymer updates with stationary distribution $\mu_{G}$ and mixing time $T_{\mathrm{mix}}(\varepsilon)=O(n\log(n/\varepsilon))$ .

Theorem 2 on its own does not guarantee an efficient algorithm for sampling from $\mu_{G}$ because the Markov chain only yields an efficient sampling algorithm if we can implement each step efficiently. We will show that under a stronger condition we can do this.

Definition 3.

A polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ is said to be computationally feasible if, for each $G\in\mathcal{G}$ and each $\gamma\in\mathcal{P}(G)$ , we can determine, in time polynomial in $|\gamma|$ , whether $\gamma\in\mathcal{C}(G)$ , and compute $w_{\gamma}$ if it is.

Definition 4.

A polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ with $q$ spins on a class $\mathcal{G}$ of graphs of maximum degree $\Delta$ satisfies the polymer sampling condition with constant $\tau\geq 5+3\log((q-1)\Delta)$ if

[TABLE]

for all $G\in\mathcal{G}$ and all $\gamma\in\mathcal{C}(G)$ .

We have the following theorem.

Theorem 5.

If a computationally feasible polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer sampling condition (3) then for all $G\in\mathcal{G}$ there is an $\varepsilon$ -approximate sampling algorithm for $\mu_{G}$ with running time $O(n\log(n/\varepsilon)\log(1/\varepsilon))$ .

Note that the polymer sampling condition required by Theorem 5 is in general more demanding than the zero-freeness required by cluster-expansion algorithms, but, as Theorem 5 demonstrates, it leads to faster algorithms. Finally, we can use the sampling algorithm and simulated annealing to give a fully polynomial time randomized approximation scheme (FPRAS) for computing the partition function of polymer models.

Theorem 6.

If a computationally feasible polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer sampling condition (3) then for all $G\in\mathcal{G}$ there is a randomized $\varepsilon$ -approximate counting algorithm for $Z(G)$ with running time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ and success probability at least $3/4$ .

Fernández, Ferrari, and Garcia [15] introduced a condition very similar to the polymer mixing condition in the setting of polymer models on $\mathbb{Z}^{d}$ . Their objective was to derive probabilistic properties of polymer models directly, without going through the combinatorics and complex analysis inherent in the cluster expansion for the log partition function. They introduced a continuous time stochastic process whose stationary distribution was the infinite volume Gibbs measure of their polymer model and their version of condition (2) implied an exponentially fast rate of convergence of this process. They remarked that such an approach had the potential to be an efficient computational tool.

Here we take an algorithmic point of view, and use the polymer mixing and sampling conditions to show that a simple discrete time Markov chain mixes rapidly and can be used to design efficient sampling and approximation algorithms. Our approach differs from that of [15] in that while they are interested primarily in the probabilistic properties of spin models on $\mathbb{Z}^{d}$ , we are interested in algorithmic problems involving spin models on general families of graphs. Our setting of discrete time processes on finite graphs is also more suitable to studying algorithmic questions. Our work confirms the central point of [15]: that complex analysis and absolute convergence of the cluster expansion is not necessary to derive many important properties of a polymer model.

1.2 Applications

We apply our results for subset polymer models to two specific examples: the ferromagnetic Potts model and the hard-core model on expander graphs. To state these results we need some definitions.

Definition 7.

Let $\alpha>0$ . A graph $G$ is an $\alpha$ -expander graph if for all $S\subset V(G)$ with $|S|\leq|V(G)|/2$ , we have $e(S,S^{c})\geq\alpha|S|$ , where $S^{c}=V(G)\setminus S$ and $e(S,S^{c})$ is the number of edges exiting the set $S$ .

Definition 8.

The $q$ -color ferromagnetic Potts model with parameter $\beta>0$ is a random assignment of $q$ colors to the vertices of a graph defined by

[TABLE]

where $m(G,\sigma)$ is the number of bichromatic edges of $G$ under the coloring $\sigma$ and $Z_{G,\beta}=\sum_{\sigma}e^{-\beta m(G,\sigma)}$ is the Potts model partition function. The parameter $\beta$ is known as the inverse temperature.

Jenssen, Keevash, and Perkins [20] gave an FPTAS and polynomial-time sampling algorithm for the Potts model on expander graphs, with an algorithm based on the cluster expansion and Barvinok’s method of polynomial interpolation. Under essentially the same conditions on the parameters we give a Markov chain based sampling algorithm with near linear running time.

Theorem 9.

Suppose $q\geq 2$ , $\Delta\geq 3$ are integers and $\alpha>0$ is a real. Then for $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}$ and any $qe^{-n}\leq\varepsilon<1$ , there is an $\varepsilon$ -approximate sampling algorithm for the $q$ -state ferromagnetic Potts model with parameter $\beta$ on all $n$ -vertex $\alpha$ -expander graphs of maximum degree $\Delta$ with running time $O(n\log(n/\varepsilon)\log(1/\varepsilon))$ . There is also an $\varepsilon$ -approximate counting algorithm with running time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ and success probability at least 3/4.

Note that, if the desired error satisfies $\varepsilon<qe^{-n}$ , then we can simply compute the partition function by brute force in poly $(n/\varepsilon)$ time. This observation combined with the above result gives an FPRAS, but we can no longer guarantee a running time of $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ for exponentially small values of $\varepsilon$ . A similar point also applies to the algorithm that we give for the hard-core model.

Definition 10.

Let $\alpha\in(0,1)$ . A bipartite graph $G=(V,E)$ with bipartition $V=V^{0}\cup V^{1}$ is a bipartite $\alpha$ -expander if, for $i\in\{0,1\}$ and all $S\subseteq V^{i}$ where $|S|\leq|V^{i}|/2$ , we have $N_{G}(S)\geq(1+\alpha)|S|$ where $N_{G}(S)$ denotes the set of vertices that are adjacent to some vertex in $S$ .

Again we give a fast Markov chain based algorithm for sampling from the hard-core model for essentially the same range of parameters for which an FPTAS is given in [20].

Theorem 11.

Suppose $\Delta\geq 3$ is an integer and $\alpha\in(0,1)$ is a real. Then for any $\lambda\geq(3\Delta)^{6/\alpha}$ and $4e^{-n}\leq\varepsilon<1$ , there is an $\varepsilon$ -approximate sampling algorithm for the hard-core model with parameter $\lambda$ on all $n$ -vertex bipartite $\alpha$ -expander graphs of maximum degree $\Delta$ . There is also an $\varepsilon$ -approximate counting algorithm for the hard-core model with success probability at least $1-\varepsilon$ . Both algorithms run in time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon)\log(1/\varepsilon))$ .

The extra factor in the running time of the sampling algorithm for the hard-core model as compared to the Potts model is due to the fact that the hard-core model on a bipartite graph does not in general exhibit exact symmetry between the ground states, and so we must approximate the partition functions of the even and odd dominant independent sets to sample.

We can extend these algorithms to obtain fast sampling algorithms in most situations in which a counting problem can be put in the framework of subset polymer models. For instance, we can use Theorems 5 and 6 to improve the running times of the algorithms given by [21, 24] for sampling and counting proper $q$ -colorings in $\Delta$ -regular bipartite graphs (for large $\Delta$ ). The two papers give slightly different polymer models for proper $q$ -colorings on $\Delta$ -regular bipartite graphs — see [21, Section 5] and [24, Section 5.2]. Section 5.2 of [24] shows that their polymer model is computationally feasible. Section 5.1 of [21] shows that their polymer model satisfies the Kotecký-Preiss condition — in fact, their proof establishes the polymer sampling condition (3). It is easy to see (by comparing the polymer weights) that the polymer model of [24] therefore also satisfies the polymer sampling condition. Thus, we get the following corollary of Theorem 5 and 6.

Corollary 12.

There is an absolute constant $C>0$ so that for all even $q\geq 3$ , all $\Delta\geq Cq^{2}\log^{2}q$ and all $\varepsilon>e^{-n/(8q)}$ , there is an $\varepsilon$ -approximate sampling algorithm to sample a uniformly random proper $q$ -coloring from a random $\Delta$ -regular bipartite graph running in time $O(n\log(n/\varepsilon)\log(1/\varepsilon))$ . Furthermore, there is a randomized $\varepsilon$ -approximation algorithm for the number of proper $q$ -colorings with running time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ and success probability at least $3/4$ . For odd $q$ , there are $\varepsilon$ -approximate counting and sampling algorithms that both run in time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon)\log(1/\varepsilon))$ .

As with independent sets, the extra factor in the running time for odd $q$ comes from the fact that the ground states (colorings in which one side of the bipartition is assigned $\lceil q/2\rceil$ colors and the other side $\lfloor q/2\rfloor$ colors) are exactly symmetric only if $q$ is even.

Finally, we remark that the approximate counting algorithms for these applications based on truncating the cluster expansion can run faster than $n^{O(\log\Delta)}$ if the parameters (expansion, fugacity, inverse temperature) are high enough (see [21, Theorem 8]), but the sampling algorithms derived from this approach will not match the $\tilde{O}(n)$ or $\tilde{O}(n^{2})$ sampling algorithms we obtain here.

1.3 Comparison to spin Glauber dynamics

A very natural idea to sample at low temperatures (large $\beta$ for the Potts model, large $\lambda$ for the hard-core model) is to use a single-spin update Markov chain like the Glauber dynamics, but to start in one of the ground states of the model chosen at random. For example, pick one of the $q$ -colors with equal probability then start the Potts model Glauber dynamics in the monochromatic configuration with that color. The intuition is that the Glauber dynamics will mix well within the portion of the state space close to the chosen ground state, and the randomness in the choice of ground state will ensure that an accurate sample from the full measure is obtained. Analyzing this algorithm was suggested in [18] and [20].

While we are not yet able to show that this algorithm succeeds, we make partial progress. We show that Glauber dynamics, restricted to remain in a portion of the state space, mixes rapidly (in polynomial time). It is easiest to state our result for the ferromagnetic Potts model.

For a ground state color $g\in[q]$ and an integer $M$ , let $\Omega^{g}_{M}(G)$ be the set of $q$ -colorings of the vertices of $G$ so that every connected component of $G$ colored with the palette of colors $[q]\setminus g$ is of size at most $M$ . The set $\Omega^{g}_{M}(G)$ consists of colorings that come from the valid polymer configurations from Example 2 above. In [20] it is shown that for an appropriate choice of $M$ , the set $\{\Omega^{g}_{M}(G)\}_{g\in[q]}$ forms an “almost partition” of the set of all colorings, in that the weight of both the overlap of the almost partition and the set of colorings uncovered by the almost partition is at most $\varepsilon$ under the conditions of Theorem 9. In particular, an $\varepsilon$ -approximate sample from the Potts model restricted to $\Omega^{g}_{M}(G)$ for $M=O(\log(n/\varepsilon))$ is enough (by symmetry) to obtain a $(q\varepsilon)$ -approximate sample from the Potts distribution $\mu_{G,\beta}$ (cf. Lemma 28). Using Markov chain comparison, we show in Section 5.3.1 that this can be done using the usual spin Glauber dynamics restricted to remain in $\Omega^{g}_{M}(G)$ .

Theorem 13.

Suppose $q\geq 2$ , $\Delta\geq 3$ are integers and $\alpha>0$ is a real. Let $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}$ be a real number and $g\in[q]$ . Then, for any $n$ -vertex $\alpha$ -expander graph $G$ of maximum degree $\Delta$ and any $\varepsilon\in(0,1)$ , for $M=O(\log(n/\varepsilon))$ the Glauber dynamics restricted to $\Omega^{g}_{M}(G)$ has mixing time $T_{\mathrm{mix}}(\varepsilon)$ polynomial in $n$ and $1/\varepsilon$ .

We remark that the polynomial bound in Theorem 13 depends on $q,\Delta,\alpha$ exponentially, see the relevant Theorem 25 and Section 5.3.1 for details. Theorem 13 shows that despite exponentially slow mixing of the Glauber dynamics on the full state space [3], it can still be used to obtain a polynomial-time approximate sampling algorithm. We leave for future work two important extensions that would complete the picture: 1) showing that unrestricted Glauber dynamics starting from a well chosen configuration works 2) lowering the running time to $O(n\log n)$ from the large polynomial we obtain in the theorem.

In Section 5, we state a general theorem (Theorem 25) comparing the polymer model dynamics to spin model dynamics as well as a specific result for the hard-core model (Theorem 29).

2 Polymer models and Markov chains

Here we compare various conditions on the weight functions of a polymer model, namely the Kotecký–Preiss [22] condition and the polymer sampling condition, and show that the latter implies the former. Then, we define the polymer Markov chain which we use to prove Theorems 2 and 5.

2.1 A comparison of the conditions on the weights

Here we show that the polymer sampling condition (3) implies the well-known Kotecký–Preiss [22] condition:

[TABLE]

To see the implication, we use a lemma of Borgs, Chayes, Kahn, and Lovász.

Lemma 14 ([6]).

Let $G$ have maximum degree $\Delta\geq 3$ and let $v\in V(G)$ . The number of connected induced subgraphs of $G$ of size $k$ containing $v$ is at most $(e\Delta)^{k-1}$ .

Now consider a polymer model satisfying (3) with constant $\tau\geq 5+3\log((q-1)\Delta)$ . Fix $\gamma\in\mathcal{C}(G)$ . We have that

[TABLE]

In order to account for all of the polymers that we sum over in the above, we consider the connected induced subgraphs of $G$ of size $k$ that contain $v$ , and the assignments to them of $q-1$ colours. Using Lemma 14, we therefore obtain that

[TABLE]

so the Kotecký–Preiss condition is satisfied.

The Kotecký–Preiss condition, in turn, implies the polymer mixing condition (2) with $\theta=1/e$ since $e\cdot x\leq e^{x}$ for $x\geq 1$ . For the same reason (since $e^{x}$ gets much bigger than $x$ ), it is easy to see that the polymer mixing condition is weaker than the Kotecký–Preiss condition.

2.2 The polymer Markov chain

For each $v\in V(G)$ , let $\mathcal{A}(v)=\{\gamma\in\mathcal{C}(G):v\in{\gamma}\}$ denote the collection of all polymers containing $v$ and let $a(v)=\sum_{\gamma\in\mathcal{A}(v)}w_{\gamma}$ . By applying (2) to the smallest $\gamma$ containing $v$ we have $a(v)\leq\theta<1$ for all $v\in V(G)$ . Define the probability distribution $\nu_{v}$ on $\mathcal{A}(v)\cup\{\emptyset\}$ by $\nu_{v}(\gamma)=w_{\gamma}$ for $\gamma\in\mathcal{A}(v)$ and $\nu_{v}(\emptyset)=1-a(v)$ .

The polymer dynamics on $\Omega$ are defined by the following transition rule from a configuration $\Gamma_{t}$ to a configuration $\Gamma_{t+1}$ :

Polymer Dynamics

Choose $v\in V(G)$ uniformly at random. Let $\gamma_{v}\in\Gamma_{t}\cap\mathcal{A}(v)$ if $\Gamma_{t}\cap\mathcal{A}(v)\neq\emptyset$ and let $\gamma_{v}=\emptyset$ otherwise. Note that $\gamma_{v}$ is well defined since $\Gamma_{t}$ can have at most one polymer containing $v$ . 2. 2.

Mutually exclusively do the following:

•

With probability $\frac{1}{2}$ , let $\Gamma_{t+1}=\Gamma_{t}\setminus\gamma_{v}$ .

•

With probability $\frac{1}{2}$ , sample $\boldsymbol{\gamma}$ from $\nu_{v}$ , set $\Gamma_{t+1}=\Gamma_{t}\cup\boldsymbol{\gamma}$ if this is in $\Omega$ and set $\Gamma_{t+1}=\Gamma_{t}$ otherwise.

Note that the polymer dynamics are aperiodic, since there are self-loops, and irreducible since we can transition from any $\Gamma\in\Omega$ to any $\Gamma^{\prime}\in\Omega$ (e.g., via the empty set). Since the polymer dynamics are finite, irreducible, and aperiodic, they are also ergodic. Next, we observe that the stationary distribution of the polymer dynamics is $\mu_{G}$ by checking detailed balance. Note that each transition of the dynamics changes a configuration $\Gamma$ by at most one polymer $\gamma$ ; let $\Gamma^{\prime}=\Gamma\cup\gamma$ . Then

[TABLE]

where $P$ is the transition matrix of the polymer dynamics, and so $\mu_{G}$ is the stationary distribution.

We now formally define the mixing time. If $\mathcal{M}$ is an ergodic Markov chain with transition matrix $P$ and stationary distribution $\nu$ then the mixing time of $\mathcal{M}$ from a state $x$ is given by

[TABLE]

where $\|\nu^{\prime}-\nu\|_{TV}$ denotes the total variation distance between distributions $\nu$ and $\nu^{\prime}$ . The mixing time of $\mathcal{M}$ is given by $T_{\mathrm{mix}}(\varepsilon)=\max_{x}T_{x}(\varepsilon)$ . We will write $T_{\mathrm{mix}}(\mathcal{M},\varepsilon)$ below if we need to emphasize which Markov chain we refer to.

2.3 Proof of Theorems 2 and 5

See 2

Proof.

We will show that under condition (2) the mixing time of the polymer dynamics is $O(n\log(n/\varepsilon))$ by applying the path coupling technique. We define a metric $D(\cdot,\cdot)$ on $\Omega$ by setting $D(\Gamma,\Gamma^{\prime})=1$ if $\Gamma^{\prime}=\Gamma\cup\{\gamma\}$ or $\Gamma=\Gamma^{\prime}\cup\{\gamma\}$ for a polymer $\gamma$ and extending this as a shortest path metric; i.e., $D(\Gamma,\Gamma^{\prime})=|\Gamma\triangle\Gamma^{\prime}|$ for any $\Gamma,\Gamma^{\prime}\in\Omega$ where $\triangle$ denotes the symmetric difference of two sets.

Now suppose we couple two chains $X_{t}$ and $Y_{t}$ by attempting the same updates in both chains at each step. Suppose that $X_{t}=Y_{t}\cup\{\gamma\}$ for some polymer $\gamma$ . With probability $\frac{|\gamma|}{n}\cdot\frac{1}{2}$ we pick $v\in{\gamma}$ and remove $\gamma_{v}$ which yields $X_{t+1}=Y_{t+1}=X_{t}$ . On the other hand, we may attempt to add a polymer $\gamma^{\prime}\nsim\gamma$ so that $Y_{t}\cup\{\gamma^{\prime}\}\in\Omega$ . That is, $X_{t+1}=X_{t}=Y_{t}\cup\{\gamma\}$ and $Y_{t+1}=Y_{t}\cup\{\gamma^{\prime}\}$ . This occurs with probability $\frac{|\gamma^{\prime}|}{n}\cdot\frac{1}{2}\cdot w_{\gamma^{\prime}}$ and in this case $D(X_{t+1},Y_{t+1})\leq 2$ . Putting these together we can bound

[TABLE]

Using (2) we have $\sum_{\gamma^{\prime}\nsim\gamma}|\gamma^{\prime}|w_{\gamma^{\prime}}\leq\theta|\gamma|$ and so

[TABLE]

By the path coupling lemma (see [12, Section 6]), and with $W$ denoting the diameter of $\Omega$ under $D(\cdot,\cdot)$ , we have that the mixing time is at most $\log(W/\varepsilon)2n/(1-\theta)=O(n\log(n/\varepsilon))$ , using that $W\leq 2n$ . This finishes the proof. ∎

To prove Theorem 5 we will show that a single update of the polymer dynamics can be computed in constant expected time. Assume that the polymer model is computationally feasible and that the polymer sampling condition (3) holds with constant $\tau\geq 5+3\log((q-1)\Delta)$ . We will use the following algorithm. Let $r=\tau-2-\log((q-1)\Delta)\geq 3+2\log((q-1)\Delta)$ and let $\mathcal{A}_{k}(v)=\{\gamma\in\mathcal{A}(v):|\gamma|\leq k\}$ .

Single polymer sampler

Choose $\mathbf{k}$ according to the following geometric distribution: for $k$ a non-negative integer,

[TABLE]

This gives $\Pr[\mathbf{k}\geq k]=e^{-rk}$ . 2. 2.

Enumerate all polymers in $\mathcal{A}_{\mathbf{k}}(v)$ and compute their weight functions. 3. 3.

Mutually exclusively output $\gamma\in\mathcal{A}_{\mathbf{k}}(v)$ with probability $w_{\gamma}e^{r|\gamma|}$ , and with all remaining probability output $\emptyset$ . In particular if $\mathbf{k}=0$ , then output $\emptyset$ with probability $1$ .

In order to show that this algorithm has constant expected running time, we will require the following result on enumerating connected subgraphs of bounded degree graphs.

Lemma 15 ([27] Lemma 3.7).

Let $G$ have maximum degree $\Delta$ and let $v\in V(G)$ . There is an algorithm running in time $O(k^{5}(e\Delta)^{2k})$ that outputs a list of all connected subgraphs of $G$ of size at most $k$ containing $v$ .

We now proceed to prove the following lemma.

Lemma 16.

Under the polymer sampling condition (3) the output distribution of the single polymer sampler is $\nu_{v}$ . Further, assuming the polymer model is computationally feasible, the expected running time of the sampler is constant.

Proof.

We first show that the probabilities $w_{\gamma}e^{r|\gamma|}$ sum to less than $1$ , which shows the last step of the sampling algorithm is well defined. Since $\tau-r=2+\log((q-1)\Delta)$ ,

[TABLE]

We next show that the output of the algorithm has distribution $\nu_{v}$ . Given $\gamma\in\mathcal{A}(v)$ , to output $\gamma$ we must choose $\mathbf{k}\geq|\gamma|$ . This happens with probability $e^{-r|\gamma|}$ by the distribution of $\mathbf{k}$ . Conditioned on choosing such a $\mathbf{k}$ , the probability we output $\gamma$ is $w_{\gamma}e^{r|\gamma|}$ , and multiplying these probabilities together gives $w_{\gamma}$ as desired. Since this is true for all $\gamma\in\mathcal{A}(v)$ , the output distribution is exactly $\nu_{v}$ .

Finally we analyze the expected running time assuming that the model is computationally feasible. To do this, we observe that by Lemma 15, conditioned on the event that $\mathbf{k}=k$ the enumeration step of the algorithm takes time $O(k^{5}(e\Delta)^{2k})$ , and the time to determine which polymers are allowed and computing their weights is $O(k^{c}(q-1)^{k}(e\Delta)^{k-1}/2)$ for some $c>0$ , since the polymer model is computationally feasible; here, the factor $k^{c}$ accounts for the time to determine whether a single polymer of size $k$ is ‘allowed’ and to compute its weight. Therefore, the expected running time is

[TABLE]

where $\tau^{\prime}=\tau-5-3\log((q-1)\Delta)\geq 0$ . ∎

Finally we prove Theorem 5. See 5

Proof.

By Theorem 2, there is there is an integer $C_{1}>1$ (independent of $n$ ) so that if we start with the empty configuration $\Gamma_{0}=\emptyset$ and run the polymer dynamics, then $\Gamma_{C_{1}\left\lceil n\log(n/\varepsilon)\right\rceil}$ has distribution within $\varepsilon/2$ total variation distance of $\mu_{G}$ . By Lemma 16, there is an integer $C_{2}>1$ (independent of $n$ ) such that the expected number of steps required to perform one update of the polymer dynamics is at most $C_{2}$ . To compute an $\varepsilon$ -sample from $\mu_{G}$ , we repeat the following $\left\lceil\log(2/\varepsilon)\right\rceil$ times, independently, and if no configuration is returned we return the empty configuration. Run the polymer dynamics for $3C_{1}C_{2}\left\lceil n\log(n/\varepsilon)\right\rceil$ steps starting from $\Gamma_{0}=\emptyset$ , and if at least $C_{1}\left\lceil n\log(n/\varepsilon)\right\rceil$ updates of the polymer dynamics were executed, return $\Gamma_{C_{1}\left\lceil n\log(n/\varepsilon)\right\rceil}$ .

We next show that the probability that the algorithm does not timeout and return the empty configuration is at least $1-\varepsilon/2$ , which therefore yields that the output distribution has total variation distance at most $\varepsilon$ from $\mu_{G}$ . Let $X$ denote the total number of steps required to execute $C_{1}\left\lceil n\log(n/\varepsilon)\right\rceil$ updates of the polymer dynamics, and note that $\mathbb{E}[X]\leq C_{1}C_{2}\left\lceil n\log(n/\varepsilon)\right\rceil$ . By Markov’s inequality, it follows that $\Pr(X\geq 3\mathbb{E}[X])<1/e$ . Thus, the probability that $X\geq 3\mathbb{E}[X]$ for each of $\left\lceil\log(2/\varepsilon)\right\rceil$ independent copies of $X$ , is less than $(1/e)^{\log(2/\varepsilon)}=\varepsilon/2$ . ∎

3 Approximate counting algorithm

In this section we show how to use a sampling oracle to approximately compute the partition function of the polymer model. One standard way is by self-reducibility. In [18] an efficient sampling algorithm for polymer models is derived from an efficient approximate counting algorithm by applying self-reducibility on the level of polymers. While we could apply polymer self-reducibility in the other direction to obtain counting algorithms from our sampling algorithm, here we use the simulated annealing method instead (see [2, 19, 30]) to obtain a faster implementation of counting from sampling.

Suppose that $(\mathcal{C}(G),w)$ is a computationally feasible polymer model. Let $\rho$ be a parameter and define a weight function

[TABLE]

for all $\gamma\in\mathcal{C}(G)$ . Then for each $\rho\geq 0$ this defines a computationally feasible polymer model $(\mathcal{C}(G),w(\rho))$ on $G$ , where setting $\rho=0$ recovers the original model $(\mathcal{C}(G),w)$ . If the original model $(\mathcal{C}(G),w)$ satisfies the polymer sampling condition (3), then so does $(\mathcal{C}(G),w(\rho))$ for every $\rho\geq 0$ as the weight function $w_{\gamma}(\rho)$ is monotone decreasing in $\rho$ .

Given the graph $G$ , we write the partition function of the polymer model $(\mathcal{C}(G),w(\rho))$ as a function of $\rho$ :

[TABLE]

The associated Gibbs distribution is denoted by $\mu_{\rho}=\mu_{G;\rho}$ . Since $\lim_{\rho\to\infty}w_{\gamma}(\rho)=0$ , we have $\lim_{\rho\to\infty}Z(\rho)=1$ (only the empty configuration $\Gamma$ contributes to this limit), and so we will use simulated annealing to interpolate between $Z(\infty)=1$ and our goal $Z(0)$ , assuming access to a sampling oracle for $(\mathcal{C}(G),w(\rho))$ for all $\rho\geq 0$ . To apply the simulated annealing method, roughly speaking, we find a sequence of parameters $0=\rho_{0}<\rho_{1}<\dots<\rho_{\ell}<\infty$ called a cooling schedule where $\ell\in\mathbb{N}^{+}$ , and then estimate $Z(0)$ using the telescoping product

[TABLE]

To estimate each term $Z(\rho_{i+1})/Z(\rho_{i})$ , we define independent random variables

[TABLE]

It is straightforward to see that $\mathbb{E}[W_{i}]=Z(\rho_{i+1})/Z(\rho_{i})$ (see Lemma 17). Using the sampling oracle for $\mu_{\rho_{i}}$ , we can sample $W_{i}$ for all $i$ , and by taking the product we get an estimate for $1/Z(0)$ .

The key ingredient of simulated annealing is finding a good cooling schedule. There are nonadaptive schedules [2] that depend only on $n$ , and adaptive schedules [19, 30] that also depend on the structure of $Z(\cdot)$ . Usually the latter leads to faster algorithms than the former. In this paper we will use a simple nonadaptive schedule: $\rho_{i}=i/n$ for $i=0,\dots,\ell$ where $\ell=O(n\log(n/\varepsilon))$ . We will show that this cooling schedule already gives us a fast algorithm for the polymer model. The reason behind it is that the weight function $w_{\gamma}(\rho)$ decays exponentially fast, and so (see Lemma 18) the partition function $Z(\rho_{\ell})$ is bounded by a constant when $\rho_{\ell}=O(\log n)$ , leading to a short cooling schedule.

Our algorithm is as follows.

Polymer approximate counting algorithm

Let $\rho_{i}=i/n$ for $i=0,1,\dots,\ell$ where $\ell=\left\lceil n\log(4e(q-1)\Delta n/\varepsilon)\right\rceil$ ; 2. 2.

For $j=1,\dots,m$ where $m=\left\lceil 64\varepsilon^{-2}\right\rceil$ :

(a)

For $0\leq i\leq\ell-1$ :

(i)

Sample $\Gamma_{i}^{(j)}$ from $\mu_{\rho_{i}}$ ; 2. (ii)

Let $W_{i}^{(j)}=\prod_{\gamma\in\Gamma_{i}^{(j)}}e^{-|\gamma|/n}$ ; 2. (b)

Let $W^{(j)}=\prod_{i=0}^{\ell-1}W_{i}^{(j)}$ ; 3. 3.

Let $\widehat{W}=\frac{1}{m}\sum_{j=1}^{m}W^{(j)}$ and output $\widehat{Z}=1/\widehat{W}$ .

Before proving Theorem 6, we first present a few useful lemmas. We shall use $\rho_{i}=i/n$ for $0\leq i\leq\ell$ as our cooling schedule and we further define $\rho_{\ell+1}=(\ell+1)/n$ though it does not appear in the algorithm. For $0\leq i\leq\ell-1$ independently we define $\Gamma_{i}$ to be a random sample from $\mu_{\rho_{i}}$ and $W_{i}=\prod_{\gamma\in\Gamma_{i}}e^{-|\gamma|/n}$ . Finally, we let $W=\prod_{i=0}^{\ell-1}W_{i}$ .

Lemma 17.

For $0\leq i\leq\ell-1$ ,

[TABLE]

Therefore,

[TABLE]

Proof.

In the proof, we use $W_{i}(\Gamma_{i})$ to denote $\prod_{\gamma\in\Gamma_{i}}\frac{w_{\gamma}(\rho_{i+1})}{w_{\gamma}(\rho_{i})}$ . We deduce from the definition of $W_{i}$ that

[TABLE]

and that

[TABLE]

Since $W_{0},\dots,W_{\ell-1}$ are mutually independent, we obtain

[TABLE]

and

[TABLE]

Lemma 18.

Suppose that $w_{\gamma}\leq 1$ for all $\gamma\in\mathcal{C}(G)$ . Then we have

[TABLE]

Proof.

It is trivial that $Z(\rho_{\ell})\geq 1$ since $\emptyset\in\Omega$ has weight $1$ . Meanwhile, we have the crude bound

[TABLE]

We then deduce that

[TABLE]

where (a) follows from Lemma 14 and (b) from $\ell\geq n\log(4e(q-1)\Delta n/\varepsilon)$ . ∎

Lemma 19.

We have

[TABLE]

Proof.

Since the weight function $w_{\gamma}(\rho)$ is decreasing in $\rho$ , the partition function $Z(\rho)$ is also decreasing, which implies $Z(\rho_{\ell+1})\leq Z(\rho_{\ell})$ . On the other hand, recalling Lemma 17, we have

[TABLE]

where $\Gamma_{0}$ is sampled from $\mu_{\rho_{0}}$ . Notice that for any $\Gamma_{0}\in\Omega$ we have

[TABLE]

Thus, the lemma follows. ∎

We are now ready to prove Theorem 6 which we restate for convenience.

See 6

Proof.

We first assume that we have access to an exact sampler $\mathcal{S}_{\mathrm{exact}}$ that samples from $\mu_{\rho}$ for all $\rho\geq 0$ . Using this sampler in the Polymer approximate counting algorithm, we find that, for each $j$ and each $i$ , $\Gamma_{i}^{(j)}$ is an exact sample from the distribution $\mu_{\rho_{i}}$ and hence $W_{i}^{(j)}$ is an exact sample of $W_{i}$ , independently for every $j$ and $i$ . Thus, $W^{(j)}$ is a sample of $W$ independently for every $j$ , and $\widehat{W}$ is the sample mean of $W^{(j)}$ ’s. We deduce from Lemmas 17 and 18 that

[TABLE]

and

[TABLE]

where we use $1+\varepsilon/2\leq e^{\varepsilon/2}$ and $e^{-\varepsilon}\leq 1-\varepsilon/2$ for all $0<\varepsilon<1$ . Then

[TABLE]

By Chebyshev’s inequality we have

[TABLE]

where the second to last inequality follows from Lemmas 17 and 19:

[TABLE]

Thus, we deduce that

[TABLE]

so the error probability is at most $1/8$ . Note that the number of samples that we used is $\ell m$ .

Now we replace the exact sampling oracle $\mathcal{S}_{\mathrm{exact}}$ by an approximate one. For every $\rho\geq 0$ , the polymer model $(\mathcal{C}(G),w(\rho))$ is computationally feasible and satisfies the polymer sampling condition (3). Thus, for any $\rho\geq 0$ , Theorem 5 gives a randomized algorithm $\mathcal{S}$ that outputs a $1/(8\ell m)$ -approximate sample from $\mu_{\rho}$ . We then couple $\mathcal{S}$ and $\mathcal{S}_{\mathrm{exact}}$ optimally and run the algorithm with both $\mathcal{S}$ and $\mathcal{S}_{\mathrm{exact}}$ simultaneously, so that for any $\rho\geq 0$ samples from $\mathcal{S}$ and $\mathcal{S}_{\mathrm{exact}}$ for $\mu_{\rho}$ coincide with probability at least $1-1/(8\ell m)$ . Let $\mathcal{B}$ be the event that at least one of the $\ell m$ samples from $\mathcal{S}$ in the algorithm does not couple with that from $\mathcal{S}_{\mathrm{exact}}$ . Then a union bound yields $\Pr(\mathcal{B})\leq 1/8$ . Let $\mathcal{F}$ be the event that the algorithm using $\mathcal{S}_{\mathrm{exact}}$ fails. From our argument before we see that $\Pr(\mathcal{F})\leq 1/8$ . Note that if neither of $\mathcal{B}$ and $\mathcal{F}$ happens, then the algorithm with $\mathcal{S}$ will output a desired estimate. Hence, we conclude from the union bound that the algorithm with $\mathcal{S}$ fails with probability at most

[TABLE]

Finally, we consider the running time of our algorithm. By Theorem 5 the running time of step 2(a)(i) is $O(n\log(8\ell mn)\log(8\ell m))=O(n\log^{2}(n/\varepsilon))$ , and for step 2(a)(ii) the running time is $O(n)$ . Thus, the running time of the algorithm is upper bounded by $\ell m\cdot O(n\log^{2}(n/\varepsilon))=O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ . ∎

4 Applications

Here we apply our results on subset polymer models to several approximate counting and sampling problems at low temperatures.

4.1 Ferromagnetic Potts model

In this section, we prove Theorem 9 for the Potts model. Throughout this section, we will work under the assumptions/conditions of Theorem 9. That is, we fix a real number $\alpha>0$ , integers $q\geq 2$ and $\Delta\geq 3$ and a real number $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}$ . We let $\mathcal{G}$ be the class of $\alpha$ -expander graphs $G$ with maximum degree at most $\Delta$ .

Consider the polymer model defined in Example 2 on an $n$ -vertex graph $G\in\mathcal{G}$ with $M=n/2$ and ground state color $g\in[q]$ . We will use $\mathcal{C}^{g}=\mathcal{C}^{g}(G)$ to denote the polymers and $w_{\gamma}^{g}$ to denote the weight of a polymer $\gamma\in\mathcal{C}^{g}$ ; recall that $w_{\gamma}^{g}=e^{-\beta B(\gamma)}$ , where $B(\gamma)$ counts the number of external edges of $\gamma$ plus the number of bichromatic internal edges. Let $Z^{g}(G)$ be the partition function of the polymer model $(\mathcal{C}^{g}(G),w^{g})$ .

Lemma 20.

Under the conditions of Theorem 9, the polymer model $(\mathcal{C}^{g}(\cdot),w^{g},\mathcal{G})$ satisfies the polymer sampling condition (3) with $\tau=\alpha\beta$ .

Proof.

Since every $G\in\mathcal{G}$ is an $\alpha$ -expander, for $\gamma\in\mathcal{C}^{g}$ we have $B(\gamma)\geq\alpha|\gamma|$ and hence $w_{\gamma}^{g}\leq e^{-\tau|\gamma|}$ . ∎

The following lemma is from [21].

Lemma 21 ([21, Lemma 12]).

For any $n$ -vertex $\alpha$ -expander graph $G$ and $\beta\geq 2\log(eq)/\alpha$ , $qZ^{g}(G)$ is an $e^{-n}$ -approximation of the Potts partition function $Z_{G,\beta}$ .

We are now ready to prove Theorem 9. See 9

Proof.

Let $\mathcal{G}$ be the class of $\alpha$ -expander graphs of maximum degree at most $\Delta$ . Clearly, the polymer models $(\mathcal{C}^{g}(\cdot),w^{g},\mathcal{G})$ are computationally feasible. By Lemma 20, the models also satisfy the polymer sampling condition and therefore Theorems 5 and 6 apply. Consider any $n$ -vertex graph $G\in\mathcal{G}$ . Since $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}>\frac{2\log(eq)}{\alpha}$ , Lemma 21 applies to $G$ .

For the sampling algorithm, we pick a color $g\in[q]$ uniformly at random and generate an $(\varepsilon/q)$ -approximate sample from the Gibbs measure associated to $Z^{g}(G)$ using the algorithm of Theorem 5, in time $O(n\log(n/\varepsilon)\log(1/\varepsilon))$ . By Lemma 21, we conclude that the resulting output is an $\varepsilon$ -approximate sample for the Potts model.

For the counting algorithm, we pick an arbitrary $g\in[q]$ and produce using the algorithm of Theorem 6 a number $\hat{Z}$ in time $O((n/\varepsilon)^{2}\log^{3}(n/\varepsilon))$ , which is an $\varepsilon/(2q)$ -approximation to $Z^{g}(G)$ with probability $\geq 3/4$ . By Lemma 21, we conclude that $q\hat{Z}$ is an $\varepsilon$ -approximation for the partition function of the Potts model (with the same probability). ∎

4.2 Hard-core model

In this section, we prove Theorem 11 for the hard-core model.

Suppose $G=(V^{0},V^{1},E)$ is an $n$ -vertex bipartite $\alpha$ -expander graph of maximum degree $\Delta$ . We will consider the hard-core model on $G$ at sufficiently large fugacities $\lambda$ . There are two relevant ground states corresponding to the two parts of $G$ , one is the independent set given by $V^{0}$ and the other is given by $V^{1}$ . We will capture deviations from the two ground states using the “even” and “odd” polymer models of Jenssen, Keevash and Perkins [21]. We remark that similar models were considered independently by Liao, Lin, Lu, and Mao [24].

For $i\in\{0,1\}$ , we say a set $S\subseteq V^{i}$ is small if $|S|\leq|V^{i}|/2$ . In particular, Definition 10 requires that small sets expand.

Following [21], we will define a polymer model $(\mathcal{C}^{i}(G^{2}),w^{i})$ ; note that the host graph222 $G^{2}$ is the graph on vertex set $V(G)$ , where two vertices are connected if their distance in $G$ is at most 2. is $G^{2}$ , rather than $G$ . The set $\mathcal{C}^{i}=\mathcal{C}^{i}(G^{2})$ of allowed polymers consists of all small sets $\gamma\subseteq V^{i}$ which are connected subgraphs in $G^{2}$ . The set of spins is $\{0,1\}$ and the ground state spin for a vertex $v$ is $g_{v}=0$ if $v\in V^{i}$ , and $g_{v}=1$ if $v\in V^{1-i}$ ; the spin assignment $\sigma_{\gamma}$ for a polymer $\gamma$ gives the spin $1$ to each $v\in\gamma$ . The weight of a polymer $\gamma\in\mathcal{C}^{i}$ is defined as

[TABLE]

where we recall that $N_{G}(\gamma)$ denotes the set of vertices in $G$ which are adjacent to some vertex in $\gamma$ . The key observation behind the definition of the weights is that for a set $\Gamma$ of compatible polymers from $\mathcal{C}^{i}$ , the contribution to $Z_{G,\lambda}$ of all independent sets $I$ with $I\cap V^{i}=\bigcup_{\gamma\in\Gamma}\gamma$ is exactly

[TABLE]

see [21, Proof of Lemma 19] for more details.

Let $Z^{i}(G)$ denote the partition function of the polymer model $(\mathcal{C}^{i},w^{i})$ (where two polymers are compatible if their distance in the host graph $G^{2}$ is at least 2). Using that $G$ is an $\alpha$ -expander, we have the following lemma from [21].

Lemma 22 ([21, Lemma 19]).

For any $\lambda\geq e^{11/\alpha}$ and any $n$ -vertex graph $G=(V^{0},V^{1},E)$ which is a bipartite $\alpha$ -expander, the number

[TABLE]

is an $e^{-n}$ -approximation of the hard-core partition function $Z_{G,\lambda}$ .

In particular, [21, Lemma 17] shows that $(1+\lambda)^{|V^{1}|}Z^{0}(G)+(1+\lambda)^{|V^{0}|}Z^{1}(G)$ counts the contribution to $Z_{G,\lambda}$ of every independent set $I$ of $G$ , but some independent sets are double counted: those independent sets $I$ for which the $2$ -connected components of $V^{0}\cap I$ and $V^{1}\cap I$ are all small. We call these independent sets sparse. The proof of [21, Lemma 19] shows that the relative contribution to $Z_{G,\lambda}$ of sparse independent sets is at most $e^{-n}$ .

We are now ready to prove Theorem 11. See 11

Proof.

First note that $\lambda\geq(3\Delta)^{6/\alpha}\geq 9^{6/\alpha}>e^{11/\alpha}$ , so Lemma 22 applies. Let $\mathcal{G}$ denote the set of host graphs $G^{2}$ corresponding to bipartite $\alpha$ -expanders $G$ of maximum degree $\Delta$ . Noting that the polymer models $(\mathcal{C}^{i}(\cdot),w^{i},\mathcal{G})$ are computationally feasible, we verify the polymer sampling condition (3) for them. Fix arbitrary $i\in\{0,1\}$ . As in [20, Section 4.2], we have the bound

[TABLE]

so, using that $\lambda\geq(3\Delta)^{6/\alpha}$ , we have that the models satisfy the polymer sampling condition with $\tau=\alpha\log\lambda\geq 6\log(3\Delta)\geq 5+3\log\Delta^{2}$ . Therefore, we may also apply Theorems 5 and 6.

For the counting algorithm, we apply Theorem 6. Namely, by taking the median of $O(\log(1/\varepsilon))$ trials, we can obtain $\hat{Z}^{0}$ and $\hat{Z}^{1}$ which are $(\varepsilon/32)$ -approximations to $Z^{0}(G)$ and $Z^{1}(G)$ , respectively, with probability at least $1-\varepsilon/32$ . Let $\mathcal{E}$ be the event that $\hat{Z}^{0}$ and $\hat{Z}^{1}$ are indeed $(\varepsilon/32)$ -approximations to $Z^{0}(G)$ and $Z^{1}(G)$ . Conditioned on $\mathcal{E}$ , the number

[TABLE]

is an $(\varepsilon/32)$ -approximation to the number $A=(1+\lambda)^{|V^{1}|}Z^{0}(G)+(1+\lambda)^{|V^{0}|}Z^{1}(G)$ . By Lemma 22 and since $\varepsilon\geq 4e^{-n}$ , $A$ is an $(\varepsilon/4)$ -approximation to $Z_{G,\lambda}$ and hence $\hat{Z}$ is an $\varepsilon$ -approximation to $Z_{G,\lambda}$ . Since $\mathcal{E}$ occurs with probability at least $1-\varepsilon/16$ , we obtain that $\hat{Z}$ is the desired approximation for the counting algorithm.

For the sampling algorithm, let $\mathbf{i}$ be the random variable which takes the value 0 with probability $\frac{(1+\lambda)^{|V^{1}|}\hat{Z}^{0}}{\hat{Z}}$ and the value 1 otherwise, where $\hat{Z}^{0},\hat{Z}^{1},\hat{Z}$ are the quantities computed earlier. Then, use Theorem 5 to obtain an $(\varepsilon/8)$ -approximate sample from the Gibbs distribution corresponding to the polymer model $(\mathcal{C}^{\mathbf{i}}(G),w^{\mathbf{i}})$ , say $\hat{\Gamma}^{\mathbf{i}}$ . Obtain then an independent set $\hat{I}$ by including into $\hat{I}$ each $v\in V^{1-\mathbf{i}}\backslash N_{G}(\bigcup_{\gamma\in\hat{\Gamma}^{\mathbf{i}}}\gamma)$ with probability $\frac{\lambda}{1+\lambda}$ and each vertex in $\bigcup_{\gamma\in\hat{\Gamma}^{\mathbf{i}}}\gamma$ (with probability 1). We claim that the output distribution of $\hat{I}$ is $\varepsilon$ -close to the hard-core distribution $\mu_{G,\lambda}$ .

To prove this, consider the random independent set $I$ obtained by repeating the same steps above but using instead perfectly accurate computations, i.e., pick $i=0$ with probability $\frac{(1+\lambda)^{|V^{1}|}Z^{0}(G)}{A}$ and the value 1 otherwise, then, sample (perfectly) $\Gamma^{i}$ from the Gibbs distribution corresponding to the polymer model $(\mathcal{C}^{i}(G),w^{i})$ , and then obtain the independent set $I$ by including into $I$ each $v\in V^{1-i}\backslash N_{G}(\bigcup_{\gamma\in{\Gamma}^{{i}}}\gamma)$ with probability $\frac{\lambda}{1+\lambda}$ and each vertex in $\bigcup_{\gamma\in{\Gamma}^{{i}}}\gamma$ (with probability 1). Then, if $I$ is not sparse, $I$ is generated with probability $\lambda^{|I|}/A$ (cf. the observation below (4)). On the other hand, if $I$ is sparse, then $I$ is generated with probability $2\lambda^{|I|}/A$ . But by Lemma 22 and the remark following, the total variation distance between the distribution of $I$ and the hard-core distribution $\mu_{G,\lambda}$ is bounded by the relative weight of the sparse independent sets, which, by Lemma 22, is at most $e^{-n}\leq\varepsilon/4$ .

We next observe that, conditioned on the event $\mathcal{E}$ (i.e., that $\hat{Z}^{0}$ and $\hat{Z}^{1}$ are $(\varepsilon/32)$ -approximations to $Z^{0}(G)$ and $Z^{1}(G)$ ), there is a coupling between $\hat{I}$ and $I$ such that $\hat{I}=I$ with probability at least $1-\varepsilon/4$ . Indeed, the total variation distance between $\mathbf{i}$ and $i$ is at most $e^{\varepsilon/16}-1\leq\varepsilon/8$ and hence there is a coupling of $\mathbf{i}$ with $i$ so that $\mathbf{i}=i$ with probability at least $1-\varepsilon/8$ . Analogously, there is a coupling of $\hat{\Gamma}^{\mathbf{i}}$ with $\Gamma^{i}$ so that $\hat{\Gamma}^{\mathbf{i}}=\Gamma^{i}$ with probability at least $1-\varepsilon/8$ . Since $\mathcal{E}$ occurs with probability at least $1-\varepsilon/16$ , it follows that the overall total variation distance between $\hat{I}$ and $I$ is at most $\varepsilon/2$ .

Hence, the output distribution of $\hat{I}$ is $\varepsilon$ -close to the hard-core distribution $\mu_{G,\lambda}$ , finishing the proof of Theorem 11. ∎

5 Comparison to spin Glauber dynamics

In this section, we derive results for spin Glauber dynamics, restricted to appropriate sets in the state space, based on our results above (using fairly standard Markov chain comparison techniques). We start with the general framework of subset polymer models and obtain Theorem 25, which is then applied to the ferromagnetic Potts and hard-core models.

5.1 Restricted Glauber dynamics for polymer models

Here, we define the restricted Glauber dynamics for subset polymer models, and show the upcoming Theorem 25 which bounds its mixing time under some appropriate conditions.

Consider a subset polymer model as in Section 1.1. There is a natural map $f:\Omega\to\{0,1,\ldots,q-1\}^{V(G)}$ between allowed polymer configurations and spin configurations, given by $f(\Gamma)_{v}=\sigma_{\gamma}(v)$ if $\gamma\in\Gamma$ and $v\in\gamma$ and $f(\Gamma)_{v}=g_{v}$ if $v\notin\cup_{\gamma\in\Gamma}\gamma$ . Let $\Omega_{\mathrm{spin}}=f(\Omega)$ be the spin configurations obtainable as images of the map $f$ . It will be helpful to consider the inverse map $f^{-1}$ and extend its domain to all $\sigma\in\{0,1,\ldots,q-1\}^{V(G)}$ , so that $f^{-1}(\sigma)$ is the polymer configuration consisting of polymers that are connected components of vertices which do not receive their ground state spin; note that the range of the extended $f^{-1}$ is not limited to $\Omega$ anymore.

Restricted Glauber dynamics is defined as follows, starting from $\Gamma_{t}\in\Omega$ .

Choose $v\in V(G)$ and $s\in\{0,\ldots,q-1\}$ uniformly. 2. 2.

$\Gamma^{\prime}$ is formed from $\Gamma_{t}$ by assigning $v$ to spin $s$ (formally, by letting $\sigma=f(\Gamma_{t})$ , forming $\sigma^{\prime}$ from $\sigma$ by assigning $v$ to spin $s$ , and finally letting $\Gamma^{\prime}=f^{-1}(\sigma^{\prime})$ ). 3. 3.

If $\Gamma^{\prime}\in\Omega$ let $p=\min(1,w(\Gamma^{\prime})/w(\Gamma))$ .

•

With probability $p$ , $\Gamma_{t+1}=\Gamma^{\prime}$ .

•

With probability $1-p$ , $\Gamma_{t+1}=\Gamma_{t}$ . 4. 4.

If $\Gamma^{\prime}\notin\Omega$ then $\Gamma_{t+1}=\Gamma_{t}$ .

We will use the Markov chain comparison technique to show that the restricted Glauber dynamics is rapidly mixing. To do this, we need a mild condition on the set of allowed polymers $\mathcal{C}(G)$ . A polymer model is said to be single-update-compatible if, for every size- $k$ polymer $\gamma\in\mathcal{C}(G)$ , there is an ordering $v_{1},\ldots,v_{k}$ of the vertices in ${\gamma}$ such that, for all $i\in[k]$ , the set $S_{i}=\{v_{1},\ldots,v_{i}\}$ induces a connected subgraph of $\gamma$ and we have that $(S_{i},\sigma_{\gamma}|_{S_{i}})$ is a valid polymer itself, i.e., $(S_{i},\sigma_{\gamma}|_{S_{i}})\in\mathcal{C}(G)$ .

We will use the comparison method of Diaconis and Saloff-Coste [8, 9] as applied to mixing times by Randall and Tetali [29]. In order to avoid discussion of eigenvalues here, we use the version from Observation 13 of the survey paper [11]. We first show that the restricted Glauber dynamics is a reversible ergodic Markov chain with stationary distribution $\mu_{G}$ , which is easy to see from its definition.

Lemma 23.

Let $G$ be a graph and let $(\mathcal{C}(G),w)$ be a single-update-compatible polymer model. The restricted Glauber dynamics is ergodic and reversible with stationary distribution $\mu_{G}$ .

Proof.

The restricted Glauber dynamics is aperiodic since we remain in the same state with positive probability after performing an update. It is irreducible since we can transition from any $\Gamma\in\Omega$ to any $\Gamma^{\prime}\in\Omega$ by adding and removing vertices one-by-one. This shows that the restricted Glauber dynamics is ergodic. To show that it is reversible and has stationary distribution $\mu_{G}$ , we check detailed balance. Suppose $\Gamma,\Gamma^{\prime}\in\Omega$ with $\Gamma\neq\Gamma^{\prime}$ and $P(\Gamma,\Gamma^{\prime})>0$ where $P$ is the transition matrix of the restricted Glauber dynamics. Then,

[TABLE]

The lemma follows. ∎

We next give some standard definitions that will be used in our comparison proof. Let $\mathcal{M}$ denote the restricted Glauber dynamics and $P$ be its transition matrix. Let $\mathcal{M}^{\prime}$ be the polymer dynamics and denote its transition matrix by $P^{\prime}$ . Define $E^{*}(\mathcal{M})$ to be the set of pairs of configurations $(\Gamma,\Gamma^{\prime})$ that can be achieved by one transition of the restricted Glauber dynamics; i.e., $E^{*}(\mathcal{M})=\{(\Gamma,\Gamma^{\prime})\in\Omega^{2}:P(\Gamma,\Gamma^{\prime})>0\}$ . Similarly, define $E^{*}(\mathcal{M}^{\prime})=\{(\Gamma,\Gamma^{\prime})\in\Omega^{2}:P^{\prime}(\Gamma,\Gamma^{\prime})>0\}$ for the polymer dynamics.

For every $(\Gamma,\Gamma^{\prime})\in E^{*}(\mathcal{M}^{\prime})$ , we define a path $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ from $\Gamma$ to $\Gamma^{\prime}$ to be a sequence of configurations such that every adjacent pair is a transition of the restricted Glauber dynamics; i.e, every adjacent pair of configurations is in $E^{*}(\mathcal{M})$ . For this, we assume that the polymer model is single-update-compatible (see Section 1.3). If $\Gamma=\Gamma^{\prime}$ , then the choice is easy — we let $\mathcal{P}_{\Gamma,\Gamma^{\prime}}=(\Gamma,\Gamma^{\prime})$ . Suppose instead that $\Gamma^{\prime}=\Gamma\cup\gamma$ for some polymer $\gamma\in\Omega$ . Recall that there is a natural one-to-one mapping $f:\Omega\to\Omega_{\mathrm{spin}}$ between the set of all (polymer) configurations $\Omega$ and the set of spin configurations $\Omega_{\mathrm{spin}}$ . Let $\sigma=f(\Gamma)$ and $\sigma^{\prime}=f(\Gamma^{\prime})$ be the corresponding spin configurations. If $\gamma$ has size $k$ , let $v_{1},\ldots,v_{k}$ be the ordering of vertices of $\gamma$ from the definition of single-update-compatible so that, for all $i\in[k]$ , the polymer induced by vertices $v_{1},\ldots,v_{i}$ is in $\mathcal{C}(G)$ . Let $(\sigma=\sigma_{0},\sigma_{1},\dots,\sigma_{k}=\sigma^{\prime})$ be the sequence of spin configurations such that each $\sigma_{j}$ is obtained from $\sigma_{j-1}$ by changing the spin of $v_{j}$ from $\sigma(v)=g_{v}$ to $\sigma^{\prime}(v)$ . The path $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ is then defined to be $(f^{-1}(\sigma_{0}),\dots,f^{-1}(\sigma_{k}))$ . If $\Gamma^{\prime}=\Gamma\backslash\gamma$ for some $\gamma\in\Omega$ , we can define the path $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ in a similar manner. Note that in both cases the length of the path is $|\mathcal{P}_{\Gamma,\Gamma^{\prime}}|=k=|\gamma|$ .

For every $(\Gamma_{0},\Gamma_{0}^{\prime})\in E^{*}(\mathcal{M})$ , the congestion of the edge $(\Gamma_{0},\Gamma_{0}^{\prime})$ is defined to be

[TABLE]

The congestion of the choice of paths is the quantity

[TABLE]

The following comparison lemma gives an upper bound on the mixing time of the restricted Glauber dynamics by the mixing time of the polymer dynamics.

Lemma 24 ([11, Observation 13]).

Let $c_{1}=\min_{\Gamma\in\Omega}P(\Gamma,\Gamma)$ and $c_{2}=\min_{\Gamma\in\Omega}\mu_{G}(\Gamma)$ . Then, for any $0<\varepsilon<1$ we have

[TABLE]

We now proceed to establish the mixing-time of the restricted Glauber dynamics, which is the main result of this section. We will apply this to both the hard-core model (on bipartite $\alpha$ -expander graphs) and the ferromagnetic Potts model (on $\alpha$ -expander graphs), for which we will define appropriate single-update compatible polymer models. Furthermore, in both of these applications, $M$ below will be logarithmic in $n/\varepsilon$ , giving polynomial mixing time for the restricted Glauber dynamics.

Theorem 25.

Suppose that a polymer model $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer mixing condition. Consider a graph $G\in\mathcal{G}$ such that $(\mathcal{C}(G),w)$ is single-update-compatible. Let $M=\max\{|\gamma|:\gamma\in\mathcal{C}(G)\}$ . Suppose that, for every pair of configurations $\Gamma,\Gamma^{\prime}\in\Omega$ whose corresponding spin configurations $f(\Gamma),f(\Gamma^{\prime})\in\Omega_{\mathrm{spin}}$ differ at exactly one vertex, we have

[TABLE]

for some constant $\eta>1$ . Then for any $0<\varepsilon<1$ , the restricted Glauber dynamics has mixing time

[TABLE]

Proof.

By Lemma 24, it suffices to upper bound the congestion $A(\Gamma_{0},\Gamma_{0}^{\prime})$ for every $(\Gamma_{0},\Gamma_{0}^{\prime})\in E^{*}(\mathcal{M})$ where

[TABLE]

If $\Gamma_{0}=\Gamma_{0}^{\prime}$ , then for our choices of paths to get $(\Gamma_{0},\Gamma_{0}^{\prime})\in\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ we must have $\Gamma=\Gamma^{\prime}=\Gamma_{0}=\Gamma_{0}^{\prime}$ . It follows that

[TABLE]

since $P(\Gamma_{0},\Gamma_{0})\geq 1/q$ by the update rule of the restricted Glauber dynamics.

Now suppose $\Gamma_{0}\neq\Gamma_{0}^{\prime}$ . Let $\sigma_{0}=f(\Gamma_{0})$ and $\sigma_{0}^{\prime}=f(\Gamma_{0}^{\prime})$ be the corresponding spin configurations. Notice that $\sigma_{0}$ and $\sigma_{0}^{\prime}$ differ at exactly one vertex, which we denote by $v$ . If $\sigma_{0}(v)\neq g_{v}$ and $\sigma_{0}^{\prime}(v)\neq g_{v}$ then no path $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ would contain $(\Gamma_{0},\Gamma_{0}^{\prime})$ by our choice of paths, and thus $A(\Gamma_{0},\Gamma_{0})=0$ . Assume next that $\sigma_{0}(v)=g_{v}$ and $\sigma_{0}^{\prime}(v)\neq g_{v}$ . Then, if $(\Gamma_{0},\Gamma_{0}^{\prime})\in\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ for some $(\Gamma,\Gamma^{\prime})\in E^{*}(\mathcal{M}^{\prime})$ , we must have $\Gamma^{\prime}=\Gamma\cup\gamma$ for some polymer $\gamma\in\Omega$ and also $v\in{\gamma}$ . Moreover, the spin configurations $f(\Gamma),f(\Gamma^{\prime}),\sigma_{0},\sigma_{0}^{\prime}$ are all the same outside ${\gamma}$ . This implies that the number of such paths $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ is upper bounded by the number of polymers containing $v$ .

Now fix some $(\Gamma,\Gamma^{\prime})\in E^{*}(\mathcal{M}^{\prime})$ such that $(\Gamma_{0},\Gamma_{0}^{\prime})\in\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ and assume that $\Gamma^{\prime}=\Gamma\cup\gamma$ for some polymer $\gamma\in\Omega$ with $v\in{\gamma}$ . Then,

[TABLE]

As the path $\mathcal{P}_{\Gamma,\Gamma^{\prime}}$ is obtained by changing the spins vertex by vertex in the corresponding spin configurations, $f(\Gamma)$ and $f(\Gamma_{0})$ differ at most $|\gamma|$ vertices. The condition of the theorem implies that

[TABLE]

The update rule of the restricted Glauber dynamics gives

[TABLE]

and for the polymer dynamics we have

[TABLE]

Let $\gamma_{v}$ denote a polymer on $\{v\}$ with a spin from $\{0,\dots,q-1\}\backslash g_{v}$ . Then, the polymer mixing condition implies that $\sum_{\gamma\nsim\gamma_{v}}|\gamma|w_{\gamma}\leq\theta|\gamma_{v}|<1$ for some $\theta\in(0,1)$ . Combining this and inequalities (6), (7), (8) and (9), we get

[TABLE]

For the case where $\sigma_{0}(v)\neq g_{v}$ and $\sigma_{0}^{\prime}(v)=g_{v}$ , the proof is almost the same and we can get the same bound. Thus,

[TABLE]

The theorem then follows from Theorem 2 and Lemma 24 once we notice that $P(\Gamma,\Gamma)\geq 1/q$ and that $\mu_{G}(\Gamma)\geq 1/(\eta q)^{n}$ for all $\Gamma\in\Omega$ . ∎

5.2 Truncated polymer model

The bound on the mixing time of the restricted Glauber dynamics in Theorem 25 is exponential in the size of the largest polymer which is in general undesirable. For example, in our applications in Section 4, $M$ was linear in the number of vertices of the host graph. Here, we show that, under the polymer sampling condition, we can restrict our attention to polymers of size $O(\log n)$ in the sense that the partition function as well as the Gibbs distribution of the truncated polymer model are close to those of the original polymer model.

Let $(\mathcal{C}(G),w)$ be a polymer model on a graph $G$ . For $k>0$ , define the truncated polymer model $(\mathcal{C}_{k}(G),w)$ by

[TABLE]

Also we let

[TABLE]

be the set of allowed configurations (note that $\Omega_{k}\subseteq\Omega$ ). The partition function of the truncated polymer model $(\mathcal{C}_{k}(G),w)$ is given by

[TABLE]

The corresponding Gibbs distribution on $\Omega_{k}$ is defined by $\mu_{G,k}(\Gamma)=\frac{\prod_{\gamma\in\Gamma}w_{\gamma}}{Z_{k}(G)}$ . We remark that if the original polymer model satisfies the polymer sampling condition then so does the truncated polymer model, and thus Theorem 5 also applies to the truncated model.

The following lemma asserts that the Gibbs distribution and the partition function of the truncated polymer model $(\mathcal{C}_{k}(G),w)$ are close to those of the original model $(\mathcal{C}(G),w)$ , provided that the polymer sampling condition (3) holds.

Lemma 26.

Let $\mathcal{G}$ be a family of graphs of maximum degree at most $\Delta$ and let $(\mathcal{C}(\cdot),w,\mathcal{G})$ be a polymer model that satisfies the polymer sampling condition (3) with constant $\tau\geq 5+3\log((q-1)\Delta)$ . Let $G$ be an $n$ -vertex graph from $\mathcal{G}$ . Then for any $\varepsilon>0$ and $k=\frac{3\log(2n/\varepsilon)}{2\tau}$ , we have

[TABLE]

Moreover, the total variation distance between $\mu_{G}$ and $\mu_{G,k}$ is at most $\varepsilon$ .

Proof.

Note that $Z_{k}(G)\leq Z(G)$ follows immediately from $\Omega_{k}\subseteq\Omega$ . For $\Gamma\in\Omega_{k}$ , let $\Omega(\Gamma)=\{\Gamma^{\prime}\in\Omega:\Gamma^{\prime}\cap\mathcal{C}_{k}(G)=\Gamma\}$ and let

[TABLE]

Let $\mathcal{C}_{k}^{+}(G)=\mathcal{C}(G)\setminus\mathcal{C}_{k}(G)=\{\gamma\in\mathcal{C}(G):|\gamma|>k\}$ be the collection of all polymers of size greater than $k$ . Notice that for each $\Gamma\in\Omega_{k}$ we have the crude bound

[TABLE]

Combining (10) and (11), we obtain that

[TABLE]

Since $(1+x)\leq e^{x}$ for all real $x$ , we have that

[TABLE]

The last inequality follows from the fact that $(\mathcal{C}(\cdot),w,\mathcal{G})$ satisfies the polymer sampling condition with constant $\tau$ . Then we deduce from Lemma 14 that

[TABLE]

and since $\tau\geq 5+3\log((q-1)\Delta)$ , we get $e(q-1)\Delta\leq e^{\tau/3}$ . It follows that

[TABLE]

Combining (12), (13), (14), and (15) yields $Z(G)\leq e^{\varepsilon}Z_{k}(G)$ , as needed. Finally, we bound the total variation distance between $\mu_{G}$ and $\mu_{G,k}$ :

[TABLE]

where the first equality is because $\mu_{G}(\Gamma)>\mu_{G,k}(\Gamma)$ if and only if $\Gamma\in\Omega\setminus\Omega_{k}$ , for which we have $\mu_{G,k}(\Gamma)=0$ . This finishes the proof. ∎

5.3 Applications

In this section, we apply the previous results to show that (spin) Glauber dynamics for the ferromagnetic Potts and hard-core models mix in polynomial time on expander graphs, when restricted to configurations close to the ground states (which, as we have already seen, constitute the main portion of the probability space at low temperatures).

5.3.1 Restricted Glauber for ferromagnetic Potts

In this section, we prove Theorem 13 for the $q$ -color ferromagnetic Potts model. Throughout this section, we will work under the assumptions/conditions of Theorem 9. That is, we fix a real number $\alpha>0$ , integers $q\geq 2$ and $\Delta\geq 3$ and a real number $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}$ . We let $\mathcal{G}$ be the class of $\alpha$ -expander graphs $G$ with maximum degree at most $\Delta$ .

Let $G$ be an $n$ -vertex graph in $\mathcal{G}$ and let $\varepsilon$ be a value in $(qe^{-n},1)$ . As in Section 4.1, we will consider the polymer model $(\mathcal{C}^{g},w^{g})$ whose polymers are connected subgraphs of $G$ with at most $n/2$ vertices, which are labeled by the remaining colors $[q]\setminus\{g\}$ . In fact, following Section 5.2, we will work with a truncation of this model. Namely, for $M>0$ , let $(\mathcal{C}^{g}_{M},w^{g})$ be the polymer model on $G$ restricted to polymers of size at most $M$ .

Observation 27.

For every $M>0$ , the set $\Omega^{g}_{M}(G)$ , as defined in Section 1.3, is precisely the set of allowable polymer configurations in the truncated polymer model $(\mathcal{C}^{g}_{M},w^{g})$ .

See 13

Proof.

We let $\mathcal{G}$ be the class of $\alpha$ -expanders with maximum degree at most $\Delta$ . For the given $n$ -vertex graph $G\in\mathcal{G}$ , let $\mu^{g}_{G,M}$ be the Gibbs distribution of the polymer model $(\mathcal{C}^{g}_{M}(G),w^{g})$ .

By Lemma 20, we have that $(\mathcal{C}^{g}(\cdot),w^{g},\mathcal{G})$ satisfies the polymer sampling condition with $\tau=\alpha\beta$ and hence so does the truncated polymer model $(\mathcal{C}^{g}_{M}(\cdot),w^{g},\mathcal{G})$ . The result therefore follows by applying Theorem 25, after observing that (i) the polymer model $(\mathcal{C}^{g}_{M}(G),w^{g})$ is single-update-compatible (use DFS ordering), and (ii) for a pair of polymer configurations $\Gamma,\Gamma^{\prime}\in\Omega^{g}_{M}$ whose corresponding spin configurations $\sigma,\sigma^{\prime}$ differ at a vertex, we have

[TABLE]

where $\eta=\exp(\beta\Delta)$ (since $G$ has maximum degree $\Delta$ , changing the spin of a vertex can create at most $\Delta$ new monochromatic/bichromatic edges). This finishes the proof. ∎

The following lemma justifies that the set $\Omega^{g}_{M}(G)$ with $M=O(\log(n/\varepsilon))$ constitutes for all but $\varepsilon$ of the aggregate weight of colorings in the Potts distribution on $G$ .

Lemma 28.

Suppose $q\geq 2$ , $\Delta\geq 3$ are integers and $\alpha>0$ is a real. Let $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}$ be a real number and $g\in[q]$ . Then, for any $n$ -vertex $\alpha$ -expander graph $G$ of maximum degree $\Delta$ and any $qe^{-n}\leq\varepsilon<1$ , for $M=\frac{3\log(4n/\varepsilon)}{2\alpha\beta}$ , $qZ^{g}_{M}(G)$ is an $\varepsilon$ -approximation of the Potts partition function $Z_{G,\beta}$ .

Proof.

Let $Z^{g}(G)$ be the partition function of the polymer model $(\mathcal{C}^{g}(G),w^{g})$ . Since $\beta\geq\frac{5+3\log((q-1)\Delta)}{\alpha}>\frac{2\log(eq)}{\alpha}$ and $\varepsilon\geq qe^{-n}$ , by Lemma 21 we have that $qZ^{g}(G)$ is an $(\varepsilon/2)$ -approximation to $Z_{G,\beta}$ . If $\mathcal{G}$ is the class of $\alpha$ -expanders with maximum degree at most $\Delta$ then by Lemma 20, we have that $(\mathcal{C}^{g}(\cdot),w^{g},\mathcal{G})$ satisfies the polymer sampling condition with $\tau=\alpha\beta$ and hence so does the truncated polymer model $(\mathcal{C}^{g}_{M}(\cdot),w^{g},\mathcal{G})$ . It follows by Lemma 26 that, for $M=\frac{3\log(4n/\varepsilon)}{2\alpha\beta}$ , $qZ^{g}_{M}(G)$ is an $(\varepsilon/2)$ -approximation to $qZ^{g}(G)$ . Therefore, $qZ^{g}_{M}(G)$ is an $\varepsilon$ -approximation to $Z_{G,\beta}$ . ∎

5.3.2 Restricted Glauber dynamics for hard-core mixes in polynomial time

In this section, we state and prove the analogue of Theorem 13 for the hard-core model. In particular, let $G=(V^{0},V^{1},E)$ be an $n$ -vertex $\alpha$ -expanding bipartite graph of maximum degree $\Delta$ , and for $i\in\{0,1\}$ and $M>0$ , let $\Omega^{i}_{M}(G)$ denote the independent sets $I$ whose deviations from the ground state $V^{i}$ consists of small connected components, more precisely, $(V^{i}\backslash I)\cup(I\cap V^{1-i})$ consists of connected components of size at most $M$ . Using similar methods to Section 5.3.1, we will show the following.

Theorem 29.

Fix $\alpha>0$ , $\Delta\geq 3$ and $\lambda\geq(6\Delta)^{3+6/\alpha}$ . For any $n$ -vertex bipartite $\alpha$ -expander with maximum degree at most $\Delta$ and any $\varepsilon\in(0,1)$ and $i\in\{0,1\}$ , with $M=O(\log(n/\varepsilon))$ , the Glauber dynamics restricted to $\Omega^{i}_{M}(G)$ has mixing time $T_{\mathrm{mix}}(\varepsilon)$ polynomial in $n$ and $1/\varepsilon$ .

As we shall see soon in the upcoming Lemma 32, and for $\lambda$ large enough, the set $\Omega^{0}_{M}(G)\cup\Omega^{1}_{M}(G)$ for $M=\Theta(\log(n/\varepsilon))$ captures all but $\varepsilon$ weight of the hard-core partition function and hence Theorem 29 can be used to obtain another polynomial time algorithm for the hard-core model on expanding graphs $G$ in that regime.

To prove Theorem 29, it will be simpler to work with somewhat different polymer models than those in Section 4.2. These models were originally used in [20] (the conference version of [21]). For $i\in\{0,1\}$ , and following [20], we will define a polymer model $(\mathcal{C}^{i}(G),w^{i})$ . The host graph will be $G$ and the model will capture deviations from the ground state $V^{i}$ : a polymer $\gamma$ will be a connected set of vertices in $G$ such that $(V^{i}\backslash I)\cup(I\cap V^{1-i})=\gamma$ for some independent set $I$ . Specifically, the set $\mathcal{C}^{i}=\mathcal{C}^{i}(G)$ of allowed polymers consists of all connected sets of vertices $\gamma$ (in $G$ ) such that $|\gamma\cap V^{i}|\leq|V^{i}|/4$ and for any $v\in\gamma\cap V^{1-{i}}$ , all of the neighbors of $v$ (in $G$ ) are also in $\gamma$ . The set of spins is $\{0,1\}$ and the ground state spin for a vertex $v$ is $g_{v}=1$ if $v\in V^{i}$ , and $g_{v}=0$ if $v\in V^{1-i}$ ; the spin assignment $\sigma_{\gamma}$ for a polymer $\gamma$ is given by $1-g_{v}$ for $v\in\gamma$ . The weight of a polymer $\gamma\in\mathcal{C}^{i}$ is defined as

[TABLE]

The main observation behind the definition of the weight $w^{i}_{\gamma}$ is that the weight of an independent set $I$ such that $(V^{i}\backslash I)\cup(I\cap V^{1-i})=\gamma$ is $\lambda^{|V^{i}|}w^{i}_{\gamma}$ .

Following again Section 5.2, it will be relevant to consider, for $M>0$ , the truncated polymer model $(\mathcal{C}^{i}_{M}(G),w^{i})$ whose polymers are of size at most $M$ ; observe that the set $\Omega^{i}_{M}$ defined above is precisely the set of allowable polymer configurations in the truncated polymer model. We next verify the polymer sampling condition (3) for these models and conclude the proof of Theorem 29.

Lemma 30.

Fix $\alpha>0$ and $\Delta\geq 3$ . Let $\mathcal{G}$ be the class of bipartite $\alpha$ -expanders with maximum degree at most $\Delta$ . For $\lambda\geq(6\Delta)^{3+6/\alpha}$ and $i\in\{0,1\}$ , the polymer model $(\mathcal{C}^{i}(\cdot),w^{i},\mathcal{G})$ satisfies the polymer sampling condition (3) with $\tau=\frac{\alpha}{2+\alpha}\log\lambda$ .

Proof.

We have $\tau\geq 5+3\log\Delta$ , so it suffices to show that for $G\in\mathcal{G}$ and $\gamma\in(\mathcal{C}^{i}(G),w^{i})$ it holds that $w^{i}_{\gamma}\leq e^{-\tau|\gamma|}$ . For $v\in\gamma\cap V^{1-{i}}$ we have that all of the neighbors of $v$ are also in $\gamma$ and hence, by the $\alpha$ -expansion of $G$ , we have that $|\gamma\cap V^{i}|\geq(1+\alpha)|\gamma\cap V^{1-i}|$ . This gives $(2+\alpha)|\gamma\cap V^{i}|\geq(1+\alpha)|\gamma|$ and therefore

[TABLE]

Proof of Theorem 29.

Let $\mathcal{G}$ be the class of bipartite $\alpha$ -expanders with maximum degree at most $\Delta$ . Consider an $n$ -vertex graph $G\in\mathcal{G}$ and let $\mu^{i}_{G,M}$ be the Gibbs distribution of the model $(\mathcal{C}^{i}_{M}(G),w^{i})$ .

By Lemma 30, we have that $(\mathcal{C}^{i}(\cdot),w^{i},\mathcal{G})$ satisfies the polymer sampling condition with $\tau=\frac{\alpha}{2+\alpha}\log\lambda$ and hence so does the truncated polymer model $(\mathcal{C}^{i}_{M}(\cdot),w^{i},\mathcal{G})$ . The result therefore follows by applying Theorem 25, after observing that (i) the polymer model $(\mathcal{C}^{i}_{M}(G),w^{i})$ is single-update-compatible (use DFS ordering), and (ii) for a pair of polymer configurations $\Gamma,\Gamma^{\prime}\in\Omega^{i}_{M}$ whose corresponding independent sets $I,I^{\prime}$ differ in at most one vertex, we have

[TABLE]

Finally, we justify that, for large enough $\lambda$ and $M=\Theta(\log(n/\varepsilon))$ , the aggregate weight of independent sets in $\Omega^{0}_{M}(G)\cup\Omega^{1}_{M}(G)$ captures all but $\varepsilon$ fraction of the hard-core partition function $Z_{G,\lambda}$ . Let $Z^{i}(G)$ denote the partition function of the polymer model $(\mathcal{C}^{i},w^{i})$ and $Z^{i}_{M}(G)$ denote the partition function of the polymer model $(\mathcal{C}^{i}_{M},w^{i}_{M})$ . We will need the following lemma from [20].

Lemma 31 ([20, Lemmas 4.1 & 4.2]).

For $\lambda>\max\big{\{}(2e)^{\frac{8n}{\alpha n_{0}}},(2e)^{\frac{8n}{\alpha n_{1}}},(2e)^{(40/\alpha)}\big{\}}$ , the number $\lambda^{n_{0}}Z^{0}(G)+\lambda^{n_{1}}Z^{1}(G)$ is a $(2e^{-n})$ -approximation of the hard-core partition function $Z_{G,\lambda}$ , where $n_{i}=|V^{i}|$ for $i\in\{0,1\}$ .

Lemma 32.

Fix $\alpha\in(0,1)$ and $\Delta\geq 3$ . There exists a constant $C>0$ such that for $\lambda>(6C\Delta)^{3+6/\alpha}$ the following holds for all $n$ -vertex bipartite $\alpha$ -expander graphs $G=(V^{0},V^{1},E)$ of maximum degree at most $\Delta$ . For all $\varepsilon\in(4e^{-n},1)$ and $M=\frac{3(2+\alpha)\log(4n/\varepsilon)}{2\alpha\log\lambda}$ , the number

[TABLE]

is an $\varepsilon$ -approximation of the hard-core partition function $Z_{G,\lambda}$ , where $n_{i}=|V^{i}|$ for $i\in\{0,1\}$ .

Proof.

Let $n_{i}=|V^{i}|$ for $i\in\{0,1\}$ and observe that $n/n_{0},n/n_{1}\leq 3$ (using that $G$ is an $\alpha$ -expander for $\alpha\in(0,1)$ , see [20] for details). Therefore, by taking $C$ large enough, we have that for all $\lambda>(6C\Delta)^{3+6/\alpha}$ both Lemmas 30 and 31 apply. Let $\varepsilon\in(4e^{-n},1)$ .

By Lemma 31, we have that $\lambda^{n_{0}}Z^{0}(G)+\lambda^{n_{1}}Z^{1}(G)$ is an $(\varepsilon/2)$ -approximation to $Z_{G,\lambda}$ . By Lemma 30, we have that, for $i\in\{0,1\}$ , $(\mathcal{C}^{i}(\cdot),w^{i},\mathcal{G})$ satisfies the polymer sampling condition with $\tau=\frac{\alpha}{2+\alpha}\log\lambda$ and hence so does the truncated polymer model $(\mathcal{C}^{i}_{M}(\cdot),w^{i},\mathcal{G})$ . (Here, as usual, we take $\mathcal{G}$ to be the class of bipartite $\alpha$ -expanders with maximum degree at most $\Delta$ .) It follows by Lemma 26 that, for $M=\frac{3(2+\alpha)\log(4n/\varepsilon)}{2\alpha\log\lambda}$ , $Z^{i}_{M}(G)$ is an $(\varepsilon/2)$ -approximation to $Z^{i}(G)$ . Therefore, $\hat{Z}$ is an $\varepsilon$ -approximation to $Z_{G,\lambda}$ . ∎

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Barvinok. Combinatorics and Complexity of Partition Functions . Algorithms and Combinatorics. Springer International Publishing, 2017.
2[2] I. Bezáková, D. Štefankovič, V. V. Vazirani, and E. Vigoda. Accelerating simulated annealing for the permanent and combinatorial counting problems. SIAM Journal on Computing , 37(5):1429–1454, 2008.
3[3] M. Bordewich, C. Greenhill, and V. Patel. Mixing of the Glauber dynamics for the ferromagnetic Potts model. Random Structures & Algorithms , 48(1):21–52, 2016.
4[4] C. Borgs. Absence of zeros for the chromatic polynomial on bounded degree graphs. Combinatorics, Probability and Computing , 15(1-2):63–74, 2006.
5[5] C. Borgs, J. T. Chayes, A. Frieze, J. H. Kim, P. Tetali, E. Vigoda, and V. H. Vu. Torpid mixing of some Monte Carlo Markov chain algorithms in statistical physics. In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science (FOCS) , pages 218–229, 1999.
6[6] C. Borgs, J. T. Chayes, J. Kahn, and L. Lovász. Left and right convergence of graphs with bounded degree. Random Structures & Algorithms , 42(1):1–28, 2013.
7[7] C. Borgs and J. Z. Imbrie. A unified approach to phase diagrams in field theory and statistical mechanics. Communications in mathematical physics , 123(2):305–328, 1989.
8[8] P. Diaconis and L. Saloff-Coste. Comparison techniques for random walk on finite groups. Ann. Probab. , 21(4):2131–2156, 1993.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Fast algorithms at low temperatures via Markov chains††thanks: These results were announced in preliminary form (without proofs) as a brief abstract in the proceedings of APPROX/RANDOM 2019

Abstract

1 Introduction

1.1 Subset polymer models

Example 1**.**

Example 2**.**

Definition 1**.**

Theorem 2**.**

Definition 3**.**

Definition 4**.**

Theorem 5**.**

Theorem 6**.**

1.2 Applications

Definition 7**.**

Definition 8**.**

Theorem 9**.**

Definition 10**.**

Theorem 11**.**

Corollary 12**.**

1.3 Comparison to spin Glauber dynamics

Theorem 13**.**

2 Polymer models and Markov chains

2.1 A comparison of the conditions on the weights

Lemma 14** ([6]).**

2.2 The polymer Markov chain

2.3 Proof of Theorems 2 and 5

Proof.

Lemma 15** ([27] Lemma 3.7).**

Lemma 16**.**

Proof.

Proof.

3 Approximate counting algorithm

Lemma 17**.**

Proof.

Lemma 18**.**

Proof.

Lemma 19**.**

Proof.

Proof.

4 Applications

4.1 Ferromagnetic Potts model

Lemma 20**.**

Proof.

Lemma 21** ([21, Lemma 12]).**

Proof.

4.2 Hard-core model

Lemma 22** ([21, Lemma 19]).**

Proof.

5 Comparison to spin Glauber dynamics

5.1 Restricted Glauber dynamics for polymer models

Lemma 23**.**

Proof.

Lemma 24** ([11, Observation 13]).**

Theorem 25**.**

Proof.

5.2 Truncated polymer model

Lemma 26**.**

Proof.

5.3 Applications

5.3.1 Restricted Glauber for ferromagnetic Potts

Observation 27**.**

Proof.

Lemma 28**.**

Proof.

5.3.2 Restricted Glauber dynamics for hard-core mixes in polynomial time

Theorem 29**.**

Lemma 30**.**

Proof.

Proof of Theorem 29.

Lemma 31** ([20, Lemmas 4.1 & 4.2]).**

Lemma 32**.**

Proof.

Example 1.

Example 2.

Definition 1.

Theorem 2.

Definition 3.

Definition 4.

Theorem 5.

Theorem 6.

Definition 7.

Definition 8.

Theorem 9.

Definition 10.

Theorem 11.

Corollary 12.

Theorem 13.

Lemma 14 ([6]).

Lemma 15 ([27] Lemma 3.7).

Lemma 16.

Lemma 17.

Lemma 18.

Lemma 19.

Lemma 20.

Lemma 21 ([21, Lemma 12]).

Lemma 22 ([21, Lemma 19]).

Lemma 23.

Lemma 24 ([11, Observation 13]).

Theorem 25.

Lemma 26.

Observation 27.

Lemma 28.

Theorem 29.

Lemma 30.

Lemma 31 ([20, Lemmas 4.1 & 4.2]).

Lemma 32.