Exponentially slow mixing in the mean-field Swendsen-Wang dynamics

Reza Gheissari; Eyal Lubetzky; Yuval Peres

arXiv:1702.05797·math.PR·May 3, 2017·SODA

Exponentially slow mixing in the mean-field Swendsen-Wang dynamics

Reza Gheissari, Eyal Lubetzky, Yuval Peres

PDF

TL;DR

This paper proves that the Swendsen-Wang dynamics for the mean-field Potts model with three or more colors exhibits exponential mixing time in the number of vertices at the critical window, confirming slow mixing behavior.

Contribution

It establishes a tight exponential lower bound on the mixing time of Swendsen-Wang dynamics in the mean-field setting at criticality, improving previous subexponential bounds.

Findings

01

Mixing time is at least exponential in the number of vertices.

02

The result applies to the mean-field Potts model with q≥3.

03

The same exponential bound holds for related FK model samplers.

Abstract

Swendsen-Wang dynamics for the Potts model was proposed in the late 1980's as an alternative to single-site heat-bath dynamics, in which global updates allow this MCMC sampler to switch between metastable states and ideally mix faster. Gore and Jerrum (1999) found that this dynamics may in fact exhibit slow mixing: they showed that, for the Potts model with $q \geq 3$ colors on the complete graph on $n$ vertices at the critical point $β_{c} (q)$ , Swendsen-Wang dynamics has $t_{mix} \geq exp (c n)$ . The same lower bound was extended to the critical window $(β_{s}, β_{S})$ around $β_{c}$ by Galanis et al. (2015), as well as to the corresponding mean-field FK model by Blanca and Sinclair (2015). In both cases, an upper bound of $t_{mix} \leq exp (c^{'} n)$ was known. Here we show that the mixing time is truly exponential in $n$ : namely, $t_{\mathrm{mix}} \geq \exp…

Equations221

λ_{s} = z \geq 0 min {z + \frac{q z}{e ^{z} - 1}}, λ_{c} = \frac{2 ( q - 1 ) lo g ( q - 1 )}{q - 2}, λ_{S} = q,

λ_{s} = z \geq 0 min {z + \frac{q z}{e ^{z} - 1}}, λ_{c} = \frac{2 ( q - 1 ) lo g ( q - 1 )}{q - 2}, λ_{S} = q,

\lim_{n\to\infty}\mu_{n,\lambda,q}\biggl{(}\sigma:\max_{r=1,..,q}\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=r\}-\tfrac{1}{q}\Bigr{|}<\varepsilon\biggr{)}=1\,,

\lim_{n\to\infty}\mu_{n,\lambda,q}\biggl{(}\sigma:\max_{r=1,..,q}\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=r\}-\tfrac{1}{q}\Bigr{|}<\varepsilon\biggr{)}=1\,,

\lim_{n\to\infty}\mu_{n,\lambda,q}\biggl{(}\sigma:\max_{r=2,...,q}\Bigl{\{}\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=1\}-a\Bigr{|},\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=r\}-\tfrac{1-a}{q-1}\Bigr{|}\Bigr{\}}<\varepsilon\biggr{)}=\frac{1}{q}\,.

\lim_{n\to\infty}\mu_{n,\lambda,q}\biggl{(}\sigma:\max_{r=2,...,q}\Bigl{\{}\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=1\}-a\Bigr{|},\Bigl{|}\tfrac{1}{n}\sum_{i\leq n}\boldsymbol{1}\{\sigma_{i}=r\}-\tfrac{1-a}{q-1}\Bigr{|}\Bigr{\}}<\varepsilon\biggr{)}=\frac{1}{q}\,.

n \to \infty lim μ_{n, λ, q}

n \to \infty lim μ_{n, λ, q}

n \to \infty lim μ_{n, λ, q}

π_{n, λ, 1} (∣ C_{x} ∣ \geq k) \leq e^{- \frac{( 1 - λ ) ^{2} k}{2}} .

π_{n, λ, 1} (∣ C_{x} ∣ \geq k) \leq e^{- \frac{( 1 - λ ) ^{2} k}{2}} .

π_{n, λ, 1} (∣ L_{1} - θ_{λ} ∣ \geq ε n) ≲ e^{- c ε^{2} n} .

π_{n, λ, 1} (∣ L_{1} - θ_{λ} ∣ \geq ε n) ≲ e^{- c ε^{2} n} .

F_{\lambda}(z)=\left\{\begin{array}[]{ll}\theta_{\lambda}+\tfrac{1}{q}(1-\theta_{\lambda z})&\mbox{for $z>1/\lambda$}\\ \tfrac{1}{q}&\mbox{for $z\leq 1/\lambda$}\end{array}\right\}\,.

F_{\lambda}(z)=\left\{\begin{array}[]{ll}\theta_{\lambda}+\tfrac{1}{q}(1-\theta_{\lambda z})&\mbox{for $z>1/\lambda$}\\ \tfrac{1}{q}&\mbox{for $z\leq 1/\lambda$}\end{array}\right\}\,.

lo g \frac{( q - 1 ) a}{1 - a} = λ (a - \frac{1 - a}{q - 1}) .

lo g \frac{( q - 1 ) a}{1 - a} = λ (a - \frac{1 - a}{q - 1}) .

f (θ) = θ_{λ (1 + (q - 1) θ) / q},

f (θ) = θ_{λ (1 + (q - 1) θ) / q},

∥ ν - π ∥_{\textsc t v} = A \subset Ω sup ∣ ν (A) - π (A) ∣ = \frac{1}{2} ∥ ν - π ∥_{ℓ^{1}} .

∥ ν - π ∥_{\textsc t v} = A \subset Ω sup ∣ ν (A) - π (A) ∣ = \frac{1}{2} ∥ ν - π ∥_{ℓ^{1}} .

t_{\textsc mi x} = in f {t : X_{0} \in Ω max ∥ P^{t} (X_{0}, \cdot) - π ∥_{\textsc t v} < 1/ (2 e)} .

t_{\textsc mi x} = in f {t : X_{0} \in Ω max ∥ P^{t} (X_{0}, \cdot) - π ∥_{\textsc t v} < 1/ (2 e)} .

gap^{- 1} - 1 \leq t_{\textsc mi x} \leq lo g (2 e / π_{m i n}) gap^{- 1} .

gap^{- 1} - 1 \leq t_{\textsc mi x} \leq lo g (2 e / π_{m i n}) gap^{- 1} .

(1 - p + p / q) gap_{\textsc r c}

(1 - p + p / q) gap_{\textsc r c}

gap_{\textsc r c}

gap_{\textsc r c}

S_{M} := S_{M} (ω) = {x \in V : ∣ C_{x} ∣ > M} .

S_{M} := S_{M} (ω) = {x \in V : ∣ C_{x} ∣ > M} .

π_{n, λ} (∣ S_{M} ∣ \geq ρ n) ≲ e^{- c ρ n} .

π_{n, λ} (∣ S_{M} ∣ \geq ρ n) ≲ e^{- c ρ n} .

\pi_{n,\lambda}\bigg{(}|\mathcal{C}_{x}|\geq k\;\Big{|}\;\mathcal{C}_{y_{1}}\,,\ldots\,,\mathcal{C}_{y_{\ell}}~{},~{}\mathcal{C}_{x}\cap\big{(}\mbox{$\bigcup_{i=1}^{\ell}\mathcal{C}_{y_{i}}$}\big{)}=\emptyset\bigg{)}\leq e^{-c_{1}k}\,.

\pi_{n,\lambda}\bigg{(}|\mathcal{C}_{x}|\geq k\;\Big{|}\;\mathcal{C}_{y_{1}}\,,\ldots\,,\mathcal{C}_{y_{\ell}}~{},~{}\mathcal{C}_{x}\cap\big{(}\mbox{$\bigcup_{i=1}^{\ell}\mathcal{C}_{y_{i}}$}\big{)}=\emptyset\bigg{)}\leq e^{-c_{1}k}\,.

π_{n, λ} (Y_{M} \geq n e^{- c_{1} M} + t) \leq e^{- t^{2} / (2 n)} .

π_{n, λ} (Y_{M} \geq n e^{- c_{1} M} + t) \leq e^{- t^{2} / (2 n)} .

K = (e^{- c_{1} M} + M^{- 1}) n;

K = (e^{- c_{1} M} + M^{- 1}) n;

π_{n, λ} (∣ S_{M} ∣ \geq ρ n)

π_{n, λ} (∣ S_{M} ∣ \geq ρ n)

\displaystyle\pi_{n,\lambda}\bigg{(}\sum_{i=1}^{K}{\mathscr{L}}_{i}\geq\rho n\bigg{)}

\displaystyle\pi_{n,\lambda}\bigg{(}\sum_{i=1}^{K}{\mathscr{L}}_{i}\geq\rho n\bigg{)}

\leq (\frac{e n}{K})^{K} π_{n, λ} (i = 1 \sum K Z_{i} \geq K E [Z_{i}] + ρ n /2) .

\displaystyle\pi_{n,\lambda}\bigg{(}\sum_{i=1}^{K}{\mathscr{L}}_{i}\geq\rho n\bigg{)}\leq\left(\frac{e}{e^{-c_{1}M}+M^{-1}}\right)^{(e^{-c_{1}M}+M^{-1})n}e^{-\frac{\rho n}{4b}}\lesssim e^{-c_{2}\rho n}\,.

\displaystyle\pi_{n,\lambda}\bigg{(}\sum_{i=1}^{K}{\mathscr{L}}_{i}\geq\rho n\bigg{)}\leq\left(\frac{e}{e^{-c_{1}M}+M^{-1}}\right)^{(e^{-c_{1}M}+M^{-1})n}e^{-\frac{\rho n}{4b}}\lesssim e^{-c_{2}\rho n}\,.

P (∣ ∣ R ∣ - α n ∣ \geq (ε + δ) n) \leq 2 exp (- \frac{δ ^{2} n}{2 M ^{2}}) .

P (∣ ∣ R ∣ - α n ∣ \geq (ε + δ) n) \leq 2 exp (- \frac{δ ^{2} n}{2 M ^{2}}) .

P (∣ R \cup S_{M} ∣ \geq (α + ε + δ) n)

P (∣ R \cup S_{M} ∣ \geq (α + ε + δ) n)

\leq P (∣ R - S_{M} ∣ \geq (α + δ) n),

P (∣ R - S_{M} ∣ - α (n - ∣ S_{M} ∣) \geq δ n + α ∣ S_{M} ∣) \leq e^{- \frac{( δ n + α ∣ S _{M} ∣ ) ^{2}}{2 ( n - ∣ S _{M} ∣ ) M ^{2}}} \leq e^{- δ^{2} n / (2 M^{2})} .

P (∣ R - S_{M} ∣ - α (n - ∣ S_{M} ∣) \geq δ n + α ∣ S_{M} ∣) \leq e^{- \frac{( δ n + α ∣ S _{M} ∣ ) ^{2}}{2 ( n - ∣ S _{M} ∣ ) M ^{2}}} \leq e^{- δ^{2} n / (2 M^{2})} .

A_{ρ} = {σ \in {1, ..., q}^{n} : r = 1, .., q max i = 1 \sum n 1 {σ_{i} = r} - \frac{n}{q} < ρ n} .

A_{ρ} = {σ \in {1, ..., q}^{n} : r = 1, .., q max i = 1 \sum n 1 {σ_{i} = r} - \frac{n}{q} < ρ n} .

X_{0} \in A_{ρ} max P_{X_{0}} (X_{1} \in / A_{ρ}) ≲ C e^{- c n} .

X_{0} \in A_{ρ} max P_{X_{0}} (X_{1} \in / A_{ρ}) ≲ C e^{- c n} .

P_{X_{0}} (∣ S_{M} (ω_{1}^{i}) ∣ \geq δ n) = π_{v_{0}^{i}, λ} (∣ S_{M} ∣ \geq δ n) ≲ e^{- cδ n} .

P_{X_{0}} (∣ S_{M} (ω_{1}^{i}) ∣ \geq δ n) = π_{v_{0}^{i}, λ} (∣ S_{M} ∣ \geq δ n) ≲ e^{- cδ n} .

\mathbb{P}_{X_{0}}\big{(}\mbox{$\bigcup_{i=1}^{q}$}\{|S_{M}(\omega_{1}^{i})|\geq\delta n\}\big{)}\lesssim e^{-c\delta n}\,.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Exponentially slow mixing in the

mean-field Swendsen–Wang dynamics

Reza Gheissari

R. Gheissari Courant Institute

New York University

251 Mercer Street

New York, NY 10012, USA.

[email protected]

,

Eyal Lubetzky

E. Lubetzky Courant Institute

New York University

251 Mercer Street

New York, NY 10012, USA.

[email protected]

and

Yuval Peres

Y. Peres Microsoft Research

1 Microsoft Way

Redmond, WA 98052, USA.

[email protected]

Abstract.

Swendsen–Wang dynamics for the Potts model was proposed in the late 1980’s as an alternative to single-site heat-bath dynamics, in which global updates allow this MCMC sampler to switch between metastable states and ideally mix faster. Gore and Jerrum (1999) found that this dynamics may in fact exhibit slow mixing: they showed that, for the Potts model with $q\geq 3$ colors on the complete graph on $n$ vertices at the critical point $\beta_{c}(q)$ , Swendsen–Wang dynamics has $t_{\textsc{mix}}\geq\exp(c\sqrt{n})$ . The same lower bound was extended to the critical window $(\beta_{s},\beta_{S})$ around $\beta_{c}$ by Galanis *et al. *(2015), as well as to the corresponding mean-field FK model by Blanca and Sinclair (2015). In both cases, an upper bound of $t_{\textsc{mix}}\leq\exp(c^{\prime}n)$ was known. Here we show that the mixing time is truly exponential in $n$ : namely, $t_{\textsc{mix}}\geq\exp(cn)$ for Swendsen–Wang dynamics when $q\geq 3$ and $\beta\in(\beta_{s},\beta_{S})$ , and the same bound holds for the related MCMC samplers for the mean-field FK model when $q>2$ .

1. Introduction

The mean-field $q$ -state Potts model is a canonical statistical physics model extending the Curie–Weiss Ising model ( $q=2$ ) to $q\in\mathbb{N}$ possible states; for $q\geq 3$ , it is one of the simplest models to exhibit a discontinuous (first-order) phase transition. Formally, the mean-field $q$ -state Potts model with parameter $\beta$ is a probability distribution $\mu_{n,\beta,q}$ over $\{1,\ldots,q\}^{n}$ , given by $\mu_{n,\beta,q}(\sigma)\propto\exp(\frac{\beta}{n}H(\sigma))$ , where $H(\sigma)=\sum_{i<j}\mathbf{1}{\{\sigma_{i}=\sigma_{j}\}}$ . The model exhibits a phase transition at $\beta=\beta_{c}(q)$ from a disordered phase ( $\beta<\beta_{c}$ ), where the sizes of all $q$ color classes concentrate around $n/q$ , to an ordered phase ( $\beta>\beta_{c}$ ), where there is typically one color class of size $a_{\beta}n$ for $a_{\beta}>1/q$ (see §2).

As a means of overcoming low-temperature bottlenecks in the energy landscape (dominant color classes), Swendsen and Wang [19] introduced a non-local reversible Markov chain, relying on the random cluster (FK) representation of the Potts model. The mean-field FK model is the generalization of $\mathcal{G}(n,p)$ —the Erdős–Rényi random graph—parametrized by $(p=\frac{\lambda}{n},q)$ , in which the probability of a graph $G=(V,E)$ , identified with $\omega\in\Omega_{\textsc{rc}}:=\{0,1\}^{\binom{n}{2}}$ , is given by $\pi_{n,\lambda,q}(\omega)\propto p^{|E|}(1-p)^{\binom{n}{2}-|E|}q^{k(G)}$ , where $k(G)$ is the number of connected components of $G$ (clusters of $\omega$ ).

Via the Edwards–Sokal coupling [8] of the $q$ -state Potts model at inverse temperature $\beta/n$ and the FK model with parameters $(p,q)$ with $p=1-e^{-\beta/n}$ , the mean-field Swendsen–Wang dynamics can be formulated as follows: consider a mean-field Potts configuration $\sigma$ with $V_{1},...,V_{q}$ being the sets of vertices $V_{i}=\{x:\sigma_{x}=i\}$ . An update of the dynamics, started from $\sigma$ , first samples, independently for every $i=1,...,q$ , a configuration $G_{i}\sim\mathcal{G}(|V_{i}|,p)$ on the subgraph of $V_{i}$ , forming an FK configuration $\omega$ as the union of the $G_{i}$ ’s; then, it assigns an i.i.d. color $X_{C}\sim\mbox{Uni}(\{1,...,q\})$ to each cluster $C$ in $\omega$ , and for every $x\in C$ , sets $\sigma^{\prime}_{x}=X_{c}$ in the new state $\sigma^{\prime}$ of the Markov chain.

As apparent from the second (coloring) stage of the Swendsen–Wang algorithm, it can seamlessly jump between the $q$ ordered low-temperature metastable states where one color is dominant. It was thus expected that this MCMC sampler would converge quickly to equilibrium at all temperatures; e.g., its total variation mixing time $t_{\textsc{mix}}$ , formally defined in §2, would be at most polynomial in the system size for all $\beta>0$ .

Indeed, at $q=2$ (the Ising model) Cooper, Dyer, Frieze and Rue [6] proved that, on the complete graph, Swendsen–Wang has $t_{\textsc{mix}}=O(\sqrt{n})$ at all $\beta$ (it was later shown in [17] that $t_{\textsc{mix}}\asymp n^{1/4}$ at $\beta_{c}$ while $t_{\textsc{mix}}=O(\log n)$ at $\beta\neq\beta_{c}$ ), and Guo and Jerrum [14] recently showed that for any $n$ -vertex graph and all $\beta$ , Swendsen–Wang has $t_{\textsc{mix}}=n^{O(1)}$ (this is in contrast to single-site dynamics, where $t_{\textsc{mix}}\geq\exp(cn)$ at low temperature [7]).

Countering this intuition, however, Gore and Jerrum [13] found in 1999 that, for any $q\geq 3$ , the Swendsen–Wang dynamics for the mean-field $q$ -state Potts model has $t_{\textsc{mix}}\geq\exp(c\sqrt{n})$ for some $c(q)>0$ at its critical point $\beta_{c}(q)$ . This is a consequence of the discontinuity of the phase transition of the mean-field Potts model for $q\geq 3$ , where at $\beta_{c}(q)$ , both the $q$ ordered phases (with one dominant color class) and the disordered phase (with all color classes having roughly $n/q$ sites) are metastable.

On the lattice $(\mathbb{Z}/n\mathbb{Z})^{d}$ , the Potts model exhibits a discontinuous phase transition for some choices of $q$ (depending on $d$ ); there it was shown in [4], following [3], that Swendsen–Wang dynamics in fact has $t_{\textsc{mix}}\geq\exp(cn^{d-1})$ for all $q$ sufficiently large, suggesting that an exponential lower bound in $n$ should also hold in mean-field, believed to approximate high-dimensional tori. (The matching upper bound of [4] applies to general graphs and translates to $t_{\textsc{mix}}\leq\exp(c^{\prime}n)$ on the complete graph.) On $\mathbb{Z}^{2}$ , this lower bound was extended [11] to $q$ where the phase transition is first-order (all $q>4$ ).

For the Glauber dynamics of the mean-field Potts model, when $q\geq 3$ , the mixing time for all $\beta$ was characterized in [7], where it was shown that, in discrete-time, $t_{\textsc{mix}}$ has order $n\log n$ at $\beta<\beta_{s}$ , order $n^{4/3}$ at $\beta=\beta_{s}$ , and finally $t_{\textsc{mix}}\geq\exp(cn)$ at $\beta>\beta_{s}$ , where $\beta_{s}$ is the spinodal point corresponding to the onset of $q$ ordered metastable phases. Recently, Galanis, Štefankovic and Vigoda [9] analyzed the mixing time of the analogous mean-field Swendsen–Wang dynamics, finding it to mix in polynomial time111It was shown in that work that $t_{\textsc{mix}}=O(\log n)$ for $\beta\notin[\beta_{s},\beta_{S})$ , whereas $t_{\textsc{mix}}\asymp n^{1/3}$ at $\beta=\beta_{s}$ . both at high temperature and—unlike Glauber dynamics—at low temperatures, for all $\beta$ outside a critical window $(\beta_{s},\beta_{S})$ around $\beta_{c}$ , where the critical point $\beta_{S}$ (mirroring the spinodal point $\beta_{s}$ ) marks the disappearance of metastability of the disordered phase.

For $\beta\in(\beta_{s},\beta_{S})$ , Swendsen–Wang was shown in [9] to slow down to $t_{\textsc{mix}}\gtrsim\exp({c\sqrt{n}})$ (extending the lower bound at $\beta=\beta_{c}$ due to Gore and Jerrum). Analogously, for the related Glauber dynamics for the mean-field FK model (see §2 for precise definitions) with $q>2$ , Blanca and Sinclair [1] proved that $t_{\textsc{mix}}\geq\exp({c\sqrt{n}})$ whenever $\lambda=np$ is in the critical window $(\lambda_{s},\lambda_{S})$ . The fact that three significant papers, over a period of almost twenty years, all presented a lower bound of the form $\exp(c\sqrt{n})$ , left open the possibility that this is the true order of the mixing time inside the critical window.

Our main result is that the mixing time of the mean-field Swendsen–Wang dynamics is truly exponential in $n$ at criticality, similar to the single-site Glauber dynamics.

Theorem 1.

Let $q\geq 3$ be a fixed integer, and consider the Swendsen–Wang dynamics for the $q$ -state mean-field Potts model on $n$ vertices at inverse temperature $\beta\in(\beta_{s},\beta_{S})$ . There exists some $c(\beta,q)>0$ such that, for all $n$ large enough, $t_{\textsc{mix}}\geq\exp(cn)$ .

The case of non-integer $q$ (the mean-field FK model) is more delicate: the analogue of Swendsen–Wang in this setting is Chayes–Machta dynamics [5], which we analyze via a recursive application of the fundamental lemma of Bollobás, Grimmett and Janson [2]. As in [1], comparison results of [20] extend the result to heat-bath Glauber dynamics.

Theorem 2.

Fix $q>2$ , and consider Glauber dynamics for the mean-field FK model on $n$ vertices with parameters $(p=\frac{\lambda}{n},q)$ where $\lambda\in(\lambda_{s},\lambda_{S})$ . There exists $c(p,q)>0$ such that $t_{\textsc{mix}}\geq\exp(cn)$ for large enough $n$ . The same holds for Chayes–Machta dynamics.

To outline our approach for proving Theorems 1–2, we first sketch the argument of [13], thereafter adapted to $\beta\in[\beta_{s},\beta_{S})$ in [9] and to the FK model in [1]. Starting from a Potts configuration where each color class has $\frac{n}{q}\pm\varepsilon n$ vertices, since $\beta<\beta_{S}$ , for small enough $\varepsilon$ , this corresponds to a subcritical Erdős-Rényi random graph $\mathcal{G}(n,p)$ in the first stage of the Swendsen–Wang dynamics. The exponential tail of component sizes in this regime shows that, for a sequence $k=k(n)$ , with probability at least $1-n\exp(-ck)$ , no cluster in the edge configuration we obtain is larger than $k$ ; on this event, the component sizes ${\mathscr{L}}_{i}$ satisfy $\sum_{i}{\mathscr{L}}_{i}^{2}\leq k\sum{\mathscr{L}}_{i}=nk$ , thus by Hoeffding’s inequality, with probability $1-O(\exp[-\varepsilon^{2}n/(2k)])$ , every new color class will have $n/q\pm\varepsilon n$ vertices, and in particular no dominant color class would emerge. In this argument, choosing $k\asymp\sqrt{n}$ balances the two probability estimates to $1-\exp(-c\sqrt{n})$ . However, at $\beta\geq\beta_{c}$ , the Potts model does admit a dominant color class with positive (uniformly bounded away from 0) probability, thus the mixing time is at least $\exp(c\sqrt{n})$ .

In order to improve this lower bound into $\exp(cn)$ per Theorem 1, instead of looking at the size of the largest component after the $\mathcal{G}(n,p)$ stage of the dynamics, we consider $S_{M}$ , the set of vertices in connected components of size larger than $M$ . We show that, whenever the $\mathcal{G}(n,p)$ stage is subcritical and $M$ is sufficiently large, the probability that $|S_{M}|>\rho n$ is at most $\exp(-c\rho n)$ . Moreover, given $|S_{M}|\leq\rho n$ , Hoeffding’s inequality implies that, following the second stage of the dynamics, all the new color classes will have $n/q\pm\varepsilon n$ vertices except with probability $\exp[-2(\frac{\varepsilon-\rho}{M})^{2}n]$ , yielding $t_{\textsc{mix}}\geq\exp(cn)$ . The proof of Theorem 2 follows a similar path, yet involves additional equilibrium estimates on the conditional probabilities under $\pi_{n,\lambda,q}$ , as the Chayes–Machta dynamics resamples a strict subset of the configuration in each step.

2. Preliminaries

Throughout this paper, we use the notation $f\lesssim g$ for two sequences $f(n),g(n)$ to denote $f=O(g)$ , and let $f\asymp g$ denote $f\lesssim g\lesssim f$ . We re-parametrize the FK and Potts models by $\lambda$ instead of $p$ and $\beta$ via the relations $p=\lambda/n$ and $\lambda/n=1-e^{-\beta/n}$ , to allow us to treat the FK and Potts models in a unified manner. We will consider these models on the complete graph on $n$ vertices, $G=(V,E)=(\{1,...,n\},\{ij\}_{1\leq i<j\leq n})$ .

Denote by $\mu_{n,\lambda,q}$ , the Potts measure (with $\beta$ such that $\lambda/n=1-e^{-\beta/n}$ ) and by $\pi_{n,\lambda,q}$ the corresponding FK measure with $p=\lambda/n$ on the complete graph on $n$ vertices. The FK model with $q=1$ corresponds precisely to the Erdős–Rényi random graph $\mathcal{G}(n,p)$ and we use the shortened notation $\pi_{n,\lambda}=\pi_{n,\lambda,1}$ . We occasionally use $\mathcal{G}(n,p,q)$ to denote the mean-field FK model given by $\pi_{n,\lambda,q}$ .

For any FK configuration $\omega\in\{0,1\}^{E}$ , enumerate the clusters of $\omega$ in decreasing size $\mathcal{C}_{1},\mathcal{C}_{2},...$ and let ${\mathscr{L}}_{i}=|\mathcal{C}_{i}|$ . For a vertex $x$ let, $\mathcal{C}_{x}$ denote the cluster to which $x$ belongs.

For all $q\leq 2$ define the critical points $\lambda_{s}=\lambda_{c}=\lambda_{S}=q$ and for $q>2$ , define

[TABLE]

so that for $q>2$ , we have $\lambda_{s}<\lambda_{c}<\lambda_{S}$ (see e.g., [9, 1]). The critical points $\lambda_{s},\lambda_{S}$ correspond to the parameters of emergence and disappearance of metastability, where at $\lambda=\lambda_{c}$ , the ordered and disordered metastable states have the same free energy. These two critical points can also have the following alternative interpreation [9]: $\lambda_{s}$ corresponds to the first uniqueness/non-uniqueness threshold of the $\Delta$ -regular infinite tree, and $\lambda_{S}$ should correspond to a second uniqueness/non-uniqueness threshold of the $\Delta$ -regular tree with periodic boundary conditions.

The FK and Potts phase transitions

The following give a description of the static phase transition undergone by the mean-field FK and Potts models respectively. Let $\Theta_{r}=\Theta_{r}(\lambda,q)$ be the largest solution of $e^{-\lambda x}=1-\frac{qx}{1+(q-1)x}$ so $\Theta_{r}=\frac{q-2}{q-1}$ when $\lambda=\lambda_{c}$ .

Proposition 2.1 ([2, Thms. 2.1–2.2],[18, Thm. 19]).

Consider the $n$ -vertex mean-field FK model with parameters $(p,q)$ with $p=\lambda/n$ ; if $\lambda<\lambda_{c}(q)$ , for every $\varepsilon>0$ , we have $\lim_{n\to\infty}\pi_{n,\lambda,q}({\mathscr{L}}_{1}\leq\varepsilon n)=1$ whereas if $\lambda>\lambda_{c}(q)$ , for every $\varepsilon>0$ , we have $\lim_{n\to\infty}\pi_{n,\lambda,q}({\mathscr{L}}_{1}\geq(\Theta_{r}-\varepsilon)n)=1$ . If $\lambda=\lambda_{c}(q)$ , there exists $\gamma(q)\in(0,1)$ so that for all $\varepsilon>0$ , $\lim_{n\to\infty}\pi_{n,\lambda,q}({\mathscr{L}}_{1}\leq\varepsilon n)\geq\gamma$ and $\lim_{n\to\infty}\pi_{n,\lambda,q}({\mathscr{L}}_{1}\geq(\Theta_{r}-\varepsilon)n)\geq 1-\gamma$ .

Corollary 2.2.

Consider the mean-field Potts model parametrized by $\lambda=n(1-e^{-\beta/n})$ and $q$ . If $\lambda<\lambda_{c}(q)$ , for any $\varepsilon>0$ ,

[TABLE]

and if $\lambda>\lambda_{c}(q)$ , then there exists $a(\lambda,q)>q^{-1}$ such that for sufficiently small $\varepsilon>0$ ,

[TABLE]

If $q>2$ and $\lambda=\lambda_{c}(q)$ , there exists $\gamma(q)\in(0,1)$ so that for all sufficiently small $\varepsilon>0$ ,

[TABLE]

Cluster dynamics

Swendsen–Wang dynamics for the $q$ -state Potts model on $G=(V,E)$ with parameter $\beta$ such that $p=1-e^{-\beta/n}$ is the following discrete-time reversible Markov chain. From a Potts configuration $\sigma$ on $G$ , generate a new state $\sigma^{\prime}$ as follows.

(1)

Introduce auxiliary edge variables and for $e=xy\in E$ set $\omega(e)=0$ if $\sigma_{x}\neq\sigma_{y}$ on each of the $q$ sets of vertices of $\sigma$ of the same color, $V_{1},..,V_{q}$ , independently sample $\omega\mathord{\upharpoonright}_{\{xy:x,y\in V_{i}\}}\sim\mathcal{G}(|V_{i}|,p)$ . 2. (2)

For every connected component of the resulting $\omega$ , reassign the cluster, collectively, an i.i.d. color in $1,...,q$ , to obtain the new configuration $\sigma^{\prime}$ .

Chayes–Machta dynamics for the FK model on $G=(V,E)$ with parameters $(p,q)$ , for $q\geq 1$ and $p=\lambda/n$ , is the following discrete-time reversible Markov chain: From an FK configuration $\omega\in\Omega_{{\textsc{rc}}}$ on $G$ , generate a new state $\omega^{\prime}\in\Omega_{{\textsc{rc}}}$ as follows.

(1)

Assign each cluster $C$ of $\omega$ an auxiliary i.i.d. variable $X_{C}\sim\mathrm{Bernoulli}(1/q)$ . 2. (2)

Resample every $e=xy$ such that $x$ and $y$ belong to active clusters ( $X_{c}=1$ ) via i.i.d. random variables $X_{e}\sim\mathrm{Bernoulli}(\lambda/n)$ , yielding a new configuration $\omega^{\prime}$ .

Variants of Chayes–Machta dynamics with $1\leq k\leq\lfloor q\rfloor$ “active colors” have also been studied, with numerical evidence for $k=\lfloor q\rfloor$ being the most efficient choice; see [10].

Glauber dynamics for the FK model

Swendsen–Wang dynamics is closely related to the FK model; much of the analysis of Swendsen–Wang dynamics on general graphs has been via the Glauber dynamics for the corresponding FK model. Discrete-time Glauber dynamics [12] for the FK model on $G=(V,E)$ with $p=\lambda/n$ is as follows: select an edge $e=xy$ in $E$ uniformly at random and update $\omega(e)$ according to $\pi_{n,\lambda,q}(\cdot\mathord{\upharpoonright}_{\{e\}}\mid\omega\mathord{\upharpoonright}_{G-\{e\}})$ .

Size of largest component and drift functions

For $\lambda>1$ , let $\theta_{\lambda}$ be the unique positive root of $e^{-\lambda x}=1-x$ . Recall the following tail estimates for ${\mathscr{L}}_{1}$ in $\mathcal{G}(n,p)$ .

Fact 2.3 (e.g., cf. [15, p. 109]).

Consider $\mathcal{G}(n,p)$ with $pn=\lambda<1$ . Then for any $x$ ,

[TABLE]

In particular, $\pi_{n,\lambda,1}({\mathscr{L}}_{1}\geq k)\leq n\exp\big{(}-(1-\lambda)^{2}k/2\big{)}$ .

Proposition 2.4 ([17, Lemma 5.4]).

Consider $\mathcal{G}(n,p)$ with $np=\lambda>1$ . There exists $c(\lambda)>0$ such that for every $\varepsilon>0$ ,

[TABLE]

For the proof of Theorem 1, following [13, 9] define the drift function for the average size of the largest color class of the Swendsen–Wang dynamics

[TABLE]

The function $F_{\lambda}(z)$ has, for some values of $\lambda$ a second fixed point besides $\frac{1}{q}$ , which we denote by $a_{\lambda}>1/q$ , which solves

[TABLE]

Proposition 2.5 ([9, Lemma 5]).

If $\lambda>\lambda_{s}$ , the fixed point $a_{\lambda}$ is such that $\lambda a_{\lambda}>1$ and moreover if $b_{\lambda}=\frac{1-a_{\lambda}}{q-1}$ , we have $\lambda b_{\lambda}<1$ . Moreover, if $q>2$ and $\lambda>\lambda_{s}$ , $a_{\lambda}$ is a Jacobian attractive fixed point of $F_{\lambda}(z)$ so that $|F^{\prime}(a_{\lambda})|<1$ .

Similarly to the above, we can define the function $f$ given by

[TABLE]

which governs the mean drift of the size of the giant component in Chayes–Machta dynamics. We can also define $\Theta_{r}$ to be the largest solution to $e^{-\lambda x}=1-\frac{qx}{1+(q-1)x}$ . Following [1], let $\Theta_{\mathrm{min}}(\lambda,q)=\max\{0,(q-\lambda)/(\lambda(q-1))\}$ , observe that if $\lambda<\lambda_{S}$ , $\lambda(\Theta_{\mathrm{min}}+q^{-1}(1-\Theta_{\mathrm{min}}))=1$ , and define the drift function $g(\theta)=f(\theta)-\theta$ .

Proposition 2.6 ([1, Lemma 2.14]).

When $q>2$ and $\lambda>\lambda_{s}$ , the drift function $g$ has two roots, $\Theta^{*}<\Theta_{r}$ in $(\Theta_{\mathrm{min}},1]$ ; moreover, $g$ is strictly positive on $(\Theta^{*},\Theta_{r})$ .

Mixing time and spectral gap

In this section, we introduce the quantities of interest regarding the time for the Swendsen–Wang and Glauber dynamics to reach equilibrium. Consider a Markov chain with finite state space $\Omega$ and transition matrix $P$ reversible with respect to $\pi$ . For two measures $\nu,\pi$ , define their total variation distance by

[TABLE]

Then the mixing time of $P$ is defined as

[TABLE]

A related quantity that is sometimes easier to work with is the spectral gap of $P$ ; Since $P$ is reversible with respect to $\pi$ , we can enumerate its spectrum from largest to smallest as $1=\lambda_{1}>\lambda_{2}>...$ ; then the spectral gap of $P$ is defined as $\text{\tt{gap}}=1-\lambda_{2}$ . The following is a standard comparison between the inverse spectral gap and the mixing time of a Markov chain with transition matrix $P$ (see e.g., [16]):

[TABLE]

Spectral gap comparisons

The following comparison inequalities between the aforementioned Markov chains are due to Ullrich.

Proposition 2.7 ([20]).

Let $q\geq 2$ be integer. Let $\text{\tt{gap}}_{{\textsc{rc}}}$ be the spectral gap of Glauber dynamics FK model on a graph $G=(V,E)$ and let $\text{\tt{gap}}_{\textsc{sw}}$ be the spectral gap of Swendsen–Wang. Then

[TABLE]

The proof of (2.2) further extends to all real $q>1$ , whence

[TABLE]

as was observed (and further generalized) by Blanca and Sinclair [1, §5], where $\text{\tt{gap}}_{\textsc{cm}}$ is the spectral gap of Chayes–Machta dynamics.

3. Slow mixing of Swendsen–Wang dynamics

Towards the proof of Theorem 1, we first establish some preliminary estimates. For $\omega\in\Omega_{{\textsc{rc}}}$ , we will frequently be interested in bounding the following quantity:

[TABLE]

The bottlenecks in the proofs of Theorems 1–2 both rely on the following estimate.

Lemma 3.1.

Consider $\omega\sim\mathcal{G}(n,p)$ with $np=\lambda<1$ fixed. There exists $c(\lambda)>0$ such that for every $\rho>0$ , there exists $M_{0}(\lambda,\rho)$ such that for every $M\geq M_{0}$ ,

[TABLE]

Proof.

Recall that by Fact 2.3, there exists $c_{1}(\lambda)>0$ such that $\pi_{n,\lambda}(|\mathcal{C}_{x}|\geq k)\leq e^{-c_{1}k}$ for all $k$ . Moreover, conditioned on other clusters, the remaining graph is distributed as $\mathcal{G}(m,\lambda)$ for $m\leq n$ , so that for any $\ell$ vertices $y_{1},...,y_{\ell}$ ,

[TABLE]

Let $Y_{M}$ be the number of clusters with at least $M$ vertices; then $Y_{M}\preceq\operatorname{Bin}(n,e^{-c_{1}M})$ so that by Azuma–Hoeffding inequality,

[TABLE]

Now let

[TABLE]

plugging into the Azuma-Hoeffding bound, we obtain

[TABLE]

In order to bound the right-hand side above, fix vertices $x_{1},...,x_{K}$ ; the joint law of $\mathcal{C}_{x_{1}},...,\mathcal{C}_{x_{K}}$ is dominated by the sum of $K$ i.i.d. random variables $Z_{1},...,Z_{K}$ , where, for some $a(\lambda),b(\lambda),\nu(\lambda)>0$ (independent of $M$ and $n$ ), $Z_{1}$ is sub-exponential with parameters $(\nu,b)$ and has mean $a$ . By the definition of $K$ , for any sufficiently large $M$ (depending on $\rho$ ), $K\mathbb{E}[Z_{i}]=Ka\leq\rho n/2$ . By a union bound and symmetry, we have

[TABLE]

Moreover, $\sum_{i=1}^{K}Z_{i}$ is also sub-exponential with parameters $(K\nu,b)$ . Therefore, there exists $c_{2}(\lambda)>0$ so that for all $\rho>0$ , there exists $M_{0}(\lambda,\rho)$ such that for all $M\geq M_{0}$ ,

[TABLE]

Plugging this bound in to (3.1) concludes the proof. ∎

In the coloring stage of the Swendsen–Wang dynamics, the following simple application of a Chernoff-Hoeffding inequality proves useful.

Lemma 3.2.

Consider an FK realization $\omega$ on $n$ vertices and suppose $|S_{M}(\omega)|\leq\varepsilon n$ for some $M>0$ . Independently color each cluster of $\omega$ collectively red with probability $\alpha\in[0,1]$ , and let $R$ be the set of all red vertices. For all $\delta>0$ ,

[TABLE]

Proof.

We consider $\mathbb{P}(|R|\geq(\alpha+\varepsilon+\delta)n)$ and $\mathbb{P}(|R|\leq(\alpha-\varepsilon-\delta)n)$ separately. To bound the former, it suffices to prove an upper bound on

[TABLE]

which by Hoeffding’s inequality satisfies

[TABLE]

Similarly bounding $\mathbb{P}(|R|\leq(\alpha-\varepsilon-\delta)n)\leq\mathbb{P}(|R-S_{M}|\leq(\alpha-\delta)n)$ by Hoeffding’s inequality and combining the two via a union bound concludes the proof. ∎

We prove Theorem 1 for $q>2$ separately for $\lambda$ that is below, above and at $\lambda_{c}$ .

3.1. The supercritical regime: proof of Theorem 1 for the case $\lambda\in(\lambda_{c},\lambda_{S})$

To prove Theorem 1 for $\lambda\in(\lambda_{c},\lambda_{S})$ , let $\rho>0$ , and define the set of configurations,

[TABLE]

Now consider the Markov chain $(X_{t})_{t\geq 0}$ and let $v_{t}=(v_{t}^{1},...,v_{t}^{q})$ be the corresponding vector counting the number of sites in each state in $X_{t}$ . We need the following claim.

Claim 3.3.

Consider Swendsen–Wang dynamics with $\lambda=n(1-e^{-\beta/n})$ for $\lambda<\lambda_{S}$ ; there exists $\rho_{0}(\lambda,q),c(\rho,\lambda,q),C(\lambda,q)>0$ such that that for every $\rho<\rho_{0}$

[TABLE]

Proof.

Consider a fixed $X_{0}\in A_{\rho}$ . In the $\mathcal{G}(n,p)$ step of the Swendsen–Wang dynamics, we consider the color components separately. For each of the $q$ colored components a new edge configuration is sampled according to $\pi_{v_{0}^{i},\lambda}$ where $i=1,...,q$ ; call the edge configuration we obtain $\omega_{1}^{i}$ and note that by definition of Swendsen–Wang dynamics, the clusters of $\{\omega_{1}^{i}\}_{i=1}^{q}$ will all be disconnected. Then since $\|v_{0}-(\frac{n}{q},...,\frac{n}{q})\|_{\infty}<\rho n$ and $\lambda<\lambda_{S}=q$ , if $\rho<1-\lambda/q=:\rho_{0}$ , every colored component is sub-critical in the $\mathcal{G}(n,p)$ step. Thus, for all $i=1,...,q$ , by Lemma 3.1, for some $c(\lambda)>0$ , if $\rho<1-\lambda/q$ , for every $M\geq M_{0}(\lambda,\rho)$ and every $\delta>0$ ,

[TABLE]

Union bounding over the $q$ different such components, we obtain

[TABLE]

In that case, if $\delta=\frac{\rho}{2q}$ and $\omega_{1}$ is the edge configuration induced on the whole graph after the $\mathcal{G}(n,p)$ step of the dynamics, there exists $c(\lambda,q)>0$ so that for $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

We can then split up

[TABLE]

and consider the coloring step of the Swendsen–Wang dynamics. Then we obtain

[TABLE]

By an application of Lemma 3.2 with $\varepsilon=\delta=\frac{\rho}{2}$ and a union bound, the above is, for $\rho<1-\lambda/q$ and $M\geq M_{0}(\lambda,\rho)$ , bounded above by

[TABLE]

Since all the above estimates were uniform in $X_{0}\in A_{\rho}$ , we obtain the desired. ∎

By Corollary 2.2, since $\beta$ is such that $\lambda>\lambda_{c}$ , for every small $\rho>0$ , we have $\mu_{n,\lambda,q}(A_{\rho}^{c})>\frac{1}{2}$ . If $X_{0}$ is such that $v_{0}=(\frac{n}{q},...,\frac{n}{q})$ , clearly $X_{0}\in A_{\rho}$ , and by Claim 3.3 and a union bound, since $\lambda<\lambda_{S}$ , there exists $c(\rho,\lambda,q)>0$ such that for every $\rho<\rho_{0}$ ,

[TABLE]

The definition of total variation mixing time then implies $t_{\textsc{mix}}\gtrsim e^{cn/2}$ as desired. ∎

3.2. The subcritical regime: proof of Theorem 1 for $\lambda\in(\lambda_{s},\lambda_{c})$

We first prove the following consequence of Lemma 3.1.

Lemma 3.4.

Consider $\mathcal{G}(n,p)$ with $np=\lambda>1$ . There exist $c(\lambda),c^{\prime}(\lambda)>0$ such that for every $\rho>0$ and $\varepsilon>0$ sufficiently small and for every $M\geq M_{0}(\lambda,\rho)$ , we have

[TABLE]

Proof.

By a union bound, rewrite the left-hand side above as

[TABLE]

Since $\lambda>1$ , by Fact 2.4, we have that $\pi_{n,\lambda}(\omega:|{\mathscr{L}}_{1}-\theta_{\lambda}|\geq\varepsilon n)\leq e^{-c\varepsilon^{2}n}$ for some $c(\lambda)>0$ . Now suppose $\mathcal{L}_{1}\geq(\theta_{\lambda}-\varepsilon)n$ and note that conditioning on $\mathcal{C}_{1}$ , since the remaining graph is disconnected from $\mathcal{C}_{1}$ , it must be distributed as $\mathcal{G}(n-{\mathscr{L}}_{1},p)$ which, since $n-{\mathscr{L}}_{1}\leq(1-\theta_{\lambda}+\varepsilon)n$ , is subcritical for small $\varepsilon$ . In that case, by Lemma 3.1, given $n-{\mathscr{L}}_{1}\leq(1-\theta_{\lambda}+\varepsilon)n$ , there exists $c(\lambda)>0$ such that for every $\rho>0$ , there exists $M_{0}(\lambda,\rho)>0$ so that for $M\geq M_{0}$ ,

[TABLE]

combined with the union bound, this implies the desired. ∎

The proof of Theorem 1 for $\lambda\in(\lambda_{s},\lambda_{c})$ is a slight modification of the proof for $\lambda\in(\lambda_{c},\lambda_{S})$ . Recall the definitions of $\theta_{\lambda},a_{\lambda}$ and $b_{\lambda}$ from §2. Fix $\lambda>\lambda_{s}$ . In decreasing order, let the number of vertices in each color class of $\sigma$ be $v^{1},...,v^{q}$ and let

[TABLE]

By Corollary 2.2, since $\lambda<\lambda_{c}$ , for sufficiently small $\rho$ , we have $\mu_{n,\lambda,q}(A_{\rho}^{c})>\frac{1}{2}$ . Therefore, it suffices by definition of total variation mixing to prove the following.

Claim 3.5.

Consider Swendsen–Wang dynamics with $\lambda=n(1-e^{-\beta/n})$ for $\lambda>\lambda_{s}$ ; there exist $\rho_{0}(\lambda,q),c(\rho,\lambda,q),C(\lambda,q)>0$ such that for every $\rho<\rho_{0}$ ,

[TABLE]

Proof.

Fix any $X_{0}\in A_{\rho}$ and let $(v_{0}^{1},...,v_{0}^{q})$ be its corresponding color class vector. By definition of $a_{\lambda}$ , for some $\rho^{\prime}(\lambda,q)>0$ there exists $\gamma\in(F^{\prime}(a_{\lambda}),1)$ such that if $|v_{0}^{1}-a_{\lambda}|\leq\rho^{\prime}n$ , we have $|F(v_{0}^{1}/n)-a_{\lambda}|<\gamma|v_{0}^{1}/n-a_{\lambda}|$ . From now on we take $\rho<\rho^{\prime}$ .

Consider the $\mathcal{G}(n,p)$ step of the Swendsen–Wang dynamics. Since $\lambda>\lambda_{s}$ , $\lambda a_{\lambda}>1$ and $\lambda b_{\lambda}<1$ , so that for $\rho>0$ sufficiently small, the first colored class of $X_{0}$ will be supercritical in the $\mathcal{G}(n,p)$ step and the other $q-1$ will all be subcritical; call the $q$ random graph configurations we obtain in this step $\omega_{1}^{i}$ for $i=1,...,q$ . Now fix such a $\rho>0$ and let $\varepsilon=\frac{(1-\gamma)\rho}{2(q+1)}$ . By Fact 2.4, we obtain that for some $c(\lambda)>0$ ,

[TABLE]

Moreover, by Lemma 3.1, we also have for some $c(\lambda)>0$ , for every $M\geq M_{0}(\lambda,\varepsilon)$ ,

[TABLE]

On the complement of the above event, $\omega^{1}$ has a single giant component of size $\theta n$ for $\theta n\in(v_{0}^{1}\theta_{\lambda v_{0}^{1}/n}-\varepsilon n,v_{0}^{1}\theta_{\lambda v_{0}^{1}/n}+\varepsilon n)$ , and $|S_{M}-\mathcal{C}_{1}|\leq q\varepsilon n$ . By Lemma 3.2, with probability $1-e^{-c\theta n}$ , the largest color class of $X_{1}$ will be the one containing $\mathcal{C}_{1}(\omega_{1}^{1})$ so without loss, we also assume that is the case.

At that stage, observe that $\mathbb{E}[v_{1}^{1}\mid\theta]=\theta n+\frac{1}{q}(1-\theta)n$ and $\mathbb{E}[v_{1}^{i}\mid\theta]=\frac{1}{q}(1-\theta)n$ for $i\neq 1$ . Then, first assigning the giant component a color, then using Lemma 3.2, we obtain that for some $c(M,\lambda)>0$ , for every $M\geq M_{0}(\lambda,\varepsilon)$ ,

[TABLE]

By a similar bound on the other $q-1$ coloring steps and the choice $\delta=(1-\gamma)\rho/2$ ,

[TABLE]

By the choice of $\gamma$ and the triangle inequality, this implies

[TABLE]

which by uniformity of the estimates over $X_{0}\in A_{\rho}$ , concludes the proof. ∎

3.3. The critical point: proof of Theorem 1 for $\lambda=\lambda_{c}$

In Corollary 2.2, for every $q>2$ , either $\gamma(q)\geq\frac{1}{2}$ in which case Claim 3.5 concludes the proof, or $1-\gamma(q)\geq\frac{1}{2}$ in which case Claim 3.3 concludes the proof. ∎

4. Slow mixing of Glauber dynamics for the FK model

Since for $q$ noninteger, Chayes–Machta dynamics activates a strict subset of the vertices at a time, we will need to use a modified argument to prove Theorem 2. We instead construct a bottleneck set $S$ and bound its bottleneck ratio. For $A,B\subset\Omega$ , let

[TABLE]

for a chain with stationary distribution $\pi$ and kernel $P$ ; the Cheeger constant of $\Omega$ is

[TABLE]

In order to prove the lower bound of Theorem 2, we prove such a lower bound on the inverse spectral gap of the Chayes–Machta dynamics, then using Proposition 2.7 and a standard comparison between the spectral gap and mixing time (2.1), we obtain the desired for the Glauber dynamics. Before the proof of Theorem 2, we prove some preliminary equilibrium bottleneck estimates for the mean-field FK model.

The following lemma that was fundamental to the understanding of the distribution $\pi_{n,\lambda,q}$ in [2] is very useful for the proof of Theorem 2.

Lemma 4.1 ([2, Lemma 3.1]).

Fix $\alpha\in[0,1]$ ; consider a mean-field FK realization $\omega\sim\pi_{n,\lambda,q}$ . Independently color each cluster of $\omega$ red with probability $\alpha$ and let $R$ be the collection of all red vertices. Conditional on $R$ , the subgraph $\omega\mathord{\upharpoonright}_{R}$ is distributed according to $\pi_{|R|,\lambda,rq}$ and the subgraph $\omega\mathord{\upharpoonright}_{V-R}$ is distributed according to $\pi_{|V-R|,\lambda,(1-r)q}$ .

The following corollary follows from iterating the process of Lemma 4.1 $\lfloor q\rfloor$ times.

Corollary 4.2.

Consider a mean-field FK realization $\omega\sim\pi_{n,\lambda,q}$ . Independently color each cluster of $\omega$ color $r_{1},...,r_{q}$ with probability $q^{-1}$ each and $r_{0}$ otherwise. Then letting $R_{0},R_{1},...,R_{q}$ be the sets of vertices colored each of $r_{0},...,r_{q}$ , the subgraph restricted to $R_{i}$ for $i=1,...,q$ is distributed according to $\pi_{|R_{i}|,\lambda,1}$ . The subgraph restricted to $R_{0}:=V-\bigcup_{i=1}^{q}R_{i}$ is distributed according to $\pi_{|R_{0}|,\lambda,q-\lfloor q\rfloor}$ . Moreover, the distributions of the $\lceil q\rceil$ color classes are (conditionally on $R_{0},...,R_{q}$ ) independent.

(Note that when $q$ is integer, the set $R_{0}$ is deterministically empty.) Via Lemma 4.1, we prove the following analogues of Lemmas 3.1 and 3.4 when $q<1$ .

Lemma 4.3.

Consider the mean-field FK model on $n$ vertices with parameters $(p,q)$ with $q<1$ and $np=\lambda<\lambda_{c}=q$ . There exists $c(\lambda,q)>0$ such that for all $\rho>0$ sufficiently small, there exists $M_{0}(\lambda,\rho)>0$ such that for all $M\geq M_{0}$ ,

[TABLE]

Proof.

We prove the desired using Lemma 4.1. Consider the random graph $\mathcal{G}(m,p)$ with the choice of $m=\lceil q^{-1}n\rceil$ ; applying Lemma 4.1 to $\mathcal{G}(m,p)$ with $\alpha=q$ , by [2, Lemma 9.1], for all $\lambda\neq q$ , we have $\mathbb{P}(|R|=n)\geq\frac{C}{\sqrt{m}}$ , for some $C(\lambda)>0$ . Then, we can write for any event $A\subset\Omega_{{\textsc{rc}}}$ ,

[TABLE]

where $\mathbb{P}_{\mathrm{col},m,\lambda}$ is the distribution over colorings of $\omega$ , averaged over realizations of $\omega\sim\pi_{m,\lambda}$ . Letting $A=A_{\rho,M}=\{|S_{M}|\geq\rho n\}$ , for every $R$ the probability on the right-hand side is bounded above by $\pi_{m,\lambda}(A_{\rho,M})$ which, by Lemma 3.1, satisfies

[TABLE]

for some $c(\lambda)>0$ and for every $\rho>0$ and every $M\geq M_{0}(\lambda,\rho)$ . But by Lemma 4.1,

[TABLE]

which combined with $\mathbb{P}_{\mathrm{col},m,\lambda}(|R|=n)\geq C/\sqrt{m}$ implies

[TABLE]

Lemma 4.4.

Consider the mean-field FK model on $n$ vertices with parameters $(p,q)$ with $q<1$ and $np=\lambda>\lambda_{c}=q$ . There exists $c(\lambda,q)>0$ such that for all $\rho>0$ sufficiently small, there exists $M_{0}(\lambda,\rho)>0$ such that for all $M\geq M_{0}$ ,

[TABLE]

Proof.

As before, consider $\mathcal{G}(m,p)$ with $m=\lceil q^{-1}n\rceil$ ; by Lemma 4.1 with $\alpha=q$ and [2, Lemma 9.1], $\mathbb{P}(|R|=n)\geq C/\sqrt{m}$ . Let $A=A_{\rho,M}=\{|S_{M}-\mathcal{C}_{1}|\geq\rho n\}$ in (4.2). Then observe that $\pi_{m,\lambda}(\omega\mathord{\upharpoonright}_{R}\in A_{\rho,M})\leq\pi_{m,\lambda}(A_{\rho,M})$ and by Lemma 3.4, $\pi_{m,\lambda}(A_{\rho,M})\lesssim e^{-c\rho n}$ . Altogether, plugging the above bounds in to (4.2) implies that there exists $c(\lambda)>0$ such that for all $\rho>0$ and all $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

4.1. The supercritical/critical regime, $np=\lambda\in[\lambda_{c},\lambda_{S})$

We first prove the desired mixing time lower bound for $\lambda\in[\lambda_{c},\lambda_{S})$ , using the following bottleneck estimate.

Lemma 4.5.

Consider the mean-field FK model on $n$ vertices with parameters $(p,q)$ where $q>2$ and $np=\lambda<\lambda_{S}$ ; there exists $c(\rho,M,\lambda,q)>0$ such that for all sufficiently small $\rho>0$ , there exists $M_{0}(\lambda,\rho)$ such that for every $M\geq M_{0}$ ,

[TABLE]

Proof.

For $\rho,M>0$ define the sets

[TABLE]

In order to bound $\pi_{n,\lambda,q}(B_{\rho,M}\mid A_{\rho,M})$ , use the coloring scheme described in Corollary 4.2. Let $\mathcal{P}$ be the set of all possible partitions of $\{1,...,n\}$ into $\lceil q\rceil$ sets, i.e., the set of all possible colorings of FK configurations. Denote by $\mathbb{P}_{\mathrm{col}}$ the probability measure over colorings $(R_{0},...,R_{\lfloor q\rfloor})$ averaged over $\pi_{n,\lambda,q}$ , and $\mathbb{P}_{\mathrm{col}}(\cdot\mid\mathcal{F})$ the probability measure over such colorings, averaged over $\pi_{n,\lambda,q}(\cdot\mid\mathcal{F})$ . For every $\mathbf{R}\in\mathcal{P}$ ,

[TABLE]

Then we can write, by Corollary 4.2,

[TABLE]

By Lemma 3.2, since $A_{\rho,M}$ implies $|S_{M}|\leq\rho n$ , for every $i=1,...,\lfloor q\rfloor$ ,

[TABLE]

If $||R_{i}|-\frac{n}{q}|<2\rho n$ for all $i=1,...,\lfloor q\rfloor$ , we are left with a remainder set satisfying

[TABLE]

Define the event $\Gamma_{\rho}$ over colorings of the mean-field FK model as

[TABLE]

so that the above conclusion can be written as

[TABLE]

Combined with the expression for $\pi_{n,\lambda,q}(B_{\rho,M}\mid A_{\rho,M})$ , this implies that

[TABLE]

By a union bound, the first term on the right-hand side is bounded above by

[TABLE]

We lower bound the numerator and upper bound the denominator simultaneously as they entail similar estimates.

Since $\lambda<\lambda_{S}=q$ , there exists $\rho_{0}(\lambda,q)$ such that for all $\rho<\rho_{0}$ , the random graph $\mathcal{G}(\frac{n}{q}+2\lfloor q\rfloor\rho n,p)$ is subcritical and the FK model $\mathcal{G}((1-\tfrac{\lfloor q\rfloor}{q}+{2\rho\lfloor q\rfloor})n,p,q-\lfloor q\rfloor)$ is also subcritical. In other words, if $\rho<\rho_{0}(\lambda,q)$ , for every $\textbf{R}\in\Gamma_{\rho}$ , the distributions $\pi_{|R_{i}|,\lambda}$ for $i=1,...,\lfloor q\rfloor$ and $\pi_{|R_{0}|,\lambda,q-\lfloor q\rfloor}$ are all subcritical. As such, by Lemma 3.1, there exists $c(\lambda,q)>0$ such that for every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Similar bounds under $\pi_{|R_{0}|,\lambda,q-\lfloor q\rfloor}$ follow immediately for a different $c(\lambda,q)>0$ from Lemma 4.3. Altogether, this implies that for every $\rho<\rho_{0}$ and every $M\geq M_{0}(\lambda,\rho)$ , there exists $c(\rho,M,\lambda,q)>0$ such that

[TABLE]

Proof of Theorem 2: the case $np=\lambda\in[\lambda_{c},\lambda_{S})$ .

For $\rho,M>0$ , recall the definitions of $A_{\rho,M}$ and $B_{\rho,M}$ . By Proposition 2.1, for $\lambda\in[\lambda_{c},\lambda_{S})$ , for sufficiently small $\rho>0$ and large $M$ , there exists $c(\lambda,q)>0$ such that $\pi_{n,\lambda,q}(A_{\rho,M}^{c})\geq c$ . Then by (4.1) it suffices to prove an exponentially decaying upper bound on

[TABLE]

where $P,Q$ are the transition matrix and edge measure, respectively, of the Chayes–Machta dynamics. We first bound the first term in the right-hand side of (4.4).

Consider some $X_{0}\in A_{\rho,M}-B_{\rho,M}$ . In the activation stage of the Chayes–Machta dynamics, clusters are activated with probability $\frac{1}{q}$ ; denote by $\mathcal{A}_{1}$ the set of activated vertices in this stage of the dynamics. Since $X_{0}\in A_{\rho,M}-B_{\rho,M}$ , by Lemma 3.2 with the choice of $\varepsilon=\delta=\rho/2$ ,

[TABLE]

Since $\lambda<\lambda_{S}=q$ , for $\rho<1-\lambda/q$ , the random graph $\mathcal{G}((\frac{1}{q}+\rho)n,p)$ is subcritical. In that case, by Lemma 3.1, there exists $c(\lambda,\rho)>0$ such that for every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Union bounding over the event $||\mathcal{A}_{1}|-n/q|\geq\rho n$ and its complement, there exists $c(\rho,M,\lambda,q)>0$ such that for every $\rho<1-\lambda/q$ , for every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Lemma 4.5 yields a similar exponentially decaying upper bound on the second term on the right-hand side of (4.4), concluding the proof. ∎

4.2. The subcritical/critical regime, $np=\lambda\in(\lambda_{s},\lambda_{c}]$

Recall the definitions of $\Theta^{*}(\lambda,q)$ and $\Theta_{r}(\lambda,q)$ corresponding to the drift function $g$ . When $\lambda\in(\lambda_{s},\lambda_{c}]$ , we will need the following intermediate lemma, before proceeding to the analogue of Lemma 4.5. This is a straightforward adaptation of an argument of [1].

Lemma 4.6.

Consider the mean-field FK model on $n$ vertices with parameters $(p,q)$ with $np=\lambda\in(\lambda_{s},\lambda_{S})$ ; let $\omega_{0}\in A_{\rho,\varepsilon,M}=\{\omega:{\mathscr{L}}_{1}\geq(\Theta^{*}+\varepsilon)n,|S_{M}-\mathcal{C}_{1}|<\rho n\}$ . Color $\mathcal{C}_{1}$ red and independently color each cluster in $\omega_{0}-\mathcal{C}_{1}$ red with probability $\frac{1}{q}$ ; let $R$ be the set of all red vertices. Resample $\omega_{0}\mathord{\upharpoonright}_{R}\sim\pi_{|R|,\lambda,1}$ and let $\omega_{1}$ be the resulting configuration on $n$ vertices; there exists $c(\rho,\varepsilon,M,\lambda)>0$ so that for sufficiently small $\rho,\varepsilon>0$ , for every $M\geq M_{0}(\lambda,\rho)$ , uniformly in $\omega_{0}\in A_{\rho,\varepsilon,M}$ ,

[TABLE]

Proof.

Fix any $\omega_{0}\in A_{\rho,\varepsilon,M}$ and let $n\theta_{0}={\mathscr{L}}_{1}(\omega_{0})$ for $\theta_{0}\geq\Theta^{*}+\varepsilon$ . Then

[TABLE]

so that by Lemma 3.2, for all $\rho>0$ ,

[TABLE]

Therefore, we can write for every $\delta>0$ ,

[TABLE]

For all $\theta_{0}\geq\Theta^{*}+\varepsilon$ , for sufficiently small $\rho>0$ , using $\theta_{0}>\Theta^{*}>\Theta_{\mathrm{min}}$ , since $\lambda<\lambda_{S}$ ,the random graph $\mathcal{G}(\mu_{0}-\rho n,p)$ is supercritical. By continuity of $f$ , for any $\delta>0$ , there exists $\rho>0$ sufficiently small such that $\max_{a:|a-\mu_{0}|\leq\rho n}|f(\theta_{0})-\theta_{\lambda a/n}|<\delta$ ; moreover, by Fact 2.4, for every $\delta>0$

[TABLE]

for some $c(\lambda,\rho)>0$ . Thus, for sufficiently small $\rho>0$ , we have, for some $c(\rho,M,\lambda)>0$ ,

[TABLE]

It remains to argue that for $\varepsilon>0$ sufficiently small, there exists $\delta>0$ such that for all $\theta_{0}\geq\Theta^{*}+\varepsilon$ , we have $nf(\theta_{0})-2\delta n\geq(\Theta^{*}+\varepsilon)n$ . If $\theta_{0}>\Theta_{r}-\varepsilon$ , then by [1, Lemma 2.14], $f(\theta_{0})\geq\Theta_{r}-\varepsilon>\Theta^{*}+\varepsilon$ and for small enough $\varepsilon$ letting $\delta=\frac{1}{2}(\Theta_{r}-\Theta^{*}-2\varepsilon)>0$ yields the desired. If $\theta_{0}\leq\Theta_{r}-\varepsilon$ , since $g$ is positive on $(\Theta^{*},\Theta_{r})$ , for $\varepsilon$ small, $f(\theta_{0})>\theta_{0}\geq\Theta^{*}+\varepsilon$ . By continuity of $f$ , for $\varepsilon<\frac{1}{2}(\Theta_{r}-\Theta^{*})$ , letting $\delta=\frac{1}{2}\min_{[\Theta^{*}+\varepsilon,\Theta_{r}-\varepsilon]}g$ , we obtain

[TABLE]

Together, for $\varepsilon>0$ sufficiently small, there exists $c(\rho,\varepsilon,M,\lambda)>0$ such that

[TABLE]

The following is the analogue of Lemma 4.5 in the presence of a giant component.

Lemma 4.7.

Consider the mean-field FK model on $n$ vertices with parameters $(p,q)$ with $q>2$ and $np=\lambda\in(\lambda_{s},\lambda_{S})$ ; for every $\rho,\varepsilon,M>0$ let

[TABLE]

There exists $c(\rho,M,\lambda,q)>0$ such that for sufficiently small $\rho,\varepsilon>0$ , for $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Proof.

Fix $np=\lambda>\lambda_{s}$ and for $\rho,\varepsilon,M>0$ , define the sets

[TABLE]

We prove the lemma similarly to Lemma 4.5, after treating the giant component separately. Using the coloring scheme of Corollary 4.2, with $\mathbb{P}_{\mathrm{col}}$ and $\pi_{\textbf{R}}$ defined as before, by considering the color class to which $\mathcal{C}_{1}$ belongs, and using symmetry, we obtain

[TABLE]

Call the two sums on the right hand side I and II respectively and consider them separately. Conditional on $A_{\rho,\varepsilon,M}$ and $\mathcal{C}_{1}\subset R_{1}$ , if $\mu_{\textbf{I}}=(\Theta^{*}+\varepsilon)n+\tfrac{1}{q}(1-\Theta^{*}-\varepsilon)n$ ,

[TABLE]

where we used Lemma 3.2 with $\varepsilon=\delta=\rho$ . Following the proof of Lemma 4.5, let

[TABLE]

By Lemma 3.2 and a union bound, $\mathbb{P}_{\mathrm{col}}((\Gamma_{\rho}^{\textbf{I}})^{c}\mid\mathcal{C}_{1}\subset R_{1},A_{\rho,\varepsilon,M})\leq 2\lceil q\rceil e^{-\rho^{2}n/(2M^{2})}$ .

Using the fact that for every $\varepsilon>0$ , $E_{\rho,\varepsilon,M}\subset B_{\rho,M}$ , we can write

[TABLE]

If $\textbf{R}\in\Gamma_{\rho}^{\textbf{I}}$ , for sufficiently small $\rho>0$ , the definition of $\Theta^{*}$ and $\lambda>\lambda_{s}$ implies $\mathcal{G}(|R_{1}|,p)$ is supercritical, and both $\mathcal{G}(\frac{n-|R_{1}|}{q-1}+2\rho n,p)$ and $\mathcal{G}(|R_{0}|,p,q-\lfloor q\rfloor)$ are subcritical. By a union bound we can expand the numerator above as at most

[TABLE]

and analogously, the denominator as at least

[TABLE]

(In both of the above, we paid a cost of $e^{-c\Theta^{*}n}$ for the assumption ${\mathscr{L}}_{1}(\omega)={\mathscr{L}}_{1}(\omega\mathord{\upharpoonright}_{R_{1}})$ .) By Lemma 3.4, (for every ${\mathscr{L}}_{1}\geq(\Theta^{*}+\varepsilon)n$ and $\textbf{R}\in\Gamma_{\rho}^{\textbf{I}}$ , $\mathcal{G}(|R_{1}|-{\mathscr{L}}_{1},p)$ is subcritical) there exists $c(\lambda,q)>0$ such that for sufficiently small $\rho,\varepsilon>0$ and every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Moreover, as in the proof of Lemma 4.5, by Lemmas 3.1 and 4.3, we also have that for $i=2,...,\lfloor q\rfloor$ that there exists $c(\lambda,q)>0$ such that for every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Clearly, analogous bounds hold for all of the above when replacing $\frac{\rho n}{\lceil q\rceil}$ with $\frac{\rho n}{2\lceil q\rceil}$ . Combining all of the above bounds and plugging them in to the right-hand side of

[TABLE]

yields an exponentially decaying upper bound on the sum I. The bound on the sum II is very similar. Letting $\mu_{\textbf{II}}=(\Theta^{*}+\varepsilon)n+\frac{q-\lfloor q\rfloor}{q}(1-\Theta^{*}-\varepsilon)n$ , we define

[TABLE]

As before, by Lemma 3.2, we can write

[TABLE]

and observe that for every $\textbf{R}\in\Gamma_{\rho}^{\textbf{II}}$ , since $\lambda\in(\lambda_{s},\lambda_{S})$ , for sufficiently small $\rho>0$ , the FK model $\pi_{|R_{0}|,\lambda,q-\lfloor q\rfloor}$ is supercritical and the random graphs $\mathcal{G}(|R_{i}|,\lambda)$ are subcritical for all $i=1,...,\lfloor q\rfloor$ . By Lemmas 3.1 and 4.4, there exists $c(\lambda,q)>0$ such that for every $\rho>0$ sufficiently small and every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

and by the same reasoning, analogous bounds hold when replacing $\frac{\rho n}{\lceil q\rceil}$ with $\frac{\rho n}{2\lceil q\rceil}$ . Then expanding the fraction in the upper bound on II as done in the bound on I implies there exists $c(\lambda,q)>0$ such that for sufficiently small $\rho,\varepsilon>0$ and every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

We are now in position to complete the proof of Theorem 2.

Proof of Theorem 2: the case $np=\lambda\in(\lambda_{s},\lambda_{c}]$ .

The proof when $\lambda\in(\lambda_{s},\lambda_{c}]$ is similar to the extension of slow mixing for the Swendsen–Wang dynamics when $\lambda\in[\lambda_{c},\lambda_{S})$ to $\lambda\in(\lambda_{s},\lambda_{c}]$ . Recall that for fixed $\lambda>\lambda_{s}$ , the two zeros of $g(\theta)=f(\theta)-\theta$ were denoted $\Theta^{*}<\Theta_{r}$ so that $g$ is positive on $(\Theta^{*},\Theta_{r})$ . We again use a conductance estimate to lower bound the inverse gap of the Chayes–Machta dynamics. Define for every $\rho,\varepsilon,M>0$ ,

[TABLE]

As in (4.4), by (4.1) it suffices to show an exponentially decaying upper bound on

[TABLE]

for sufficiently small $\rho,\varepsilon>0$ and large $M$ ; this is because by Proposition 2.1, for all small enough $\varepsilon,\rho$ , we have $\pi_{n,\lambda,q}(A_{\rho,\varepsilon,M}^{c})\geq c>0$ . We bound the two terms above separately as in the proof for $\lambda\in[\lambda_{c},\lambda_{S})$ . First of all, note by Lemma 4.7 that the second term on the right-hand side is bounded above by $e^{-cn}$ for some $c(\rho,M,\lambda,q)>0$ for every sufficiently small $\varepsilon,\rho>0$ and every $M\geq M_{0}(\lambda,\rho)$ .

Now consider any $X_{0}\in A_{\rho,\varepsilon,M}-E_{\rho,\varepsilon,M}$ and bound $P(X_{0},A^{c}_{\rho,\varepsilon,M})$ under the Chayes–Machta dynamics. We split the transition probability of the Chayes–Machta dynamics into the case when $\mathcal{C}_{1}(X_{0})$ is activated and $\mathcal{C}_{1}(X_{0})$ is not activated; let $\mathcal{A}_{1}$ denote the set of activated vertices. If $\mathcal{C}_{1}(X_{0})\not\subset\mathcal{A}_{1}$ , we have $\mathbb{E}[|\mathcal{A}_{1}|\mid\mathcal{C}_{1}\not\subset\mathcal{A}_{1}]\leq\frac{1}{q}(1-\Theta^{*}-\varepsilon)n$ and since $X_{0}\in A_{\rho,\varepsilon,M}$ , by Lemma 3.2, if $\varepsilon>\rho$ , then

[TABLE]

If $|\mathcal{A}_{1}|\leq\frac{1}{q}(1-\Theta^{*}-\varepsilon)n+\varepsilon n$ , for sufficiently small $\varepsilon>0$ , since $\lambda<\lambda_{S}=q$ , the random graph $\mathcal{G}(|\mathcal{A}_{1}|,p)$ is subcritical, in which case with probability at least $1-e^{-c\Theta^{*}n}$ , $\mathcal{C}_{1}(X_{1})=\mathcal{C}_{1}(X_{0})$ . By Lemma 3.1, there exists $c(\rho,M,\lambda,q)>0$ such that for $0<\rho<\varepsilon$ sufficiently small and every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Thus, for some $c(\rho,\varepsilon,M,\lambda,q)>0$ , for small enough $0<\rho<\varepsilon$ , and every $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Now suppose that $\mathcal{C}_{1}(X_{0})\subset\mathcal{A}_{1}$ ; then one step of Chayes–Machta dynamics is described precisely by the set up of Lemma 4.6, with $\rho$ replaced by $\rho/2$ , yielding

[TABLE]

for some $c^{\prime}(\varepsilon,\rho,M,\lambda,q)>0$ for all sufficiently small $\varepsilon,\rho>0$ and $M\geq M_{0}(\lambda,\rho)$ . On the complement of that event, deterministically $\mathcal{C}_{1}(X_{1})=\mathcal{C}_{1}(X_{1}\mathord{\upharpoonright}_{\mathcal{A}_{1}})$ . By Lemma 3.4, for some $c(\lambda,q)>0$ , for small $\varepsilon,\rho>0$ and large $M\geq M_{0}(\lambda,\rho)$ ,

[TABLE]

Combining the above, we deduce that there exists $c(\rho,\varepsilon,M,\lambda,q)>0$ such that for all sufficiently small $0<\rho<\varepsilon$ , for every $M\geq M_{0}(\lambda,\rho)$ , we have $P(X_{0},A_{\rho,\varepsilon,M}^{c})\lesssim e^{-cn}$ , concluding the proof of Theorem 2 when $\lambda\in(\lambda_{s},\lambda_{c}]$ . ∎

Acknowledgment

R.G. thanks the theory group of Microsoft Research Redmond for its hospitality during the time some of this work was carried out. E.L. was supported in part by NSF grant DMS-1513403.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Blanca and A. Sinclair. Dynamics for the mean-field random-cluster model. In Proc. of the 19th International Workshop on Randomization and Computation (RANDOM 2015) , pages 528–543, 2015.
2[2] B. Bollobás, G. Grimmett, and S. Janson. The random-cluster model on the complete graph. Probab. Theory Related Fields , 104(3):283–317, 1996.
3[3] C. Borgs, J. Chayes, A. Frieze, J. H. Kim, P. Tetali, E. Vigoda, and V. H. Vu. Torpid mixing of some monte carlo markov chain algorithms in statistical physics. In Proc. of the 40th Annual Symposium on Foundations of Computer Science (FOCS 1999) , pages 218–229, 1999.
4[4] C. Borgs, J. T. Chayes, and P. Tetali. Tight bounds for mixing of the Swendsen-Wang algorithm at the Potts transition point. Probab. Theory Related Fields , 152(3-4):509–557, 2012.
5[5] L. Chayes and J. Machta. Graphical representations and cluster algorithms i. discrete spin systems. Physica A: Statistical Mechanics and its Applications , 239(4):542–601, 1997.
6[6] C. Cooper, M. E. Dyer, A. M. Frieze, and R. Rue. Mixing properties of the Swendsen-Wang process on the complete graph and narrow grids. J. Math. Phys. , 41(3):1499–1527, 2000. Probabilistic techniques in equilibrium and nonequilibrium statistical physics.
7[7] P. Cuff, J. Ding, O. Louidor, E. Lubetzky, Y. Peres, and A. Sly. Glauber dynamics for the mean-field Potts model. J. Stat. Phys. , 149(3):432–477, 2012.
8[8] R. G. Edwards and A. D. Sokal. Generalization of the Fortuin-Kasteleyn-Swendsen-Wang representation and Monte Carlo algorithm. Phys. Rev. D (3) , 38(6):2009–2012, 1988.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Exponentially slow mixing in the

Abstract.

1. Introduction

Theorem 1**.**

Theorem 2**.**

2. Preliminaries

The FK and Potts phase transitions

Proposition 2.1** ([2, Thms. 2.1–2.2],[18, Thm. 19]).**

Corollary 2.2**.**

Cluster dynamics

Glauber dynamics for the FK model

Size of largest component and drift functions

Fact 2.3** (e.g., cf. [15, p. 109]).**

Proposition 2.4** ([17, Lemma 5.4]).**

Proposition 2.5** ([9, Lemma 5]).**

Proposition 2.6** ([1, Lemma 2.14]).**

Mixing time and spectral gap

Spectral gap comparisons

Proposition 2.7** ([20]).**

3. Slow mixing of Swendsen–Wang dynamics

Lemma 3.1**.**

Proof.

Lemma 3.2**.**

Proof.

3.1. The supercritical regime: proof of Theorem 1 for the case λ∈(λc,λS)\lambda\in(\lambda_{c},\lambda_{S})λ∈(λc​,λS​)

Claim 3.3**.**

Proof.

3.2. The subcritical regime: proof of Theorem 1 for λ∈(λs,λc)\lambda\in(\lambda_{s},\lambda_{c})λ∈(λs​,λc​)

Lemma 3.4**.**

Proof.

Claim 3.5**.**

Proof.

3.3. The critical point: proof of Theorem 1 for λ=λc\lambda=\lambda_{c}λ=λc​

4. Slow mixing of Glauber dynamics for the FK model

Lemma 4.1** ([2, Lemma 3.1]).**

Corollary 4.2**.**

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

4.1. The supercritical/critical regime, np=λ∈[λc,λS)np=\lambda\in[\lambda_{c},\lambda_{S})np=λ∈[λc​,λS​)

Lemma 4.5**.**

Proof.

Proof of Theorem 2: the case np=λ∈[λc,λS)np=\lambda\in[\lambda_{c},\lambda_{S})np=λ∈[λc​,λS​).

4.2. The subcritical/critical regime, np=λ∈(λs,λc]np=\lambda\in(\lambda_{s},\lambda_{c}]np=λ∈(λs​,λc​]

Lemma 4.6**.**

Proof.

Lemma 4.7**.**

Proof.

Proof of Theorem 2: the case np=λ∈(λs,λc]np=\lambda\in(\lambda_{s},\lambda_{c}]np=λ∈(λs​,λc​].

Acknowledgment

Theorem 1.

Theorem 2.

Proposition 2.1 ([2, Thms. 2.1–2.2],[18, Thm. 19]).

Corollary 2.2.

Fact 2.3 (e.g., cf. [15, p. 109]).

Proposition 2.4 ([17, Lemma 5.4]).

Proposition 2.5 ([9, Lemma 5]).

Proposition 2.6 ([1, Lemma 2.14]).

Proposition 2.7 ([20]).

Lemma 3.1.

Lemma 3.2.

3.1. The supercritical regime: proof of Theorem 1 for the case $\lambda\in(\lambda_{c},\lambda_{S})$

Claim 3.3.

3.2. The subcritical regime: proof of Theorem 1 for $\lambda\in(\lambda_{s},\lambda_{c})$

Lemma 3.4.

Claim 3.5.

3.3. The critical point: proof of Theorem 1 for $\lambda=\lambda_{c}$

Lemma 4.1 ([2, Lemma 3.1]).

Corollary 4.2.

Lemma 4.3.

Lemma 4.4.

4.1. The supercritical/critical regime, $np=\lambda\in[\lambda_{c},\lambda_{S})$

Lemma 4.5.

Proof of Theorem 2: the case $np=\lambda\in[\lambda_{c},\lambda_{S})$ .

4.2. The subcritical/critical regime, $np=\lambda\in(\lambda_{s},\lambda_{c}]$

Lemma 4.6.

Lemma 4.7.

Proof of Theorem 2: the case $np=\lambda\in(\lambda_{s},\lambda_{c}]$ .