Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered   Phases

Matthew Fahrbach; Dana Randall

arXiv:1904.01495·cs.DS·December 23, 2020

Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered Phases

Matthew Fahrbach, Dana Randall

PDF

TL;DR

This paper proves that Glauber dynamics for the six-vertex model in ordered phases can require exponential time to mix, revealing fundamental limitations of local Markov chains in these regimes.

Contribution

It provides the first rigorous bounds on the slow mixing of Glauber dynamics in the ferroelectric phase of the six-vertex model, extending understanding in ordered phases.

Findings

01

Glauber dynamics mixes exponentially slow in the ferroelectric phase.

02

Boundary conditions can induce slow mixing in the ordered phases.

03

New techniques relate correlated random walks to lattice path models.

Abstract

The six-vertex model in statistical physics is a weighted generalization of the ice model on $Z^{2}$ (i.e., Eulerian orientations) and the zero-temperature three-state Potts model (i.e., proper three-colorings). The phase diagram of the model depicts its physical properties and suggests where local Markov chains will be efficient. In this paper, we analyze the mixing time of Glauber dynamics for the six-vertex model in the ordered phases. Specifically, we show that for all Boltzmann weights in the ferroelectric phase, there exist boundary conditions such that local Markov chains require exponential time to converge to equilibrium. This is the first rigorous result bounding the mixing time of Glauber dynamics in the ferroelectric phase. Our analysis demonstrates a fundamental connection between correlated random walks and the dynamics of intersecting lattice path models (or…

Equations201

Δ = \frac{a ^{2} + b ^{2} - c ^{2}}{2 ab} .

Δ = \frac{a ^{2} + b ^{2} - c ^{2}}{2 ab} .

∥ μ - ν ∥_{TV} = \frac{1}{2} x \in Ω \sum ∣ μ (x) - ν (x) ∣ .

∥ μ - ν ∥_{TV} = \frac{1}{2} x \in Ω \sum ∣ μ (x) - ν (x) ∣ .

Φ (S) = \frac{\sum _{x \in S, y \neq \in S} π ( x ) P ( x , y )}{π ( S )},

Φ (S) = \frac{\sum _{x \in S, y \neq \in S} π ( x ) P ( x , y )}{π ( S )},

S = def {x \in Ω : the deviation of each path in x is less than 8 n^{3/4}} .

S = def {x \in Ω : the deviation of each path in x is less than 8 n^{3/4}} .

Pr (γ) \propto μ^{(# of straights in γ)} .

Pr (γ) \propto μ^{(# of straights in γ)} .

Pr (γ deviates by at least 2 m) \leq e^{- (1 - ε) \frac{m ^{2}}{μ n}} .

Pr (γ deviates by at least 2 m) \leq e^{- (1 - ε) \frac{m ^{2}}{μ n}} .

X_{i + 1} = {X_{i} - X_{i} with probability p, with probability 1 - p .

X_{i + 1} = {X_{i} - X_{i} with probability p, with probability 1 - p .

Pr (S_{2 n} = 2 m) = {\frac{1}{2} p^{2 n - 1} \sum_{k = 1}^{n - m} (k - 1 n + m - 1) (k - 1 n - m - 1) (1 - p)^{2 k - 1} p^{2 n - 1 - 2 k} (\frac{n ( 1 - p ) + k ( 2 p - 1 )}{k}) if 2 m = 2 n, if 2 m < 2 n .

Pr (S_{2 n} = 2 m) = {\frac{1}{2} p^{2 n - 1} \sum_{k = 1}^{n - m} (k - 1 n + m - 1) (k - 1 n - m - 1) (1 - p)^{2 k - 1} p^{2 n - 1 - 2 k} (\frac{n ( 1 - p ) + k ( 2 p - 1 )}{k}) if 2 m = 2 n, if 2 m < 2 n .

Pr (γ deviates by at least 2 m) = Pr (i = 0..2 n max ∣ S_{i} ∣ \geq 2 m S_{2 n} = 0) .

Pr (γ deviates by at least 2 m) = Pr (i = 0..2 n max ∣ S_{i} ∣ \geq 2 m S_{2 n} = 0) .

Pr (S_{2 n} = 0) \sim \frac{1}{μ π n} .

Pr (S_{2 n} = 0) \sim \frac{1}{μ π n} .

Pr (S_{2 n} = 2 m) \geq Pr (S_{2 n} = 2 (m + 1)) .

Pr (S_{2 n} = 2 m) \geq Pr (S_{2 n} = 2 (m + 1)) .

Pr (S_{2 n} = 2 m) = {\frac{1}{2} p^{2 n - 1} \sum_{k = 1}^{n - m} (k - 1 n + m - 1) (k - 1 n - m - 1) (1 - p)^{2 k - 1} p^{2 n - 1 - 2 k} (\frac{n ( 1 - p ) + k ( 2 p - 1 )}{k}) if 2 m = 2 n, if 2 m < 2 n .

Pr (S_{2 n} = 2 m) = {\frac{1}{2} p^{2 n - 1} \sum_{k = 1}^{n - m} (k - 1 n + m - 1) (k - 1 n - m - 1) (1 - p)^{2 k - 1} p^{2 n - 1 - 2 k} (\frac{n ( 1 - p ) + k ( 2 p - 1 )}{k}) if 2 m = 2 n, if 2 m < 2 n .

Pr (S_{2 n} = 2 m) = (1 - p) p^{2 n - 3} (n (1 - p) + 2 p - 1),

Pr (S_{2 n} = 2 m) = (1 - p) p^{2 n - 3} (n (1 - p) + 2 p - 1),

Pr (S_{2 n} = 2 (m + 1)) = \frac{1}{2} p^{2 n - 1} .

n \geq \frac{1}{1 - p} \cdot (\frac{p ^{2}}{2 ( 1 - p )} + 1 - 2 p) > 0.

n \geq \frac{1}{1 - p} \cdot (\frac{p ^{2}}{2 ( 1 - p )} + 1 - 2 p) > 0.

(k - 1 n + m - 1) (k - 1 n - m - 1) - (k - 1 n + ( m + 1 ) - 1) (k - 1 n - ( m + 1 ) - 1) \geq 0.

(k - 1 n + m - 1) (k - 1 n - m - 1) - (k - 1 n + ( m + 1 ) - 1) (k - 1 n - ( m + 1 ) - 1) \geq 0.

(k - 1 n + ( m + 1 ) - 1)

(k - 1 n + ( m + 1 ) - 1)

(k - 1 n - ( m + 1 ) - 1)

1 - \frac{n + m}{n + m - ( k - 1 )} \cdot \frac{n - m - k}{n - m - 1} \geq 0.

1 - \frac{n + m}{n + m - ( k - 1 )} \cdot \frac{n - m - k}{n - m - 1} \geq 0.

(n + m - (k - 1)) (n - m - 1) \geq (n + m) (n - m - k) .

(n + m - (k - 1)) (n - m - 1) \geq (n + m) (n - m - k) .

(n + m - (k - 1)) (n - m - 1) - (n + m) (n - m - k) = (2 m + 1) (k - 1) \geq 0

(n + m - (k - 1)) (n - m - 1) - (n + m) (n - m - k) = (2 m + 1) (k - 1) \geq 0

Pr (S_{2 n} = 2 m) \leq e^{- (1 - ε) \frac{m ^{2}}{μ n}} .

Pr (S_{2 n} = 2 m) \leq e^{- (1 - ε) \frac{m ^{2}}{μ n}} .

Pr (γ deviates by at least 2 m)

Pr (γ deviates by at least 2 m)

\leq \frac{Pr ( max _{i = 0..2 n} ∣ S _{i} ∣ \geq 2 m )}{Pr ( S _{2 n} = 0 )}

\leq 2 μ π n \cdot Pr (i = 0..2 n max ∣ S_{i} ∣ \geq 2 m),

Pr (i = 0..2 n max ∣ S_{i} ∣ \geq 2 m)

Pr (i = 0..2 n max ∣ S_{i} ∣ \geq 2 m)

= 2 \cdot Pr (i = 0..2 n max S_{i} \geq 2 m) .

Pr (i = 0..2 n max S_{i} \geq 2 m)

Pr (i = 0..2 n max S_{i} \geq 2 m)

i = 1 \sum n Pr (S_{2 i} \geq 2 m)

i = 1 \sum n Pr (S_{2 i} \geq 2 m)

Pr (γ deviates by at least 2 m)

Pr (γ deviates by at least 2 m)

π (S)

π (S)

2 ℓ (n - (2 ℓ d + ℓ + c))

2 ℓ (n - (2 ℓ d + ℓ + c))

2 k = 1 \sum ℓ 2 (k d \pm c)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered

Phases

Matthew Fahrbach

Email: [email protected]. Supported in part by an NSF Graduate Research Fellowship under grant DGE-1650044.

School of Computer Science, Georgia Institute of Technology

Dana Randall

Email: [email protected]. Supported in part by NSF grants CCF-1526900, CCF-1637031, and CCF-1733812.

School of Computer Science, Georgia Institute of Technology

Abstract

The six-vertex model in statistical physics is a weighted generalization of the ice model on $\mathbb{Z}^{2}$ (i.e., Eulerian orientations) and the zero-temperature three-state Potts model (i.e., proper three-colorings). The phase diagram of the model depicts its physical properties and suggests where local Markov chains will be efficient. In this paper, we analyze the mixing time of Glauber dynamics for the six-vertex model in the ordered phases. Specifically, we show that for all Boltzmann weights in the ferroelectric phase, there exist boundary conditions such that local Markov chains require exponential time to converge to equilibrium. This is the first rigorous result bounding the mixing time of Glauber dynamics in the ferroelectric phase. Our analysis demonstrates a fundamental connection between correlated random walks and the dynamics of intersecting lattice path models (or routings). We analyze the Glauber dynamics for the six-vertex model with free boundary conditions in the antiferroelectric phase and significantly extend the region for which local Markov chains are known to be slow mixing. This result relies on a Peierls argument and novel properties of weighted non-backtracking walks.

1 Introduction

The six-vertex model was first introduced by Pauling in 1935 [Pau35] to study the thermodynamics of crystalline solids with ferroelectric properties, and has since become one of the most compelling models in statistical mechanics. The prototypical instance of the model is the hydrogen-bonding pattern of two-dimensional ice—when water freezes, each oxygen atom must be surrounded by four hydrogen atoms such that two of the hydrogen atoms bond covalently with the oxygen atom and two are farther away. The state space of the six-vertex model consists of orientations of the edges in a finite region of the two-dimensional square lattice where every internal vertex has two incoming edges and two outgoing edges, also represented as Eulerian orientations of the underlying lattice graph. The model is most often studied on the $n\times n$ square lattice $\Lambda_{n}\subseteq\mathbb{Z}^{2}$ with $4n$ additional edges so that each internal vertex has degree 4. There are six possible edge orientations incident to a vertex (see Figure 1). We assign Boltzmann weights $w_{1},w_{2},w_{3},w_{4},w_{5},w_{6}\in\mathbb{R}_{>0}$ to the six vertex types and define the partition function as $Z=\sum_{x\in\Omega}\prod_{i=1}^{6}w_{i}^{n_{i}(x)}$ , where $\Omega$ is the set of Eulerian orientations of $\Lambda_{n}$ and $n_{i}(x)$ is the number of type- $i$ vertices in the configuration $x$ .

In 1967, Lieb discovered exact solutions to the six-vertex model with periodic boundary conditions (i.e., on the torus) for three different parameter regimes [Lie67a, Lie67b, Lie67c]. In particular, he famously showed that if all six vertex weights are set to $w_{i}=1$ , the energy per vertex is $\lim_{n\rightarrow\infty}Z^{1/n^{2}}=\left(4/3\right)^{3/2}=1.5396007...$ , which is known as “Lieb’s square ice constant”. His results were immediately generalized to all parameter regimes and to account for external electric fields [Sut67, Yan67]. An equivalence between periodic and free boundary conditions in the limit was established in [BKW73], and since then the primary object of study has been the six-vertex model subject to domain wall boundary conditions, where the lower and upper boundary edges point into the square and the left and right boundary edges point outwards [ICK92, KZJ00, BPZ02, BF06, BL09, BL10]. The six-vertex model serves as an important “counterexample” in statistical physics because the surface free energy in the thermodynamic limit depends on the boundary conditions. In particular, it is different for periodic and domain wall boundary conditions.

There have been several surprisingly profound connections to combinatorics and probability in this line of work. For example, Zeilberger gave a sophisticated computer-assisted proof of the alternating sign matrix conjecture in 1995 [Zei96]. A year later, Kuperberg [Kup96] produced an elegant and significantly shorter proof using analysis of the partition function of the six-vertex model with domain wall boundary conditions. Other connections to combinatorics include the dimer model on the Aztec diamond and the arctic circle theorem [CEP96, FS06], sampling lozenge tilings [LRS01, Wil04, BCFR17], and counting 3-colorings of lattice graphs [RT00, CR16].

While there has been extraordinary progress in understanding properties of the six-vertex model with periodic or domain wall boundary conditions in mathematical physics, remarkably less is known when the model is subject to arbitrary boundary conditions. Sampling configurations using Markov chain Monte Carlo (MCMC) algorithms has been one of the primary means for discovering mathematical and physical properties of the six-vertex model [AR05, LKV17, LKRV18, KS18, BR20]. However, the model is empirically very sensitive to boundary conditions, and numerical studies have often observed slow convergence of local MCMC algorithms under certain parameter settings. For example, according to [LKRV18], “it must be stressed that the Metropolis algorithm might be impractical in the antiferromagnetic phase, where the system may be unable to thermalize.” There are very few rigorous results about natural Markov chains and the computational complexity of sampling from the six-vertex model when the Boltzmann distribution is nonuniform, thus motivating our study of Glauber dynamics for the six-vertex model, the most widely used MCMC sampling algorithm, in the ferroelectric and antiferroelectric phases.

At first glance, the model has six degrees of freedom. However, this conveniently reduces to a two-parameter family because of invariants that relate pairs of vertex types. To see this, it is useful to view the configurations of the six-vertex model as intersecting lattice paths by erasing all of the edges that are directed south or west and keeping the others (see Figure 2). Using this bijective “routing interpretation,” it is simple to see that the number of type-5 and type-6 vertices must be closely correlated. In addition to revealing invariants, the lattice path representation of configurations turns out to be exceptionally useful for analyzing Glauber dynamics. Moreover, the total weight of a configuration should remain unchanged if all the edge directions are reversed in the absence of an external electric field, so we let $w_{1}=w_{2}=a$ , $w_{3}=w_{4}=b$ , and $w_{5}=w_{6}=c$ . This complementary invariance is known as the zero field assumption, and it is often convenient to exploit the conservation laws of the model [BL09] to reparameterize the system so that $w_{1}=a^{2}$ and $w_{2}=1$ . This allows us to ignore empty sites and focus solely on weighted lattice paths. Furthermore, since our goal is to sample configurations from the Boltzmann distribution, we can normalize the partition function by a factor of $c^{-n^{2}}$ and consider the weight $(a/c,b/c,1)$ instead of the parameter $(a,b,c)$ . Collectively, we refer to these properties as the invariance of the Gibbs measure for the six-vertex model.

The single-site Glauber dynamics for the six-vertex model is the Markov chain that makes local moves by (1) choosing an internal cell of the lattice uniformly at random and (2) reversing the orientations of the edges that bound the chosen cell if they form a cycle. In the lattice path interpretation, these dynamics correspond to the “mountain-valley” Markov chain that flips corners. Transitions between states are made according to the Metropolis-Hastings acceptance probability [MRR*+*53] so that the Markov chain converges to the desired distribution.

The phase diagram of the six-vertex model represents distinct thermodynamic properties of the system and is partitioned into three regions: the disordered (DO) phase, the ferroelectric (FE) phase, and the antiferroelectric (AFE) phase. To establish these regions, we consider the parameter

[TABLE]

The disordered phase is the set of parameters $(a,b,c)\in\mathbb{R}_{>0}^{3}$ that satisfy $|\Delta|<1$ , and Glauber dynamics is expected to be rapidly mixing in this region because there are no long-range correlations in the system. The ferroelectric phase is defined by $\Delta>1$ , or equivalently when we have $a>b+c$ or $b>a+c$ . The antiferroelectric phase is defined by $\Delta<-1$ , or equivalently when $a+b<c$ .

The phase diagram is symmetric over the positive diagonal, which follows from the fact that $a$ and $b$ are interchangeable under the automorphism that rotates each of the six vertex types by ninety degrees clockwise. This is equivalent to rotating the entire model under the zero field assumption. Therefore, we can assume that mixing results are symmetric over the main diagonal. Combinatorially, we show in Section 3 that configurations in the ferroelectric phase can be interpreted as intersecting lattice paths that prefer to adhere to each other. We carefully exploit this property to show that Glauber dynamics slow mixing. In the antiferroelectric phase, configurations prefer vertices of type- $c$ and tend to be closely aligned with one of states with maximum probability that are arrow reversals of each other.

1.1 Related Works

Cai, Liu, and Lu [CLL19] recently investigated the six-vertex model for 4-regular graphs and provided strong evidence that the complexity of approximating the partition function agrees with the phase diagram from statistical physics. In particular, they give a fully randomized approximation scheme (FPRAS) for all 4-regular graphs in the subregion of the disordered phase defined by the inequalities $a^{2}\leq b^{2}+c^{2}$ , $b^{2}\leq a^{2}+c^{2}$ , and $c^{2}\leq a^{2}+b^{2}$ (i.e., the blue region in Figure 3(a)). Their algorithm builds on the winding technique for Holant problems developed in [McQ13, HLZ16] and requires $O(n^{10})$ time to sample a six-vertex configuration from the Boltzmann distribution, where $n$ is the number of vertices in the graph. The Markov chain they use is not Glauber dynamics, but rather a directed loop algorithm whose state space is augmented with “near-perfect” configurations that slightly violate the Eulerian orientation constraint. This Markov chain can be understood as gradually reversing a large directed loop in a valid six-vertex configuration, whereas Glauber dynamics is restricted to reversing cycles that form the perimeter of a cell. Cai, Liu, and Lu also showed that an FPRAS for 4-regular graphs cannot exist in the ferroelectric or antiferroelectric regions unless $\textbf{RP}=\textbf{NP}$ (i.e., the gray regions in Figure 3(a)). Their hardness results use nonplanar 4-regular gadgets to reduce from 3-MIS, the NP-hard problem of computing the cardinality of a maximum independent set in a 3-regular graph [GJS74], and therefore so not directly reveal anything about the mixing time of Glauber dynamics for the six-vertex model on regions of $\mathbb{Z}^{2}$ . A dichotomy theorem for the (exact) computability of the partition function of the six-vertex model on 4-regular graphs was also recently proven in [CFX18].

As for the positive results about the mixing time of Glauber dynamics, Luby, Randall, and Sinclair [LRS01] proved rapid mixing of a Markov chain that leads to a fully polynomial almost uniform sampler for Eulerian orientations on any region of the Cartesian lattice with fixed boundaries (i.e., the unweighted case when $a/c=b/c=1$ ). Randall and Tetali [RT00] then used a comparison technique to argue that Glauber dynamics for Eulerian orientations on lattice graphs is rapidly mixing by relating this Markov chain to the Luby-Randall-Sinclair chain. Goldberg, Martin, and Paterson [GMP04] extended their approach to show that Glauber dynamics is rapidly mixing on rectangular lattice regions with free boundary conditions.

Liu [Liu18] gave the first rigorous result showing that Glauber dynamics can be slowly mixing in a subregion of an ordered phase. In particular, Liu showed that local Markov chains subject to free boundary conditions require exponential time to converge to stationarity in the antiferroelectric subregion defined by $\max(a,b)<c/\mu$ (i.e., the red region in Figure 3(b)), where $\mu=2.6381585...$ is the connective constant for self-avoiding walks on the square lattice. We note that the connective constant is defined by the limit $\mu=\lim_{n\rightarrow\infty}\gamma_{n}^{1/n}$ , where $\gamma_{n}$ is the number of self-avoiding walks of length $n$ on the square lattice. Liu also showed that the directed loop algorithm used in [CLL19] mixes slowly in the same antiferroelectric subregion and for all of the ferroelectric region. This, however, has no bearing on the efficiency of Glauber dynamics in the ferroelectric region. As an aside, we also remark that the partition function is exactly computable for all boundary conditions at the free-fermion point when $\Delta=0$ , or equivalently $a^{2}+b^{2}=c^{2}$ , via a reduction to domino tilings and a Pfaffian computation [FS06].

1.2 Main Results

In this paper we show that there exist boundary conditions for which Glauber dynamics mixes slowly for the six-vertex model in the ferroelectric and antiferroelectric phases. We start by proving that there are boundary conditions that cause Glauber dynamics to be slow for all Boltzmann weights that lie in the ferroelectric region of the phase diagram, where the mixing time is exponential in the number of vertices in the lattice. This is the first rigorous result for the mixing time of Glauber dynamics in the ferroelectric phase and it gives a complete characterization.

Theorem 1.1 (Ferroelectric Phase).

For any $(a,b,c)\in\mathbb{R}^{3}_{>0}$ such that $a>b+c$ or $b>a+c$ , there exist boundary conditions for which Glauber dynamics mixes exponentially slowly on $\Lambda_{n}$ .

We note that our approach naturally breaks down at the critical line of the conjectured phase diagram for the mixing time in a way that reveals a trade-off between the energy and entropy of the system. Additionally, our analysis suggests an underlying combinatorial interpretation for the phase transition between the ferroelectric and disordered phases in terms of the adherence strength of intersecting lattice paths and the momentum parameter of correlated random walks.

Our second mixing result builds on the topological obstruction framework developed in [Ran06] to show that Glauber dynamics with free boundary conditions mixes slowly in most of the antiferroelectric region. Specifically, we generalize the recent antiferroelectric mixing result in [Liu18] with a Peierls argument that uses multivariate generating functions for weighted non-backtracking walks instead of the connectivity constant for (unweighted) self-avoiding walks to better account for the discrepancies in Boltzmann weights.

Theorem 1.2 (Antiferroelectric Phase).

For any $(a,b,c)\in\mathbb{R}^{3}_{>0}$ such that $ac+bc+3ab<c^{2}$ , Glauber dynamics mixes exponentially slowly on $\Lambda_{n}$ with free boundary conditions.

We illustrate the new regions for which Glauber dynamics can be slowly mixing in Figure 3. Observe that our antiferroelectric subregion significantly extends Liu’s and pushes towards the conjectured threshold.

1.3 Techniques

We take significantly different approaches for our analysis of the ferroelectric and antiferroelectric phases. In the ferroelectric phase, where $a>b+c$ and type- $a$ vertices are preferred to type- $b$ and type- $c$ vertices, we construct boundary conditions that induce polynomially-many paths separated by a critical distance that allows all of the paths to (1) behave independently and (2) simultaneously intersect with their neighbors maximally. (This analysis also covers the case $b>a+c$ by a standard invariant that shows symmetry in the phase diagram over the line $y=x$ .) From here, we analyze the dynamics of a single path in isolation as an escape probability, which eventually allows us to bound the conductance of the Markov chain. The dynamics of a single lattice path is equivalent to that of a correlated random walk. In Section 5 we present a new tail inequality for correlated random walks that accurately bounds the probability of large deviations from the starting position. We note that decomposing the dynamics of lattice models into one-dimensional random walks has recently been shown to achieve nearly tight bounds for escape probabilities in a different setting [DFGX18].

One of the key technical contributions in this paper is our analysis of the tail behavior of correlated random walks in Section 5. While there is a simple combinatorial expression for the position of a correlated random walk written as a sum of marginals, it is not immediately useful for bounding the displacement from the origin. To achieve an exponentially small tail bound for these walks, we first construct a smooth function that tightly upper bounds the marginals and then optimize this function to analyze the asymptotics of the log of the maximum marginal. Once we obtain an asymptotic equality for the maximum marginal, we can upper bound the deviation of a correlated random walk, and hence the deviation of a lattice path in a configuration. Ultimately, this allows us to show that there exists a balanced cut in the state space that has an exponentially small escape probability, which implies that the Glauber dynamics are slowly mixing.

In the antiferroelectric phase, on the other hand, the weights satisfy $a+b<c$ , so type- $c$ vertices are preferred. It follows that there are two (arrow-reversal) symmetric ground states of maximum probability containing only type- $c$ vertices. To move between configurations that agree predominantly with different ground states, the Markov chain must pass through configurations with a large number of type- $a$ or type- $b$ vertices. Using the idea of fault lines introduced in [Ran06], we use weighted non-backtracking walks to characterize such configurations and construct a cut set with exponentially small probability mass that separates the ground states.

2 Preliminaries

We start with some background on Markov chains and mixing times. Let $\mathcal{M}$ be an ergodic, reversible Markov chain with finite state space $\Omega$ , transition probability matrix $P$ , and stationary distribution $\pi$ . The $t$ -step transition probability from states $x$ to $y$ is denoted as $P^{t}(x,y)$ . The total variation distance between probability distributions $\mu$ and $\nu$ on $\Omega$ is

[TABLE]

The mixing time of $\mathcal{M}$ is $\tau(1/4)=\min\{t\in\mathbb{Z}_{\geq 0}:\max_{x\in\Omega}\left\|P^{t}(x,\cdot)-\pi\right\|_{\text{TV}}\leq 1/4\}$ . We say that $\mathcal{M}$ is rapidly mixing if its mixing time is $O(\text{poly}(n))$ , where $n$ is the size of each configuration in the state space. Similarly, we say that $\mathcal{M}$ is slow mixing if its mixing time is $\Omega(\exp(n^{c}))$ for some constant $c>0$ .

The mixing time of a Markov chain is characterized by its conductance (up to polynomial factors). The conductance of a nonempty set $S\subseteq\Omega$ is

[TABLE]

and the conductance of the entire Markov chain is $\Phi^{*}=\min_{S\subseteq\Omega:0<\pi(S)\leq 1/2}\Phi(S)$ . It is often useful to view the conductance of a set as an escape probability—starting from stationarity and conditioned on being in $S$ , the conductance $\Phi(S)$ is the probability that $\mathcal{M}$ leaves $S$ in one step.

Theorem 2.1 ([LPW17]).

For an ergodic, reversible Markov chain with conductance $\Phi^{*}$ , $\tau(1/4)\geq 1/(4\Phi^{*})$ .

To show that a Markov chain is slow mixing, it suffices to show that the conductance is exponentially small.

3 Slow Mixing in the Ferroelectric Phase

We start with the ferroelectric phase where $a>b+c$ or $b>a+c$ , and we give a conductance-based argument to show that Glauber dynamics can be slowly mixing in the entire ferroelectric region. Specifically, we show that there exist boundary conditions that induce an exponentially small, asymmetric bottleneck in the state space, revealing a natural trade-off between the energy and entropy in the system. Viewing the six-vertex model in the intersecting lattice path interpretation suggests how to plant polynomially-many paths in the grid that can (1) be analyzed independently, while (2) being capable of intersecting maximally. This path independence makes our analysis tractable and allows us to interpret the dynamics of a path as a correlated random walk, for which we develop an exponentially small tail bound in Section 5. Since conductance governs mixing times, we show how to relate the expected maximum deviation of a correlated walk to the conductance of the Markov chain and prove slow mixing. In addition to showing slow mixing up to the conjectured threshold, a surprising feature of our argument is that it potentially gives a combinatorial explanation for the phase transition from the ferroelectric to disordered phase. In particular, Lemma 3.6 demonstrates how the parameters of the model delicately balance the probability mass of the Markov chain.

We start by leveraging the invariance of the Gibbs measure and the lattice path interpretation of the six-vertex model to conveniently reparameterize the Boltzmann weights. Recall that for a fixed boundary condition, the invariants of the model [BL09] imply that $a=\sqrt{w_{1}w_{2}}$ . Therefore, we set $w_{1}=\lambda^{2}$ and $w_{2}=1$ to ignore empty sites while letting $a=\lambda$ . We also set $b=w_{3}=w_{4}=\mu$ and $c=w_{5}=w_{6}=1$ so that the weight of a configuration only comes from straight segments and intersections of neighboring lattice paths.

3.1 Constructing the Boundary Conditions and Cut

We begin with a few colloquial definitions for lattice paths that allow us to easily construct the boundary conditions and make arguments about the conductance of the Markov chain. We call a $2n$ -step, north-east lattice path $\gamma$ starting from $(0,0)$ a path of length $2n$ , and if the path ends at $(n,n)$ we describe it as tethered. If $\gamma=((0,0),(x_{1},y_{1}),(x_{2},y_{2}),\dots,(x_{2n},y_{2n}))$ , we define the deviation of $\gamma$ to be $\max_{i=0..2n}\lVert(x_{i},y_{i})-(i/2,i/2)\rVert_{1}$ . Geometrically, path deviation captures the (normalized) maximum perpendicular distance of the path to the line $y=x$ . We refer to vertices $(x_{i},y_{i})$ along the path as corners or straights depending on whether or not the path turned. If two paths intersect at a vertex we call this site a cross. Note that this classifies all vertex types in the six-vertex model.

We consider the following independent paths boundary condition for an $n\times n$ six-vertex model for the rest of the section. To construct this boundary condition, we consider its lattice path interpretation. First, place a tethered path $\gamma_{0}$ that enters $(0,0)$ horizontally and exits $(n,n)$ horizontally. Next, place $2\ell=2\lfloor n^{1/8}\rfloor$ translated tethered paths of varying length above and below the main diagonal, each separated from its neighbors by distance $d=\lfloor 32n^{3/4}\rfloor$ . Specifically, the paths $\gamma_{1},\gamma_{2},\dots,\gamma_{\ell}$ below the main diagonal begin at the vertices $(d,0),(2d,0),\dots,(\ell d,0)$ and end at the vertices $(n,n-d),(n,n-2d),\dots,(n,n-\ell d)$ , respectively. The paths $\gamma_{-1},\gamma_{-2},\dots,\gamma_{-\ell}$ above the main diagonal begin at $(0,d),(0,2d),\dots,(0,\ell d)$ and end at $(n-d,n),(n-2d,n),\dots,(n-\ell d,n)$ . The deviation of a translated tethered path is the deviation of the same path starting at $(0,0)$ . To complete the boundary condition, we force the paths below the main diagonal to enter vertically and exit horizontally. Symmetrically, we force the paths above the main diagonal to enter horizontally and exit vertically. See Figure 4(a) for an illustration of the construction when all paths have small deviation.

Next, we construct an asymmetric cut in the state space induced by this boundary condition in terms of its internal lattice paths. In particular, we analyze a set $S$ of configurations such that every path in a configuration has small deviation. Formally, we let

[TABLE]

Observe that by our choice of separation distance $d=\lfloor 32n^{3/4}\rfloor$ and the deviation limit for $S$ , no paths in any configuration of $S$ intersect. It follows that the partition function for $S$ factors into a product of $2\ell+1$ partition functions, one for each path with bounded deviation. This intuition is useful when analyzing the conductance $\Phi(S)$ as an escape probability from stationarity.

3.2 Lattice Paths as Correlated Random Walks

Now we weight the internal paths according to the parameters of the six-vertex model defined in the beginning of Section 3. The main result in this subsection is Lemma 3.1, which states that random tethered paths are exponentially unlikely to deviate past $\omega(n^{1/2})$ , even if drawn from a Boltzmann distribution that favors straights. Start by defining $\Gamma(\mu,n)$ to be the distribution over tethered paths of length $2n$ with the property that

[TABLE]

Lemma 3.1.

Let $\mu,\varepsilon>0$ and $m=o(n)$ . For $n$ sufficiently large and $\gamma\sim\Gamma(\mu,n)$ , we have

[TABLE]

Before giving the proof of Lemma 3.1, we first introduce the concept of correlated random walks. Then we present three prerequisite results about correlated random walks and briefly explain their connection to the deviation of biased tethered paths. Our goal here is to show how the supporting lemmas interact prior to the proof of Lemma 3.1.

A key idea in our analysis of the ferroelectric phase is the notion of a correlated random walk, which generalize a simple symmetric random walk by accounting for momentum. A correlated random walk with momentum parameter $p\in[0,1]$ starts at the origin and is defined as follows. Let $X_{1}$ be a uniform random variable with support $\{-1,1\}$ . For all subsequent steps $i\geq 2$ , the direction of the process is correlated with the direction of the previous step and satisfies

[TABLE]

We denote the position of the walk at time $t$ by $S_{t}=\sum_{i=1}^{t}X_{i}$ . It will often be useful to make the change of variables $p=\mu/(1+\mu)$ when analyzing the six-vertex model, where $\mu>0$ is the weight of a straight vertex. In many cases this also leads to cleaner expressions. We use the following probability mass function (PMF) for the position of a correlated random walk to develop our new tail inequality (Lemma 3.5), which holds for all values of $p$ .

Lemma 3.2 ([HF98]).

For any $n\geq 1$ and $m\geq 0$ , the PMF of a correlated random walk is

[TABLE]

Now that we have defined correlated random walks, we proceed by observing that there is a natural measure-preserving bijection between biased tethered paths of length $2n$ and correlated random walks of length $2n$ that return to the origin. To see this, observe that every vertical edge in the tethered path corresponds to a step to the right in the correlated random walk (i.e., $X_{i}=1$ ), and every horizontal edge in the tethered path corresponds to a step to the left in the correlated random walk (i.e., $X_{i}=-1$ ). Concretely, for a correlated random walk $(S_{0},S_{1},\dots,S_{2n})$ parameterized by $p=\mu/(1+\mu)$ , we have

[TABLE]

The first prerequisite lemma we present is an asymptotic equality that generalizes the return probability of simple symmetric random walks. This allows us to relax the condition in Equation 1 where the correlated random walk must return to the origin, and instead we bound $\Pr\left(\max_{i=0..2n}|S_{i}|\geq 2m\right)$ at the expense of an polynomial factor.

Lemma 3.3 ([Gil55]).

For any constant $\mu>0$ , the return probability of a correlated random walk is

[TABLE]

The second result that we need in order to prove Lemma 3.1 is that the PMF for correlated random walks is monotone.

Lemma 3.4.

For any momentum parameter $p\in(0,1)$ and $n$ sufficiently large, the probability of the position of a correlated random walk is monotone. Concretely, for $m\in\{0,1,\dots,n-1\}$ , we have

[TABLE]

Proof.

We consider the cases $m=n-1$ and $m\in\{0,1,2,\dots,n-2\}$ separately. Using Lemma 3.2, the probability density function for the position of a correlated random walk is

[TABLE]

If $m=n-1$ , then we have the equations

[TABLE]

Therefore, we have $\Pr\left(S_{2n}=2m\right)\geq\Pr\left(S_{2n}=2(m+1)\right)$ for all

[TABLE]

Now we assume that $m\in\{0,1,2,\dots,n-2\}$ . Writing $\Pr\left(S_{2n}=2m\right)-\Pr\left(S_{2n}=2(m+1)\right)$ as a difference of sums and matching the corresponding terms, it is instead sufficient to show for all values of $k\in\{1,2,\dots,n-(m+1)\}$ , we have

[TABLE]

Next, rewrite the binomial coefficients as

[TABLE]

Therefore, it remains to show that

[TABLE]

Since all of the values in $\{n+m,n+m-(k-1),n-m-k,n-m-1\}$ are positive for any choice of $m$ and $k$ , it is equivalent to show that

[TABLE]

Observing that

[TABLE]

completes the proof. ∎

The third result we need is an upper bound for the position of a correlated random walk. We fully develop this inequality in Section 5 by analyzing the asymptotic behavior of the PMF in Lemma 3.2. We note that Lemma 3.5 shows exactly how the tail behavior of simple symmetric random walks generalizes to correlated random walks as a function of $\mu$ .

Lemma 3.5.

*Let $\mu,\varepsilon>0$ and $m=o(n)$ . For $n$ sufficiently large, a correlated random walk satisfies *

[TABLE]

Now that we have established these supporting lemmas, we are prepared to complete the proof of Lemma 3.1, which also heavily relies on union bounds and relaxing conditional probabilities.

Proof of Lemma 3.1.

Using the measure-preserving bijection between tethered paths of length $2n$ and correlated random walks of length $2n$ (Section 3.2) along with the definition of conditional probability and Lemma 3.3, we have

[TABLE]

where the last inequality uses the definition of asymptotic equality with $\varepsilon=1/2$ . Next, a union bound and the symmetry of correlated random walks imply that

[TABLE]

Now we focus on the probability that the maximum position of the walk is at least $2m$ . For this event to be true, the walk must reach $2m$ at some time $i\in\{0,1,2,\dots,2n\}$ , so by a union bound,

[TABLE]

The second inequality takes into account the parity of the random walk, the fact that if $i=0$ the walk can only be at position [math], and the relaxed condition that the final position is at least $2m$ . Lemma 3.4 implies that the distribution is unimodal on its support centered at the origin for sufficiently large $n$ . Moreover, for walks of the same parity with increasing length and a fixed tail threshold, the probability of the tail is nondecreasing. Combining these two observations, we have

[TABLE]

Using the chain of previous inequalities and the upper bound for $\Pr\left(S_{2n}=2m\right)$ in Lemma 3.5 with the smaller error $\varepsilon/2$ , it follows that

[TABLE]

which completes the proof. ∎

3.3 Bounding the Conductance and Mixing Time

Next, we bound the conductance of the Markov chain by viewing $\Phi(S)$ as an escape probability. We start by claiming that $\pi(S)\leq 1/2$ (as required by the definition of conductance) if and only if the parameters are in the ferroelectric phase. Then we use the correspondence between tethered paths and correlated random walks (i.e., Section 3.2) to prove that $\Phi(S)$ is exponentially small.

Lemma 3.6.

Let $\mu>0$ and $\lambda>1+\mu$ be constants. For $n$ sufficiently large, $\pi\left(S\right)\leq 1/2$ .

Proof.

We start by upper bounding $\pi(S)$ in terms of the partition function $Z$ . No paths in any state of $S$ deviate by more than $2n^{3/4}$ by the definition of $S$ . Moreover, since adjacent paths are separated by distance $d=\lfloor 32n^{3/4}\rfloor$ , no two can intersect (Figure 4(a)). Therefore, it follows that the paths are independent of each other, which is convenient because it allows us to implicitly factor the generating function for configurations in $S$ .

Next, observe that an upper bound for the generating function of any single path is $\left(1+\mu\right)^{2n+1}$ . This is true because all paths have length at most $2n$ , and we introduce an additional $(1+\mu)^{2}$ factor to account for boundary conditions. Since all the paths are independent and $\ell=\lfloor n^{1/8}\rfloor$ , we have

[TABLE]

Now we lower bound the partition function $Z$ of the entire model by considering the weight of the ferroelectric ground state (Figure 4(b)). Recall that we labeled the $\ell$ paths below the main diagonal path $\gamma_{1},\gamma_{2},\dots,\gamma_{\ell}$ such that $\gamma_{\ell}$ is farthest from the main diagonal. Let $c\leq 10$ be a constant that accounts for subtle misalignments between adjacent paths. It follows that each path $\gamma_{k}$ uniquely corresponds to at least $n-(2kd+k+c)$ intersections. Using the last path $\gamma_{\ell}$ as a lower bound for the number of intersections that each path contributes and accounting also for the paths above the main diagonal, it follows that there are at least

[TABLE]

intersections in the ground state.

Similarly, we bound the number of straights that each path $\gamma_{k}$ contributes. Note that we may also need an upper bound for this quantity in order to lower bound the partition function since it is possible that $0<\mu<1$ . The number of straights in $\gamma_{k}$ is $2(kd\pm c)$ , and $\gamma_{0}$ has two straights on the boundary. Therefore, the total number of straights in the ground configuration is

[TABLE]

Since intersections are weighted by $\lambda^{2}$ and straights by $\mu$ in our reparameterized model, by considering the ground state and using the previous enumerations, it follows that

[TABLE]

Combining these inequalities allows us to upper bound the probability mass of the cut $\pi(S)$ by

[TABLE]

Using the assumption that $\lambda>1+\mu$ , we have $\pi(S)\leq 1/2$ for $n$ sufficiently large, as desired. ∎

Our analysis of the escape probability from $S$ critically relies on the fact that paths in any state $x\in S$ are non-intersecting. Combinatorially, we exploit the factorization of the generating function for states in $S$ as a product of $2\ell+1$ independent path generating functions.

Lemma 3.7.

Let $\mu,\varepsilon>0$ be constants. For $n$ sufficiently large, $\Phi(S)\leq e^{-(1-\varepsilon)\mu^{-1}n^{1/2}}$ .

Proof.

The conductance $\Phi(S)$ can be understood as the following escape probability. Sample a state $x\in S$ from the stationary distribution $\pi$ conditioned on $x\in S$ , and run the Markov chain from $x$ for one step to get a neighboring state $y$ . The definition of conductance implies that $\Phi(S)$ is the probability that $y\not\in S$ . Using this interpretation, we can upper bound $\Phi(S)$ by the probability mass of states that are near the boundary of $S$ in the state space, since the process must escape in one step. Therefore, it follows from the independent paths boundary condition and the definition of $S$ that

[TABLE]

Next, we use a union bound over the $2\ell+1$ different paths in a configuration and consider the event that a particular path $\gamma_{k}$ deviates by at least $4n^{3/4}$ . Because all of the paths in $S$ are independent, we only need to consider the behavior of $\gamma_{k}$ in isolation. This allows us to rephrase the conditional event. Relaxing the conditional probability of each term in the sum gives

[TABLE]

For large enough $n$ , the length of every path $\gamma_{k}$ is in the range $[n,2n]$ since we eventually have the inequality $n-\ell d\geq n/2$ . Therefore, we can apply Lemma 3.1 with the error $\varepsilon/2$ to each term and use the universal upper bound

[TABLE]

It follows from the union bound and previous inequality that the conductance $\Phi(S)$ is bounded by

[TABLE]

which completes the proof. ∎

Now that we have constructed a cut in the state space with exponentially small conductance, we can obtain a bound on the mixing time when the probability mass is properly distributed.

Theorem 3.8.

Let $\mu,\varepsilon>0$ and $\lambda>1+\mu$ . For $n$ sufficiently large, $\tau\left(1/4\right)\geq e^{(1-\varepsilon)\mu^{-1}n^{1/2}}$ .

Proof.

Since $\pi(S)\leq 1/2$ by Lemma 3.6, we have $\Phi^{*}\leq\Phi(S)$ . The proof follows from Theorem 2.1 and the conductance bound in Lemma 3.7 with a smaller error $\varepsilon/2$ . ∎

Last, we restate our main theorem and use Theorem 3.8 to show that Glauber dynamics for the six-vertex model can be slow mixing for all parameters in the ferroelectric phase.

See 1.1

Proof.

Without loss of generality, we reparameterized the model so that $a=\lambda$ , $b=\mu$ , and $c=1$ . Therefore, Glauber dynamics with the independent paths boundary condition is slow mixing if $a>b+c$ by Theorem 3.8. Since the rotational invariance of the six-vertex model implies that $a$ and $b$ are interchangeable parameters, this mixing time result also holds in the case $b>a+c$ . ∎

4 Slow Mixing in the Antiferroelectric Phase

While Glauber dynamics can be slowly mixing in the ferroelectric phase, we find it is true for substantially different reasons. In the antiferroelectric phase, Boltzmann weights satisfy $a+b<c$ , so configurations tend to favor corner (i.e., type- $c$ ) vertices. The main insight behind our slow mixing proof is that when $c$ is sufficiently large, the six-vertex model can behave like the low-temperature hardcore model on $\mathbb{Z}^{2}$ where configurations predominantly agree with one of two ground states. Liu recently formalized this argument in [Liu18] and showed that Glauber dynamics for the six-vertex model with free boundary conditions requires exponential time when $\max(a,b)<\mu c$ , where $\mu\leq 2.639$ is the connective constant of self-avoiding walks on the square lattice [GC01]. His proof uses a Peierls argument based on topological obstructions introduced by Randall [Ran06] in the context of independent sets. In this section, we extend Liu’s result to the region depicted in Figure 3(c) by computing a closed-form multivariate generating function that upper bounds the number of self-avoiding walks and better accounts for disparities in their Boltzmann weights induced by the parameters of the six-vertex model.

4.1 Topological Obstruction Framework

We start with a recap of the definitions and framework laid out in [Liu18]. There are two ground states in the antiferroelectric phase such that every interior vertex is a corner: $x_{\textnormal{R}}$ (Figure 5(a)) and $x_{\textnormal{G}}$ (Figure 5(b)). These configurations are edge reversals of each other, so for any state $x\in\Omega$ we can color its edges red if they are oriented as in $x_{\textnormal{R}}$ or green if they are oriented as in $x_{\textnormal{G}}$ . See Figure 5(c) for an example of how a configuration is colored. It follows from case analysis of the six vertex types in Figure 1 that the number of red edges incident to any internal vertex is even, and if there are only two red edges then they must be rotationally adjacent to each other. The same property holds for green edges by symmetry. Note that the four edges bounding a cell of the lattice are monochromatic if and only if they are oriented cyclically, and thus reversible by Glauber dynamics. We say that a simple path from a horizontal edge on the left boundary of $\Lambda_{n}$ to a horizontal edge on the right boundary is a red horizontal bridge if it contains only red edges. We define green horizontal bridges and monochromatic vertical bridges similarly. A configuration has a red cross if it contains both a red horizontal bridge and a red vertical bridge. Likewise, we can define a green cross. Let $C_{\textnormal{R}}\subseteq\Omega$ be the set of all states with a red cross, and let $C_{\textnormal{G}}\subseteq\Omega$ be the set of all states with a green cross. It follows from Lemma 4.1 that $C_{\textnormal{R}}\cap C_{\textnormal{G}}=\emptyset$ .

Next, we define the dual lattice $L_{n}$ to describe configurations in $\Omega\setminus(C_{\textnormal{R}}\cup C_{\textnormal{G}})$ . The vertices of $L_{n}$ are the centers of the cells in $\Lambda_{n}$ , including the cells on the boundary that are partially enclosed, and we connect dual vertices by an edge if their corresponding cells are diagonally adjacent. Note that $L_{n}$ is a union of two disjoint graphs (Figure 6(a)). For any state $x\in\Omega$ there is a corresponding dual subgraph $L_{x}$ defined as follows: for each interior vertex $v$ in $\Lambda_{n}$ , if $v$ is incident to two red edges and two green edges, then $L_{x}$ contains the dual edge passing through $v$ that separates the two red edges from the two green edges. This construction is well-defined because the red edges are rotationally adjacent. See Figure 6(b) for an example of a dual configuration. For any $x\in\Omega$ , we say that $x$ has a horizontal fault line if $L_{x}$ contains a simple path from a left dual boundary vertex to a right dual boundary vertex. We define horizontal fault lines similarly and let $C_{\textnormal{FL}}\subseteq\Omega$ be the set of all states containing a horizontal or vertical fault line. Fault lines completely separate red and green edges, and hence are topological obstructions that prohibit monochromatic bridges.

Last, we extend the notion of fault lines to almost fault lines. We say that $x\in\Omega$ has a horizontal almost fault line if there is a simple path in $L_{n}$ connecting a left dual boundary vertex to a right dual boundary vertex such that all edges except for one are in $L_{x}$ . We define vertical almost fault lines similarly and let the set $C_{\textnormal{AFL}}\subseteq\Omega$ denote all states containing an almost fault line. Finally, let $\partial C_{\textnormal{R}}\subseteq\Omega$ denote the set of states not in $C_{\textnormal{R}}$ that one move away from $C_{\textnormal{R}}$ in the state space according to the Glauber dynamics.

Lemma 4.1 ([Liu18]).

We can partition the state space into $\Omega=C_{\textnormal{R}}\cup C_{\textnormal{FL}}\cup C_{\textnormal{G}}$ . Furthermore, we have $\partial C_{\textnormal{R}}\subseteq C_{\textnormal{FL}}\cup C_{\textnormal{AFL}}$ .

4.2 Bounding the Mixing Time with a Peierls Argument

In this subsection we show that $\pi(C_{\textnormal{FL}}\cup C_{\textnormal{AFL}})$ is an exponentially small bottleneck in the state space $\Omega$ . The analysis relies on Lemma 4.1 and a new multivariate upper bound for weighted self-avoiding walks (Lemma 4.2). Our key observation is that when a fault line changes direction, the vertices in its path change from type- $a$ to type- $b$ or vice versa. Therefore, our goal in this subsection is to generalize the trivial $3^{n-1}$ upper bound for the number of self-avoiding walks by accounting for their changes in direction in aggregate. We achieve this by using generating functions to solve a system of linear recurrence relations.

We start by encoding non-backtracking walks that start from the origin and take their first step northward using the characters in $\{\texttt{S},\texttt{L},\texttt{R}\}$ , representing straight, left, and right steps. For example, the walk SLRSSL corresponds to the sequence $((0,0),(0,1),(-1,1),(-1,2),(-1,3),(-1,4),(-2,4))$ . If a fault line is the same shape as SLRSSL up to a rotation about the origin, then there are only two possible sequences of vertex types through which it can pass: $abaaab$ and $babbba$ . This follows from the fact that once the first vertex type is determined, only turns in the self-avoiding walk (i.e., the L and R characters) cause the vertex type to switch. We define the weight of a fault line to be the product of the vertex types through which it passes. More generally, we define the weight of a non-backtracking walk that initially passes through a fixed vertex type to be the product of the induced vertex types according to the rule that turns toggle the current type. Formally, we let the function $g_{a}(\gamma):\{\texttt{S}\}\times\{\texttt{S},\texttt{L},\texttt{R}\}^{n-1}\rightarrow\mathbb{R}_{\geq 0}$ denote the weight of a non-backtracking walk $\gamma$ that starts by crossing a type- $a$ vertex. We define the function $g_{b}(\gamma)$ similarly and provide the examples $g_{a}(\texttt{SLRSSL})=a^{4}b^{2}$ and $g_{b}(\texttt{SLRSSL})=a^{2}b^{4}$ for clarity. Last, observe that a sequence of vertex types can have many different walks in its preimage. The non-backtracking walk SRRSSR also maps to $abaaab$ and $babbba$ —in fact, there are $2^{3}=8$ such walks in this example since we can interchange L and R characters.

The idea of enumerating the preimages of a binary string corresponding to sequence of vertex types suggests a recursive approach for computing the sum of weighted non-backtracking walks. This naturally leads to the use of generating functions, so we overload the variables $x$ and $y$ to also denote function arguments. For nonempty binary string $s\in\{0,1\}^{n}$ , let $h(s)$ count the number of pairs of adjacent characters that are not equal and let $|s|$ denote the number of ones in $s$ (e.g., if $s=010001$ then $h(s)=3$ and $|s|=2$ ). The sum of weighted self-avoiding walks is upper bounded by the sum of weighted non-backtracking walks, so we proceed by analyzing the following function:

[TABLE]

Note that $F_{n}(1,1)=2\cdot 3^{n-1}$ recovers the number of non-backtracking walks that initially cross type- $a$ or type- $b$ vertices.

In the next section, we compute the closed-form solution for $F_{n}(x,y)$ by diagonalizing a matrix corresponding to the system of recurrence relations, which allows us to accurately quantify the discrepancy between fault lines when the Boltzmann weights $a$ and $b$ differ. For now, we use the following upper bound for $F_{n}(x,y)$ in our Peierls argument and defer its proof to Section 4.3.

Lemma 4.2.

Let $F_{n}(x,y)$ be the generating function for weighted non-backtracking walks defined in Equation 2. For any integer $n\geq 1$ and $x,y\in\mathbb{R}_{>0}$ , we have

[TABLE]

The first step of our Peierls argument is to upper bound $\pi(C_{\textnormal{FL}}\cup C_{\textnormal{AFL}})$ , which then gives us a bound on the conductance and allows us to prove Theorem 1.2. We start by defining the subset of antiferroelectric parameters that cause $F_{n}(a/c,b/c)$ to decrease exponentially fast.

Lemma 4.3.

If $(a,b,c)\in\mathbb{R}_{>0}^{3}$ is antiferroelectric and $3ab+ac+bc<c^{2}$ , then

[TABLE]

Proof.

Let $x=a/c$ and $y=b/c$ , and observe that $0<x<1$ by the antiferroelectric assumption. It follows from our hypothesis that $y<(1-x)/(1+3x)$ . Therefore, we have

[TABLE]

which completes the proof. ∎

Lemma 4.4.

If $(a,b,c)\in\mathbb{R}_{>0}^{3}$ is antiferroelectric and $3ab+ac+bc<c^{2}$ , then for Glauber dynamics with free boundary conditions we have

[TABLE]

Proof.

For any self-avoiding walk $\gamma$ and dual vertices $s,t\in L_{n}$ on the boundary, let $\Omega_{\gamma,s,t}\subseteq\Omega$ be the set of states that contain $\gamma$ as a fault line or an almost fault line such that $\gamma$ starts at $s$ and ends at $t$ . Without loss of generality, assume that the (almost) fault line is vertical. Reversing the direction of all edges on the left side of $\gamma$ defines the injective map $f_{\gamma,s,t}:\Omega_{\gamma,s,t}\rightarrow\Omega\setminus\Omega_{\gamma,s,t}$ such that if $\gamma$ is a fault line of $x\in\Omega_{\gamma,s,t}$ , then the weight of its image $f_{\gamma,s,t}(x)$ is amplified by $c^{|\gamma|}/g_{a}(\gamma)$ or $c^{|\gamma|}/g_{b}(\gamma)$ . For an example of this injection, see Figure 6(c). Similarly, if $\gamma$ is an almost fault line, decompose $\gamma$ into subpaths $\gamma_{1}$ and $\gamma_{2}$ separated by a type- $c$ vertex such that $\gamma_{1}$ starts at $s$ and $\gamma_{2}$ ends at $t$ . In this case, the weight of the images of almost fault lines is amplified by a factor of $\min(a,b)/c\cdot c^{|\gamma_{1}|+|\gamma_{2}|}/(g_{\alpha}(\gamma_{1})g_{\beta}(\gamma_{2}))$ for some $(\alpha,\beta)\in\{a,b\}^{2}$ . Using the fact that $f_{\gamma,s,t}$ is injective and summing over the states containing $\gamma$ as a fault line and an almost fault line separately gives us

[TABLE]

where the sum is over all $\Theta(|\gamma|)$ decompositions of $\gamma$ into $\gamma_{1}$ and $\gamma_{2}$ .

Equipped with Equation 3 and Lemma 4.2, we use a union bound over all pairs of terminal vertices $(s,t)$ and fault line lengths $\ell$ to bound $\pi(C_{\textnormal{FL}}\cup C_{\textnormal{AFL}})$ in terms of the generating function for weighted non-backtracking walks $F_{\ell}(x,y)$ . Since antiferroelectric weights satisfy $3ab+ac+bc<c^{2}$ , it follows from Lemma 4.3 that

[TABLE]

Note that the convolutions in the first inequality generate all almost weighted non-backtracking walks. ∎

See 1.2

Proof of Theorem 1.2.

Let $\Omega_{\textnormal{MIDDLE}}=C_{\textnormal{FL}}\cup C_{\textnormal{AFL}}$ , $\Omega_{\textnormal{LEFT}}=C_{\textnormal{R}}\setminus\Omega_{\textnormal{MIDDLE}}$ , and $\Omega_{\textnormal{RIGHT}}=C_{\textnormal{G}}\setminus\Omega_{\textnormal{MIDDLE}}$ . It follows from Lemma 4.1 that $\Omega=\Omega_{\textnormal{LEFT}}\cup\Omega_{\textnormal{MIDDLE}}\cup\Omega_{\textnormal{RIGHT}}$ is a partition with the properties that $\partial\Omega_{\textnormal{LEFT}}\subseteq\Omega_{\textnormal{MIDDLE}}$ and $\pi(\Omega_{\textnormal{LEFT}})=\pi(\Omega_{\textnormal{RIGHT}})$ . Since the partition is symmetric, Lemma 4.4 implies that $1/4\leq\pi(\Omega_{\textnormal{LEFT}})\leq 1/2$ , for $n$ sufficiently large. Therefore, we can upper bound the conductance by $\Phi^{*}\leq\Phi\left(\Omega_{\textnormal{LEFT}}\right)\leq 4\pi\left(\Omega_{\textnormal{MIDDLE}}\right)$ . Using Theorem 2.1 along with Lemma 4.4 and Lemma 4.3 gives the desired mixing time bound. ∎

4.3 Weighted Non-Backtracking Walks

In this section we present a closed-form formula for the weighted non-backtracking walks generating function $F_{n}(x,y)$ , and we give the proof of Lemma 4.2. We start by decomposing the generating function $F_{n}(x,y)$ into two sums over disjoint sets of bit strings defined by their final character. Formally, for any $n\geq 1$ , let

[TABLE]

First, note that $F_{n}(x,y)=F_{n,0}(x,y)+F_{n,1}(x,y)$ . Second, observe that by recording the final character of the bit strings, we can design a system of linear recurrences to account for the $2^{h(s)}$ term appearing in Equation 2, which counts the number of non-backtracking walks that map to a given sequence of vertex types.

Lemma 4.5.

For any integer $n\geq 1$ and $x,y\in\mathbb{R}_{>0}$ , we have the system of recurrence relations

[TABLE]

where the base cases are $F_{1,0}(x,y)=x$ and $F_{1,1}(x,y)=y$ .

Proof.

This immediately follows from the definitions of the functions $F_{n,0}(x,y)$ and $F_{n,1}(x,y)$ . ∎

Lemma 4.6.

For any integer $n\geq 1$ and $x,y\in\mathbb{R}_{>0}$ , define the values

[TABLE]

The generating $F_{n}(x,y)$ can be written in closed-form as

[TABLE]

Proof.

For brevity, we let $F_{n,0}=F_{n,0}(x,y)$ and $F_{n,1}=F_{n,1}(x,y)$ . It follows from Lemma 4.5 that

[TABLE]

Next, observe that the recurrence matrix is diagonalizable. In particular, we have

[TABLE]

where

[TABLE]

Since the base cases are $F_{n,0}=x$ and $F_{n,1}=y$ , it follows that

[TABLE]

Using the fact $F_{n}(x,y)=F_{n,0}(x,y)+F_{n,1}(x,y)$ and simplifying the matrix equation above gives us

[TABLE]

as desired. ∎

See 4.2

Proof.

We start by using Lemma 4.6 to rewrite the closed-form solution of $F_{n}(x,y)$ as

[TABLE]

Next, we observe that the eigenvalue $\lambda_{1}$ satisfies $\lambda_{1}<0$ and $\lvert\lambda_{1}\rvert\leq\lambda_{2}$ . Since $(x+y)^{2}<m^{2}$ , it follows that $x+y-m=2\lambda_{1}<0$ . Furthermore, we have $2\lvert\lambda_{1}\rvert\leq\lvert x+y\rvert+\lvert-m\rvert=2\lambda_{2}$ by the triangle inequality. Together these two properties imply that

[TABLE]

Therefore, we can upper bound $F_{n}(x,y)$ by

[TABLE]

Since $x^{2}+6xy+y^{2}<m^{2}$ , we have the inequalities

[TABLE]

The result follows from the definition of $\lambda_{2}$ . ∎

5 Tail Behavior of Correlated Random Walks

In this section we prove Lemma 3.5, which gives an exponentially small upper bound for the tail of a correlated random walk as a function of its momentum parameter $\mu$ . Our proof builds off of the PMF for the position of a correlated random walk restated below, which is combinatorial in nature and not readily amenable for tail inequalities. Specifically, the probability $\Pr\left(S_{2n}=2m\right)$ is a sum of marginals conditioned on the number of turns that the walk makes [RH81].

See 3.2

There are two main ideas in our approach to develop a more useful bound for the position of a correlated random walk $\Pr\left(S_{2n}=2m\right)$ . First, we construct a smooth function that upper bounds the marginals as a function of $x$ (a continuation of the number of turns in the walk $k$ ), and then we determine its maximum value. Next we show that the log of the maximum value is asymptotically equivalent to $m^{2}/(\mu n)$ for $m=o(n)$ , which gives us desirable bounds for sufficiently large values of $n$ . We note that our analysis illustrates precisely how correlated random walks generalize simple symmetric random walks and how the momentum parameter $\mu$ controls the exponential decay.

5.1 Upper Bounding the Marginal Probabilities

We start by using Stirling’s approximation to construct a smooth function that upper bounds the marginal terms in the sum of the PMF for correlated random walks. For $x\in(0,n-m)$ , let

[TABLE]

It can easily be checked that $f(x)$ is continuous on all of $[0,n-m]$ using the fact that $\lim_{x\rightarrow 0}x^{x}=1$ .

Lemma 5.1.

For any integer $m\geq 0$ , a correlated random walk satisfies

[TABLE]

Proof.

Consider the probability density function for $\Pr\left(S_{2n}=2m\right)$ in Lemma 3.2. If $2m=2n$ the claim is clearly true, so we focus on the other case. We start by bounding the rightmost polynomial term in the sum. For all $n\geq 1$ , we have

[TABLE]

Next, we reparameterize the marginals in terms of $\mu$ , where $p=\mu/(1+\mu)$ , and use a more convenient upper bound for the binomial coefficients. Observe that

[TABLE]

Stirling’s approximation states that for all $n\geq 1$ we have

[TABLE]

so we can bound the products of binomial coefficients up to a polynomial factor by

[TABLE]

The proof follows the definition of $f(x)$ given in Equation 4. ∎

There are polynomially-many marginal terms in the sum of the PMF, so if the maximum term is exponentially small, then the total probability is exponentially small. Since the marginal terms are bounded above by an expression involving $f(x)$ , we proceed by maximizing $f(x)$ on its support.

Lemma 5.2.

The function $f(x)$ is maximized at the critical point

[TABLE]

Proof.

We start by showing that $f(x)$ is log-concave on $(0,n-m)$ , which implies that it is unimodal. It follows that a local maximum of $f(x)$ is a global maximum. Since $n$ and $k$ are fixed as constants and because the numerator is positive, it is sufficient to show that

[TABLE]

is concave. Observe that the first derivative of $g(x)$ is

[TABLE]

and the second derivative is

[TABLE]

Because $g^{\prime\prime}(x)<0$ on $(0,n-m)$ , the function $f(x)$ is log-concave and hence unimodal.

To identify the critical points of $f(x)$ , it suffices to determine where $g^{\prime}(x)=0$ since $\log x$ is increasing. Using the previous expression for $g^{\prime}(x)$ , it follows that

[TABLE]

Therefore, the critical points are the solutions of $(n-x)^{2}-m^{2}=\mu^{2}x^{2}$ , so we have

[TABLE]

It remains and suffices to show that $x^{*}$ is a local maximum since $f(x)$ is unimodal. Observing that

[TABLE]

and differentiating $f(x)=\exp(\log f(x))$ using the chain rule, the definition of $x^{*}$ gives

[TABLE]

We know $f(x^{*})>0$ , so $f^{\prime\prime}(x^{*})$ has the same sign as $g^{\prime\prime}(x^{*})<0$ . Therefore, $x^{*}$ is a local maximum of $f(x)$ . Using the continuity of $f(x)$ on $[0,n-m]$ and log-concavity, $f(x^{*})$ is a global maximum. ∎

Remark 5.3.

It is worth noting that for $m=o(n)$ , the asymptotic behavior of the critical point is continuous as a function of $\mu>0$ . In particular, it follows from Lemma 5.2 that $x^{*}\sim n/(1+\mu)$ .

5.2 Asymptotic Behavior of the Maximum Log Marginal

Now that we have a formula for $x^{*}$ , and hence an expression for $f(x^{*})$ , we want to show that

[TABLE]

for some constant $c>0$ . Because there are polynomially-many marginals in the sum, this leads to an exponentially small upper bound for $\Pr\left(S_{2n}=2m\right)$ . Define the maximum log marginal to be

[TABLE]

Equivalently, we show that $h(n)\geq n^{c}$ for sufficiently large $n$ using asymptotic equivalences.

Lemma 5.4.

The maximum log marginal $h(n)$ can be symmetrically expressed as

[TABLE]

Proof.

Grouping the terms of $h(n)$ by factors of $n$ , $m$ and $x^{*}$ gives

[TABLE]

Using Equation 5, observe that the last term is

[TABLE]

The proof follows by grouping the terms of the desired expression by factors of $n$ and $m$ . ∎

The following lemma is the crux of our argument, as it presents an asymptotic equality for the maximum log marginal in the PMF for correlated random walks. We remark that we attempted to bound this quantity directly using Taylor expansions instead of an asymptotic equivalence, and while this seems possible, the expressions are unruly. Our asymptotic equivalence demonstrates that second derivative information is needed, which makes the earlier approach even more unmanageable.

Lemma 5.5.

For any $\mu>0$ and $m=o(n)$ , the maximum log marginal satisfies $h(n)\sim m^{2}/(\mu n)$ .

Proof.

The proof is by case analysis for $\mu$ . In both cases we analyze $h(n)$ as expressed in Lemma 5.4, consider a change of variables, and use L’Hospital’s rule twice. In the first case, we assume $\mu=1$ . The value of $x^{*}$ in Lemma 5.2 gives us

[TABLE]

It follows that $h(n)$ can be simplified as

[TABLE]

To show $h(n)\sim m^{2}/n$ , by the definition of asymptotic equivalence we need to prove that

[TABLE]

Make the change of variables $y=m/n$ . Since $m=o(n)$ , this is equivalent to showing

[TABLE]

Using L’Hospital’s rule twice with the derivatives

[TABLE]

it follows that

[TABLE]

This completes the proof for $\mu=1$ .

The case when $\mu\neq 1$ is analogous but messier. Making the same change of variables $y=m/n$ , it is equivalent to show that

[TABLE]

because the value of $x^{*}$ for $\mu\neq 1$ in Lemma 5.2 gives us

[TABLE]

Denoting the left-hand side of Section 5.2 by $g(y)$ , one can verify that the first two derivatives of $g(y)$ are

[TABLE]

Observing that $g(0)=g^{\prime}(0)=0$ due to convenient cancellations and using L’Hospital’s rule twice,

[TABLE]

This completes the proof for all cases of $\mu$ . ∎

See 3.5

Proof.

For $n$ sufficiently large, the asymptotic equality for $h(n)$ in Lemma 5.5 gives us

[TABLE]

It follows from our construction of $f(x)$ and the definition of the maximum log marginal that

[TABLE]

as desired. ∎

6 Conclusion

We have made significant progress towards rigorously establishing the conjectured slow regions of the phase diagram for the six-vertex model. In particular, we prove that there exist boundary conditions for which Glauber dynamics requires exponential convergence time for the entire ferroelectric region and most of the antiferroelectric region. Furthermore, our proofs demonstrate why sharp boundaries exist between the ferroelectric phase and the disordered phase, where Glauber dynamics is believed to transition to polynomial-time convergence. We have not fully characterized the antiferroelectric phase, but our improvement over the best previous bounds in [Liu18] cover a significantly larger part of the region.

Our arguments for the slow mixing of Glauber dynamics completely break down in the disordered phase, as expected, but there has not been any rigorous work showing that in this region of the phase diagram we have fast convergence. The single exception is the unweighted case when we have $a=b=c$ , which corresponds to Eulerian orientations of the lattice region. This was shown to converge in polynomial time for all boundary conditions [RT00, LRS01, GMP04]. The approaches in these works are inherently combinatorial, and it seems that generalizing them to weighted cases will require significantly different ideas. Lastly, we emphasize that our proofs of slow mixing rely on new techniques for analyzing lattice models, which include the closed-form generating function for weighted non-backtracking walks derived in Section 4 and the exponentially small tail inequality for correlated random walks developed in Section 5.

Bibliography42

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AR 05] David Allison and Nicolai Reshetikhin. Numerical study of the 6-vertex model with domain wall boundary conditions. Annales de l’institut Fourier , 55(6):1847–1869, 2005.
2[BCFR 17] Prateek Bhakta, Ben Cousins, Matthew Fahrbach, and Dana Randall. Approximately sampling elements with fixed rank in graded posets. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1828–1838. SIAM, 2017.
3[BF 06] Pavel Bleher and Vladimir Fokin. Exact solution of the six-vertex model with domain wall boundary conditions. Disordered phase. Communications in Mathematical Physics , 268(1):223–284, 2006.
4[BKW 73] H. J. Brascamp, H. Kunz, and F. Y. Wu. Some rigorous results for the vertex model in statistical mechanics. Journal of Mathematical Physics , 14(12):1927–1932, 1973.
5[BL 09] Pavel Bleher and Karl Liechty. Exact solution of the six-vertex model with domain wall boundary conditions. Ferroelectric phase. Communications in Mathematical Physics , 286(2):777–801, 2009.
6[BL 10] Pavel Bleher and Karl Liechty. Exact solution of the six-vertex model with domain wall boundary conditions: Antiferroelectric phase. Communications on Pure and Applied Mathematics , 63(6):779–829, 2010.
7[BPZ 02] N. M. Bogoliubov, A. G. Pronko, and M. B.. Zvonarev. Boundary correlation functions of the six-vertex model. Journal of Physics A: Mathematical and General , 35(27):5525, 2002.
8[BR 20] Pavel Belov and Nicolai Reshetikhin. The two-point correlation function in the six-vertex model. ar Xiv preprint ar Xiv:2012.05182 , 2020.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered

Abstract

1 Introduction

1.1 Related Works

1.2 Main Results

Theorem 1.1** (Ferroelectric Phase).**

Theorem 1.2** (Antiferroelectric Phase).**

1.3 Techniques

2 Preliminaries

Theorem 2.1** ([LPW17]).**

3 Slow Mixing in the Ferroelectric Phase

3.1 Constructing the Boundary Conditions and Cut

3.2 Lattice Paths as Correlated Random Walks

Lemma 3.1**.**

Lemma 3.2** ([HF98]).**

Lemma 3.3** ([Gil55]).**

Lemma 3.4**.**

Proof.

Lemma 3.5**.**

Proof of Lemma 3.1.

3.3 Bounding the Conductance and Mixing Time

Lemma 3.6**.**

Proof.

Lemma 3.7**.**

Proof.

Theorem 3.8**.**

Proof.

Proof.

4 Slow Mixing in the Antiferroelectric Phase

4.1 Topological Obstruction Framework

Lemma 4.1** ([Liu18]).**

4.2 Bounding the Mixing Time with a Peierls Argument

Lemma 4.2**.**

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

Proof of Theorem 1.2.

4.3 Weighted Non-Backtracking Walks

Lemma 4.5**.**

Proof.

Lemma 4.6**.**

Proof.

Proof.

5 Tail Behavior of Correlated Random Walks

5.1 Upper Bounding the Marginal Probabilities

Lemma 5.1**.**

Proof.

Lemma 5.2**.**

Proof.

Remark 5.3**.**

5.2 Asymptotic Behavior of the Maximum Log Marginal

Lemma 5.4**.**

Proof.

Lemma 5.5**.**

Proof.

Proof.

6 Conclusion

Theorem 1.1 (Ferroelectric Phase).

Theorem 1.2 (Antiferroelectric Phase).

Theorem 2.1 ([LPW17]).

Lemma 3.1.

Lemma 3.2 ([HF98]).

Lemma 3.3 ([Gil55]).

Lemma 3.4.

Lemma 3.5.

Lemma 3.6.

Lemma 3.7.

Theorem 3.8.

Lemma 4.1 ([Liu18]).

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.

Lemma 4.5.

Lemma 4.6.

Lemma 5.1.

Lemma 5.2.

Remark 5.3.

Lemma 5.4.

Lemma 5.5.