Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered Phases
Matthew Fahrbach, Dana Randall

TL;DR
This paper proves that Glauber dynamics for the six-vertex model in ordered phases can require exponential time to mix, revealing fundamental limitations of local Markov chains in these regimes.
Contribution
It provides the first rigorous bounds on the slow mixing of Glauber dynamics in the ferroelectric phase of the six-vertex model, extending understanding in ordered phases.
Findings
Glauber dynamics mixes exponentially slow in the ferroelectric phase.
Boundary conditions can induce slow mixing in the ordered phases.
New techniques relate correlated random walks to lattice path models.
Abstract
The six-vertex model in statistical physics is a weighted generalization of the ice model on (i.e., Eulerian orientations) and the zero-temperature three-state Potts model (i.e., proper three-colorings). The phase diagram of the model depicts its physical properties and suggests where local Markov chains will be efficient. In this paper, we analyze the mixing time of Glauber dynamics for the six-vertex model in the ordered phases. Specifically, we show that for all Boltzmann weights in the ferroelectric phase, there exist boundary conditions such that local Markov chains require exponential time to converge to equilibrium. This is the first rigorous result bounding the mixing time of Glauber dynamics in the ferroelectric phase. Our analysis demonstrates a fundamental connection between correlated random walks and the dynamics of intersecting lattice path models (or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Slow Mixing of Glauber Dynamics for the Six-Vertex Model in the Ordered
Phases
Matthew Fahrbach
Email: [email protected]. Supported in part by an NSF Graduate Research Fellowship under grant DGE-1650044.
School of Computer Science, Georgia Institute of Technology
Dana Randall
Email: [email protected]. Supported in part by NSF grants CCF-1526900, CCF-1637031, and CCF-1733812.
School of Computer Science, Georgia Institute of Technology
Abstract
The six-vertex model in statistical physics is a weighted generalization of the ice model on (i.e., Eulerian orientations) and the zero-temperature three-state Potts model (i.e., proper three-colorings). The phase diagram of the model depicts its physical properties and suggests where local Markov chains will be efficient. In this paper, we analyze the mixing time of Glauber dynamics for the six-vertex model in the ordered phases. Specifically, we show that for all Boltzmann weights in the ferroelectric phase, there exist boundary conditions such that local Markov chains require exponential time to converge to equilibrium. This is the first rigorous result bounding the mixing time of Glauber dynamics in the ferroelectric phase. Our analysis demonstrates a fundamental connection between correlated random walks and the dynamics of intersecting lattice path models (or routings). We analyze the Glauber dynamics for the six-vertex model with free boundary conditions in the antiferroelectric phase and significantly extend the region for which local Markov chains are known to be slow mixing. This result relies on a Peierls argument and novel properties of weighted non-backtracking walks.
1 Introduction
The six-vertex model was first introduced by Pauling in 1935 [Pau35] to study the thermodynamics of crystalline solids with ferroelectric properties, and has since become one of the most compelling models in statistical mechanics. The prototypical instance of the model is the hydrogen-bonding pattern of two-dimensional ice—when water freezes, each oxygen atom must be surrounded by four hydrogen atoms such that two of the hydrogen atoms bond covalently with the oxygen atom and two are farther away. The state space of the six-vertex model consists of orientations of the edges in a finite region of the two-dimensional square lattice where every internal vertex has two incoming edges and two outgoing edges, also represented as Eulerian orientations of the underlying lattice graph. The model is most often studied on the square lattice with additional edges so that each internal vertex has degree 4. There are six possible edge orientations incident to a vertex (see Figure 1). We assign Boltzmann weights to the six vertex types and define the partition function as , where is the set of Eulerian orientations of and is the number of type- vertices in the configuration .
In 1967, Lieb discovered exact solutions to the six-vertex model with periodic boundary conditions (i.e., on the torus) for three different parameter regimes [Lie67a, Lie67b, Lie67c]. In particular, he famously showed that if all six vertex weights are set to , the energy per vertex is , which is known as “Lieb’s square ice constant”. His results were immediately generalized to all parameter regimes and to account for external electric fields [Sut67, Yan67]. An equivalence between periodic and free boundary conditions in the limit was established in [BKW73], and since then the primary object of study has been the six-vertex model subject to domain wall boundary conditions, where the lower and upper boundary edges point into the square and the left and right boundary edges point outwards [ICK92, KZJ00, BPZ02, BF06, BL09, BL10]. The six-vertex model serves as an important “counterexample” in statistical physics because the surface free energy in the thermodynamic limit depends on the boundary conditions. In particular, it is different for periodic and domain wall boundary conditions.
There have been several surprisingly profound connections to combinatorics and probability in this line of work. For example, Zeilberger gave a sophisticated computer-assisted proof of the alternating sign matrix conjecture in 1995 [Zei96]. A year later, Kuperberg [Kup96] produced an elegant and significantly shorter proof using analysis of the partition function of the six-vertex model with domain wall boundary conditions. Other connections to combinatorics include the dimer model on the Aztec diamond and the arctic circle theorem [CEP96, FS06], sampling lozenge tilings [LRS01, Wil04, BCFR17], and counting 3-colorings of lattice graphs [RT00, CR16].
While there has been extraordinary progress in understanding properties of the six-vertex model with periodic or domain wall boundary conditions in mathematical physics, remarkably less is known when the model is subject to arbitrary boundary conditions. Sampling configurations using Markov chain Monte Carlo (MCMC) algorithms has been one of the primary means for discovering mathematical and physical properties of the six-vertex model [AR05, LKV17, LKRV18, KS18, BR20]. However, the model is empirically very sensitive to boundary conditions, and numerical studies have often observed slow convergence of local MCMC algorithms under certain parameter settings. For example, according to [LKRV18], “it must be stressed that the Metropolis algorithm might be impractical in the antiferromagnetic phase, where the system may be unable to thermalize.” There are very few rigorous results about natural Markov chains and the computational complexity of sampling from the six-vertex model when the Boltzmann distribution is nonuniform, thus motivating our study of Glauber dynamics for the six-vertex model, the most widely used MCMC sampling algorithm, in the ferroelectric and antiferroelectric phases.
At first glance, the model has six degrees of freedom. However, this conveniently reduces to a two-parameter family because of invariants that relate pairs of vertex types. To see this, it is useful to view the configurations of the six-vertex model as intersecting lattice paths by erasing all of the edges that are directed south or west and keeping the others (see Figure 2). Using this bijective “routing interpretation,” it is simple to see that the number of type-5 and type-6 vertices must be closely correlated. In addition to revealing invariants, the lattice path representation of configurations turns out to be exceptionally useful for analyzing Glauber dynamics. Moreover, the total weight of a configuration should remain unchanged if all the edge directions are reversed in the absence of an external electric field, so we let , , and . This complementary invariance is known as the zero field assumption, and it is often convenient to exploit the conservation laws of the model [BL09] to reparameterize the system so that and . This allows us to ignore empty sites and focus solely on weighted lattice paths. Furthermore, since our goal is to sample configurations from the Boltzmann distribution, we can normalize the partition function by a factor of and consider the weight instead of the parameter . Collectively, we refer to these properties as the invariance of the Gibbs measure for the six-vertex model.
The single-site Glauber dynamics for the six-vertex model is the Markov chain that makes local moves by (1) choosing an internal cell of the lattice uniformly at random and (2) reversing the orientations of the edges that bound the chosen cell if they form a cycle. In the lattice path interpretation, these dynamics correspond to the “mountain-valley” Markov chain that flips corners. Transitions between states are made according to the Metropolis-Hastings acceptance probability [MRR*+*53] so that the Markov chain converges to the desired distribution.
The phase diagram of the six-vertex model represents distinct thermodynamic properties of the system and is partitioned into three regions: the disordered (DO) phase, the ferroelectric (FE) phase, and the antiferroelectric (AFE) phase. To establish these regions, we consider the parameter
[TABLE]
The disordered phase is the set of parameters that satisfy , and Glauber dynamics is expected to be rapidly mixing in this region because there are no long-range correlations in the system. The ferroelectric phase is defined by , or equivalently when we have or . The antiferroelectric phase is defined by , or equivalently when .
The phase diagram is symmetric over the positive diagonal, which follows from the fact that and are interchangeable under the automorphism that rotates each of the six vertex types by ninety degrees clockwise. This is equivalent to rotating the entire model under the zero field assumption. Therefore, we can assume that mixing results are symmetric over the main diagonal. Combinatorially, we show in Section 3 that configurations in the ferroelectric phase can be interpreted as intersecting lattice paths that prefer to adhere to each other. We carefully exploit this property to show that Glauber dynamics slow mixing. In the antiferroelectric phase, configurations prefer vertices of type- and tend to be closely aligned with one of states with maximum probability that are arrow reversals of each other.
1.1 Related Works
Cai, Liu, and Lu [CLL19] recently investigated the six-vertex model for 4-regular graphs and provided strong evidence that the complexity of approximating the partition function agrees with the phase diagram from statistical physics. In particular, they give a fully randomized approximation scheme (FPRAS) for all 4-regular graphs in the subregion of the disordered phase defined by the inequalities , , and (i.e., the blue region in Figure 3(a)). Their algorithm builds on the winding technique for Holant problems developed in [McQ13, HLZ16] and requires time to sample a six-vertex configuration from the Boltzmann distribution, where is the number of vertices in the graph. The Markov chain they use is not Glauber dynamics, but rather a directed loop algorithm whose state space is augmented with “near-perfect” configurations that slightly violate the Eulerian orientation constraint. This Markov chain can be understood as gradually reversing a large directed loop in a valid six-vertex configuration, whereas Glauber dynamics is restricted to reversing cycles that form the perimeter of a cell. Cai, Liu, and Lu also showed that an FPRAS for 4-regular graphs cannot exist in the ferroelectric or antiferroelectric regions unless (i.e., the gray regions in Figure 3(a)). Their hardness results use nonplanar 4-regular gadgets to reduce from 3-MIS, the NP-hard problem of computing the cardinality of a maximum independent set in a 3-regular graph [GJS74], and therefore so not directly reveal anything about the mixing time of Glauber dynamics for the six-vertex model on regions of . A dichotomy theorem for the (exact) computability of the partition function of the six-vertex model on 4-regular graphs was also recently proven in [CFX18].
As for the positive results about the mixing time of Glauber dynamics, Luby, Randall, and Sinclair [LRS01] proved rapid mixing of a Markov chain that leads to a fully polynomial almost uniform sampler for Eulerian orientations on any region of the Cartesian lattice with fixed boundaries (i.e., the unweighted case when ). Randall and Tetali [RT00] then used a comparison technique to argue that Glauber dynamics for Eulerian orientations on lattice graphs is rapidly mixing by relating this Markov chain to the Luby-Randall-Sinclair chain. Goldberg, Martin, and Paterson [GMP04] extended their approach to show that Glauber dynamics is rapidly mixing on rectangular lattice regions with free boundary conditions.
Liu [Liu18] gave the first rigorous result showing that Glauber dynamics can be slowly mixing in a subregion of an ordered phase. In particular, Liu showed that local Markov chains subject to free boundary conditions require exponential time to converge to stationarity in the antiferroelectric subregion defined by (i.e., the red region in Figure 3(b)), where is the connective constant for self-avoiding walks on the square lattice. We note that the connective constant is defined by the limit , where is the number of self-avoiding walks of length on the square lattice. Liu also showed that the directed loop algorithm used in [CLL19] mixes slowly in the same antiferroelectric subregion and for all of the ferroelectric region. This, however, has no bearing on the efficiency of Glauber dynamics in the ferroelectric region. As an aside, we also remark that the partition function is exactly computable for all boundary conditions at the free-fermion point when , or equivalently , via a reduction to domino tilings and a Pfaffian computation [FS06].
1.2 Main Results
In this paper we show that there exist boundary conditions for which Glauber dynamics mixes slowly for the six-vertex model in the ferroelectric and antiferroelectric phases. We start by proving that there are boundary conditions that cause Glauber dynamics to be slow for all Boltzmann weights that lie in the ferroelectric region of the phase diagram, where the mixing time is exponential in the number of vertices in the lattice. This is the first rigorous result for the mixing time of Glauber dynamics in the ferroelectric phase and it gives a complete characterization.
Theorem 1.1** (Ferroelectric Phase).**
For any such that or , there exist boundary conditions for which Glauber dynamics mixes exponentially slowly on .
We note that our approach naturally breaks down at the critical line of the conjectured phase diagram for the mixing time in a way that reveals a trade-off between the energy and entropy of the system. Additionally, our analysis suggests an underlying combinatorial interpretation for the phase transition between the ferroelectric and disordered phases in terms of the adherence strength of intersecting lattice paths and the momentum parameter of correlated random walks.
Our second mixing result builds on the topological obstruction framework developed in [Ran06] to show that Glauber dynamics with free boundary conditions mixes slowly in most of the antiferroelectric region. Specifically, we generalize the recent antiferroelectric mixing result in [Liu18] with a Peierls argument that uses multivariate generating functions for weighted non-backtracking walks instead of the connectivity constant for (unweighted) self-avoiding walks to better account for the discrepancies in Boltzmann weights.
Theorem 1.2** (Antiferroelectric Phase).**
For any such that , Glauber dynamics mixes exponentially slowly on with free boundary conditions.
We illustrate the new regions for which Glauber dynamics can be slowly mixing in Figure 3. Observe that our antiferroelectric subregion significantly extends Liu’s and pushes towards the conjectured threshold.
1.3 Techniques
We take significantly different approaches for our analysis of the ferroelectric and antiferroelectric phases. In the ferroelectric phase, where and type- vertices are preferred to type- and type- vertices, we construct boundary conditions that induce polynomially-many paths separated by a critical distance that allows all of the paths to (1) behave independently and (2) simultaneously intersect with their neighbors maximally. (This analysis also covers the case by a standard invariant that shows symmetry in the phase diagram over the line .) From here, we analyze the dynamics of a single path in isolation as an escape probability, which eventually allows us to bound the conductance of the Markov chain. The dynamics of a single lattice path is equivalent to that of a correlated random walk. In Section 5 we present a new tail inequality for correlated random walks that accurately bounds the probability of large deviations from the starting position. We note that decomposing the dynamics of lattice models into one-dimensional random walks has recently been shown to achieve nearly tight bounds for escape probabilities in a different setting [DFGX18].
One of the key technical contributions in this paper is our analysis of the tail behavior of correlated random walks in Section 5. While there is a simple combinatorial expression for the position of a correlated random walk written as a sum of marginals, it is not immediately useful for bounding the displacement from the origin. To achieve an exponentially small tail bound for these walks, we first construct a smooth function that tightly upper bounds the marginals and then optimize this function to analyze the asymptotics of the log of the maximum marginal. Once we obtain an asymptotic equality for the maximum marginal, we can upper bound the deviation of a correlated random walk, and hence the deviation of a lattice path in a configuration. Ultimately, this allows us to show that there exists a balanced cut in the state space that has an exponentially small escape probability, which implies that the Glauber dynamics are slowly mixing.
In the antiferroelectric phase, on the other hand, the weights satisfy , so type- vertices are preferred. It follows that there are two (arrow-reversal) symmetric ground states of maximum probability containing only type- vertices. To move between configurations that agree predominantly with different ground states, the Markov chain must pass through configurations with a large number of type- or type- vertices. Using the idea of fault lines introduced in [Ran06], we use weighted non-backtracking walks to characterize such configurations and construct a cut set with exponentially small probability mass that separates the ground states.
2 Preliminaries
We start with some background on Markov chains and mixing times. Let be an ergodic, reversible Markov chain with finite state space , transition probability matrix , and stationary distribution . The -step transition probability from states to is denoted as . The total variation distance between probability distributions and on is
[TABLE]
The mixing time of is . We say that is rapidly mixing if its mixing time is , where is the size of each configuration in the state space. Similarly, we say that is slow mixing if its mixing time is for some constant .
The mixing time of a Markov chain is characterized by its conductance (up to polynomial factors). The conductance of a nonempty set is
[TABLE]
and the conductance of the entire Markov chain is . It is often useful to view the conductance of a set as an escape probability—starting from stationarity and conditioned on being in , the conductance is the probability that leaves in one step.
Theorem 2.1** ([LPW17]).**
For an ergodic, reversible Markov chain with conductance , .
To show that a Markov chain is slow mixing, it suffices to show that the conductance is exponentially small.
3 Slow Mixing in the Ferroelectric Phase
We start with the ferroelectric phase where or , and we give a conductance-based argument to show that Glauber dynamics can be slowly mixing in the entire ferroelectric region. Specifically, we show that there exist boundary conditions that induce an exponentially small, asymmetric bottleneck in the state space, revealing a natural trade-off between the energy and entropy in the system. Viewing the six-vertex model in the intersecting lattice path interpretation suggests how to plant polynomially-many paths in the grid that can (1) be analyzed independently, while (2) being capable of intersecting maximally. This path independence makes our analysis tractable and allows us to interpret the dynamics of a path as a correlated random walk, for which we develop an exponentially small tail bound in Section 5. Since conductance governs mixing times, we show how to relate the expected maximum deviation of a correlated walk to the conductance of the Markov chain and prove slow mixing. In addition to showing slow mixing up to the conjectured threshold, a surprising feature of our argument is that it potentially gives a combinatorial explanation for the phase transition from the ferroelectric to disordered phase. In particular, Lemma 3.6 demonstrates how the parameters of the model delicately balance the probability mass of the Markov chain.
We start by leveraging the invariance of the Gibbs measure and the lattice path interpretation of the six-vertex model to conveniently reparameterize the Boltzmann weights. Recall that for a fixed boundary condition, the invariants of the model [BL09] imply that . Therefore, we set and to ignore empty sites while letting . We also set and so that the weight of a configuration only comes from straight segments and intersections of neighboring lattice paths.
3.1 Constructing the Boundary Conditions and Cut
We begin with a few colloquial definitions for lattice paths that allow us to easily construct the boundary conditions and make arguments about the conductance of the Markov chain. We call a -step, north-east lattice path starting from a path of length , and if the path ends at we describe it as tethered. If , we define the deviation of to be . Geometrically, path deviation captures the (normalized) maximum perpendicular distance of the path to the line . We refer to vertices along the path as corners or straights depending on whether or not the path turned. If two paths intersect at a vertex we call this site a cross. Note that this classifies all vertex types in the six-vertex model.
We consider the following independent paths boundary condition for an six-vertex model for the rest of the section. To construct this boundary condition, we consider its lattice path interpretation. First, place a tethered path that enters horizontally and exits horizontally. Next, place translated tethered paths of varying length above and below the main diagonal, each separated from its neighbors by distance . Specifically, the paths below the main diagonal begin at the vertices and end at the vertices , respectively. The paths above the main diagonal begin at and end at . The deviation of a translated tethered path is the deviation of the same path starting at . To complete the boundary condition, we force the paths below the main diagonal to enter vertically and exit horizontally. Symmetrically, we force the paths above the main diagonal to enter horizontally and exit vertically. See Figure 4(a) for an illustration of the construction when all paths have small deviation.
Next, we construct an asymmetric cut in the state space induced by this boundary condition in terms of its internal lattice paths. In particular, we analyze a set of configurations such that every path in a configuration has small deviation. Formally, we let
[TABLE]
Observe that by our choice of separation distance and the deviation limit for , no paths in any configuration of intersect. It follows that the partition function for factors into a product of partition functions, one for each path with bounded deviation. This intuition is useful when analyzing the conductance as an escape probability from stationarity.
3.2 Lattice Paths as Correlated Random Walks
Now we weight the internal paths according to the parameters of the six-vertex model defined in the beginning of Section 3. The main result in this subsection is Lemma 3.1, which states that random tethered paths are exponentially unlikely to deviate past , even if drawn from a Boltzmann distribution that favors straights. Start by defining to be the distribution over tethered paths of length with the property that
[TABLE]
Lemma 3.1**.**
Let and . For sufficiently large and , we have
[TABLE]
Before giving the proof of Lemma 3.1, we first introduce the concept of correlated random walks. Then we present three prerequisite results about correlated random walks and briefly explain their connection to the deviation of biased tethered paths. Our goal here is to show how the supporting lemmas interact prior to the proof of Lemma 3.1.
A key idea in our analysis of the ferroelectric phase is the notion of a correlated random walk, which generalize a simple symmetric random walk by accounting for momentum. A correlated random walk with momentum parameter starts at the origin and is defined as follows. Let be a uniform random variable with support . For all subsequent steps , the direction of the process is correlated with the direction of the previous step and satisfies
[TABLE]
We denote the position of the walk at time by . It will often be useful to make the change of variables when analyzing the six-vertex model, where is the weight of a straight vertex. In many cases this also leads to cleaner expressions. We use the following probability mass function (PMF) for the position of a correlated random walk to develop our new tail inequality (Lemma 3.5), which holds for all values of .
Lemma 3.2** ([HF98]).**
For any and , the PMF of a correlated random walk is
[TABLE]
Now that we have defined correlated random walks, we proceed by observing that there is a natural measure-preserving bijection between biased tethered paths of length and correlated random walks of length that return to the origin. To see this, observe that every vertical edge in the tethered path corresponds to a step to the right in the correlated random walk (i.e., ), and every horizontal edge in the tethered path corresponds to a step to the left in the correlated random walk (i.e., ). Concretely, for a correlated random walk parameterized by , we have
[TABLE]
The first prerequisite lemma we present is an asymptotic equality that generalizes the return probability of simple symmetric random walks. This allows us to relax the condition in Equation 1 where the correlated random walk must return to the origin, and instead we bound at the expense of an polynomial factor.
Lemma 3.3** ([Gil55]).**
For any constant , the return probability of a correlated random walk is
[TABLE]
The second result that we need in order to prove Lemma 3.1 is that the PMF for correlated random walks is monotone.
Lemma 3.4**.**
For any momentum parameter and sufficiently large, the probability of the position of a correlated random walk is monotone. Concretely, for , we have
[TABLE]
Proof.
We consider the cases and separately. Using Lemma 3.2, the probability density function for the position of a correlated random walk is
[TABLE]
If , then we have the equations
[TABLE]
Therefore, we have for all
[TABLE]
Now we assume that . Writing as a difference of sums and matching the corresponding terms, it is instead sufficient to show for all values of , we have
[TABLE]
Next, rewrite the binomial coefficients as
[TABLE]
Therefore, it remains to show that
[TABLE]
Since all of the values in are positive for any choice of and , it is equivalent to show that
[TABLE]
Observing that
[TABLE]
completes the proof. ∎
The third result we need is an upper bound for the position of a correlated random walk. We fully develop this inequality in Section 5 by analyzing the asymptotic behavior of the PMF in Lemma 3.2. We note that Lemma 3.5 shows exactly how the tail behavior of simple symmetric random walks generalizes to correlated random walks as a function of .
Lemma 3.5**.**
*Let and . For sufficiently large, a correlated random walk satisfies *
[TABLE]
Now that we have established these supporting lemmas, we are prepared to complete the proof of Lemma 3.1, which also heavily relies on union bounds and relaxing conditional probabilities.
Proof of Lemma 3.1.
Using the measure-preserving bijection between tethered paths of length and correlated random walks of length (Section 3.2) along with the definition of conditional probability and Lemma 3.3, we have
[TABLE]
where the last inequality uses the definition of asymptotic equality with . Next, a union bound and the symmetry of correlated random walks imply that
[TABLE]
Now we focus on the probability that the maximum position of the walk is at least . For this event to be true, the walk must reach at some time , so by a union bound,
[TABLE]
The second inequality takes into account the parity of the random walk, the fact that if the walk can only be at position [math], and the relaxed condition that the final position is at least . Lemma 3.4 implies that the distribution is unimodal on its support centered at the origin for sufficiently large . Moreover, for walks of the same parity with increasing length and a fixed tail threshold, the probability of the tail is nondecreasing. Combining these two observations, we have
[TABLE]
Using the chain of previous inequalities and the upper bound for in Lemma 3.5 with the smaller error , it follows that
[TABLE]
which completes the proof. ∎
3.3 Bounding the Conductance and Mixing Time
Next, we bound the conductance of the Markov chain by viewing as an escape probability. We start by claiming that (as required by the definition of conductance) if and only if the parameters are in the ferroelectric phase. Then we use the correspondence between tethered paths and correlated random walks (i.e., Section 3.2) to prove that is exponentially small.
Lemma 3.6**.**
Let and be constants. For sufficiently large, .
Proof.
We start by upper bounding in terms of the partition function . No paths in any state of deviate by more than by the definition of . Moreover, since adjacent paths are separated by distance , no two can intersect (Figure 4(a)). Therefore, it follows that the paths are independent of each other, which is convenient because it allows us to implicitly factor the generating function for configurations in .
Next, observe that an upper bound for the generating function of any single path is . This is true because all paths have length at most , and we introduce an additional factor to account for boundary conditions. Since all the paths are independent and , we have
[TABLE]
Now we lower bound the partition function of the entire model by considering the weight of the ferroelectric ground state (Figure 4(b)). Recall that we labeled the paths below the main diagonal path such that is farthest from the main diagonal. Let be a constant that accounts for subtle misalignments between adjacent paths. It follows that each path uniquely corresponds to at least intersections. Using the last path as a lower bound for the number of intersections that each path contributes and accounting also for the paths above the main diagonal, it follows that there are at least
[TABLE]
intersections in the ground state.
Similarly, we bound the number of straights that each path contributes. Note that we may also need an upper bound for this quantity in order to lower bound the partition function since it is possible that . The number of straights in is , and has two straights on the boundary. Therefore, the total number of straights in the ground configuration is
[TABLE]
Since intersections are weighted by and straights by in our reparameterized model, by considering the ground state and using the previous enumerations, it follows that
[TABLE]
Combining these inequalities allows us to upper bound the probability mass of the cut by
[TABLE]
Using the assumption that , we have for sufficiently large, as desired. ∎
Our analysis of the escape probability from critically relies on the fact that paths in any state are non-intersecting. Combinatorially, we exploit the factorization of the generating function for states in as a product of independent path generating functions.
Lemma 3.7**.**
Let be constants. For sufficiently large, .
Proof.
The conductance can be understood as the following escape probability. Sample a state from the stationary distribution conditioned on , and run the Markov chain from for one step to get a neighboring state . The definition of conductance implies that is the probability that . Using this interpretation, we can upper bound by the probability mass of states that are near the boundary of in the state space, since the process must escape in one step. Therefore, it follows from the independent paths boundary condition and the definition of that
[TABLE]
Next, we use a union bound over the different paths in a configuration and consider the event that a particular path deviates by at least . Because all of the paths in are independent, we only need to consider the behavior of in isolation. This allows us to rephrase the conditional event. Relaxing the conditional probability of each term in the sum gives
[TABLE]
For large enough , the length of every path is in the range since we eventually have the inequality . Therefore, we can apply Lemma 3.1 with the error to each term and use the universal upper bound
[TABLE]
It follows from the union bound and previous inequality that the conductance is bounded by
[TABLE]
which completes the proof. ∎
Now that we have constructed a cut in the state space with exponentially small conductance, we can obtain a bound on the mixing time when the probability mass is properly distributed.
Theorem 3.8**.**
Let and . For sufficiently large, .
Proof.
Since by Lemma 3.6, we have . The proof follows from Theorem 2.1 and the conductance bound in Lemma 3.7 with a smaller error . ∎
Last, we restate our main theorem and use Theorem 3.8 to show that Glauber dynamics for the six-vertex model can be slow mixing for all parameters in the ferroelectric phase.
See 1.1
Proof.
Without loss of generality, we reparameterized the model so that , , and . Therefore, Glauber dynamics with the independent paths boundary condition is slow mixing if by Theorem 3.8. Since the rotational invariance of the six-vertex model implies that and are interchangeable parameters, this mixing time result also holds in the case . ∎
4 Slow Mixing in the Antiferroelectric Phase
While Glauber dynamics can be slowly mixing in the ferroelectric phase, we find it is true for substantially different reasons. In the antiferroelectric phase, Boltzmann weights satisfy , so configurations tend to favor corner (i.e., type-) vertices. The main insight behind our slow mixing proof is that when is sufficiently large, the six-vertex model can behave like the low-temperature hardcore model on where configurations predominantly agree with one of two ground states. Liu recently formalized this argument in [Liu18] and showed that Glauber dynamics for the six-vertex model with free boundary conditions requires exponential time when , where is the connective constant of self-avoiding walks on the square lattice [GC01]. His proof uses a Peierls argument based on topological obstructions introduced by Randall [Ran06] in the context of independent sets. In this section, we extend Liu’s result to the region depicted in Figure 3(c) by computing a closed-form multivariate generating function that upper bounds the number of self-avoiding walks and better accounts for disparities in their Boltzmann weights induced by the parameters of the six-vertex model.
4.1 Topological Obstruction Framework
We start with a recap of the definitions and framework laid out in [Liu18]. There are two ground states in the antiferroelectric phase such that every interior vertex is a corner: (Figure 5(a)) and (Figure 5(b)). These configurations are edge reversals of each other, so for any state we can color its edges red if they are oriented as in or green if they are oriented as in . See Figure 5(c) for an example of how a configuration is colored. It follows from case analysis of the six vertex types in Figure 1 that the number of red edges incident to any internal vertex is even, and if there are only two red edges then they must be rotationally adjacent to each other. The same property holds for green edges by symmetry. Note that the four edges bounding a cell of the lattice are monochromatic if and only if they are oriented cyclically, and thus reversible by Glauber dynamics. We say that a simple path from a horizontal edge on the left boundary of to a horizontal edge on the right boundary is a red horizontal bridge if it contains only red edges. We define green horizontal bridges and monochromatic vertical bridges similarly. A configuration has a red cross if it contains both a red horizontal bridge and a red vertical bridge. Likewise, we can define a green cross. Let be the set of all states with a red cross, and let be the set of all states with a green cross. It follows from Lemma 4.1 that .
Next, we define the dual lattice to describe configurations in . The vertices of are the centers of the cells in , including the cells on the boundary that are partially enclosed, and we connect dual vertices by an edge if their corresponding cells are diagonally adjacent. Note that is a union of two disjoint graphs (Figure 6(a)). For any state there is a corresponding dual subgraph defined as follows: for each interior vertex in , if is incident to two red edges and two green edges, then contains the dual edge passing through that separates the two red edges from the two green edges. This construction is well-defined because the red edges are rotationally adjacent. See Figure 6(b) for an example of a dual configuration. For any , we say that has a horizontal fault line if contains a simple path from a left dual boundary vertex to a right dual boundary vertex. We define horizontal fault lines similarly and let be the set of all states containing a horizontal or vertical fault line. Fault lines completely separate red and green edges, and hence are topological obstructions that prohibit monochromatic bridges.
Last, we extend the notion of fault lines to almost fault lines. We say that has a horizontal almost fault line if there is a simple path in connecting a left dual boundary vertex to a right dual boundary vertex such that all edges except for one are in . We define vertical almost fault lines similarly and let the set denote all states containing an almost fault line. Finally, let denote the set of states not in that one move away from in the state space according to the Glauber dynamics.
Lemma 4.1** ([Liu18]).**
We can partition the state space into . Furthermore, we have .
4.2 Bounding the Mixing Time with a Peierls Argument
In this subsection we show that is an exponentially small bottleneck in the state space . The analysis relies on Lemma 4.1 and a new multivariate upper bound for weighted self-avoiding walks (Lemma 4.2). Our key observation is that when a fault line changes direction, the vertices in its path change from type- to type- or vice versa. Therefore, our goal in this subsection is to generalize the trivial upper bound for the number of self-avoiding walks by accounting for their changes in direction in aggregate. We achieve this by using generating functions to solve a system of linear recurrence relations.
We start by encoding non-backtracking walks that start from the origin and take their first step northward using the characters in , representing straight, left, and right steps. For example, the walk SLRSSL corresponds to the sequence . If a fault line is the same shape as SLRSSL up to a rotation about the origin, then there are only two possible sequences of vertex types through which it can pass: and . This follows from the fact that once the first vertex type is determined, only turns in the self-avoiding walk (i.e., the L and R characters) cause the vertex type to switch. We define the weight of a fault line to be the product of the vertex types through which it passes. More generally, we define the weight of a non-backtracking walk that initially passes through a fixed vertex type to be the product of the induced vertex types according to the rule that turns toggle the current type. Formally, we let the function denote the weight of a non-backtracking walk that starts by crossing a type- vertex. We define the function similarly and provide the examples and for clarity. Last, observe that a sequence of vertex types can have many different walks in its preimage. The non-backtracking walk SRRSSR also maps to and —in fact, there are such walks in this example since we can interchange L and R characters.
The idea of enumerating the preimages of a binary string corresponding to sequence of vertex types suggests a recursive approach for computing the sum of weighted non-backtracking walks. This naturally leads to the use of generating functions, so we overload the variables and to also denote function arguments. For nonempty binary string , let count the number of pairs of adjacent characters that are not equal and let denote the number of ones in (e.g., if then and ). The sum of weighted self-avoiding walks is upper bounded by the sum of weighted non-backtracking walks, so we proceed by analyzing the following function:
[TABLE]
Note that recovers the number of non-backtracking walks that initially cross type- or type- vertices.
In the next section, we compute the closed-form solution for by diagonalizing a matrix corresponding to the system of recurrence relations, which allows us to accurately quantify the discrepancy between fault lines when the Boltzmann weights and differ. For now, we use the following upper bound for in our Peierls argument and defer its proof to Section 4.3.
Lemma 4.2**.**
Let be the generating function for weighted non-backtracking walks defined in Equation 2. For any integer and , we have
[TABLE]
The first step of our Peierls argument is to upper bound , which then gives us a bound on the conductance and allows us to prove Theorem 1.2. We start by defining the subset of antiferroelectric parameters that cause to decrease exponentially fast.
Lemma 4.3**.**
If is antiferroelectric and , then
[TABLE]
Proof.
Let and , and observe that by the antiferroelectric assumption. It follows from our hypothesis that . Therefore, we have
[TABLE]
which completes the proof. ∎
Lemma 4.4**.**
If is antiferroelectric and , then for Glauber dynamics with free boundary conditions we have
[TABLE]
Proof.
For any self-avoiding walk and dual vertices on the boundary, let be the set of states that contain as a fault line or an almost fault line such that starts at and ends at . Without loss of generality, assume that the (almost) fault line is vertical. Reversing the direction of all edges on the left side of defines the injective map such that if is a fault line of , then the weight of its image is amplified by or . For an example of this injection, see Figure 6(c). Similarly, if is an almost fault line, decompose into subpaths and separated by a type- vertex such that starts at and ends at . In this case, the weight of the images of almost fault lines is amplified by a factor of for some . Using the fact that is injective and summing over the states containing as a fault line and an almost fault line separately gives us
[TABLE]
where the sum is over all decompositions of into and .
Equipped with Equation 3 and Lemma 4.2, we use a union bound over all pairs of terminal vertices and fault line lengths to bound in terms of the generating function for weighted non-backtracking walks . Since antiferroelectric weights satisfy , it follows from Lemma 4.3 that
[TABLE]
Note that the convolutions in the first inequality generate all almost weighted non-backtracking walks. ∎
See 1.2
Proof of Theorem 1.2.
Let , , and . It follows from Lemma 4.1 that is a partition with the properties that and . Since the partition is symmetric, Lemma 4.4 implies that , for sufficiently large. Therefore, we can upper bound the conductance by . Using Theorem 2.1 along with Lemma 4.4 and Lemma 4.3 gives the desired mixing time bound. ∎
4.3 Weighted Non-Backtracking Walks
In this section we present a closed-form formula for the weighted non-backtracking walks generating function , and we give the proof of Lemma 4.2. We start by decomposing the generating function into two sums over disjoint sets of bit strings defined by their final character. Formally, for any , let
[TABLE]
First, note that . Second, observe that by recording the final character of the bit strings, we can design a system of linear recurrences to account for the term appearing in Equation 2, which counts the number of non-backtracking walks that map to a given sequence of vertex types.
Lemma 4.5**.**
For any integer and , we have the system of recurrence relations
[TABLE]
where the base cases are and .
Proof.
This immediately follows from the definitions of the functions and . ∎
Lemma 4.6**.**
For any integer and , define the values
[TABLE]
The generating can be written in closed-form as
[TABLE]
Proof.
For brevity, we let and . It follows from Lemma 4.5 that
[TABLE]
Next, observe that the recurrence matrix is diagonalizable. In particular, we have
[TABLE]
where
[TABLE]
Since the base cases are and , it follows that
[TABLE]
Using the fact and simplifying the matrix equation above gives us
[TABLE]
as desired. ∎
See 4.2
Proof.
We start by using Lemma 4.6 to rewrite the closed-form solution of as
[TABLE]
Next, we observe that the eigenvalue satisfies and . Since , it follows that . Furthermore, we have by the triangle inequality. Together these two properties imply that
[TABLE]
Therefore, we can upper bound by
[TABLE]
Since , we have the inequalities
[TABLE]
The result follows from the definition of . ∎
5 Tail Behavior of Correlated Random Walks
In this section we prove Lemma 3.5, which gives an exponentially small upper bound for the tail of a correlated random walk as a function of its momentum parameter . Our proof builds off of the PMF for the position of a correlated random walk restated below, which is combinatorial in nature and not readily amenable for tail inequalities. Specifically, the probability is a sum of marginals conditioned on the number of turns that the walk makes [RH81].
See 3.2
There are two main ideas in our approach to develop a more useful bound for the position of a correlated random walk . First, we construct a smooth function that upper bounds the marginals as a function of (a continuation of the number of turns in the walk ), and then we determine its maximum value. Next we show that the log of the maximum value is asymptotically equivalent to for , which gives us desirable bounds for sufficiently large values of . We note that our analysis illustrates precisely how correlated random walks generalize simple symmetric random walks and how the momentum parameter controls the exponential decay.
5.1 Upper Bounding the Marginal Probabilities
We start by using Stirling’s approximation to construct a smooth function that upper bounds the marginal terms in the sum of the PMF for correlated random walks. For , let
[TABLE]
It can easily be checked that is continuous on all of using the fact that .
Lemma 5.1**.**
For any integer , a correlated random walk satisfies
[TABLE]
Proof.
Consider the probability density function for in Lemma 3.2. If the claim is clearly true, so we focus on the other case. We start by bounding the rightmost polynomial term in the sum. For all , we have
[TABLE]
Next, we reparameterize the marginals in terms of , where , and use a more convenient upper bound for the binomial coefficients. Observe that
[TABLE]
Stirling’s approximation states that for all we have
[TABLE]
so we can bound the products of binomial coefficients up to a polynomial factor by
[TABLE]
The proof follows the definition of given in Equation 4. ∎
There are polynomially-many marginal terms in the sum of the PMF, so if the maximum term is exponentially small, then the total probability is exponentially small. Since the marginal terms are bounded above by an expression involving , we proceed by maximizing on its support.
Lemma 5.2**.**
The function is maximized at the critical point
[TABLE]
Proof.
We start by showing that is log-concave on , which implies that it is unimodal. It follows that a local maximum of is a global maximum. Since and are fixed as constants and because the numerator is positive, it is sufficient to show that
[TABLE]
is concave. Observe that the first derivative of is
[TABLE]
and the second derivative is
[TABLE]
Because on , the function is log-concave and hence unimodal.
To identify the critical points of , it suffices to determine where since is increasing. Using the previous expression for , it follows that
[TABLE]
Therefore, the critical points are the solutions of , so we have
[TABLE]
It remains and suffices to show that is a local maximum since is unimodal. Observing that
[TABLE]
and differentiating using the chain rule, the definition of gives
[TABLE]
We know , so has the same sign as . Therefore, is a local maximum of . Using the continuity of on and log-concavity, is a global maximum. ∎
Remark 5.3**.**
It is worth noting that for , the asymptotic behavior of the critical point is continuous as a function of . In particular, it follows from Lemma 5.2 that .
5.2 Asymptotic Behavior of the Maximum Log Marginal
Now that we have a formula for , and hence an expression for , we want to show that
[TABLE]
for some constant . Because there are polynomially-many marginals in the sum, this leads to an exponentially small upper bound for . Define the maximum log marginal to be
[TABLE]
Equivalently, we show that for sufficiently large using asymptotic equivalences.
Lemma 5.4**.**
The maximum log marginal can be symmetrically expressed as
[TABLE]
Proof.
Grouping the terms of by factors of , and gives
[TABLE]
Using Equation 5, observe that the last term is
[TABLE]
The proof follows by grouping the terms of the desired expression by factors of and . ∎
The following lemma is the crux of our argument, as it presents an asymptotic equality for the maximum log marginal in the PMF for correlated random walks. We remark that we attempted to bound this quantity directly using Taylor expansions instead of an asymptotic equivalence, and while this seems possible, the expressions are unruly. Our asymptotic equivalence demonstrates that second derivative information is needed, which makes the earlier approach even more unmanageable.
Lemma 5.5**.**
For any and , the maximum log marginal satisfies .
Proof.
The proof is by case analysis for . In both cases we analyze as expressed in Lemma 5.4, consider a change of variables, and use L’Hospital’s rule twice. In the first case, we assume . The value of in Lemma 5.2 gives us
[TABLE]
It follows that can be simplified as
[TABLE]
To show , by the definition of asymptotic equivalence we need to prove that
[TABLE]
Make the change of variables . Since , this is equivalent to showing
[TABLE]
Using L’Hospital’s rule twice with the derivatives
[TABLE]
it follows that
[TABLE]
This completes the proof for .
The case when is analogous but messier. Making the same change of variables , it is equivalent to show that
[TABLE]
because the value of for in Lemma 5.2 gives us
[TABLE]
Denoting the left-hand side of Section 5.2 by , one can verify that the first two derivatives of are
[TABLE]
Observing that due to convenient cancellations and using L’Hospital’s rule twice,
[TABLE]
This completes the proof for all cases of . ∎
See 3.5
Proof.
For sufficiently large, the asymptotic equality for in Lemma 5.5 gives us
[TABLE]
It follows from our construction of and the definition of the maximum log marginal that
[TABLE]
as desired. ∎
6 Conclusion
We have made significant progress towards rigorously establishing the conjectured slow regions of the phase diagram for the six-vertex model. In particular, we prove that there exist boundary conditions for which Glauber dynamics requires exponential convergence time for the entire ferroelectric region and most of the antiferroelectric region. Furthermore, our proofs demonstrate why sharp boundaries exist between the ferroelectric phase and the disordered phase, where Glauber dynamics is believed to transition to polynomial-time convergence. We have not fully characterized the antiferroelectric phase, but our improvement over the best previous bounds in [Liu18] cover a significantly larger part of the region.
Our arguments for the slow mixing of Glauber dynamics completely break down in the disordered phase, as expected, but there has not been any rigorous work showing that in this region of the phase diagram we have fast convergence. The single exception is the unweighted case when we have , which corresponds to Eulerian orientations of the lattice region. This was shown to converge in polynomial time for all boundary conditions [RT00, LRS01, GMP04]. The approaches in these works are inherently combinatorial, and it seems that generalizing them to weighted cases will require significantly different ideas. Lastly, we emphasize that our proofs of slow mixing rely on new techniques for analyzing lattice models, which include the closed-form generating function for weighted non-backtracking walks derived in Section 4 and the exponentially small tail inequality for correlated random walks developed in Section 5.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AR 05] David Allison and Nicolai Reshetikhin. Numerical study of the 6-vertex model with domain wall boundary conditions. Annales de l’institut Fourier , 55(6):1847–1869, 2005.
- 2[BCFR 17] Prateek Bhakta, Ben Cousins, Matthew Fahrbach, and Dana Randall. Approximately sampling elements with fixed rank in graded posets. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1828–1838. SIAM, 2017.
- 3[BF 06] Pavel Bleher and Vladimir Fokin. Exact solution of the six-vertex model with domain wall boundary conditions. Disordered phase. Communications in Mathematical Physics , 268(1):223–284, 2006.
- 4[BKW 73] H. J. Brascamp, H. Kunz, and F. Y. Wu. Some rigorous results for the vertex model in statistical mechanics. Journal of Mathematical Physics , 14(12):1927–1932, 1973.
- 5[BL 09] Pavel Bleher and Karl Liechty. Exact solution of the six-vertex model with domain wall boundary conditions. Ferroelectric phase. Communications in Mathematical Physics , 286(2):777–801, 2009.
- 6[BL 10] Pavel Bleher and Karl Liechty. Exact solution of the six-vertex model with domain wall boundary conditions: Antiferroelectric phase. Communications on Pure and Applied Mathematics , 63(6):779–829, 2010.
- 7[BPZ 02] N. M. Bogoliubov, A. G. Pronko, and M. B.. Zvonarev. Boundary correlation functions of the six-vertex model. Journal of Physics A: Mathematical and General , 35(27):5525, 2002.
- 8[BR 20] Pavel Belov and Nicolai Reshetikhin. The two-point correlation function in the six-vertex model. ar Xiv preprint ar Xiv:2012.05182 , 2020.
