The interchange process on high-dimensional products

Jonathan Hermon; Justin Salez

arXiv:1905.02146·math.PR·January 29, 2021

The interchange process on high-dimensional products

Jonathan Hermon, Justin Salez

PDF

TL;DR

This paper proves that the mixing time of the interchange process on high-dimensional hypercubes is proportional to the dimension, showing rapid emergence of macroscopic cycles and providing bounds on related constants.

Contribution

It resolves a long-standing conjecture about the mixing time on hypercubes and extends results to products of arbitrary fixed-size graphs.

Findings

01

Mixing time on hypercube is of order n

02

Macroscopic cycles emerge in constant time

03

Log-Sobolev constant is of order 1

Abstract

We resolve a long-standing conjecture of Wilson (2004), reiterated by Oliveira (2016), asserting that the mixing-time of the unit-rate Interchange Process on the $n$ -dimensional hypercube is of order $n$ . This follows from a sharp inequality established at the level of Dirichlet forms, from which we also deduce that macroscopic cycles emerge in constant time, and that the log-Sobolev constant of the exclusion process is of order $1$ . Beyond the hypercube, our results apply to cartesian products of arbitrary graphs of fixed size, shedding light on a broad conjecture of Oliveira (2013).

Tables1

Table 1. Table 1: Some classical orders of magnitude.

$G$	$\| V_{G} \|$	$t_{rel}^{rw} (G)$	$t_{mix}^{rw} (G)$	$t_{mix}^{ip} (G)$
$𝒦_{n}$	$n$	$1 / n$	$1 / n$	$(\log n) / n$
$𝒫_{n}$	$n$	$n^{2}$	$n^{2}$	$n^{2} \log n$
$ℤ_{2}^{n}$	$2^{n}$	$1$	$\log n$	$n$

Equations186

L_{G}^{\textsc i p} f (σ)

L_{G}^{\textsc i p} f (σ)

t_{\textsc mi x}^{\textsc i p} (G)

t_{\textsc mi x}^{\textsc i p} (G)

t_{\textsc mi x}^{\textsc i p} (K_{n})

t_{\textsc mi x}^{\textsc i p} (K_{n})

t_{\textsc mi x}^{\textsc i p} (P_{n})

t_{\textsc mi x}^{\textsc i p} (P_{n})

t_{\textsc mi x}^{\textsc i p} (Z_{2}^{n})

t_{\textsc mi x}^{\textsc i p} (Z_{2}^{n})

n ≲

n ≲

L_{G}^{\textsc r w} f (x)

L_{G}^{\textsc r w} f (x)

t_{\textsc r e l}^{\textsc i p} (G)

t_{\textsc r e l}^{\textsc i p} (G)

t_{\textsc mi x}^{\textsc r w} (G)

t_{\textsc mi x}^{\textsc r w} (G)

t_{\textsc mi x}^{\textsc i p} (G)

t_{\textsc mi x}^{\textsc i p} (G)

E_{G}^{\textsc i p} (f)

E_{G}^{\textsc i p} (f)

E_{K}^{\textsc i p} (f)

E_{K}^{\textsc i p} (f)

χ_{G}^{\textsc i p}

χ_{G}^{\textsc i p}

t_{\textsc mi x}^{\textsc i p} (G)

t_{\textsc mi x}^{\textsc i p} (G)

\frac{t _{\textsc r e l}^{\textsc r w} ( G _{n} )}{t _{\textsc mi x}^{\textsc r w} ( G _{n} )}

\frac{t _{\textsc r e l}^{\textsc r w} ( G _{n} )}{t _{\textsc mi x}^{\textsc r w} ( G _{n} )}

V_{G_{1}} = \dots = V_{G_{n}} = {1, \dots, ℓ}

V_{G_{1}} = \dots = V_{G_{n}} = {1, \dots, ℓ}

χ_{G}^{\textsc i p}

χ_{G}^{\textsc i p}

t_{\textsc mi x}^{\textsc i p} (G) ≍_{ℓ} lo g ∣ V_{G} ∣.

t_{\textsc mi x}^{\textsc i p} (G) ≍_{ℓ} lo g ∣ V_{G} ∣.

t_{\textsc r e l}^{\textsc r w} (G)

t_{\textsc r e l}^{\textsc r w} (G)

t_{\textsc mi x}^{\textsc r w} (G)

n \to \infty lim inf ⎩ ⎨ ⎧ \frac{t _{\textsc mi x}^{\textsc i p} ( K _{ℓ}^{n} )}{ℓ ^{n}} ⎭ ⎬ ⎫

n \to \infty lim inf ⎩ ⎨ ⎧ \frac{t _{\textsc mi x}^{\textsc i p} ( K _{ℓ}^{n} )}{ℓ ^{n}} ⎭ ⎬ ⎫

n \to \infty lim sup ⎩ ⎨ ⎧ \frac{t _{\textsc mi x}^{\textsc i p} ( K _{ℓ}^{n} )}{ℓ ^{n}} ⎭ ⎬ ⎫

n \to \infty lim sup ⎩ ⎨ ⎧ \frac{t _{\textsc mi x}^{\textsc i p} ( K _{ℓ}^{n} )}{ℓ ^{n}} ⎭ ⎬ ⎫

χ_{G}^{\textsc i p}

χ_{G}^{\textsc i p}

t_{\textsc cy c}^{\textsc i p} (G)

t_{\textsc cy c}^{\textsc i p} (G)

t_{\textsc cy c}^{\textsc i p} (G)

t_{\textsc cy c}^{\textsc i p} (G)

t_{\textsc cy c}^{\textsc i p} (G)

t_{\textsc cy c}^{\textsc i p} (G)

L_{G}^{\textsc e x - k} f (S)

L_{G}^{\textsc e x - k} f (S)

\frac{t _{\textsc r e l}^{\textsc r w} ( G )}{t _{\textsc r e l}^{\textsc r w} ( K )} \leq

\frac{t _{\textsc r e l}^{\textsc r w} ( G )}{t _{\textsc r e l}^{\textsc r w} ( K )} \leq

E [f lo g f] - E [f] lo g E [f]

E [f lo g f] - E [f] lo g E [f]

\frac{1}{2} t_{\textsc r e l}^{\textsc r w} (G) \leq

\frac{1}{2} t_{\textsc r e l}^{\textsc r w} (G) \leq

ρ_{K_{n}}^{\textsc e x - k}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The interchange process on high-dimensional products

Jonathan Hermon, Justin Salez

Abstract

We resolve a long-standing conjecture of Wilson (2004), reiterated by Oliveira (2016), asserting that the mixing time of the Interchange Process with unit edge rates on the $n$ -dimensional hypercube is of order $n$ . This follows from a sharp inequality established at the level of Dirichlet forms, from which we also deduce that macroscopic cycles emerge in constant time, and that the log-Sobolev constant of the exclusion process is of order $1$ . Beyond the hypercube, our results apply to cartesian products of arbitrary graphs of fixed size, shedding light on a broad conjecture of Oliveira (2013).

1 Introduction

1.1 Interchange process

Let $G=(V_{G},E_{G})$ be a finite undirected connected graph. The interchange process (ip) on $G$ is the continuous-time random walk $(\xi_{t})_{t\geq 0}$ on the symmetric group $\mathfrak{S}(V_{G})$ with initial condition $\xi_{0}=\textrm{id}$ and the following Markov generator: for all observables $f\colon\mathfrak{S}(V_{G})\to\mathbb{R}$ ,

[TABLE]

where $\tau_{e}$ denotes the transposition of the endpoints of the edge $e$ . One may think of each vertex as carrying a labelled particle, and of the edges as being equipped with independent unit-rate Poisson clocks: whenever a clock rings, the particles sitting at the endpoints of the corresponding edge simply exchange their positions.

Since $\mathcal{L}^{{\textsc{ip}}}_{G}$ is symmetric and irreducible, the law of $\xi_{t}$ converges to that of a uniform permutation $\xi_{\star}$ as $t\to\infty$ . We shall here be interested in the time-scale on which this convergence occurs, as traditionally measured by the total-variation mixing time:

[TABLE]

Understanding the relation between this fundamental quantity and the geometry of $G$ is a challenging problem, to which a remarkable variety of tools have been applied: representation theory [13, 9], couplings [25, 32, 2, 29], eigenvectors [32, 20], functional inequalities [24, 33, 8], comparison methods [11, 4], etc. The question is of course particularly meaningful when the number of states becomes large, and one is thus naturally led to study asymptotics along various growing sequences of graphs $(G_{n})_{n\geq 1}$ .

The case of the $n-$ clique $G_{n}=\mathcal{K}_{n}$ has been extensively studied under the name random transposition shuffle. In particular, Diaconis and Shahshahani [13] proved that

[TABLE]

In fact, this was shown for any precision $\varepsilon\in(0,1)$ instead of $\frac{1}{e}$ in (2), thereby extablishing the very first instance of what is now called a cutoff phenomenon [10]. Another well-understood case is the $n-$ path $\mathcal{P}_{n}$ , for which Lacoin [23] recently proved cutoff at time

[TABLE]

There are, however, many simple graph sequences along which even the order of magnitude of $t_{\textsc{mix}}^{{\textsc{ip}}}(G_{n})$ is unknown. An emblematic example (which was the initial motivation for our work) is the boolean hypercube $\mathbb{Z}_{2}^{n}$ , for which Wilson [32] conjectured in $2004$ that

[TABLE]

This was reiterated as Problem 4.2 of the AIM workshop Markov Chains Mixing Times [28]. Here and throughout the paper, $\asymp$ and $\lesssim$ denote equality and inequality up to universal positive multiplicative constants. The current best estimates are

[TABLE]

The lower bound is due to Wilson [32, Section 9.1], and the upper bound was recently obtained by Alon and Kozma [4, Corollary 10] as a special case of a much more general estimate which we will now discuss.

1.2 The big picture

An important observation about the ip is that the motion of a single particle is itself a Markov process. The generator is the usual graph Laplacian, which acts on functions $f\colon V_{G}\to\mathbb{R}$ by

[TABLE]

It is natural to expect the mixing properties of $\mathcal{L}^{{\textsc{ip}}}_{G}$ and $\mathcal{L}^{{\textsc{rw}}}_{G}$ to be intimately related. Indeed, a celebrated conjecture of Aldous, now resolved by Caputo, Liggett and Richthammer [8], asserts that the relaxation times (inverse spectral gaps) of these two operators coincide:

[TABLE]

Recall that $t_{\textsc{rel}}^{{\textsc{rw}}}(G)$ classically controls the mixing time $t_{\textsc{mix}}^{{\textsc{rw}}}(G)$ of the single-particle dynamics (7), up to a correction which is only logarithmic in the number of vertices:

[TABLE]

Inspired by the identity (8), Oliveira [29] conjectured that the same control applies to $t_{\textsc{mix}}^{{\textsc{ip}}}(G)$ . More precisely, he proposed the following simple-looking but far-reaching estimate, which is sharp in the three very different graph examples mentioned above (see Table 1).

Conjecture 1 (Oliveira [29]).

For any connected graph $G$ ,

[TABLE]

Some partial progress on Conjecture 1 can be found in [18, 17]. It is easy to prove that $t_{\mathrm{rel}}^{\mathrm{RW}}\log|V|$ is comparable up to some universal constants to the mixing time of $|V|$ independent particles [18, §2.1] and thus Oliveira’s conjecture has the following probabilistic interpretation. It is saying that the mixing time of the interchange process is at most some universal constant multiple of the mixing time of $|V|$ independent particles (in fact, this is how it phrased in [29]). . We also note that under a mild spectral condition one has that $t_{\textsc{mix}}^{{\textsc{ip}}}(G)\gtrsim t_{\textsc{rel}}^{{\textsc{rw}}}(G)\log|V_{G}|$ [18, Theorem 1.4], but that in general one can have that $t_{\textsc{mix}}^{{\textsc{ip}}}(G)$ is of smaller order than $t_{\textsc{rel}}^{{\textsc{rw}}}(G)\log|V_{G}|$ .111One such example can be obtained by taking $G_{n}$ to be the graph obtained by attaching a path of some diverging length $L_{n}$ to a clique of size $n$ , where $\log L_{n}=o(\log n)$ . A similar example is analyzed in [16], where it is shown that the order of the total variation mixing time of the interchange process may increase as a result of increasing some of the edge rates by a $1+o(1)$ multiplicative factor, or by adding a small number of edges to the base graph (in a manner that makes the original graph and the new graph quasi-isometric).

One of the most powerful techniques to bound the mixing time of a complicated Markov chain consists in comparing its Dirichlet form with that of a better understood chain having the same state space and stationary law, see the seminal paper by Diaconis and Saloff-Coste [11]. In the case of ip, the Dirichlet form is given by

[TABLE]

and a natural candidate for the comparison is the mean-field version $\mathcal{E}^{{\textsc{ip}}}_{\mathcal{K}}$ , where $\mathcal{K}$ denotes the complete graph on the same vertex set as $G$ . Let us therefore define the comparison constant of the ip on $G$ as the smallest number $\chi_{G}^{\textsc{ip}}$ such that the inequality

[TABLE]

holds for all $f\colon\mathfrak{S}(V_{G})\to\mathbb{R}$ . This constant is the optimal price to pay in order to systematically transfer quantitative estimates from ip on $\mathcal{K}$ to ip on $G$ . In a recent breakthrough, Alon and Kozma [4, Theorem 1] established the following remarkably general estimate.

Theorem 1 (Alon and Kozma [4]).

For any regular connected graph $G$ ,

[TABLE]

In particular, they deduced the following bound on the mixing times.

Corollary 1 (Alon and Kozma [4]).

For any regular connected graph $G$ ,

[TABLE]

Note that this proves Conjecture 1 along sequences $(G_{n})_{n\geq 1}$ satisfying $t_{\textsc{mix}}^{{\textsc{rw}}}(G_{n})\asymp t_{\textsc{rel}}^{{\textsc{rw}}}(G_{n})$ . Examples include $\mathcal{K}_{n}$ , $\mathcal{P}_{n}$ , or the discrete tori $\mathbb{Z}_{n},\mathbb{Z}_{n}^{2},\mathbb{Z}_{n}^{3}$ , etc. On the other hand, for various other graphs such as the hypercube $\mathbb{Z}_{2}^{n}$ or bounded-degree expanders, one has

[TABLE]

and Theorem 1 fails at capturing the conjectured asymptotics. In light of this, the next step towards Conjecture 1 should naturally consist in understanding the mixing properties of the ip on graphs satisfying (15). This is precisely the program to which the present paper is intended to contribute.

2 Results

2.1 Comparison constant and mixing time

A natural and important class of graphs satisfying (15) are the “high-dimensional” graphs obtained by taking cartesian products of a large number of small graphs. Recall that the cartesian product $G=G_{1}\times\cdots\times G_{n}$ of $n$ graphs $G_{1},\ldots,G_{n}$ is the graph with vertex set $V_{G_{1}}\times\cdots\times V_{G_{n}},$ and where the neighbors of a vertex $x=(x_{1},\ldots,x_{n})$ are obtained by replacing an arbitrary coordinate $x_{i}$ ( $1\leq i\leq n$ ) with an arbitrary neighbor of $x_{i}$ in the graph $G_{i}$ . Note that $G$ is connected as soon as $G_{1},\ldots,G_{n}$ are. We will allow the dimension $n$ to grow arbitrarily but will keep the side-length fixed, meaning that

[TABLE]

for some fixed integer $\ell\geq 2$ . Two simple examples are the $n-$ dimensional torus $\mathbb{Z}_{\ell}^{n}=\mathbb{Z}_{\ell}\times\cdots\times\mathbb{Z}_{\ell},$ and the $n-$ dimensional Hamming graph $\mathcal{K}_{\ell}^{n}=\mathcal{K}_{\ell}\times\cdots\times\mathcal{K}_{\ell}$ . In particular, when $\ell=2$ , we recover the hypercube. Our main result is the determination of the exact order of magnitude of $\chi^{{\textsc{ip}}}_{G}$ on all product graphs of fixed side-length.

Theorem 2 (Comparison).

All connected product graphs of side-length $\ell\geq 2$ satisfy

[TABLE]

where $\asymp_{\ell}$ means equality up to multiplicative constants that depend only on $\ell$ .

Our estimate on $\chi^{{\textsc{ip}}}_{G}$ classically yields an upper bound on the mixing time, even in the strong $L^{2}$ sense (see [4, Lemma 6]). Moreover, a standard application of Wilson’s method (see [20, Proposition 1.2]) yields a matching lower bound. We thus obtain the following result, which confirms in particular Wilson’s long-standing prediction (5).

Corollary 2 (Mixing time).

All connected product graphs of side-length $\ell\geq 2$ satisfy

[TABLE]

Note that on product graphs, the single-particle dynamics (7) updates each coordinate independently. Consequently, any connected product graph of side-length $\ell\geq 2$ satisfies

[TABLE]

(The double logarithm comes from the fact that there are $\log_{\ell}|V_{G}|$ coordinates, and that the time it takes to update all of them a constant number of times is logarithmic in the number of coordinates). Thus, our Corollary 2 resolves Conjecture 1 for all product graphs of fixed side-length, in a regime where Theorem 1 always fails at doing so.

Remark 1 (Pre-cutoff).

Let us comment on the constants hidden in our results, at least for the Hamming graph $\mathcal{K}_{\ell}^{n}$ . Wilson’s eigenvector method [32] produces the precise lower bound

[TABLE]

with the constant $c_{\ell}>0$ being completely explicit (for example, we have $c_{2}=\frac{\ln 4}{2}$ ). On the other hand, our Corollary 2 guarantees that

[TABLE]

for some constant $C_{\ell}<\infty$ that can certainly be made explicit as well, by a careful examination of our proof. However, we did not try to optimize the value of $C_{\ell}$ , nor even to extract its rough dependency in $\ell$ , because we believe that our comparison-based approach is inherently too rough to produce sharp constants anyway. Nevertheless, we note that neither Wilson’s lower bound $c_{\ell}$ nor our upper bound $C_{\ell}$ change if we replace $\frac{1}{e}$ by any other precision $\varepsilon\in(0,1)$ in the definition (2), thereby establishing what is known as a pre-cutoff. Improving this to a true cutoff (i.e. $C_{\ell}=c_{\ell}$ ) remains a fascinating open problem.

We would like to close this section with a plausible extension of Theorem 2, inspired by an analogous result that we recently obtained for the Zero-Range Process [19, Corollary 3].

Conjecture 2 (General comparison).

All finite connected graphs satisfy

[TABLE]

Note that a proof of this would immediately imply Conjecture 1.

2.2 Emergence of macroscopic cycles

One statistics of particular interest is the cycle structure of the random permutation $\xi_{t}$ , as a function of the time $t$ . On the infinite $d-$ dimensional lattice $\mathbb{Z}^{d}$ with $d\geq 3$ , a long-standing conjecture of Tóth [31] predicts a phase transition, indicated by the sudden emergence of infinite cycles at some critical time $t=t_{c}\in(0,\infty)$ . This is related to a major open problem about the so-called quantum Heisenberg ferromagnet in statistical mechanics. To the best of our knowledge, the phase transition has only been proved on infinite regular trees [5, 15].

In the case of a large finite graph $G$ , the relative lengths of cycles in a uniform random permutation asymptotically follow the Poisson-Dirichlet distribution (see, e.g., [30]). In particular, $\xi_{t}$ is likely to contain a macroscopic cycle at time $t\geq t_{\textsc{mix}}^{{\textsc{ip}}}(G)$ . By analogy with Tóth’s conjecture, one should however expect macroscopic cycles to emerge much before the mixing time. This was established in a precise sense by Schramm [30] in the mean-field case where $G=\mathcal{K}_{n}$ , see also [6, 7]. Alas, results on other finite graphs are quite limited. In [3], Alon and Kozma obtained intriguing identities – involving the irreducible representations of the symmetric group – for the expected number of cycles of a given size in $\xi_{t}$ on any finite graph. Using these identities, they obtained a comparison-based estimate on the quantity

[TABLE]

Theorem 3 (Alon and Kozma [4]).

All finite graphs $G$ satisfy

[TABLE]

Thus, our main result implies that on high-dimensional graphs, macroscopic cycles do indeed emerge much before a single particle even mixes (recall (20)).

Corollary 3 (Giant cycles).

All connected product graphs $G$ of fixed side-length $\ell$ satisfy

[TABLE]

We note that Corollary 3 may not be sharp: it is actually quite possible that the macroscopic cycles already emerge at time $\Theta(1/n)$ (where $n$ is the number of terms in the product), although proving this would require new ideas beyond the Alon-Kozma estimate (25). When specialized to the hypercube $\mathbb{Z}_{2}^{n}$ , Corollary 3 complements a result of Koteckỳ, Miłoś and Ueltschi [22] regarding the appearance of mesoscopic cycles. It also complements a recent result by Adamczak, Kotowski and Miłoś [1], who established a phase transition for the emergence of macroscopic cycles on the $2-$ dimensional Hamming graph $\mathcal{K}_{n}^{2}$ . Finally, we note that, by virtue of [4, Theorem 13], our main result also has direct implications on the magnetisation of the quantum Heisenberg ferromagnet.

2.3 Exclusion process

Another widely-studied interacting particle system is the exclusion process [14, 27, 26, 21]. For a finite graph $G$ and an integer $0<k<|V_{G}|$ , the $k-$ particle exclusion process (ex-k) on $G$ is a Markov chain on the set ${V_{G}\choose k}$ of $k-$ element subsets of $V_{G}$ , with generator given by

[TABLE]

where $\oplus$ denotes the symmetric difference and $\partial S$ the edge-boundary of $S$ in $G$ . This process describes the set occupied by $k$ fixed particles under the ip. More precisely, the ex-k $(\zeta_{t})_{t\geq 0}$ with initial condition $S\in{V_{G}\choose k}$ can be constructed from the ip $(\xi_{t})_{t\geq 0}$ by setting $\zeta_{t}:=\xi_{t}^{-1}(S)$ . This observation, together with (8), easily implies that

[TABLE]

where $\mathcal{K}$ denotes the complete graph on $V_{G}$ , and $\chi^{{\textsc{ex-}k}}_{G}$ the optimal constant in the functional inequality $\mathcal{E}^{{\textsc{ex-}k}}_{\mathcal{K}}\left(\cdot\right)\leq\chi_{G}^{{\textsc{ex-}k}}\,\mathcal{E}^{{\textsc{ex-}k}}_{G}\left(\cdot\right)$ . Recalling (19) and the fact that $t_{\textsc{rel}}^{\textsc{rw}}(\mathcal{K})=1/|V_{G}|$ , we obtain the following corollary.

Corollary 4 (Comparison constant for ex-k).

For all connected product graphs $G$ of side-length $\ell\geq 2$ , and all $0<k<|V_{G}|$ , we have $\chi^{{\textsc{ex-}k}}_{G}\asymp_{\ell}|V_{G}|.$

As a consequence, one can transfer many quantitative estimates from $\mathcal{K}$ to $G$ . This includes the inverse log-Sobolev constant $\rho^{{\textsc{ex-}k}}_{G}$ , defined as the smallest number such that

[TABLE]

for all $f\colon{V_{G}\choose k}\to(0,\infty)$ , where ${\mathbb{E}}[\cdot]$ is expectation under the uniform law. This constant provides powerful controls on the underlying Markov semi-group [12]. It is easy to see that

[TABLE]

On the other hand, the log-Sobolev constant of the exclusion process on the complete graph (Bernoulli-Laplace model) was determined by Lee and Yau [24, Theorem 5]:

[TABLE]

In particular, this allows us to pinpoint the exact order of $\rho^{{\textsc{ex-}k}}_{G}$ in the dense-particle regime.

Corollary 5 (Log-Sobolev constant of ex-k).

Fix $\varepsilon\in(0,1)$ , $\ell\geq 2$ . Then, for all connected product graphs $G$ of side-length $\ell$ and all $k\in\left[\varepsilon|V_{G}|,(1-\varepsilon){|V_{G}|}\right]$ , we have $\rho^{{\textsc{ex-}k}}_{G}\asymp_{\ell,\varepsilon}1.$

Finally, we note that our main result also implies an upper bound of order $n$ (uniformly in $1\leq k\leq 2^{n}$ ) on the $L^{2}$ mixing time of ex-k on the hypercube $\mathbb{Z}_{2}^{n}$ , complementing a total-variation estimate recently obtained by Hermon and [18] (as part of a much more general result).

3 Proof of the main result

3.1 Proof outline

The lower bound in Theorem 2 is easy. Indeed, if $G$ is any finite graph and if $\mathcal{K}$ denotes the complete graph on $V_{G}$ , then the very definition of $\chi_{G}^{\textsc{ip}}$ implies

[TABLE]

where the first equality uses (8). For a graph product of side-length $\ell$ , we deduce

[TABLE]

The remainder of the paper is devoted to proving a matching upper bound. To do so, we combine four simple ideas, each one corresponding to a step of the proof.

Our first step consists in reducing the analysis of ip on a general $n-$ dimensional graph-product $G$ of side-length $\ell$ to the special case of the Hamming graph $\mathcal{K}_{\ell}^{n}$ . This reduction relies on the classical method of canonical paths. An important simplification is that, by a standard path-lifting procedure, it is actually enough to just compare the single-particle on $G$ to that on $\mathcal{K}_{\ell}^{n}$ . See Section 3.2 for details. 2. 2.

Our second step consists in re-interpreting the single-particle dynamics on $\mathcal{K}_{\ell}^{n}$ as a random walk on the additive group $\mathbb{Z}_{n}^{\ell}$ , with the increment law $\mu$ being uniform over vectors with a single non-zero coordinate. This algebraic reformulation is performed in Section 3.3. It will allow one to use group-theoretical methods. 3. 3.

The third step consists in exploiting the celebrated octopus inequality [8, Theorem 2.3] to compare the ip with increment law $\mu$ to the ip with increment law $\mu^{\star t}=\mu\star\cdots\star\mu$ ( $t-$ fold convolution), at a cost of order $t$ . This is directly inspired by what Alon and Kozma did in [4]. However, instead of taking $t=\Theta(n\log n)$ so as to ensure that $\mu^{\star t}$ is close to uniform (all coordinates being refreshed with high probability), we crucially take $t=\Theta(n)$ only, with the prefactor being carefully adjusted so that only roughly half of the coordinates get refreshed under $\mu^{\star t}$ . This important point is made rigorous by an application of the de Moivre - Laplace Local Limit Theorem, see Section 3.4. 4. 4.

Finally, the last step consists in showing that, although the increment law $\mu^{\star t}$ is still very far from uniform (because of our choice of $t$ ), the associated Dirichlet form is actually comparable to the one with uniform increments. This is achieved by constructing canonical paths of length $2$ , the underlying intuition being that randomizing all coordinates of a vector can be achieved by randomizing half the coordinates in one step, and the other half in a second step. This is described in Section 3.5.

3.2 Canonical paths

Our starting point is a powerful tool for comparing Dirichlet forms known as canonical paths, see e.g., [11]. As a warm-up, consider the single-particle dynamics (7) with Dirichlet form

[TABLE]

As usual, a path in $G$ will be a finite sequence of vertices $\gamma=(\gamma_{0},\ldots,\gamma_{k})$ $(k\geq 0)$ such that $e_{i}:=\{\gamma_{i-1},\gamma_{i}\}\in E_{G}$ for each $1\leq i\leq k$ . We call $k$ the length of the path and denote it by $|\gamma|$ . Also, we refer to $\gamma_{0},\gamma_{k}$ as the endpoints of the path $\gamma$ , and to $e_{1},\ldots,e_{k}$ as the traversed edges. By a random path in $G$ , we simply mean a random variable taking value in the (countable) set of all paths in $G$ . We write ${\mathbb{E}}[\cdot]$ for the corresponding expectation.

Theorem 4 (Canonical paths, see e.g. [11]).

Let $G$ , $H$ be connected graphs on the same vertex set. For each edge $f\in E_{H}$ , let $\gamma_{f}$ be any random path in $G$ with the same endpoints as $f$ . Then, $\mathcal{E}_{H}^{{\textsc{rw}}}\leq\kappa\,\mathcal{E}_{G}^{{\textsc{rw}}}$ where $\kappa$ is the congestion, defined as follows:

[TABLE]

We now make three elementary but important remarks.

Remark 2 (Trivial choice).

Even in the worst-case situation where $H$ is the complete graph on $V_{G}$ , we can always achieve the poor bound

[TABLE]

by considering a spanning tree $T$ of $G$ and letting $\gamma_{f}$ be the unique simple path in $T$ connecting the endpoints of $f$ . Note that this path is actually non-random. Exploiting randomness and the particular structure of $H$ to design paths with a low congestion is a matter of art.

Remark 3 (Congestion behaves well under products).

If for $1\leq i\leq n$ , we can compare $G_{i}$ to $H_{i}$ with congestion $\kappa_{i}$ , then we can compare $G_{1}\times\cdots\times G_{n}$ to $H_{1}\times\cdots\times H_{n}$ with congestion

[TABLE]

by considering paths that only evolve along a single coordinate, in the obvious way.

Remark 4 (Cayley graphs).

Theorem 4 simplifies when $G=\textrm{Cay}({\mathbb{G}},A)$ and $H=\textrm{Cay}({\mathbb{G}},B)$ are Cayley graphs generated by subsets $A,B$ of a finite group ${\mathbb{G}}$ . Indeed, any word $\omega=(\omega_{1},\ldots,\omega_{k})\in A^{k}$ can be used to define a path

[TABLE]

in $G$ from $x\in{\mathbb{G}}$ to $xb$ , where $b=\omega_{1}\cdots\omega_{k}$ is the evaluation of $\omega$ in ${\mathbb{G}}$ . Consequently, we only have to specify, for each $b\in B$ , a random word $\omega_{b}$ over $A$ whose evaluation is $b$ . Moreover, a straightforward computation shows that the resulting congestion is simply

[TABLE]

where $|\omega|$ denotes the length of a word $\omega$ , and $N(a,\omega)$ the number of occurrences of $a$ in it.

Remark 4 applies in particular to the ip on any graph $G$ . Indeed, one has

[TABLE]

with ${\mathbb{G}}=\mathfrak{S}(V_{G})$ and $A=\{\tau_{e}\colon e\in E_{G}\}$ . Moreover, any path in $G$ with endpoints $f=\{x,y\}$ and traversed edges $e_{1},\ldots,e_{k}$ can be lifted to a word over $A$ that evaluates to $\tau_{f}$ , namely:

[TABLE]

Since the congestion is multiplied by at most $4$ ( $2$ for the length of the word, and $2$ for the number of occurrences of a letter in it), we obtain the following classical result.

Corollary 6 (From canonical paths for rw to canonical paths for ip).

Under the exact same assumptions (and notation) as in Theorem 4, we also have

[TABLE]

Combining this with Remarks 2 and 3, we obtain the following inequality, which reduces the upper bound of Theorem 2 to the extremal case where $G$ is the $(n,\ell)-$ Hamming graph:

[TABLE]

Corollary 7.

For any $n-$ dimensional connected product graph $G$ of side-length $\ell$ ,

[TABLE]

In light of this result, the upper bound in Theorem 2 now boils down to the claim

[TABLE]

for each $\ell\geq 2$ , to which the remainder of the paper is devoted.

3.3 The octopus inequality

From now on, we fix the side-length $\ell\geq 2$ and the dimension $n\geq 1$ . Writing $\mathcal{K}$ for the complete graph on $\{1,\ldots,\ell\}^{n}$ , our goal is to establish the comparison

[TABLE]

where $c$ does not depend on $n$ . We start by observing that the random walks on $\mathcal{K}_{\ell}^{n}$ and on $\mathcal{K}$ can both be conveniently viewed as random walks on the group

[TABLE]

equipped with coordinate-wise addition mod $\ell$ (which we will simply denote by $+$ ). Given a probability measure $\mu$ on ${\mathbb{G}}$ , we recall that the random walk with increment law $\mu$ has Dirichlet form

[TABLE]

for all $f\colon{\mathbb{G}}\to\mathbb{R}$ . In particular, we have the representation

[TABLE]

where $\pi$ and $\rho_{k}$ $(0\leq k\leq n)$ respectively denote the uniform distributions on ${\mathbb{G}}$ and on

[TABLE]

Here $\mathrm{supp}(x)=\{i\colon x_{i}\neq 0\}$ naturally denotes the support of $x=(x_{1},\ldots,x_{n})\in{\mathbb{G}}$ . Similarly, the ip on ${\mathbb{G}}$ with increment law $\mu$ has Dirichlet form

[TABLE]

for $f\colon\mathfrak{S}({\mathbb{G}})\to\mathbb{R}$ , with the interpretation $\tau_{\{x,x+z\}}=\textrm{id}$ when $z=0$ . In view of (49) (with ip instead of rw), our claim (46) rewrites as

[TABLE]

for some (possibly different) constant $c<\infty$ that is only allowed to depend on $\ell$ . The proof will crucially rely on the following elegant application of the octopus inequality [8, Theorem 2.3], which we borrow from Alon and Kozma [4]. We include a short proof as our setting is here slightly different. The convolution of two probability measures $\mu,\nu$ on ${\mathbb{G}}$ is defined by

[TABLE]

Also, we say that a measure $\mu$ on ${\mathbb{G}}$ is symmetric if $\mu(z)=\mu(-z)$ for all $z\in{\mathbb{G}}$ .

Lemma 3 (Comparison for convolutions).

For any symmetric probability measure $\mu$ on ${\mathbb{G}}$ ,

[TABLE]

Proof.

If $\mu(0)=0$ , the octopus inequality [8, Theorem 2.3] asserts that

[TABLE]

for all $f\colon\mathfrak{S}({\mathbb{G}})\to\mathbb{R}$ and $x\in{\mathbb{G}}$ (the factor $\frac{1}{2}$ on the right-hand side compensates for the fact that we are here summing over all ordered pairs $(u,v)\in{\mathbb{G}}^{2}$ ). Averaging over all $x\in{\mathbb{G}}$ , and applying the (bijective) change of variables $(x,u,v)\mapsto(x+u,-u,v-u)$ on the right-hand side, we obtain

[TABLE]

which is precisely $2\mathcal{E}_{\mu}^{\textsc{ip}}(f)\geq\mathcal{E}_{\mu\star\mu}^{\textsc{ip}}(f)$ by symmetry of $\mu$ . This proves the claim when $\mu(0)=0$ . In the general case, we write $\mu=(1-\theta)\rho_{0}+\theta\nu$ with ${\nu}(0)=0$ , and we observe that

[TABLE]

Thus, the claim $\mathcal{E}_{\mu\star\mu}^{\textsc{ip}}\leq 2\mathcal{E}_{\mu}^{\textsc{ip}}$ is equivalent to $\mathcal{E}_{\nu\star\nu}^{\textsc{ip}}\leq 2\mathcal{E}_{\nu}^{\textsc{ip}}$ . ∎

For reasons that will become clear later, we henceforth set

[TABLE]

Let us introduce the measure $\mu$ defined by

[TABLE]

Since $\mu$ is symmetric and $t$ is a power of $2$ , we may iterate Lemma 3 to get

[TABLE]

where $\mu^{\star t}=\mu\star\cdots\star\mu$ denotes the $t-$ fold convolution of $\mu$ . Thus, our goal (52) now boils down to showing that

[TABLE]

for some constant $c<\infty$ that only depends on $\ell$ . To this end, we analyze the convolution $\mu^{\star t}$ accurately using the de Moivre - Laplace Local Limit Theorem.

3.4 Local Limit Theorem

As a warm-up, consider the binomial expansion of the uniform law: $\pi=\sum_{k=0}^{n}b_{k}\rho_{k}$ , where

[TABLE]

The classical de Moivre - Laplace Local Limit Theorem provides uniform estimates on the coefficients $b_{0},\ldots,b_{n}$ . Although a specific value of $p$ was chosen at (58), the statement is of course valid for any $p\in(0,1)$ .

Theorem 5 (de Moivre - Laplace).

There is $C<\infty$ depending only on $p$ such that

[TABLE]

for all $0\leq k\leq n$ , with $x=(k-np)/\sqrt{np(1-p)}$ .

We can use this to approximate $\mathcal{E}_{\pi}$ with $\mathcal{E}_{\rho_{\mathcal{I}}}$ , where $\rho_{\mathcal{I}}$ is defined as follows:

[TABLE]

Lemma 4 (Plateau proxy for $\mathcal{E}_{\pi}^{\textsc{ip}}$ ).

There is $c<\infty$ depending on $\ell$ only, such that

[TABLE]

Proof.

If $\nu$ is a symmetric distribution on ${\mathbb{G}}$ , the Cauchy-Schwartz inequality yields

[TABLE]

for all $x\in{\mathbb{G}}$ , where $d_{\textsc{tv}}(\cdot,\cdot)$ denotes the total-variation distance. In particular, when $d_{\textsc{tv}}(\nu,\pi)\leq\frac{1}{4}$ , we obtain $(\nu\star\nu)(x)\geq\pi(x)/4$ for all $x\in{\mathbb{G}}$ and hence

[TABLE]

Let us now apply this general observation to the restriction of $\pi$ to $\bigcup_{k\in\mathcal{I}}{\mathbb{G}}_{k}$ :

[TABLE]

Note that $d_{\textsc{tv}}(\nu,\pi)=1-q$ , and that $q\geq 3/4$ thanks to our definition of $\mathcal{I}$ and Chebychev’s inequality for the Binomial $(n,p)$ . Thus, (67) applies and yields

[TABLE]

where the second inequality uses Lemma 3, and the third $q\geq 3/4$ . Finally, Theorem 5 ensures that $|\mathcal{I}|\max_{k\in\mathcal{I}}b_{k}$ is bounded by a quantity which only depends on $p$ . ∎

In order to establish (61), we will now approximate $\mu^{\star t}$ by the distribution

[TABLE]

Note that the center of $\mathcal{J}$ is twice smaller than that of $\mathcal{I}$ .

Lemma 5 (Plateau proxy for $\mu^{\star t}$ ).

There is $c>0$ depending on $\ell$ only, such that

[TABLE]

Proof.

The convolution with $\mu$ describes the following transformation on ${\mathbb{G}}-$ valued random variables: pick one of the $n$ coordinates uniformly at random and, with probability $\theta$ , replace it with a fresh uniform sample from $\mathbb{Z}_{\ell}$ . Consequently, we may construct a random variable $X=(X_{1},\ldots,X_{n})$ with law $\mu^{\star t}$ by setting

[TABLE]

where $N,U_{1},\ldots,U_{t},Z_{1},\ldots,Z_{n}$ are independent random variables with the following laws:

•

$N$ is binomial with parameters $t$ and $\theta$ ;

•

$U_{1},\ldots,U_{t}$ are uniform on $\{1,\ldots,n\}$ ;

•

$Z_{1},\ldots,Z_{n}$ are uniform on $\mathbb{Z}_{\ell}$ .

In particular, setting $S:=|\mathrm{supp}(X)|$ , we have

[TABLE]

and our proof boils down to establishing that

[TABLE]

for some constant $c>0$ that only depends on $\ell$ . Now, conditionally on $N$ , the variable

[TABLE]

counts the number of distinct coupons collected by time $N$ in the standard coupon-collector problem of size $n$ . Thus,

[TABLE]

Recalling our definitions (57), and since $N$ is a Binomial $(t,\theta)$ variable, we easily deduce

[TABLE]

Consequently, Chebychev’s inequality yields

[TABLE]

Now, conditionally on $R$ , the random variable $S$ is just a Binomial with parameters $R,p$ . In particular, Theorem 2 with $R$ instead of $n$ ensures that

[TABLE]

where $c>0$ only depends on $p$ . Combining (80) and (81) establishes the claim. ∎

3.5 Final comparison

In view of Lemmas 4 and 5, our objective (61) now reduces to establishing the following.

Proposition 1 (Final comparison).

There exists $c<\infty$ depending only on $\ell$ , such that

[TABLE]

The crucial ingredient of the proof is the following lemma.

Lemma 6.

For any $i,j\in\{0,\ldots,n\}$ with $i+j\in\{0,\ldots,n\}$ , we have

[TABLE]

Proof.

Let $(X,Y)$ denote a random element from the set

[TABLE]

Then $\omega:=(X,Y)$ is a random word of length $2$ over ${\mathbb{G}}_{i}\cup{\mathbb{G}}_{j}$ , whose evaluation $X+Y$ is uniform over ${\mathbb{G}}_{i+j}$ . By Corollary 6 and Remark 4, we deduce that

[TABLE]

where the congestion $\kappa$ is given by

[TABLE]

The second line follows from the observation that $X$ and $Y$ are uniformly distributed on ${\mathbb{G}}_{i}$ and ${\mathbb{G}}_{j}$ , respectively. On the other hand, the definitions of $\rho_{i},\rho_{j},\rho_{i+j}$ imply

[TABLE]

The claim readily follows. ∎

Proof of Proposition 1.

Our definitions of $\mathcal{I},\mathcal{J}$ ensure that $|\mathcal{I}|=|\mathcal{J}|$ and that

[TABLE]

In particular, we have

[TABLE]

Now, since $|{\mathbb{G}}_{i}|={n\choose i}(\ell-1)^{i}$ for all $i\in\{0,\ldots,n\}$ , we have

[TABLE]

As $i$ varies across $\mathcal{J}$ , this ratio remains bounded away from [math] and $\infty$ uniformly in $n$ . Consequently, Lemma 6 ensures that for all $(i,j)\in\mathcal{P}$ ,

[TABLE]

where $c<\infty$ depends only on $\ell$ . Inserting this above, we obtain

[TABLE]

This concludes the proof. ∎

Bibliography33

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Radosław Adamczak, Michał Kotowski, and Piotr Miłoś. Phase transition for the interchange and quantum heisenberg models on the hamming graph. ar Xiv preprint ar Xiv:1808.08902 , 2018.
2[2] David Aldous and Jim Fill. Reversible Markov chains and random walks on graphs, 2002. Unfinished manuscript. Available at http://www.stat.berkeley.edu/~aldous/RWG/book.html .
3[3] Gil Alon and Gady Kozma. The probability of long cycles in interchange processes. Duke Math. J. , 162(9):1567–1585, 2013. MR 3079255 .
4[4] Gil Alon and Gady Kozma. Comparing with octopi. Ann. Inst. Henri Poincaré Probab. Stat. , 56(4):2672–2685, 2020. MR 4164852 .
5[5] Omer Angel. Random infinite permutations and the cyclic time random walk. In Discrete random walks (Paris, 2003) , Discrete Math. Theor. Comput. Sci. Proc., AC, pages 9–16. Assoc. Discrete Math. Theor. Comput. Sci., Nancy, 2003. MR 2042369 .
6[6] Nathanaël Berestycki. Emergence of giant cycles and slowdown transition in random transpositions and k 𝑘 k -cycles. Electron. J. Probab. , 16:no. 5, 152–173, 2011. MR 2754801 .
7[7] Nathanaël Berestycki and Gady Kozma. Cycle structure of the interchange process and representation theory. Bull. Soc. Math. France , 143(2):265–280, 2015. MR 3351179 .
8[8] Pietro Caputo, Thomas M. Liggett, and Thomas Richthammer. Proof of Aldous’ spectral gap conjecture. J. Amer. Math. Soc. , 23(3):831–851, 2010. MR 2629990 .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The interchange process on high-dimensional products

Abstract

Contents

1 Introduction

1.1 Interchange process

1.2 The big picture

Conjecture 1** (Oliveira [29]).**

Theorem 1** (Alon and Kozma [4]).**

Corollary 1** (Alon and Kozma [4]).**

2 Results

2.1 Comparison constant and mixing time

Theorem 2** (Comparison).**

Corollary 2** (Mixing time).**

Remark 1** (Pre-cutoff).**

Conjecture 2** (General comparison).**

2.2 Emergence of macroscopic cycles

Theorem 3** (Alon and Kozma [4]).**

Corollary 3** (Giant cycles).**

2.3 Exclusion process

Corollary 4** (Comparison constant for ex-k).**

Corollary 5** (Log-Sobolev constant of ex-k).**

3 Proof of the main result

3.1 Proof outline

3.2 Canonical paths

Theorem 4** (Canonical paths, see e.g. [11]).**

Remark 2** (Trivial choice).**

Remark 3** (Congestion behaves well under products).**

Remark 4** (Cayley graphs).**

Corollary 6** (From canonical paths for rw to canonical paths for ip).**

Corollary 7**.**

3.3 The octopus inequality

Lemma 3** (Comparison for convolutions).**

Proof.

3.4 Local Limit Theorem

Theorem 5** (de Moivre - Laplace).**

Lemma 4** (Plateau proxy for Eπ\textscip\mathcal{E}_{\pi}^{\textsc{ip}}Eπ\textscip​).**

Proof.

Lemma 5** (Plateau proxy for μ⋆t\mu^{\star t}μ⋆t).**

Proof.

3.5 Final comparison

Proposition 1** (Final comparison).**

Lemma 6**.**

Proof.

Proof of Proposition 1.

Conjecture 1 (Oliveira [29]).

Theorem 1 (Alon and Kozma [4]).

Corollary 1 (Alon and Kozma [4]).

Theorem 2 (Comparison).

Corollary 2 (Mixing time).

Remark 1 (Pre-cutoff).

Conjecture 2 (General comparison).

Theorem 3 (Alon and Kozma [4]).

Corollary 3 (Giant cycles).

Corollary 4 (Comparison constant for ex-k).

Corollary 5 (Log-Sobolev constant of ex-k).

Theorem 4 (Canonical paths, see e.g. [11]).

Remark 2 (Trivial choice).

Remark 3 (Congestion behaves well under products).

Remark 4 (Cayley graphs).

Corollary 6 (From canonical paths for rw to canonical paths for ip).

Corollary 7.

Lemma 3 (Comparison for convolutions).

Theorem 5 (de Moivre - Laplace).

Lemma 4 (Plateau proxy for $\mathcal{E}_{\pi}^{\textsc{ip}}$ ).

Lemma 5 (Plateau proxy for $\mu^{\star t}$ ).

Proposition 1 (Final comparison).

Lemma 6.