Weak-disorder limit at criticality for directed polymers on hierarchical graphs
Jeremy Clark

TL;DR
This paper proves a distributional limit theorem for directed polymer partition functions on hierarchical graphs at criticality, revealing new behavior in the marginally relevant disorder case with joint scaling of layers and temperature.
Contribution
It establishes the first distributional convergence result for the critical marginally relevant case of directed polymers on hierarchical graphs, using a novel Stein's method approach.
Findings
Distributional convergence of partition functions at criticality.
Limit theorem applies to models with edge and vertex disorder.
Analysis introduces a perturbative Stein's method at a critical scale.
Abstract
We prove a distributional limit theorem conjectured in [Journal of Statistical Physics 174, No. 6, 1372-1403 (2019)] for partition functions defining models of directed polymers on diamond hierarchical graphs with disorder variables placed at the graphical edges. The limiting regime involves a joint scaling in which the number of hierarchical layers, , of the graphs grows as the inverse temperature, , vanishes with a fine-tuned dependence on . The conjecture pertains to the marginally relevant disorder case of the model wherein the branching parameter and the segmenting parameter determining the hierarchical graphs are equal, which coincides with the diamond fractal embedding the graphs having Hausdorff dimension two. Unlike the analogous weak-disorder scaling limit for random polymer models on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Weak-disorder limit at criticality for directed polymers
on hierarchical graphs
Jeremy Thane Clark111 [email protected]
University of Mississippi, Department of Mathematics
Abstract
We prove a distributional limit theorem conjectured in [Journal of Statistical Physics 174, No. 6, 1372-1403 (2019)] for partition functions defining models of directed polymers on diamond hierarchical graphs with disorder variables placed at the graphical edges. The limiting regime involves a joint scaling in which the number of hierarchical layers, , of the graphs grows as the inverse temperature, , vanishes with a fine-tuned dependence on . The conjecture pertains to the marginally relevant disorder case of the model wherein the branching parameter and the segmenting parameter determining the hierarchical graphs are equal, which coincides with the diamond fractal embedding the graphs having Hausdorff dimension two. Unlike the analogous weak-disorder scaling limit for random polymer models on hierarchical graphs in the disorder relevant case (or for the (1+1)-dimensional polymer on the rectangular lattice), the distributional convergence of the partition function when cannot be approached through a term-by-term convergence to a Wiener chaos expansion, which does not exist for the continuum model emerging in the limit. The analysis proceeds by controlling the distributional convergence of the partition functions in terms of the Wasserstein distance through a perturbative generalization of Stein’s method at a critical step. In addition, we prove that a similar limit theorem holds for the analogous model with disorder variables placed at the vertices of the graphs.
1 Introduction
In probabilistic frameworks, a disordered system usually refers to a relatively simple and familiar random object whose “pure” probabilistic law is distorted through its coupling to a random “environment” formed by an array of random variables (local impurities) or a random field. If the size of the model depends on a parameter , a central question for these disordered systems is whether typical realizations of the random environment create either a qualitative or only a quantitative change in the law of the random object as . For a given coupling strength of the system to the environment, these large-scale behaviors are respectively referred to as strongly disordered or weakly disordered. A disordered system is further classified as disorder relevant if it exhibits strong disorder for any fixed as the system size grows or as disorder irrelevant otherwise. Finally, models at the border between the disorder relevant and disorder irrelevant regimes are referred to as marginally relevant or marginally irrelevant, and these boundary models manifest anomalous finer scaling behavior as the coupling strength vanishes.
One of the most closely studied disorder models is the directed polymer in a random environment, which usually refers to a -dimensional simple symmetric random walk (SSRW) whose trajectories are reweighed within a Gibbsian formalism that depends on an inverse temperature parameter, , and an array of centered i.i.d. random variables labeled by the time-space lattice for a polymer length . The parameter effectively controls the strength of the polymer’s coupling to the environment, and corresponds to a pure SSRW. Established results in this field imply that the -dimensional polymer model is disorder relevant when , marginally relevant when , and disorder irrelevant in all higher dimensions; see Comets’s recent book [14].
In this article, we prove a distributional limit theorem for partition functions defined from a hierarchical model for directed polymers in a random environment for which the disorder is marginally relevant. Our limiting regime, which involves a joint scaling wherein the number of hierarchical layers of the model grows while the disorder strength decays to zero, is similar to the critical weak-disorder scaling regime for -dimensional polymers proposed by Caravenna, Sun, and Zygouras in [7, 9]. While [9] proves the existence of a subsequential distributional limit of the partition functions within this critical scaling regime and fully characterizes the correlation structure of any such limit, the uniqueness of the subsequential distributional limit currently remains open. Although the hierarchical symmetry of the model considered in this article makes a detailed limit analysis within the critical weak-disorder regime less difficult than for the rectangular lattice polymer model with marginally relevant disorder, the hierarchical setting provides some insights that are likely general for weak-disorder scaling limits at criticality for marginally relevant systems.
The continuum polymer model corresponding to the scaling limit of this article is studied in [12, 13]. We will return to a broader discussion of related work in Section 4 after defining our hierarchical model and presenting a first version of our main result.
2 The setup and a statement of the main result
This section begins by defining a family of random measures on directed paths crossing diamond hierarchical graphs and concludes with the statement of a limit theorem for the total masses of the measures (Theorem 2.7), which was conjectured in [10]. The models in this section have bond-disorder, i.e., disorder variables placed at the edges of the graphs, while the models discussed in the next section have disorder at the vertices.
2.1 Construction of the diamond hierarchical graphs
Hierarchical diamond graphs , are recursively defined through a construction determined by a branching number and a segmenting number . The zeroth graph, , is simply two root vertices, and , with an edge between them. The first-generation graph, , is formed by parallel branches connecting and , wherein each branch has edges running in sequence. For the graph is defined recursively from by embedding a copy of in place of each edge on . The set of edges, , on thus contains elements.
The first three recursively-defined diamond graphs with and .
A directed path on is a function for which is incident to , is incident to , and successive edges , share a common vertex for . In other terms, the path moves monotonically upwards from up to , as seen in the figure. We denote the set of directed paths on by .
2.2 Random Gibbsian measure on directed paths
Next we define a random Gibbs measure on the space of directed paths. Let be an i.i.d. family of random variables labeled by and having mean zero, variance one, and finite exponential moments, \mathbb{E}\big{[}\exp\{\beta\omega_{h}\}\big{]} for . Given an inverse temperature value , we define a random path measure on directed paths such that the weight assigned to is given by
[TABLE]
where means that the edge lies along the path . At infinite temperature (), is a uniform probability measure on . We denote the total mass of by
[TABLE]
in terms of the disorder variables . The recursive construction of the diamond graphs implies the following distributional recursive relation for the partition functions :
[TABLE]
where the ’s are independent copies of the random variable . The variances \varrho_{n}(\beta):=\textup{Var}\big{(}W_{n}^{\omega}(\beta)\big{)} are recursively related as \varrho_{n+1}(\beta)=M_{b,s}\big{(}\varrho_{n}(\beta)\big{)} with defined as
[TABLE]
Thus the fixed point is linearly attractive when , linearly repelling when , and marginally repelling when .
2.3 High-temperature scaling limits for the Gibbs measure
Our focus is on high-temperature (i.e., weak-disorder) scaling limits in which the hierarchical level parameter, , grows as the inverse temperature decays under an appropriate tuning in such that the random path measures converge in distribution to a limiting random measure on paths. This article concerns only the total mass of the measures while [12] extends this limit analysis to the full measures and discusses some delicate properties of the limiting path measures. High-temperature scaling limits are only of interest in the cases and for which is a repelling fixed point of the variance map . The article [1] contains a limit theorem for in the case , where for a fixed parameter value the inverse temperature has the large asymptotic form
[TABLE]
The sequences of random variables converge in distribution as to a family of limit laws supported on that satisfy the distributional recursion relation
[TABLE]
where are i.i.d. copies of . The variance, , of satisfies M_{b,s}\big{(}R_{b,s}(r)\big{)}\,=\,R_{b,s}(\frac{s}{b}r). Of course, the exponential form of the inverse temperature scaling (2.4) corresponds to the linear repelling (2.2) of the map from that occurs in the case.
The main result of the current article is a proof of an analogous limit theorem for in the case. An inverse temperature scaling—see below in (2.5)—was proposed in [10] although the results therein were confined to proving convergence of the positive integer moments.222The scaling (2.5) includes a correction pointed out by an anonymous referee that ensures consistency with the variance asymptotics (2.7) below; see Appendix A for an outline of the computation determining (2.5) from (2.7). The variance asymptotics is what plays a direct role in all subsequent analysis. Although the convergence of the positive integer moments implies the existence of subsequential distributional limits, it does not imply convergence in law because the higher limiting moments increase super-factorially; see (III) of Theorem 2.4 below. For fixed and , let the sequence have the large asymptotics
[TABLE]
where and are respectively the third and fourth cumulants of the disorder variables, , and the constants are defined as
[TABLE]
If we let denote the -fold composition of , the variance, \varrho_{n}\big{(}\beta_{n,r}^{(b)}\big{)}, of W_{n}^{\omega}\big{(}\beta_{n,r}^{(b)}\big{)} can be written explicitly as
[TABLE]
The basic observations above combined with Lemma 2.3 below imply that \varrho_{n}\big{(}\beta_{n,r}^{(b)}\big{)} converges as to a limit for any .
Remark 2.1**.**
Let us set the skewness, , of the disorder variables to zero for simplicity. Theorem 7.1 of [1] states that if is replaced by a coarser scaling of the form for a parameter , then W_{n}^{\omega}\big{(}\hat{\beta}/\sqrt{n}\big{)} has the distributional behaviors listed below depending on as .
[TABLE]
In the above, we use the notation heuristically to mean that the random variables are “close” in distribution. Thus is a critical point for the parameter in the moment behavior of W_{n}^{\omega}\big{(}\hat{\beta}/\sqrt{n}\big{)} when , and falls within a critical window around . The variance blowup after coincides with the transition to strong disorder as can be seen in the limit model emerging under the scaling (2.5) as ; see Remark 2.9.
Remark 2.2**.**
The critical inverse temperature scaling for (2+1)-dimensional directed polymers considered in [9] has the form \beta_{L,r}=\frac{\sqrt{\pi}}{(\log L)^{1/2}}-\frac{\pi\tau}{2\log L}+\frac{\sqrt{\pi}r+\pi^{3/2}(\frac{5}{4}\tau^{2}-\frac{7}{12}\tau^{\prime}-\frac{1}{2})}{2(\log L)^{3/2}}+\mathit{o}\big{(}\frac{1}{(\log L)^{3/2}}\big{)} for , where is the polymer length, is a parameter, and are the third and fourth cumulants of the disorder variables; see [9, Remark 1.1]. In terms of the length of the diamond graph polymers, the asymptotic form (2.5) is fairly similar except for the inclusion of the term .
2.4 Previous results on the centered moments
The lemma and theorem below are results from [10].
Lemma 2.3** (Variance function).**
For any , there exists a unique continuously differentiable increasing function satisfying the properties (I)-(III) below.
- (I)
Composition of with the map translates the parameter : M_{b,b}\big{(}R_{b}(r)\big{)}\,=\,R_{b}(r+1). 2. (II)
As , diverges to . As , has the vanishing asymptotics
[TABLE] 3. (III)
The derivative admits the limiting form
[TABLE]
Moreover, if for some the sequence of positive real numbers has the large asymptotics
[TABLE]
then converges as to .
Appendix B contains an elementary but instructive calculation showing the consistency between properties (I) and (II) above. The higher centered moments of W_{n}^{\omega}\big{(}\beta_{n,r}^{(b)}\big{)} converge to limits characterized as follows.
Theorem 2.4** (Limiting higher moments).**
Fix and let . For each there is a continuous, increasing function such that for any
[TABLE]
The limit functions satisfy properties (I)-(III) below.
- (I)
There are multivariate polynomials with nonnegative coefficients such that for all
[TABLE] 2. (II)
* diverges to as and vanishes as with the asymptotics for even and R^{(m)}_{b}(r)=\mathit{O}\big{(}|r|^{-(m+1)/2}\big{)} for odd.* 3. (III)
There is a such that holds for any fixed and large enough .
Remark 2.5**.**
The function in the statement of Lemma 2.3 is equal to in the statement of Theorem 2.4.
Remark 2.6**.**
The quantity in (II) for even agrees with the moment of a centered normal random variable with variance .
2.5 A first version of the main result
As mentioned above, Theorem 2.4 does not imply that W_{n}^{\omega}\big{(}\beta_{n,r}^{(b)}\big{)} converges in law as since grows super-factorially with by (III) of Theorem 2.4. Thus the following theorem was left as a conjecture in [10].
Theorem 2.7**.**
Fix and , and let the sequence have the form (2.5). When there is convergence in distribution as
[TABLE]
to a family of limit laws \big{\{}L_{r}^{(b)}\big{\}}_{r\in{\mathbb{R}}} uniquely determined by (I)-(IV) below.
- (I)
* has mean and variance .* 2. (II)
For , the centered moment of is equal to . 3. (III)
Let be a random variable with distribution . The centered variables converge in law as to a centered normal with variance . 4. (IV)
If are independent variables with distribution , then there is equality in distribution
[TABLE]
Remark 2.8**.**
The convergence in distribution of to as follows from the asymptotics for the centered moments in (II) of Theorem 2.4.
Remark 2.9**.**
The family of limit laws in Theorem 2.7 exhibits a transition from weak disorder to strong disorder as goes from to in the sense that the random variables converge in probability to one as and to zero as , where the latter is proved in [13, Section 5] using a conditional Gaussian multiplicative chaos structure that we will describe at the end of Section 4.
3 A similar limit theorem for the site-disorder model
Next we will state an analogous result to Theorem 2.7 corresponding to when the environmental disorder is built into the partition function through the vertices of the diamond graphs rather than the edges.
For and , let denote the set of vertices on the diamond graph with the roots and excluded. Thus , and for the number of non root vertices is given by \big{|}V^{b,s}_{n}\big{|}=b(s-1)\frac{(bs)^{n}-1}{bs-1}. The hierarchical construction of the sequence of diamond graphs in Section 2.1 implies that is canonically identifiable with a subset of for each , and we refer to as the set of generation- vertices.
As before, let be an i.i.d. family of centered random variables with variance one and finite exponential moments. We define the partition function in analogy to in (2.1) except with the product of random variables running over all vertices along the path :
[TABLE]
where the notation is used for a vertex and a path to indicate that one of the edges for is incident to . When the partition function is simply equal to since , and the hierarchical symmetry of the model implies the following distributional equality, which is similar to (2.2):
[TABLE]
where are i.i.d. copies of and are i.i.d. copies of the disorder variable. The terms correspond to the generation- vertices of the diamond graph .
The following theorem is the counterpart to Theorem 2.7 for the site-disorder model, and its proof is in Section 14.
Theorem 3.1**.**
Fix and , and assume . Define , and let and be defined as in (2.5). If the sequence \big{\{}\widehat{\beta}_{n,r}^{(b)}\big{\}}_{n\in\mathbb{N}} has the asymptotic form
[TABLE]
then \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}^{(b)}\big{)} converges in distribution as to the limit law of Theorem 2.7.
Remark 3.2**.**
Define \upsilon_{b}:\big{[}0,\widehat{\kappa}_{b}\big{)}\rightarrow[0,\infty) by \upsilon_{b}(\hat{\beta}):=\hat{\beta}\frac{\sqrt{2}}{\sqrt{b}}\tan\big{(}\frac{\pi}{2}\frac{\hat{\beta}}{\widehat{\kappa}_{b}}\big{)}. In the case of , [1, Thm. 2.5] states that the partition function has the large distributional behaviors listed below depending on the parameter .
[TABLE]
We use in the same heuristic sense as in Remark 2.1. Thus is a critical point for the large behavior of \widehat{W}_{n}^{\omega}\big{(}\hat{\beta}/n\big{)} that is analogous to for W_{n}^{\omega}\big{(}\hat{\beta}/\sqrt{n}\big{)} as described in Remark 2.1.
Remark 3.3**.**
Our proof of Theorem 3.1 proceeds by showing that \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}^{(b)}\big{)} is close in norm to a similarly-defined partition function in which the disorder variables are only attached to vertices of generation greater than . This effectively reduces the generation- site-disorder model to a generation- bond-disorder model. The results developed to prove Theorem 2.7 can then be applied to prove Theorem 3.1.
4 Further discussion
As mentioned in Section 1, the -dimensional polymer model is disorder relevant when and marginally relevant when . In principle, disorder relevance opens up the possibility that there exists a continuum disorder model that emerges in a joint limit in which the polymer length, , grows as the inverse temperature vanishes with an appropriate dependence on .333The general relationship between disorder relevance and continuum limits is argued for in [8]. A rigorous mathematical result in this direction was developed by Alberts, Khanin, and Quastel in the article [2], which proved that the partition function for (1+1)-dimensional polymers converges in law to a nontrivial distributional limit, , as and the inverse temperature has the asymptotic form \beta=\big{(}\hat{\beta}+\mathit{o}(1)\big{)}L^{-1/4} for a fixed parameter value . This scaling limit is referred to as the intermediate disorder regime since it magnifies a parameter region between the weak () and the strong () domains of disorder behavior for the -dimensional polymer, and it amounts to a continuum/weak-disorder limiting regime in which the polymers are diffusively rescaled towards Brownian motion trajectories while the environmental disorder variables are renormalized towards a white noise field on . The authors construct the limiting partition functions in terms of Wiener chaos expansions of the field involving the one-dimensional heat kernel \varrho(t^{\prime},x^{\prime};t,x)=\frac{1}{\sqrt{2\pi(t-t^{\prime})}}\textup{exp}\big{\{}-\frac{(x-x^{\prime})^{2}}{2(t-t^{\prime})}\big{\}}.
A model of continuum directed polymers corresponding to the limiting partition function laws in [2] was discussed more explicitely in [3], where is equal in distribution to the total mass of a random measure on , i.e., the space of Brownian trajectories. Moreover, the authors use the point-to-point form, , of these limiting partition function laws to construct a solution to the one-dimensional stochastic heat equation (SHE):
[TABLE]
In the case where corresponds to the limit of point-to-line partition functions for polymers starting at the origin, is equal in law to the total mass of a random measure on that can be formally expressed as
[TABLE]
where is the Wiener measure on for a standard Brownian motion and defines a Gaussian field444The field yields a Gaussian random variable when integrated against a test function \psi\in L^{2}\big{(}C([0,1]),\mathbf{P}\big{)}. over with correlation kernel given by the intersection time between paths: T(p,q)=\mathbb{E}\big{[}\widehat{W}(p)\widehat{W}(q)\big{]}=\int_{0}^{1}\delta(p_{t}-q_{t})dt. Random measures formally expressed in terms of exponentials of Gaussian fields as in (4.1) are the focus of the theory of Gaussian multiplicative chaos (GMC), and is a subcritical GMC for any that can be understood through the general approach to GMC theory in [26]. The random measures are a.s. mutually singular to and satisfy
[TABLE]
In particular, is absolutely continuous with respect to , which is a necessary feature of subcritical GMCs.555See Lemma 34 of [26].
Weak-disorder limits analogous to [2] for the marginally relevant -dimensional polymer involve fundamental new mathematical difficulties and are not as well understood as the weak-disorder regime for the -polymer despite significant progress in a series of articles [5, 6, 7, 8, 9] by Caravenna, Sun, and Zygouras. In [6] the authors proved that the partition function for (2+1)-dimensional polymers has the following distributional limit behavior as when the inverse temperature tends to zero as \beta\equiv\beta_{L}=\frac{\sqrt{\pi}}{(\log L)^{1/2}}\big{(}\hat{\beta}+\mathit{o}(1)\big{)} for fixed :
[TABLE]
where is a standard normal random variable and \sigma_{\hat{\beta}}^{2}:=\log\big{(}\frac{1}{1-\hat{\beta}^{2}}\big{)}. In other terms, for the limit law, , is a mean-one lognormal that converges in probability to zero (while having exploding variance) as . Thus a phase transition from weak disorder to strong disorder occurs at within this weak-coupling limit regime.
A further study of the (2+1)-dimensional directed polymer around the critical point within the weak-disorder limit is undertaken in [9] by choosing the more refined inverse temperature scaling in Remark 2.2, which depends on a fixed parameter value . This scaling satisfies \beta_{L,r}=\frac{\sqrt{\pi}}{(\log L)^{1/2}}\big{(}1+\mathit{o}(1)\big{)} for , i.e., falls within the critical window of the phase transition (4.3) and is determined by the requirement that the variance of \textup{exp}\{\beta_{L,r}\omega\}/\mathbb{E}\big{[}\textup{exp}\{\beta_{L,r}\omega\}\big{]}, where is a disorder variable, has the large asymptotic form \frac{\pi}{\log L}+\frac{\pi r}{\log^{2}L}+\mathit{o}\big{(}\frac{1}{\log^{2}L}\big{)}.666The parameter is related to the parameter used in [7, 9] through for defined below (4.5). For a time parameter , the authors define the following random measures on :
[TABLE]
where is the partition function for length polymers starting from position . Using a tightness argument involving bounds for the third moments of the variables for , the authors prove the existence of subsequential limits as such that converges in law to a random measure on satisfying
[TABLE]
where for the Euler-Mascheroni constant , and is a correlation kernel with logarithmic blowup around its diagonal from Bertini and Cancrini’s article [4] on the two-dimensional SHE. The above is related to a recent breakthough on the moments of the two-dimensional SHE at criticality by Gu, Quastel, and Tsai [20]. When the form (4.5) is consistent with the existence of a -dimensional continuum random polymer measure on , with total mass equal in distribution to the random variable , that is analogous to the (1+1)-dimensional case in [3] when the starting point of the polymer has an appropriate probability density (i.e., diffuse initial position). If denotes Wiener measure on for trajectories starting with initial position density , then two independently chosen trajectories will a.s. not intersect. In other words, the product Wiener measure assigns probability zero to the set of pairs of trajectories that intersect. If a continuum disordered polymer measure exists, would not be absolutely continuous with respect to , unlike the continuum -dimensional polymer case (4.2).
Next we outline the rough analogy between models for directed polymers in a random environment on diamond hierarchical graphs and on rectangular lattices. Hierarchical graphs (“lattices”) are a frequent setting for statistical mechanical toy models because they may retain key characteristics of interest from their non-hierarchical analogs while providing a decomposability in terms of renormalization transformations; see for instance [17, 18, 21, 22, 23, 25, 28] for recent mathematical work. By the nature of their recursive construction, hierarchical models embed copies of themselves after a change in the controlling parameters for the embedded copies. The articles [15, 16] were the first to study models of directed polymers in a random environment on diamond hierarchical graphs.777This assertion about the history of directed polymers on the diamond lattice is from [14, Page 73]. In [23], Lacoin and Moreno analyzed the phase diagram of polymers on diamond graphs when the disorder variables are placed on the vertices, showing that
- •
strong disorder holds for any when , and
- •
when there is a critical inverse temperature for which weak disorder holds when and strong disorder holds for above .
In terms of their disorder relevance, the cases , , and are analogous respectively to the , , and cases of (d+1)-dimensional polymers on the rectangular lattice. In the disorder relevant case, [1] proves a limit theorem for the partition functions in an intermediate disorder regime analogous to [2], and [11] defines a continuum polymer model similar to [3], although using GMC for the construction rather than Wiener chaos.
When the model is altered by placing disorder variables on the edges of the graphs rather than the vertices (as in Section 2), the analysis in [23] goes through essentially unchanged when or , but for the marginal case of there is a basic combinatorial difference: for two directed polymers and chosen independently and uniformly at random,
- •
the expected number of vertices shared by and has order for , where is the length888In terms of the parameter , the polymer length has the form . of the polymers, and
- •
the expected number of edges shared by and is exactly , independent of . A closer look shows that when the polymers will share no edges at all with a probability , and that the expected number of common edges will be of order in the complementary event.
Thus, when , the diamond graph polymer model with edge disorder is similar to the polymer measures underlying the mollified partition functions in (4.4) in the sense that two independent two-dimensional SSRW trajectories of length with initial spatial probability densities spread out on the order of have a probability of intersecting that vanishes with order and, when conditioned on the event that the paths intersect, an expected number of intersections on the order of .
We will briefly summarize the continuum polymer model defined in [12] and its conditional Gaussian multiplicative chaos structure [13]. The limiting partition function law, , derived in later sections is equal in distribution to the total mass of a random measure on the space of directed paths crossing a compact diamond fractal, , having Hausdorff dimension two. Each directed path is an isometric embedding of the unit interval into the fractal, and there is a natural “uniform” probability measure on (serving as the analog of Wiener measure for the continuum (1+1)-dimensional polymer) for which . For directed paths , the set of intersection times is , and two paths chosen uniformly at random, i.e., according to the product measure , have a finite (trivial) number of intersections with probability one. In contrast, the random product measures a.s. assign positive weight to the set of pairs for which is uncountable, albeit of Hausdorff dimension zero. The size of typical can be characterized through the exponent case of the generalized Hausdorff measure on of the form
[TABLE]
where , and the infimum is over all coverings of by intervals of length less than ; see the monograph [24] for a discussion of the general theory of Hausdorff measures.
The qualitative difference (trivial to nontrivial) between the typical behavior of the intersection-times set under the pure measure and realizations of the disordered product measure is a strong localization property that is not present in the subcritical continuum models [3, 11]. To compare with the (1+1)-dimensional continuum polymer measures discussed above, the set of intersection times is appropriately measured by —which is closely related to the dimension- Hausdorff measure of —for both the product Wiener measure and realizations of . Secondly, in contrast with (4.2), the expectation of has Lebesgue decomposition with respect to given by
[TABLE]
where the measure assigns full weight to the set of pairs such that for all and for all , in other terms, for which has log-Hausdorff exponent one. The fact that \mathbb{E}\big{[}\mathbf{M}_{r}\times\mathbf{M}_{r}\big{]} is not absolutely continuous with respect to implies that is not a subcritical GMC.
The random measure is also not a “critical” GMC since the expectation \mathbb{E}\big{[}\mathbf{M}_{r}]=\mu is a probability measure and thus -finite. The family of random measure laws , however, has a conditional interrelational GMC structure wherein for any the law of the random measure can be constructed from as
[TABLE]
where is a field over that is Gaussian when conditioned on and has a correlation kernel T(p,q)=\mathbb{E}\big{[}\widehat{W}_{\mathbf{M}_{r}}(p)\widehat{W}_{\mathbf{M}_{r}}(q)\,|\,\mathbf{M}_{r}\big{]} roughly equivalent to the generalized Hausdorff measure with exponent , , of the set of intersection times. Because the random measures converge in law to the pure measure as , the above formally implies that an infinite field strength is required to generate as a GMC on .
5 Notation and organization
Notation: In the remainder of the article, we refer exclusively to the case when the branching parameter and the segmenting parameter of the diamond graphs are equal (). The dependence of all previously defined expressions on the parameter will be suppressed as in the following list of notational identifications:
[TABLE]
denotes the positive integers and . In heuristic discussions, we write for random variables and that are “close” in distribution.
Article organization:
- •
Section 6 builds up to the statement of Theorem 6.23 (bond-disorder #2), which is a slightly strengthened version of Theorem 2.7 (bond-disorder #1) that is couched in the language used in the proofs. Theorem 7.3 (bond-disorder #3) is a third version of this type of distributional convergence result that leverages more stringent moment conditions for greater control of the rate of convergence.
- •
Taken together, Sections 8 & 9 complete the proof of Theorem 6.23 (bond-disorder #2) after stating the key technical results in Proposition 9.1 and Lemmas 9.7-9.9 that support the proof.
- •
Sections 10 & 11 contain the proofs of Proposition 9.1 & Lemmas 9.7-9.9 with some of the relatively routine elements delayed to Section 12.
- •
Theorem 7.3 (bond-disorder #3) is proved in Section 13.
- •
Theorem 3.1 (site-disorder) is proved in Section 14.
- •
Proofs of propositions that are technical variations of results from [10] are placed in Section 15.
- •
Appendix A derives the inverse temperature scaling (2.5) from the variance scaling (2.7), Appendix B carries through an instructive consistency check between (I) and (II) of Lemma 2.3, and Appendix C provides some background on the zero bias approach [19] to Stein’s method.
6 Reformulation in terms of arrays and Wasserstein distance
This section defines the notation and terminology needed for the statement of Theorem 6.23, which is a more flexible version of Theorem 2.7. The language defined here is used throughout the remainder of the article.
6.1 Edge-labeled array notation
The recursive construction of the diamond hierarchical graphs outlined in Section 2.1 implies a canonical one-to-one correspondence between the set of edges, , of the -generation diamond graph and the -fold product set (\{1,\ldots,b\}\times\{1,\ldots,b\}\big{)}^{k}; see the diagram below illustrating this correspondence in the first- and second-generation graphs when . The hierarchical structure of the graphs also implies that for with each element is canonically identifiable with a -element subset of .
Notation 6.1** (Arrays).**
Let be real numbers labeled by for some . **
- •
The notation denotes an element of , which we refer to as an* array.*
- •
If for some with , then denotes an element in , where we have abused notation by identifying with its canonically corresponding subset of .**
Next we define an operation on edge-labeled arrays that can be used (see Proposition 6.5) to express the partition function (2.1).
Definition 6.2** (Array maps).**
For and , define for as the element in corresponding to the segment along the branch of the embedded copy of in identified with .999This is to be understood in the context of the recursive construction of from in Section 2.1.
- •
We define as the map that sends an array of real numbers to the contracted array
[TABLE]
- •
We define as the linearization of around the zero array:
[TABLE]
- •
We define , i.e., the “error” of the linearization.
- •
For , and refer to the -fold composition of the maps and , respectively.
Remark 6.3**.**
Note the ambiguity of the notations , , since we use them to denote maps from to for any .
Remark 6.4**.**
For , our notational conventions imply that
[TABLE]
The following proposition relates the array map to the partition function . The proof is placed in Section 12.1.
Proposition 6.5**.**
The partition function in (2.1) can be written in terms of the map as
[TABLE]
Remark 6.6**.**
Let be an array of i.i.d. centered random variables with variance .
- (i)
and are i.i.d. arrays of centered random variables with variance and , respectively. In particular, the operation preserves the variance of the array variables. 2. (ii)
For and , the random variables and are uncorrelated. Thus the variables in the array have variance . 3. (iii)
Moreover, the random variable can be written as the following sum of uncorrelated terms: .
The lemma below generalizes (iii) in Remark 6.6 and identifies the main source of uncorrelated terms found in this article. The proof follows easily from the multilinear polynomial forms of the maps , , .
Lemma 6.7**.**
Let be an array of independent centered random variables with finite second moments. If for , then the random variables and are uncorrelated when at least one of the following sets is nonempty:
[TABLE]
Proof.
Suppose that . The multilinear polynomial is a linear combination of monomials for which the set must contain a pair satisfying the following property: there exist and such that , , , and . On the other hand, the multilinear polynomial does not contain any monomials of this type, so and are uncorrelated. ∎
Remark 6.8**.**
Note that if is an array of i.i.d. centered random variables with variance , then has the form of a central limit-type normalized sum since . More generally, if , then is an array of central limit-type normalized sums since .
In the following, we define terminology for the multilayer arrays determined by repeated application of when starting from a given edge-labeled array.
Definition 6.9**.**
Let be defined as in Definition 6.2 and .
- •
A -pyramidic array is a finite sequence in of arrays of real numbers satisfying \big{\{}x_{a}^{(k-1,n)}\big{\}}_{a\in E_{k-1}}=\mathcal{Q}\big{\{}x_{a}^{(k,n)}\big{\}}_{a\in E_{k}} for all .
- •
When we condense the superscript as for . Moreover, \big{\{}x_{a}^{(k,n)}\big{\}}_{a\in E_{k}}=Q^{k}\{x_{h}^{(n)}\}_{h\in E_{n}} is referred to as the -pyramidic array generated from \big{\{}x_{h}^{(n)}\big{\}}_{h\in E_{n}}.
Remark 6.10**.**
When we remove the subscript from since .
Remark 6.11**.**
To distinguish the entire -pyramidic array from one of its subarray layers, \big{\{}x_{a}^{(k,n)}\big{\}}_{a\in E_{k}}, we will sometimes write \big{\{}x_{a}^{(*,n)}\big{\}}_{a\in E_{*}}.
6.2 Regular sequences of -pyramidic arrays of random variables
Next we narrow our focus to sequences of -pyramidic arrays of random variables. The following definition characterizes the assumptions that we use in our limit theorem in the next subsection.
Definition 6.12**.**
A sequence \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} of -pyramidic arrays of random variables taking values in will be said to be regular with parameter if the sequence of generating arrays \big{(}\{X_{h}^{(n)}\}_{h\in E_{n}}\big{)}_{n\in\mathbb{N}} satisfies the properties below.
- (I)
For each , the random variables in the array are centered and i.i.d. 2. (II)
The variance of the random variables in the array has the large asymptotics
[TABLE] 3. (III)
For each , the moment of the random variables in the array vanishes as .
Moreover, \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} is minimally regular if (I)-(II) hold, but (III) is only assumed for .
Remark 6.13**.**
The first example of a regular sequence \big{(}\big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} of -pyramidic arrays that we have in mind is when the random variables in the generating arrays \big{\{}X_{h}^{(n)}\}_{h\in E_{n}} are defined as in (6.1) with having the large asymptotics (2.5) for some . The variance criterion (II) in Definition 6.12 holds by (2.7) and the higher even moment criterion (III) merely follows from the fact that vanishes as .
Proposition 6.14 generalizes the result (2.9) in Theorem 2.4 about the convergence of the higher centered moments of . We omit the proof, which is the same as that of part (i) of Theorem 3.3 of [10], or said differently, the proof of part (i) of Theorem 3.3 of [10] proceeds by implicitly proving Proposition 6.14.
Proposition 6.14**.**
Let \big{(}\big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} be a sequence of -pyramidic arrays of random variables generated from a sequence of arrays \big{(}\{X_{h}^{(n)}\}_{h\in E_{n}}\big{)}_{n\in\mathbb{N}} satisfying properties (I)-(II) in Definition 6.12 for some . If the even moment of the random variables in the array \big{(}\{X_{h}^{(n)}\}_{h\in E_{n}}\big{)}_{n\in\mathbb{N}} vanishes as , then for each the moment of the random variables X^{(0,n)}=\mathcal{Q}^{n}\big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} converges to as , where is the function in Theorem 2.4.
The statement of the following lemma is formulated to emphasize the connection with the properties (I)-(III) in Theorem 6.16 below that we use to characterize the limit law emerging as .
Lemma 6.15**.**
The statements below hold for any regular sequence \big{(}\big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} of -pyramidic arrays with parameter .
- (I)
For each and , the variables in the array \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{n}} are i.i.d. 2. (II)
For each and , the array \mathcal{Q}\big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} is equal to \big{\{}X_{a}^{(k-1,n)}\big{\}}_{a\in E_{k-1}}. 3. (III)
For each and , the variables in the array are centered, and the variables have finite moment that converges to as for every and .
The above hold for minimally regular sequences except the convergence in (III) is only for .
Proof.
Statements (I) and (II) of Lemma 6.15 are immediate consequences of the definition of the variable arrays \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}}. To see (III), note that for we have X_{a}^{(k,n)}=\mathcal{Q}^{n-k}\big{\{}X_{h}^{(n)}\big{\}}_{h\in a\cap E_{n}}. By definition, the random variables have variance satisfying the large asymptotics (2.7), which we can rewrite in the form
[TABLE]
Notice that (6.3) has the form (2.7) with and replaced by and , respectively. It follows from Proposition 6.14 that the moment of X_{a}^{(k,n)}=\mathcal{Q}^{n-k}\big{\{}X_{h}^{(n)}\big{\}}_{h\in a\cap E_{n}} converges to as for each .∎
6.3 A limit theorem for -pyramidic arrays
Theorems 6.16 & 6.23 below are the main technical results of this article, and they are jointly proved in Section 9.3. Theorem 6.16 characterizes the limiting law for the distributional convergence statement in Theorem 6.23.
Theorem 6.16** (Limit law).**
For any , there exists a unique law on sequences in of edge-labeled arrays of random variables, \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}}, taking values in and holding the properties (I)-(III) below.
- (I)
For each , the variables in the array \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} are i.i.d. 2. (II)
For each , the array \big{\{}\mathbf{X}_{a}^{(k-1)}\big{\}}_{a\in E_{k-1}} is equal to \mathcal{Q}\big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}}. 3. (III)
For each , the variables in the array \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} are centered and have moment equal to for all .
Notation 6.17**.**
In the case of the random variables from Theorem 6.16, i.e., the peak of the infinite -pyramidic array of random variables, we will drop the scripts & and optionally attach the parameter as a subscript: .
Remark 6.18**.**
By hierarchical symmetry, the random variables in the arrays \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} from Theorem 6.16 with parameter are equal in distribution to .
Remark 6.19**.**
The limit law in Theorem 2.7 is equal in distribution to .
Remark 6.20**.**
Let \big{(}\big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}}\big{)}_{k\in\mathbb{N}_{0}} be a sequence of arrays of random variables satisfying the properties in the statement of Theorem 6.16. For the purpose of proving the uniqueness in Theorem 6.16, it will be useful to make the trivial observation that the sequence of -pyramidic arrays \big{(}\big{\{}\mathbf{X}_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} defined by \big{\{}\mathbf{X}_{a}^{(k,n)}\big{\}}_{a\in E_{k}}\equiv\big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} for is regular with parameter .
In the sequel we will evaluate the distance between measures on using Wasserstein- & - metrics.
Definition 6.21** (Wasserstein distance).**
For two Borel probability measures and on , let be the set of joint measures on with marginals and . For assume that and satisfy and . We define the Wasserstein- distance between and as
[TABLE]
If and are random variables with distributional measures and , respectively, then we extend our notation through the interpretation .
We prove the following proposition on the distributional continuity of in Section 12.1.
Proposition 6.22**.**
Let be defined as in Notation 6.17. The law of is a locally -Hölder continuous function of with respect to the Wasserstein- metric.
By Remark 6.13 the limit theorem below implies Theorem 2.7.
Theorem 6.23**.**
Let \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} be a minimally regular sequence of -pyramidic arrays of random variables with parameter . For any and , the Wasserstein-2 distance between and vanishes as , and, in particular, the i.i.d. array \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} (viewed as taking values in ) converges in law to \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} for each .
Remark 6.24**.**
The hierarchical symmetry of the model implies that it is sufficient to prove Theorem 6.23 for the case in which the arrays \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} and \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} are single random variables and , respectively. The proof of Theorem 6.23 involves writing X^{(0,n)}=\mathcal{Q}^{N}\big{\{}X_{e}^{(N,n)}\big{\}}_{e\in E_{N}} and \mathbf{X}=\mathcal{Q}^{N}\big{\{}\mathbf{X}_{e}^{(N)}\big{\}}_{e\in E_{N}} for with and introducing arrays of random variables \big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} (Definition 9.5) for which we show that and in an appropriately strong sense that is characterized in Proposition 9.1.
7 Rate of convergence under stricter moment assumptions
In this section we will state an alternative version of the limit result in Theorem 6.23 that offers more explicit rates of distributional convergence as under stronger moment assumptions on the arrays of random variables from which the -pyramidic arrays are generated. The conditions of the limit theorem easily translate into conditions for checking that a family of regular sequences of -pyramidic arrays of random variables depending on an auxiliary parameter is uniformly convergent with respect to the Wasserstein- metric (Corollary 7.5). The following definition characterizes our new assumptions.
Definition 7.1**.**
Fix some . A regular sequence of -pyramidic arrays of random variables \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} with parameter is said to be -sharply regular if the sequence of generating arrays \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfies the following more restrictive forms of (II) and (III) in Definition 6.12:
- (II*)
The variance of the random variables in the array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} has the asymptotics (6.2) with \mathit{o}\big{(}\frac{1}{n^{2}}\big{)} replaced by \mathit{O}\big{(}\frac{1}{n^{2+\alpha}}\big{)}. 2. (III*)
For each , the moment of the random variables in the array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} is as .
Remark 7.2**.**
The sequence \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} of -pyramidic arrays generated by arrays \big{\{}X_{h}^{(n)}\}_{h\in E_{n}} defined as in (6.1) where has the large asymptotics (2.5) with \mathit{o}\big{(}\frac{1}{n^{3/2}}\big{)} replaced by \mathit{O}\big{(}\frac{1}{n^{3/2+\alpha}}\big{)} is -sharply regular. Property (III*) holds since is \mathit{O}\big{(}\frac{1}{n^{1/2}}\big{)} as and property (II*) follows from the computation in Appendix A.
The following theorem, which we prove in Section 13, implies that if \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} is an -sharply regular sequence of -pyramidic arrays of random variables with parameter , then the Wasserstein- distance between (i.e, the peak of the -pyramidic array in the sequence) and the limit law vanishes with order as for any choice of . By hierarchical symmetry, this generalizes to the convergence of the random variables \big{\{}X^{(k,n)}_{a}\big{\}}_{a\in E_{k}} in the higher generation (i.e., ) array layers. The statement of Theorem 7.3 is formulated to provide easily verifiable conditions under which a family of -sharply regular sequences of -pyramidic arrays of random variables can be shown to be uniformly convergent in law; see Corollary 7.5.
Theorem 7.3**.**
Fix , , , and a bounded interval . Define . There exists a positive number such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying
- (I)
\left|\textup{Var}\big{(}X_{h}^{(n)}\big{)}\,-\,\kappa^{2}\big{(}\frac{1}{n}+\frac{\eta\log n}{n^{2}}+\frac{r}{n^{2}}\big{)}\right|\,<\,\frac{\mathbf{v}}{n^{2+\alpha}}* and* 2. (II)
\mathbb{E}\left[\big{|}X_{h}^{(n)}\big{|}^{2\mathfrak{p}}\right]\,<\,\frac{\varkappa}{n^{\mathfrak{p}}},
the peak, , of the -pyramidic array, \big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}, generated by \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} has distance less than from with respect to the Wasserstein-2 metric.
Remark 7.4**.**
Our proof of Theorem 7.3 follows essentially the same track as the proof of Theorem 6.23 except for the use of technical lemmas that fit with this particular formulation of the distributional convergence. Through a different proof method, it may be possible to extend the range of the exponent to a larger interval, e.g., .
The next corollary is a direct consequence of Theorem 7.3.
Corollary 7.5**.**
Fix , , , and a bounded interval . Let be a function from a set into . For some and all , let \big{\{}X_{h}^{(n)}(s)\big{\}}_{h\in E_{n}} be an i.i.d. array of random variables satisfying conditions (I)-(II) in Theorem 7.3 with parameter . The inequality below holds for the in Theorem 7.3.
[TABLE]
Fix and . The following example applies Corollary 7.5 to uniformly approximate the random variables for in the interval by applied to an i.i.d. array , where the variables are log-normal perturbations of the variables from Theorem 6.16. The construction below is used in the proof of Proposition 6.22 and is closely related to the Gaussian multiplicative chaos construction in (4.7).
Example 7.6**.**
Let the array of random variables be defined as in Theorem 6.16 for some parameter value and be an array of independent standard Brownian motions independent of . For define
[TABLE]
Note that when the random variable is equal in distribution to by (II) of Theorem 6.16. The variance of has the large asymptotic form
[TABLE]
where we have used (II) of Lemma 2.3. Moreover, the error term is uniformly bounded by a single multiple of for all . By writing as a sum of and \big{(}1+\mathbf{X}_{h}^{(n)}\big{)}\big{(}e^{\frac{\kappa}{n}\mathbf{B}^{h}_{t}-\frac{\kappa^{2}}{2n^{2}}t}-1\big{)}, the even moment of can be shown to be \mathit{O}\big{(}\frac{1}{n^{\mathfrak{p}}}\big{)} using that
[TABLE]
The approximation above for when is from (II) of Theorem 2.4. It follows that the arrays \big{\{}X_{h}^{(n)}(r,t)\big{\}}_{h\in E_{n}} satisfy the conditions (I)-(II) of Theorem 7.3 for any fixed and all and for large enough . By Corollary 7.5, the random variables converge uniformly to over with respect to the Wasserstein- metric as .
8 Existence of a limiting -pyramidic array of random variables
In this section we prove the existence of the infinite -pyramidic array of random variables described in Theorem 6.16. The proof is based on a routine tightness argument involving nested subsequences.
Proof of Theorem 6.16 (existence).
Let \big{(}\big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} be a regular sequence of -pyramidic arrays of random variables with parameter , e.g., of the form in Remark 6.13. For any , and , the variance of converges to as by Lemma 6.15. In particular, for any fixed the sequence of random arrays indexed by , viewed as a random vector in , is tight. We define inductively in as a nested sequence of subsequences as follows:
- •
Let be a subsequence of such that the single-element array \big{\{}X_{a}^{(0,\,\xi_{n}^{(0)})}\big{\}}_{a\in E_{0}} converges in law as to a limit \big{\{}\mathbf{X}_{a}^{(0)}\big{\}}_{a\in E_{0}}.
- •
If for the sequence has been chosen so that the array \big{\{}X_{a}^{(k,\,\xi_{n}^{(k)})}\big{\}}_{a\in E_{k}} converges in law as to a limiting array \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}}, then we choose to be a subsequence of such that \big{\{}X_{a}^{(k+1,\,\xi^{(k+1)}_{n})}\big{\}}_{a\in E_{k+1}} converges in law to some limit \big{\{}\mathbf{X}_{a}^{(k+1)}\big{\}}_{a\in E_{k+1}}.
With the sequence in of limiting array laws constructed above, we will next consider properties (I)-(III). When it comes to property (II), we will first verify the equality in a distributional sense—see (8.1)—because the arrays constructed above may be defined on different probability spaces for different .
Property (I) follows immediately from the construction since all of the arrays, \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}}, used in the construction are i.i.d. For property (II) notice that for any
[TABLE]
where the second equality follows from part (II) of Lemma 6.15, and the third holds by the continuity of the map . It follows that for each the -pyramidic array generated from \big{\{}\mathbf{X}_{a}^{(k-1)}\big{\}}_{a\in E_{k-1}} is equal in distribution to the top layers of the -pyramidic array generated by \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}}. By the Kolmogorov extension theorem, the sequence in of arrays of random variables \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} can be defined on a single probability space such that \big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} is a.s. equal to \mathcal{Q}\big{\{}\mathbf{X}_{a}^{(k-1)}\big{\}}_{a\in E_{k-1}}. For property (III), Lemma 6.15 implies that the moment of converges to the limit for any and . Since this holds for all , we have that \mathbb{E}\big{[}(\mathbf{X}_{a}^{(k)})^{m}\big{]}=R^{(m)}(r-k) for all by uniform integrability.
The limiting random variables take values in since the random variables \big{\{}1+X_{h}^{(n)}\big{\}}_{h\in E_{n}} are nonnegative by their definition (6.1), and the form of the map implies that the arrays \big{\{}1+X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} for \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}}:=\mathcal{Q}^{n-k}\big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} must also be nonnegative. ∎
9 Uniqueness of the limiting -pyramidic array and universality
The goal of this section is to prove Theorem 6.23 and, simultaneously, the uniqueness part of Theorem 6.16 after stating the key propositions that enter into the proof. Section 9.1 contains the statement of Proposition 9.1, which is central to the organization of our analysis. In Section 9.2, we heuristically motivate the definitions of the arrays of random variables that have a role in the proof of Theorem 6.23, which is in Section 9.3.
9.1 -bound for a contractive dynamics on arrays of random variables
The following proposition provides a condition template by which we can show that the random variables \mathcal{Q}^{N}\big{\{}U_{e}^{(N)}\big{\}}_{e\in E_{N}} and \mathcal{Q}^{N}\big{\{}V_{e}^{(N)}\big{\}}_{e\in E_{N}} are close together under the metric on random variables provided that \big{\{}\big{(}U_{e}^{(N)},V_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} is an i.i.d. array of (-valued) random variables and the variables and are close together in . In loose terms, we are bounding the sensitivity of the “dynamics” on arrays generated by the map to the initial conditions.
Proposition 9.1**.**
Fix some , and let . There exist and depending only on such that the statements (i)-(ii) below hold for any i.i.d. array \big{\{}\big{(}U_{e}^{(N)},V_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} of centered -valued random variables for which has the variance bound
[TABLE]
- (i)
If \mathbb{E}\big{[}\big{(}V_{e}^{(N)}-U_{e}^{(N)}\big{)}^{2}\big{]}<\delta/N^{4}, then
[TABLE] 2. (ii)
If \mathbb{E}\big{[}\big{(}V_{e}^{(N)}-U_{e}^{(N)}\big{)}^{2}\big{]}<\delta/N^{2} and the variables and are uncorrelated, then
[TABLE]
Remark 9.2**.**
In particular, if \big{\{}\big{(}U_{e}^{(N)},V_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} is a sequence in of arrays of random variables satisfying the conditions of Proposition 9.1 and \mathbb{E}\big{[}\big{(}V_{e}^{(N)}-U_{e}^{(N)}\big{)}^{2}\big{]}=\mathit{o}\big{(}1/N^{4}), then the distance between \mathcal{Q}^{N}\big{\{}V_{e}^{(N)}\big{\}}_{e\in E_{N}} and \mathcal{Q}^{N}\big{\{}U_{e}^{(N)}\big{\}}_{e\in E_{N}} vanishes with large .
Remark 9.3**.**
By the asymptotics for as in (II) of Lemma 2.3, the right side of (9.1) is equal to R(s-N)+\mathit{o}\big{(}\frac{1}{N^{2}}\big{)}. The statement of Proposition 9.1 is equivalent if is replaced by .
9.2 Defining intermediary distributional approximations
After the heuristic discussion below, we will state Definition 9.5, which defines the arrays of random variables appearing in the proof of Theorem 6.23. Lemmas 9.7-9.9 in the next subsection state bounds for the distance/Wasserstein- distance between the random variables in these arrays, providing opportunities to apply Proposition 9.1.
Let be a minimally regular sequence in of -pyramidic arrays of random variables. Proposition 9.1 combined with Remark 6.24 suggests a path for proving Theorem 6.23 by showing that for and the distance between the random variables and is small for some coupling of the variables. To help orient the reader towards the framework of the analysis in coming sections, we will motivate the definitions of three distributional approximations for the random variable that have roles in the proof of Theorem 6.23. The analysis will be founded on the introduction of intermediary generational scales between and that allow us to identify two sources of central limit-type renormalized sums—see (I) and (II) below—within an approximation for . It suffices for us to take
[TABLE]
for a large enough choice of .101010For the purpose of proving Theorem 6.23, can also be replaced by for any choice of in the definitions of and , however, this is not optimal for Theorem 7.3. In particular, when
[TABLE]
For notational neatness, we will suppress the dependence of these generational parameters on : and .
Remark 9.4**.**
To enable the reader to distinguish at a glance between arrays having the four distinct generational parameters , we will maintain a rigid indexing convention in which the arrays with generation numbers , , , are respectively dummy indexed by the letters , , , :
[TABLE]
Recall from (ii) of Notation 6.1 that given an array and some with , then refers to the subarray labeled by all canonically embedded in . From Definition 6.9 we can write X^{(N,n)}_{e}=\mathcal{Q}^{n-N}\big{\{}X_{h}^{(n)}\big{\}}_{h\in e\cap E_{n}}. For any defined as above with , this equality can be rewritten using the identity as
[TABLE]
The braced expressions above are central limit-type normalized sums (recall Remark 6.8), and thus admit Gaussian approximations when and :
- (I)
For the variables in the array \big{\{}Y_{f}^{N,n}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}}:=\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}}\mathcal{Q}^{n-\mathbf{n}}\big{\{}X_{h}^{(n)}\big{\}}_{h\in e\cap E_{n}} are approximately distributed as
[TABLE]
because the variables in the array \mathcal{Q}^{n-\mathbf{n}}\big{\{}X_{h}^{(n)}\big{\}}_{h\in e\cap E_{n}} have variance approximately equal to when by Lemma 6.15. 2. (II)
For \displaystyle Z_{f}^{N,n}:=\sum_{k=1}^{\mathbf{n}-\mathbf{\widehat{n}}}\mathcal{L}^{k-1}\mathcal{E}\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}-k}\mathcal{Q}^{n-\mathbf{n}}\big{\{}X_{h}^{(n)}\big{\}}_{h\in f\cap E_{n}}, the variable \displaystyle\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{e}^{N,n}:=\mathcal{L}^{\mathbf{\widehat{n}}-N}\big{\{}Z_{f}^{N,n}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}} has approximate distribution
[TABLE]
The variance is the asymptotic variance of as as will be shown in Lemma 11.3.
The above line of heuristic reasoning suggests that variables in the array \big{\{}X^{(N,n)}_{e}\}_{e\in E_{N}} are close in distribution to variables in the array \big{\{}\mathbf{\widetilde{X}}^{(N)}_{e}\}_{e\in E_{N}} defined in (iii) of Definition 9.5 below. The random variables and in (i) & (ii) of Definition 9.5 serve as distributional intermediaries between and ; see the Wasserstein- bounds for their differences in Lemmas 9.7-9.9. Note that in (i) is merely a different way of writing (9.2).
Definition 9.5**.**
Let be defined as in (9.2) for a given value of , and let the i.i.d. arrays of random variables \big{\{}Y_{f}^{N,n}\big{\}}_{f\in E_{\mathbf{\widehat{n}}}}, \big{\{}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{e}^{N,n}\big{\}}_{e\in E_{N}}, \big{\{}\mathbf{Y}_{f}^{(N)}\big{\}}_{f\in E_{\mathbf{\widehat{n}}}} and \big{\{}\mathbf{Z}_{e}^{(N)}\big{\}}_{e\in E_{N}} be defined as in (I) and (II) above.
- (i)
We define variables in the array \big{\{}\widehat{X}^{N,n}_{e}\big{\}}_{e\in E_{N}} as
[TABLE] 2. (ii)
For \big{\{}Y_{f}^{N,n}\big{\}}_{f\in E_{\mathbf{\widehat{n}}}} and \big{\{}\mathbf{Z}_{e}^{(N)}\big{\}}_{e\in E_{N}} independent, we define the i.i.d. array \big{\{}\mathbf{\widehat{X}}^{N,n}_{e}\big{\}}_{e\in E_{N}} to have variables with distribution
[TABLE] 3. (iii)
For \big{\{}\mathbf{Y}_{f}^{(N)}\big{\}}_{f\in E_{\mathbf{\widehat{n}}}} and \big{\{}\mathbf{Z}_{e}^{(N)}\big{\}}_{e\in E_{N}} independent, we define the i.i.d. array \big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} to have variables with distribution
[TABLE]
Remark 9.6**.**
The superscripts of the variables , , , , , , , and refer to their dependence on the underlying generational parameters with , whereas the superscript of (with the parenthesis and two indices) denotes more specifically that the random variable is an element of the layer of a -pyramidic array generated from a generation- array, .
9.3 Proof of Theorem 6.23
We will prove Theorem 6.23 and the uniqueness part of Theorem 6.16 after stating the crucial Lemmas 9.7-9.9, whose proofs in Section 11 form the core of our technical analysis.
For with and , let the random variables , , be defined as in Section 9.2 for a minimally regular sequence, \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}}, of -pyramidic arrays with parameter and a choice of the parameter in the equations (9.2) defining and . The lemmas below imply that the pairs \big{(}X^{(N,n)}_{e},\widehat{X}^{N,n}_{e}\big{)}, \big{(}\widehat{X}^{N,n}_{e},\mathbf{\widehat{X}}_{e}^{N,n}\big{)}, and \big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}^{(N)}_{e}\big{)} satisfy the conditions (i) or (ii) of Proposition 9.1 when after appropriate couplings of the variables for the latter two pairs. The constants in the statements of the next three lemmas depend on and the sequence \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}}.
Lemma 9.7, which is proved in Section 11.1, bounds the error in resulting from the partial linear approximation in (9.2).
Lemma 9.7**.**
The random variables and are uncorrelated. There is a positive number such that for any the inequality below holds for all large enough .
[TABLE]
Lemma 9.8 provides a bound for the error, when measured in terms of the Wasserstein-2 distance, of the Gaussian approximation heuristically motivated in (I) of Section 9.2. The proof is in Section 11.3 and uses a perturbative generalization of Stein’s method that is discussed in Section 11.2.
Lemma 9.8**.**
There exists a positive number such that for any the inequality below holds for all large enough .
[TABLE]
Lemma 9.9 bounds the Wasserstein-2 distance error resulting from the Gaussian approximation heuristically motivated in (II) of Section 9.2. The proof is in Section 11.4 and uses a bound (Lemma 11.6) that follows from the zero bias approach to Stein’s method, which is discussed in Appendix C.
Lemma 9.9**.**
There exists a positive number such that for any the inequality below holds for all large enough .
[TABLE]
Remark 9.10**.**
By definition of , Lemmas 9.8 & 9.9 imply that there are couplings \big{(}\widehat{X}_{e}^{N,n},\mathbf{\widehat{X}}_{e}^{N,n}\big{)} and \big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)} such that \mathbb{E}\big{[}\big{(}\widehat{X}_{e}^{N,n}-\mathbf{\widehat{X}}^{N,n}_{e}\big{)}^{2}\big{]} and \mathbb{E}\big{[}\big{(}\mathbf{\widehat{X}}_{e}^{N,n}-\mathbf{\widetilde{X}}^{(N)}_{e}\big{)}^{2}\big{]} are for large .
Remark 9.11**.**
When applying Proposition 9.1 in the proof of Theorem 6.23, we only need that the bounds , , and in Propositions 9.7-9.9 are respectively , , and for which it is sufficient to assume that for and .
The following easy corollary of Lemmas 9.7 - 9.9 verifies the condition (9.1) in the statement of Proposition 9.1 for the pairs of random variables discussed above, and its proof is in Section 12.2.
Corollary 9.12**.**
Define . For any the inequality \mathbb{E}\big{[}\big{(}U_{e}^{(N)}\big{)}^{2}\big{]}<R(-N)+\frac{\kappa^{2}s}{N^{2}} holds for equal to , , and for large enough and .
Remark 9.13**.**
The relevant sense of a given statement holding “for large enough and ” will always be that there exists a constant and an increasing function such that the statement is true whenever and .
Let us temporarily assume Proposition 9.1, Lemmas 9.7 - 9.9, and Corollary 9.12 to complete the remainder of the proof of Theorem 6.23. As in Corollary 9.12, we will define for the reason explained in Remark 9.11.
Proof of Theorem 6.23 and Theorem 6.16 (uniqueness part).
Let \big{(}\{X_{a}^{(*,n)}\}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} be a minimally regular sequence of -pyramidic arrays of random variables with parameter . By Remark 6.24 it suffices for us to focus on distributional convergence in the case in which the array \big{\{}X^{(k,n)}_{a}\big{\}}_{a\in E_{k}} consists of a single random variable, . We have divided the analysis below into parts (a)-(d).
(a) Setting up: For let the arrays of random variables \big{\{}\widehat{X}_{e}^{N,n}\big{\}}_{e\in E_{N}}, \big{\{}\mathbf{\widehat{X}}_{e}^{N,n}\big{\}}_{e\in E_{N}}, and \big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} be defined as in Definition 9.5. We will show that the Wasserstein- distance between and \mathcal{Q}^{N}\big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} converges to zero as and grow. Writing X^{(0,n)}=\mathcal{Q}^{N}\big{\{}X_{e}^{(N,n)}\big{\}}_{e\in E_{N}} and applying the triangle inequality yields
[TABLE]
The random variables \mathcal{Q}^{N}\big{\{}\widehat{X}_{e}^{N,n}\big{\}}_{e\in E_{N}} and \mathcal{Q}^{N}\big{\{}X_{e}^{(N,n)}\big{\}}_{e\in E_{N}} are already defined in the same probability space, and we will not require any special coupling between them. Notice that the expressions on the right side above have the form of those expressions bounded in Proposition 9.1.
(b) Verifying the conditions of Proposition 9.1: By Lemma 9.7 the variables and are uncorrelated, and there is a positive sequence with such that
[TABLE]
for any fixed and large enough . By Lemmas 9.8 & 9.9 and Remark 9.11, there is a positive sequence with and i.i.d. couplings \big{\{}\big{(}\widehat{X}_{e}^{N,n},\mathbf{\widehat{X}}_{e}^{N,n}\big{)}\big{\}}_{e\in E_{N}} and \big{\{}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} such that
[TABLE]
for any fixed and large enough . Corollary 9.12 implies that the arrays \big{\{}\widehat{X}_{e}^{N,n}\big{\}}_{e\in E_{N}}, \big{\{}\mathbf{\widehat{X}}_{e}^{N,n}\big{\}}_{e\in E_{N}}, \big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} satisfy condition (9.1) of Proposition 9.1 for any and large and . Moreover, the above considerations imply that for large enough and we have the following:
- •
the array \big{\{}\big{(}\widehat{X}_{e}^{N,n},X_{e}^{(N,n)}\big{)}\big{\}}_{e\in E_{N}} satisfies the conditions for part (ii) of Proposition 9.1 with \big{(}\widehat{X}_{e}^{N,n},X_{e}^{(N,n)}\big{)}=\big{(}U_{e}^{(N)},V_{e}^{(N)}\big{)},
- •
the arrays \big{\{}\big{(}\widehat{X}_{e}^{N,n},\mathbf{\widehat{X}}_{e}^{N,n}\big{)}\big{\}}_{e\in E_{N}} satisfy the conditions for part (i) of Proposition 9.1, and
- •
the arrays \big{\{}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} satisfy the conditions for part (i) of Proposition 9.1.
(c) Returning to (9.7): Therefore with three applications of Proposition 9.1 to the right side of (9.7) there is a such that for large enough and we have the first inequality below.
[TABLE]
The second inequality holds by Lemmas 9.7 - 9.9. As the above goes to zero by the asymptotic properties of and .
(d) Connecting with the random array constructed in Section 8: We have established that the Wasserstein- distance between and \mathcal{Q}^{N}\big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} vanishes as and grow. Let \big{\{}\mathbf{X}^{(k)}_{a}\big{\}}_{a\in E_{k}} be the sequence in of arrays of random variables for parameter constructed in Section 8 through subsequential distributional limits of \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} as . As mentioned in Remark 6.20, the arrays \big{\{}\mathbf{X}^{(k)}_{a}\big{\}}_{a\in E_{k}} form a parameter- regular sequence of -pyramidic arrays of random variables with no dependence. Thus we can apply the distributional convergence result that we have just proved to the special case \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}}:=\big{\{}\mathbf{X}_{a}^{(k)}\big{\}}_{a\in E_{k}} to get that the Wassertstein- distance between and \mathcal{Q}^{N}\big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} converges to zero as . Therefore, \rho_{2}\big{(}X^{(0,n)},\mathbf{X}\big{)} vanishes with large and the law of must be unique. ∎
10 Proof of Proposition 9.1
Proof.
By Remark 9.3, the condition (9.1) is equivalent to assuming that the variance of is smaller than . For , define the i.i.d. arrays of random variables
[TABLE]
and . The variables , , have mean zero, and has variance
[TABLE]
for \big{(}\sigma^{(N)}\big{)}^{2}:=\mathbb{E}\big{[}\big{(}U_{e}^{(N)}\big{)}^{2}\big{]}, where the second equality above holds by Remark 6.6 (note that has the same law as a generation partition function). The inequality uses our assumption that the variance of is smaller than , and the last equality is property (I) of Lemma 2.3.
We have the following recursive relation for the variables
[TABLE]
Since the arrays are i.i.d. and centered, the recursive formula above shows, by induction, that if is uncorrelated with for then is uncorrelated with for all and . In particular if and are uncorrelated, then \mathcal{Q}^{N}\big{\{}U_{e}^{(N)}\big{\}}_{e\in E_{N}} and \mathcal{Q}^{N}\big{\{}V_{e}^{(N)}\big{\}}_{e\in E_{N}}-\mathcal{Q}^{N}\big{\{}U_{e}^{(N)}\big{\}}_{e\in E_{N}} are uncorrelated.
Define the multivariate polynomial
[TABLE]
The form of the polynomial implies that there exists a such that
[TABLE]
for all with and . To see the above inequality, notice that a single term has absolute value bounded by , and we have when and when since .
Let \big{(}\varrho_{k}^{(N)}\big{)}^{2} denote the second moment of , and define u_{k}^{(N)}:=\mathbb{E}\big{[}U_{a}^{(k,N)}W_{a}^{(k,N)}\big{]}. Taking the second moment of (10.2) yields
[TABLE]
where the last inequality follows from (10.1).
In the following analysis, we will temporarily assume that and that is sufficiently far in the negative direction so that for all , which is possible by the asymptotics for as in (II) of Lemma 2.3. These assumptions ensure that the terms in the sums over below are positive and that the denominator is bounded away from zero. Recall that \big{(}\varrho_{N}^{(N)}\big{)}^{2}:=\mathbb{E}\Big{[}\big{(}V_{e}^{(N)}-U_{e}^{(N)}\big{)}^{2}\Big{]} for . Suppose that \big{(}\varrho_{N}^{(N)}\big{)}^{2}<\delta/N^{2+\epsilon}, where
[TABLE]
Note that because property (II) in Lemma 2.3 implies that the series \sum_{\ell=1}^{\infty}\big{(}R(s-\ell)\,-\,\frac{\kappa^{2}}{\ell-s}\big{)} and \sum_{\ell=1}^{\infty}\big{(}R(s-\ell)\big{)}^{2} are summable and because the asymptotics for implies that the infimum above is finite.
Let be the smallest such that \big{(}\varrho_{k}^{(N)}\big{)}^{2}\leq\big{(}R(s-k)\big{)}^{2}. Note that the inequality \big{(}\varrho_{N}^{(N)}\big{)}^{2}\leq\big{(}R(s-N)\big{)}^{2} holds by the assumption \big{(}\rho_{N}^{(N)}\big{)}^{2}<\delta/N^{2+\epsilon} and the definition of , and thus we must have . For with k+1\in\big{[}\mathbf{k}^{(N)},N\big{]}, we have the inequality
[TABLE]
Notice that \big{(}\varrho_{k}^{(N)}\big{)}^{2} is smaller than \big{(}R(s-k)\big{)}^{2} because \big{(}\varrho^{(N)}_{N}\big{)}^{2}<\delta/N^{2+\epsilon}. Hence, and by induction on we can deduce that . Therefore we can apply the above inequality with to get
[TABLE]
where . Since \big{(}\varrho^{(N)}_{N}\big{)}^{2}:=\mathbb{E}\big{[}\big{(}V_{e}^{(N)}-U_{e}^{(N)}\big{)}^{2}\big{]}, the proof is complete in the case when is sufficiently far in the negative direction, i.e., for all for some .
For the general case of , pick large enough so that . Our previous result for implies that there exist such for any ,
[TABLE]
The above uses that \big{(}\varrho^{(N)}_{n}\big{)}^{2}:=\mathbb{E}\big{[}\big{(}U_{a}^{(n,N)}-V_{a}^{(n,N)}\big{)}^{2}\big{]}, where , are distributed as generation partition functions and that we can write the argument of in the inequality \mathbb{E}\big{[}\big{(}U_{e}^{(N)}\big{)}^{2}\big{]}<R(s-N) in the form for with . Through iterating (10.4) times, we can get an inequality of the form \big{(}\varrho_{0}^{(N)}\big{)}^{2}\,\leq\,Q_{s}\big{(}\big{(}\varrho_{n}^{(N)}\big{)}^{2}\big{)} for a degree polynomial with nonnegative coefficients that depend on and no constant term. Since is differentiable and , there are such that for all . Combining (10.5) with this inequality for yields that for any ,
[TABLE]
This implies that the desired inequalities hold in the general case of . ∎
11 The three approximation lemmas
In this section, we will prove Lemmas 9.7-9.9. Recall from Sections 9.2 & 9.3 that Lemma 9.7 involves bounding the error of the partial linearization (9.2) and Lemmas 9.8 & 9.9 are Gaussian approximations of the terms (I) and (II) in (9.4) driven by central limit-type normalized sums that occur at different generational scales.
As before, let \big{(}\big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} denote a minimally regular sequence of -pyramidic arrays with parameter . For , , and , we will frequently use the notation
[TABLE]
Note that as by (III) of Lemma 6.15 with .
11.1 Proof of Lemma 9.7
Proof of Lemma 9.7.
The variables and are uncorrelated by Lemma 6.7 and have mean zero, so the square of the distance between and can be written as
[TABLE]
The proof is complete since as . ∎
11.2 A generalization of Stein’s auxiliary functions
Before moving to the proof of Lemma 9.8 we will discuss a generalized version of the auxiliary functions used in Stein’s method [27], which is a general strategy for proving the central limit theorem under the Wasserstein- metric. For random variables and with , the Wasserstein- distance has the dual form
[TABLE]
where is the collection of all Lipshitz functions on with Lipshitz constant . Given define the auxiliary function
[TABLE]
The function solves the differential equation
[TABLE]
and has the following convenient uniform bounds on its first two derivatives:
[TABLE]
Thus if is a random variable with finite variance and then
[TABLE]
A useful feature of the auxiliary function, , is that the Wasserstein- distance between the distributions of and can be reduced to a quantity only involving .
We will require a perturbative generalization of Stein’s method that bounds the Wasserstein- distance between random variables of the form and for variables , , satisfying that is centered with and is independent of . In other words, we would like to show how to bound the error of replacing the random variable with a standard normal independent of . In this case we will define an auxiliary function for a given that satisfies the following partial differential equation analogous to (11.6):
[TABLE]
The following proposition, whose proof is in Section 12.3, provides bounds for the first- and second-order partial derivatives of in analogy to (11.7).
Proposition 11.1**.**
Define for through the formula
[TABLE]
For all ,
[TABLE]
The trivial corollary below generalizes Proposition 11.1 to arbitrary variance .
Corollary 11.2**.**
Define for through the formula
[TABLE]
The function solves the partial differential equation
[TABLE]
and for all ,
[TABLE]
Proof.
Define and . Notice that we can write as
[TABLE]
Since , it follows that the first- and second-order derivatives of have the bounds in Proposition 11.1. From the equation we see that the derivatives of have the desired bounds. ∎
11.3 Proof of Lemma 9.8
For with , we will maintain the usual convention that , , and . Recall that is defined in (9.5), is defined above (9.6), and , are defined in Definition 9.5. We will need the following lemma, which collects some statements about the second and fourth moments of these random variables. The proof is in Section 12.3.
Lemma 11.3**.**
Let the random variables , , , be defined in terms of a minimally regular sequence of -pyramidic arrays of random variables \big{(}\big{\{}X^{(*,n)}_{a}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} with parameter .
- (i)
The variance of Y_{f}^{N,n}:=\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}} is , and . Moreover, is bounded from above and below by constant multiples of for all with . 2. (ii)
The variance of Z_{f}^{N,n}:=\sum_{k=1}^{\mathbf{n}-\mathbf{\widehat{n}}}\mathcal{L}^{k-1}\mathcal{E}\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}-k}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}} has the large convergence
[TABLE]
Moreover, is bounded by a constant multiple of for all with . 3. (iii)
There is a such that the fourth moments of the random variables and are respectively bounded by and for all with . 4. (iv)
There is a such that the fourth moments of the random variables , , are bounded by for all with .
Remark 11.4**.**
For (ii) of Lemma 11.3, note that is bounded from below by a constant multiple of for all as a consequence of (II) of Lemma 2.3 and since for and .
The lemma below, whose proof is in Section 12.3, follows easily from Holder’s inequality and the definition of Wasserstein- distance.
Lemma 11.5**.**
For , let and be random variables with finite absolute moments. We have the following bound on the Wasserstein- distance between and using the Wasserstein- distance:
[TABLE]
Proof of Lemma 9.8.
This proof is divided into parts (a)-(g).
(a) Notation: For we can write and in the forms
[TABLE]
where the random variables , , are defined as
[TABLE]
and recall that is the normal random variable (independent of and ) defined in (9.6).
(b) Stein’s method: Next we will use Stein’s method to bound the Wasserstein- distance between and . By definition of Wasserstein- distance,
[TABLE]
For a given with Lipschitz constant less than , define as in Corollary 11.2 with . Then is a solution to the partial differential equation
[TABLE]
where the expectation is w.r.t. \mathbf{Z}^{(N)}_{e}\sim\mathcal{N}\big{(}0,\varsigma_{N}^{2}\big{)}. By Corollary 11.2, the first-order partial derivatives of are bounded by and the second-order partial derivatives are bounded by .
By (11.10) and (11.12), to bound the expression in the supremum of (11.11), we must bound the absolute value of
[TABLE]
As in the usual implementation of Stein’s method, we would like to tease out cancellations between (I) and (II) by writing the random variable in (II) as a sum of a “large” term, , and a “small” term, , and then Taylor expanding (II). The complicating feature here is that is not independent of .
(c) Identifying the dependent factors: Next we seek to separate out the dependence of the random variables and on the random variable for a given . More precisely, we can define a term such that the statements (i)-(iii) below hold for the -valued random variable \Delta_{f}^{N,n}:=\frac{1}{b^{\mathbf{\widehat{n}}-N}}\big{(}Y_{f}^{N,n}+Y_{f}^{N,n}B_{f}^{N,n},\,Z^{N,n}_{f}\big{)}.
- (i)
The random variables , , have mean zero. 2. (ii)
The random vector \big{(}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{V}^{N,n}_{e}-\Delta_{f}^{N,n},B_{f}^{N,n}\big{)} is independent of \big{(}Y_{f}^{N,n},Z^{N,n}_{f}\big{)}. 3. (iii)
The random variables and are uncorrelated. Thus with (ii) the random variables , , are pairwise uncorrelated.
For the definition of is as follows:
[TABLE]
where the function , which maps arrays into , is defined below.111111Recall that for the indexing set is canonically identifiable with . The variable is a multilinear function, , of the array \big{\{}Y^{N,n}_{f}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}}, where
[TABLE]
Moreover, the partial derivative of with respect to has the form
[TABLE]
where is the -element subset of consisting of elements with the following three - restrictions: , there is a path in that passes over both and , and there is an element in that contains both and .121212The elements correspond to the in the above expression for \mathcal{F}\big{\{}y_{a}\big{\}}_{a\in E_{\mathbf{\widehat{n}}-N}}.
Next we justify statements (i)-(iii). For statement (i), note that the variables , , are multilinear polynomials of the array \big{\{}X^{(\mathbf{n},n)}_{g}\big{\}}_{g\in e\cap E_{\mathbf{n}}} that have no constant term, and consequently these variables have mean zero. Statement (iii) follows from Lemma 6.7 because the random variables have the forms Y_{f}^{N,n}:=\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}}\big{\{}X^{(\mathbf{n},n)}_{g}\big{\}}_{g\in f\cap E_{\mathbf{n}}} and Z^{N,n}_{f}:=\sum_{k=1}^{\mathbf{n}-\mathbf{\widehat{n}}}\mathcal{L}^{k-1}\mathcal{E}\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}-k}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}}. Note, in particular, that and are functions of the random variables with . The form (11.14) of the multilinear polynomial implies that only depends on variables in the array \big{\{}X^{(\mathbf{n},n)}_{g}\big{\}}_{g\in e\cap E_{\mathbf{n}}} with . Hence is independent of \big{(}Y_{f}^{N,n},Z^{N,n}_{f}\big{)}. By using that \macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Y}_{e}^{N,n}=\mathcal{F}\big{\{}Y^{N,n}_{f}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}}, the difference between the -valued random variables and can be written as
[TABLE]
The multilinearity of implies that \mathcal{G}\big{\{}y_{\widehat{f}}\big{\}}_{\widehat{f}\in e\cap E_{\mathbf{\widehat{n}}}}:=\mathcal{F}\big{\{}y_{\widehat{f}}\big{\}}_{\widehat{f}\in e\cap E_{\mathbf{\widehat{n}}}}-y_{f}\frac{\partial\mathcal{F}}{\partial y_{f}}\big{\{}y_{\widehat{f}}\big{\}}_{\widehat{f}\in e\cap E_{\mathbf{\widehat{n}}}} does not depend on the variable . The right side of the display above is a function of the variables \big{(}Y_{\widehat{f}}^{N,n},\,Z^{N,n}_{\widehat{f}}\big{)} with and , and thus is independent of \big{(}Y_{f}^{N,n},\,Z^{N,n}_{f}\big{)}. In fact, these observations imply that and are are jointly independent of \big{(}Y_{f}^{N,n},Z^{N,n}_{f}\big{)}, i.e., (ii).
With (11.14) and the triangle inequality, we can bound the norm of for by
[TABLE]
The inequality holds for some and all as a consequence of part (i) of Lemma 11.3.
(d) Stein analysis: Now we are ready to begin an analysis of the expression (11.3). By Taylor’s theorem to second-order, the expression inside the expectation in (II) has the form
[TABLE]
where is the 2-tensor of second-order derivatives and is some value between [math] and depending on and . The expectation of the first expression on the right side of (11.3) is zero by observations (i)-(iii) in part (c) above. By definition of , the second term on the right side of (11.3) can be written as
[TABLE]
Again by observations (i)-(iii) in part (c), the expectation of the first expression on the right side of (11.3) is zero.
As a consequence of the above remarks, taking the expectation of (11.3) leaves us with
[TABLE]
where we have used that is independent of to factor the first expectation on the right. The right-most expectation on the top line of (11.3) is equal to
[TABLE]
For \varsigma_{N,n}:=\mathbb{E}\big{[}\big{(}Z^{N,n}_{f}\big{)}^{2}\big{]}^{1/2}, combining (11.3) and (11.19) with (11.3) yields the equality
[TABLE]
In the above we have used that the expressions (III) and (IV) do not depend on the choice of and that there are elements in . The first term on the right side of (11.3) vanishes as because is bounded by and by part (ii) of Lemma 11.3. We will bound the last two terms on the right side of (11.3) in (e) and (f) below.
(e) Second term on the right side of (11.3): For any , the norm of the 2-tensor is bounded by since its components are smaller than as a consequence of Corollary 11.2. Thus we have the second inequality below.
[TABLE]
By (11.3), Lemma 11.3, and Remark 11.4, the above is bounded for all with by
[TABLE]
As the above is asymptotically proportional to since .
(f) Third term on the right side of (11.3): To bound the third term on the right side of (11.3), we can use that the vector has norm less times , i.e., the bound for the second-order partial derivatives of , and apply the Cauchy-Schwarz inequality to get
[TABLE]
By (11.3), Lemma 11.3, and Remark 11.4, the above is bounded for all with by
[TABLE]
As the above is asymptotically proportional to .
(g) Extension to the Wasserstein- distance: Our results in parts (b)-(f) can be summarized by stating that there is a such that for all large with
[TABLE]
where \xi_{N}^{\prime}(n):=\sqrt{\frac{\pi}{2}}\big{|}\frac{\varsigma_{N,n}^{2}}{\varsigma_{N}}-\varsigma_{N}\big{|}. As mentioned below (11.3), vanishes as for any fixed . By applying Lemma 11.5 with , we have that
[TABLE]
The limit superior of the above as is bounded by a constant multiple of by (11.21) and part (iv) of Lemma 11.3. ∎
11.4 Proof of Lemma 9.9
The following lemma is a central limit theorem in which the distance between a normalized sum of i.i.d. random variables and a centered normal random variable of the same variance is measured in terms of the Wasserstein-1 distance. We include a proof using the zero bias transformation of Goldstein and Reinert [19] in Appendix C.
Lemma 11.6**.**
Let be i.i.d. centered random variables with variance and finite third absolute moment. Then for and
[TABLE]
The next corollary applies Lemma 11.5 to the above result. The proof is at the end of Appendix C.
Corollary 11.7**.**
Let us take the conditions of Lemma 11.6 and assume in addition that the fourth moment of the random variables is finite. Then for any
[TABLE]
Proof of Lemma 9.9.
For the variables and have the form
[TABLE]
for and , respectively, where \big{\{}Y_{f}^{N,n}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}} and \big{\{}\mathbf{Y}_{f}^{(N)}\big{\}}_{f\in e\cap E_{\mathbf{\widehat{n}}}} are defined as in (9.5) and independent of . In the analysis below, we bound the Wasserstein- distance between and after choosing i.i.d. couplings \big{(}Y_{f}^{N,n},\mathbf{Y}_{f}^{(N)}\big{)} for .
(a) Using i.i.d. couplings to bound the Wasserstein- distance: For each , let be a coupling of the variables and such that
[TABLE]
With this coupling, we can bound the Wasserstein- distance between and as follows:
[TABLE]
where for the arrays within the expectations above are defined as
[TABLE]
(b) Bounding the inner summand on the second line of (11.4): Recall from (i) of Lemma 11.3 and (9.5), respectively, that the variables and have variances \textup{Var}\big{(}Y^{N,n}_{f}\big{)}=\sigma_{\mathbf{n},n}^{2} and \textup{Var}\big{(}\mathbf{Y}^{N,n}_{f}\big{)}=R(r-\mathbf{n}). Consequently, elements in the above arrays have variances \textup{Var}\big{(}\widetilde{Y}^{N,n}_{\mathbf{f}}\big{)}=\sigma_{\mathbf{n},n}^{2} and \textup{Var}\big{(}\mathbf{\widetilde{Y}}^{(N)}_{\mathbf{f}}\big{)}\,=\,R(r-\mathbf{n}) since preserves the variance of the array variables. For any , we can write the summand in (11.4) in the form
[TABLE]
(c) Going back to (11.4): The first term on the right side of (11.4) is equal to \big{(}\rho_{2}\big{(}Y^{N,n}_{f},\mathbf{Y}_{f}^{(N)}\big{)}\big{)}^{2} for any representative by definition of how the couplings in (11.22) are defined and since \big{|}e\cap E_{\mathbf{\widehat{n}}}|=b^{2(\mathbf{\widehat{n}}-N)}. Similarly, as a consequence of (11.24), the second term on the right side of (11.4) is bounded from above by \mathbf{c}\frac{\mathbf{\widehat{n}}-N}{N}\big{(}\rho_{2}\big{(}Y^{N,n}_{f},\mathbf{Y}_{f}^{(N)}\big{)}\big{)}^{2}. Thus for all with
[TABLE]
where the second inequality holds for some since . Thus we have shown that \rho_{2}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)} is bounded by a constant multiple of \rho_{2}\big{(}Y^{N,n}_{f},\mathbf{Y}_{f}^{(N)}\big{)}.
(d) Bounding the right side of (11.25): Next we focus on bounding \rho_{2}\big{(}Y^{N,n}_{f},\mathbf{Y}_{f}^{(N)}\big{)}. Since has variance and has variance , it will be convenient to use the triangle inequality to get
[TABLE]
Using that \textup{Var}\big{(}\mathbf{Y}_{f}^{(N)}\big{)}=R(r-\mathbf{n}), the first term on the right side of (11.26) can simply be bounded by
[TABLE]
By definition, is a sum of the i.i.d. random variables over , which contains elements. Hence, by Corollary 11.7 we have the inequality below for the first term on the right side of (11.26).
[TABLE]
The second inequality holds for some since \mathbb{E}\big{[}\big{|}X^{(\mathbf{n},n)}_{g}\big{|}^{4}\big{]} is bounded from above by a constant multiple of for all with by (iv) of Lemma 11.3. The last term in (11.28) is asymptotically proportional to as since .
(e) Conclusion: The inequalities (11.25)-(11.28) show that there is a such that for all with
[TABLE]
where \varsigma_{N}^{\prime\prime}(n):=\mathfrak{c}\big{|}\sigma_{\mathbf{n},n}-\sqrt{R(r-\mathbf{n})}\big{|}. The term vanishes as since by (III) of Lemma 6.15 with , and hence the proof is complete. ∎
12 Miscellaneous proofs from Sections 6, 9, & 11
12.1 Proofs from Section 6
Proof of Proposition 6.5.
We will prove the identity (6.1) using induction starting from . When , the set contains a single element , and the identity follows immediately from the definitions:
[TABLE]
Suppose that the identity (6.1) holds for some . The hierarchical nesting that defines the sequence of diamond graphs implies that there is a one-to-one correspondence between the set of generation- paths, , crossing and the set of -tuples with and . Within this identification, labels the branch of that traces over and for is the trajectory of through the copy of along the branch. In particular, it follows that . Using this bijection, we can rewrite the partition function as
[TABLE]
Hence the identity (6.1) holds for all by induction.∎
Proof of Proposition 6.22.
We will prove that the law of is a locally -Hölder continuous function of with respect to the Wasserstein- metric by showing that for all and
[TABLE]
where the function has a continuous—and thus locally bounded—derivative by Lemma 2.3. For any we can construct as , where the array of random variables is defined as in Theorem 6.16 for parameter . Let be an array of independent normal random variables with mean [math] and variance that is independent of . Define X_{n,r,t}^{\mathbf{B}}:=\mathcal{Q}^{n}\big{\{}X_{h}^{(n)}(r,t)\big{\}}_{h\in E_{n}} for X_{h}^{(n)}(r,t):=\big{(}1+\mathbf{X}_{h}^{(n)}\big{)}\textup{exp}\big{\{}\frac{\kappa}{n}\mathbf{B}^{h}_{t}-\frac{\kappa^{2}}{2n^{2}}t\big{\}}-1, i.e., as in Example 7.6. By the triangle inequality, we can bound the Wasserstein- distance between and by
[TABLE]
The second term on the right side of (12.2) converges to zero as by the discussion in Example 7.6. The random variables and are uncorrelated since is the conditional expectation of given . Thus, since and have mean zero,
[TABLE]
To see that \textup{Var}\big{(}X^{\mathbf{B}}_{n,r,t}\big{)} converges to as , notice that
[TABLE]
where the first and third equalities hold by part (i) of Remark 6.6 and Lemma 2.3, respectively. The second equality above follows from (7.2). Therefore we have established the inequality (12.1). ∎
12.2 Proofs from Section 9
Proof of Corollary 9.12.
The random variables and are uncorrelated as a consequence of Lemma 6.7, and thus
[TABLE]
where the convergence holds by (III) of Lemma 6.15 with . The equality holds for by the asymptotics for as in (II) of Lemma 2.3. If , then the right side above is smaller than for . Thus we have verified the desired condition in the case for any and large enough .
Next we extend our result to the case . By Lemma 9.8 and Remark 9.11, there are couplings between the random variables and such that the limit superior as of \mathbb{E}\big{[}\big{(}\widehat{X}^{N,n}_{e}-\mathbf{\widehat{X}}^{N,n}_{e}\big{)}^{2}\big{]} is \mathit{o}\big{(}\frac{1}{N^{4}}\big{)} for . By foiling and applying Cauchy-Schwarz, we get
[TABLE]
Since \limsup_{n\rightarrow\infty}\mathbb{E}\big{[}(\widehat{X}^{N,n}_{e})^{2}\big{]}\leq R(r-N) by (12.4) and is \mathit{O}\big{(}\frac{1}{N}\big{)} for as a consequence of (II) of Lemma 2.3, the limit superior of the middle term above as is \mathit{o}\big{(}\frac{1}{N^{5/2}}\big{)} with large . Thus \limsup_{n\rightarrow\infty}\mathbb{E}\big{[}(\mathbf{\widehat{X}}^{N,n}_{e})^{2}\big{]} is bounded by \limsup_{n\rightarrow\infty}\mathbb{E}\big{[}(\widehat{X}^{N,n}_{e})^{2}\big{]}+\mathit{o}\big{(}\frac{1}{N^{5/2}}\big{)}, which is smaller than when for any choice of . Hence we have extended our result to the case , and the same reasoning applies to . ∎
12.3 Proofs from Section 11
Proof of Proposition 11.1.
The bounds and are equivalent to (11.7), so we can focus on the partial derivatives , , and . Define and . We can rewrite in terms of as
[TABLE]
Moreover, we can rewrite in the form
[TABLE]
where is the kernel
[TABLE]
The results will follow by bounding \sup_{z\in{\mathbb{R}}}\int_{{\mathbb{R}}}\big{|}(\mathbf{d}G)(z,r)\big{|}dr for the derivatives \mathbf{d}\in\big{\{}\partial_{r},\partial_{r}^{2},\partial_{z}\partial_{r}\big{\}}.
The first partial derivative with respect to has the form
[TABLE]
For any , the equality \int_{{\mathbb{R}}}\big{|}\partial_{r}G(z,r)\big{|}dr\,=\,2\sqrt{2\pi}e^{\frac{z^{2}}{2}}\phi_{-}(z)\phi_{+}(z) holds, and the right side attains its maximum value, , when .
The second-order partial derivatives involving have the forms and , where
[TABLE]
Notice that and are nonnegative for all , and thus we simply have
[TABLE]
Therefore \sup_{z\in{\mathbb{R}}}\int_{{\mathbb{R}}}\big{|}(\mathbf{d}G)(z,r)\big{|}dr\leq 2 for and . ∎
The following proposition gives uniform bounds for the second and fourth moments of random variables from a minimally regular sequence of -pyramidic arrays. We prove Proposition 12.1 in Section 15.1 using techniques and an inequality from [10].
Proposition 12.1**.**
Let \big{(}\big{\{}X^{(*,n)}_{a}\big{\}}_{a\in E_{*}}\big{)}_{n\in\mathbb{N}} be a minimally regular sequence of -pyramidic arrays of random variables.
- (i)
The variances of the random variables are bounded from above and below by positive multiples of for all and . 2. (ii)
The fourth moments of the random variables are bounded from above by a multiple of for all and k\in\big{\{}0,\ldots,\lfloor n/2\rfloor\big{\}}.
We will prove the next lemma in Section 15.2. In a basic sense, the proof uses the same idea as the proof of Lemma 6.7 although the analysis is made more complex by the fourth moment.
Lemma 12.2**.**
For , let be an array of i.i.d. centered random variables with finite fourth moment. Define for . There is a not depending on the distribution of the variables such that the following inequality holds for all :
[TABLE]
Proof of Lemma 11.3.
Part (i): For the variance of Y_{f}^{N,n}\,=\,\mathcal{L}^{\mathbf{n}-\mathbf{\widehat{n}}}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}} is \sigma_{\mathbf{n},n}^{2}:=\textup{Var}\big{(}X_{g}^{(\mathbf{n},n)}\big{)} since the operation preserves the variance of the random variables in the array by Remark 6.6. The convergence of to as holds by (III) of Lemma 6.15 with . Finally, is bounded from above and below by constant multiples of for all with by Proposition 12.1 since .
Part (ii): Since terms in the sum Z^{N,n}_{f}=\sum_{k=\mathbf{\widehat{n}}+1}^{\mathbf{n}}\mathcal{L}^{k-\mathbf{\widehat{n}}-1}\mathcal{E}\mathcal{L}^{\mathbf{n}-k}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}} are uncorrelated by Lemma 6.7, we have the second equality below.
[TABLE]
By definition of , the above expression has the form .
Next we argue that is bounded from above by a constant multiple of . By (12.6), we have that \varsigma_{N,n}^{2}:=(\mathbf{n}-\mathbf{\widehat{n}})S\big{(}\sigma_{\mathbf{n},n}^{2}\big{)}, where the polynomial has no constant or linear terms. Since the lowest-order nonzero term in the polynomial is quadratic, part (i) of Proposition 12.1 implies that S\big{(}\sigma_{\mathbf{n},n}^{2}\big{)} is bounded by a constant multiple of for all with . The result then follows because for .
Part (iii): For , define \sigma^{(4)}_{\mathbf{n},n}\,:=\,\mathbb{E}\Big{[}\big{(}X_{g}^{(\mathbf{n},n)}\big{)}^{4}\Big{]}. Also, for and with , we define
[TABLE]
Note that \widetilde{\sigma}^{(2)}_{k,\mathbf{n},n}=\textup{Var}\big{(}X_{g}^{(\mathbf{n},n)}\big{)}=:\sigma^{2}_{\mathbf{n},n}, and Jensen’s inequality implies that
[TABLE]
The second inequality above holds for some and all with by (ii) of Proposition 12.1 and since for . Applying (12.7) with yields our desired bound for \mathbb{E}\big{[}\big{(}Y_{f}^{N,n}\big{)}^{4}\big{]}\,=\,\widetilde{\sigma}^{(4)}_{\mathbf{\widehat{n}},\mathbf{n},n}.
Let . By Lemma 12.2 the fourth moment of has the bound
[TABLE]
For , define \big{\{}\check{X}_{a}^{N,n}\big{\}}_{a\in f\cap E_{k}}:=\mathcal{L}^{\mathbf{n}-k}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in f\cap E_{\mathbf{n}}}. A single term from the sum in (12.8) has the bound
[TABLE]
The first inequality above is Jensen’s, and the second inequality holds for all with by (12.7). Thus by (12.8), (12.9), and the form of the polynomial , the fourth moment of is bounded from above by a multiple of .
Part (iv): Since for , an application of (ii) of Proposition 12.1 with yields that the fourth moment of is bounded by a constant multiple of for all with . The fourth moment bounds for and can be proven using the techniques in the proof of (iii).131313Also, see the proof of part (iv) of Lemma 11.3 in Section 13.3, which is an analogous result for general even moments under -sharp regularity-type assumptions. ∎
Proof of Lemma 11.5.
Let be a coupling such that the -distance between the variables and is equal to . Since is an infimum of the distance over couplings,
[TABLE]
∎
13 Sharp regularity and rate of convergence
Next we focus on proving Theorem 7.3. To do this, we will use analogous technical results to those in Lemmas 9.7-9.9—see (i)-(iii) of Lemma 13.1 below—that assume sharp regularity-type conditions and provide bounds in terms of functions of the “microscopic” parameter rather than the “mesoscopic” parameter . With Lemma 13.1 in hand, the proof of Theorem 7.3 carries through with only minor modifications of the proof of Theorem 6.23. We prove Lemma 13.1 in Section 13.2, and in Section 13.3 we prove an analog of Lemma 11.3.
13.1 Proof of Theorem 7.3
We will prove Theorem 7.3 after stating two preliminary lemmas. Lemma 13.1 bounds the same quantities as in Lemmas 9.7-9.9, and its proof is in the next subsection.
Lemma 13.1**.**
Fix , , , and a bounded interval . Define and for . There exists a positive number such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying
- (I)
\left|\textup{Var}\big{(}X_{h}^{(n)}\big{)}-\kappa^{2}\big{(}\frac{1}{n}+\frac{\eta\log n}{n^{2}}+\frac{r}{n^{2}}\big{)}\right|<\frac{\mathbf{v}}{n^{2+\alpha}}* and* 2. (II)
\mathbb{E}\Big{[}\big{|}X_{h}^{(n)}\big{|}^{2\mathfrak{p}}\Big{]}<\frac{\varkappa}{n^{\mathfrak{p}}},
the following inequalities hold:
- (i)
\mathbb{E}\Big{[}\big{(}X^{(N,n)}_{e}-\widehat{X}^{N,n}_{e}\big{)}^{2}\Big{]}^{1/2}\,<\,\mathbf{c}\frac{\log(n+1)}{n^{\alpha/3}}* ,* 2. (ii)
\rho_{2}\big{(}\widehat{X}^{N,n}_{e},\mathbf{\widehat{X}}^{N,n}_{e}\big{)}\,<\,\frac{\mathbf{c}}{n^{4\alpha/9+\upsilon}}\displaystyle* ,* 3. (iii)
\rho_{2}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)}\,<\,\frac{\mathbf{c}}{n^{8\alpha/9}}\,,\displaystyle**
where \big{\{}X^{(N,n)}_{e}\big{\}}_{e\in E_{N}} is the generation layer of the -pyramidic array generated from \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}, and , , are defined as in Definition 9.5 with .
Remark 13.2**.**
In Lemma 13.1, any value of greater than yields the same result.
Recall that the random variables in the array \big{\{}\mathbf{X}_{h}^{(n)}\big{\}}_{h\in E_{n}} from Theorem 6.16 with parameter have moment given by , where the function is characterized in Lemma 2.3 and the functions for are characterized in Theorem 2.4. The following trivial lemma implies that the conditions of Lemma 13.1 are satisfied by for all and all in a bounded interval when are large enough.
Lemma 13.3**.**
Fix , , and a bounded interval . There exist such that (I)-(II) below hold for all and .
- (I)
\left|R(r-n)-\kappa^{2}\big{(}\frac{1}{n}+\frac{\eta\log n}{n^{2}}+\frac{r}{n^{2}}\big{)}\right|<\frac{\mathbf{v}}{n^{2+\alpha}}** 2. (II)
**
Proof.
The inequalities (I)-(II) above hold for large enough and all and as a consequence of the asymptotics in (II) of Lemma 2.3 and (II) of Theorem 2.4, respectively. ∎
Proof of Theorem 7.3.
Let , , , , , , and be as in Lemma 13.1. By Lemma 13.1, there is a such that if , , and \big{\{}X^{(n)}_{h}\big{\}}_{h\in E_{n}} is an array of i.i.d. centered random variables satisfying conditions (I)-(II) in Theorem 7.3, then
[TABLE]
where for the third inequality we have used that is \mathit{O}\big{(}n^{-4\alpha/9-\upsilon}\big{)} as since . By the same reasoning as in parts (a)-(c) of the proof of Theorem 6.23, there are i.i.d. families of pair couplings \big{\{}\big{(}\widehat{X}^{N,n}_{e},\mathbf{\widehat{X}}^{N,n}_{e}\big{)}\big{\}}_{e\in E_{N}} and \big{\{}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)}\big{\}}_{e\in E_{N}} such that the first two inequalities below hold:
[TABLE]
where arises from an application of Proposition 9.1. For \mathbf{C}:=\mathbf{c}C\big{(}2+\sup_{u\in\mathbb{N}}\frac{\log(u+1)}{u^{\alpha/9-\upsilon}}\big{)}, the third inequality simply uses that .
By (13.1) the Wasserstein-2 distance between X^{(0,n)}=\mathcal{Q}^{n}\big{\{}X^{(n)}_{h}\big{\}}_{h\in E_{n}} and \mathcal{Q}^{N}\big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} is bounded by a multiple of for any i.i.d. array \big{\{}X^{(n)}_{h}\big{\}}_{h\in E_{n}} satisfying properties (I)-(II) in the statement of Theorem 7.3. Let the array of random variables \big{\{}\mathbf{X}^{(n)}_{h}\big{\}}_{h\in E_{n}} be defined as in Theorem 6.16 for parameter . By property (III) in Theorem 6.16, the positive integer moment of is , and thus by Lemma 13.3 the array \big{\{}\mathbf{X}^{(n)}_{h}\big{\}}_{h\in E_{n}} satisfies conditions (I)-(II) of Lemma 13.1 for all and with possibly larger values of . By substituting \big{\{}\mathbf{X}^{(n)}_{h}\big{\}}_{h\in E_{n}} for \big{\{}X^{(n)}_{h}\big{\}}_{h\in E_{n}} in our above analysis, we get that the Wasserstein- distance between \mathbf{X}=\mathcal{Q}^{n}\big{\{}\mathbf{X}^{(n)}_{h}\big{\}}_{h\in E_{n}} and \mathcal{Q}^{N}\big{\{}\mathbf{\widetilde{X}}_{e}^{(N)}\big{\}}_{e\in E_{N}} is bounded by a multiple of for all and . By the triangle inequality, we thus have the bound that we sought for the Wasserstein-2 distance between and .∎
13.2 Proof of Lemma 13.1
Recall that there are steps in each of the proofs of Lemmas 9.7-9.9 in which we respectively identified sequences , , that vanish as for each fixed and for which the inequalities ()-() below hold for some and all with .
- ()
\mathbb{E}\Big{[}\big{(}X^{(N,n)}_{e}-\widehat{X}^{N,n}_{e}\big{)}^{2}\Big{]}\,\leq\,\mathfrak{c}\frac{\log^{2}(N+1)}{N^{3}}\,+\,\xi_{N}(n) 2. ()
\rho_{1}\big{(}\widehat{X}^{N,n}_{e},\mathbf{\widehat{X}}^{N,n}_{e}\big{)}\,\leq\,\mathfrak{c}\frac{\log^{-\frac{1}{2}}(N+1)}{N^{\mathfrak{m}\log b}}\,+\,\xi_{N}^{\prime}(n)\displaystyle 3. ()
\rho_{2}\big{(}\mathbf{\widehat{X}}_{e}^{N,n},\mathbf{\widetilde{X}}_{e}^{(N)}\big{)}\,\leq\,\frac{\mathfrak{c}}{N^{\frac{\mathfrak{m}}{3}\log b+\frac{1}{2}}}\,+\,\xi_{N}^{\prime\prime}(n)\displaystyle
The inequalities ()-() are from (11.4), (11.21), & (11.29). Also recall that the proofs of () & () rely on bounds from Lemma 11.3. The following lemma states analogous results to those in Lemma 11.3 under the conditions (I)-(II) of Lemma 13.1, and its proof is in Section 13.3. In the statement of Lemma 13.4, the random variables and are defined as in (9.5) & (9.6), is defined as in (11.1), and are defined as in Lemma 13.4.
Lemma 13.4**.**
Fix , , , and a bounded interval . For , define . There exist positive numbers and such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying conditions (I)-(II) of Lemma 13.1, the inequalities below hold for the random variables , , , and the variances \sigma_{N,n}^{2}:=\textup{Var}\big{(}X_{e}^{(N,n)}\big{)} & \varsigma_{N,n}^{2}:=\textup{Var}\big{(}Z^{N,n}_{f}\big{)} defined through the -pyramidic array \big{\{}X_{a}^{(*,n)}\big{\}}_{a\in E_{*}} generated from \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}.
- (i)
* is bounded from above by , and is bounded from below by provided that .* 2. (ii)
* is bounded from above by and satisfies the inequality*
[TABLE] 3. (iii)
The fourth moments of the random variables and are bounded by and , respectively. 4. (iv)
The moments of the random variables , , and are bounded by .
The lemma below states that analogs of the inequalities ()-() hold for large enough when , , and are defined in terms of an array of random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying the conditions of Lemma 13.1. If we were only concerned with having a counterpart to the inequality (), then the constant would only depend on the bounded interval because the derivation of () in the proof of Lemma 9.7 is entirely based on properties of the function from Lemma 2.3. The counterparts to () & () can be shown by following the steps in the proofs of () & () and replacing each application of (i)-(iv) from Lemma 11.3 by an application of (i)-(iv) from Lemma 13.4. Thus we omit the proof of Lemma 13.5, which is a lengthy near-repetition of our previous line of arguments establishing ()-() in Section 11.
Lemma 13.5**.**
Fix , , and a bounded interval . Define . There exists a positive number such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying conditions (I)-(II) of Lemma 13.1 for , then the inequalities ()-() above hold, where \big{\{}X^{(N,n)}_{e}\big{\}}_{e\in E_{N}} is the generation layer of the -pyramidic array generated from \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}, and , , are defined as in Definition 9.5.
Lemma 13.6 offers some control for the rate of convergence in the case of Lemma 6.15 under the -sharp regularity condition on the variance of the random variables in the generating array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}. The proof, which is placed in Section 15.3, borrows a technical result from [10].
Lemma 13.6**.**
Fix , , and a bounded interval . There exists a positive number such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying condition (I) of Lemma 13.1, the inequality below holds for all :
[TABLE]
where \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} is the generation layer of the -pyramidic array generated from \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}.
Proof of Lemma 13.1.
Fix , , , and a bounded interval . Define , , and . Let \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} be an i.i.d. array of random variables satisfying conditions (I)-(II) for , , , , and some . Since , Jensen’s inequality and condition (II) imply that
[TABLE]
Thus \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfies condition (II) with and . By Lemma 13.5, there is such that the inequalities ()-() hold. In parts (i)-(iii) below we will start from the inequalities ()-(), respectively, and focus on bounding the terms , , .
Part (i): By inequality (),
[TABLE]
where is the error of the approximation of (11.2) by the expression in (11.3), i.e.,
[TABLE]
The second inequality in (13.3) holds for some since . In the analysis below, we will show that is bounded by a multiple of , and consequently that the distance between and is bounded by a multiple of by (13.3).
Define the polynomial , in other words, as with the linear term removed. As in the proof of Lemma 9.7, we can use telescoping sums to write
[TABLE]
where we have used the identities and M\big{(}R(r-k)\big{)}=R(r-k+1). Thus can be written as
[TABLE]
It follows that
[TABLE]
By Lemma 13.6, there is a such that for all and
[TABLE]
The lowest-order nonzero term in the polynomial is quadratic, and thus the following is finite:
[TABLE]
Since for , (13.5) implies that the distance between S\big{(}\sigma_{k,n}^{2}\big{)} and S\big{(}R(r-k)\big{)} is bounded by
[TABLE]
By applying (13.6) to (13.4) and using that is an increasing function, we get that
[TABLE]
The supremum above is finite because for by Lemma 2.3. Since and , the inequality (13.7) implies that \big{|}\xi_{N}(n)\big{|} is bounded by a multiple of .
Part (ii): Since \xi^{\prime}_{N,n}:=\sqrt{\frac{\pi}{2}}\big{|}\frac{\varsigma_{N,n}^{2}}{\varsigma_{N}}-\varsigma_{N}\big{|}, the first inequality below is ():
[TABLE]
The second inequality holds for some by part (ii) of Lemma 13.4 for the second term and since and for the first term.
As in the proof of Lemma 9.8, we will use Lemma 11.5 to bound the Wasserstein- distance using the Wasserstein- distance. Applying Lemma 11.5 with yields
[TABLE]
The second inequality uses that . Note that the exponent is strictly greater than since , and thus the above shows that the Wassertstein- distance between and is bounded by a multiple of .
Part (iii): Since \xi^{\prime\prime}_{N,n}:=\mathfrak{c}\big{|}\sigma_{\mathbf{n},n}-\sqrt{R(r-\mathbf{n})}\big{|}, the inequality () gives us that
[TABLE]
Since and , the first term on the right side of (13.9) is bounded by a multiple of . By Lemma 13.6, we have the second inequality below:
[TABLE]
Since and , the above is bounded by a multiple of . ∎
13.3 Proof of Lemma 11.3
The following is an analog of Proposition 12.1 that provides bounds for the moments of the random variables in a -pyramidic array \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} generated from an i.i.d. array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying the conditions of Lemma 13.1. The proof uses techniques from [10] and is placed in Section 15.4.
Proposition 13.7**.**
Fix , , , and a bounded interval . There exists a positive number such that for any , , and i.i.d. array of centered random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} satisfying conditions (I)-(II) of Lemma 13.1, the inequality below holds for all :
[TABLE]
where \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}} is the generation layer of the -pyramidic array generated from \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}}.
Proof of Lemma 11.3.
Part (i): By Lemma 13.6, there is a such that
[TABLE]
holds for any , , and i.i.d. array of random variables \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} with and satisfying condition (I) of Lemma 13.1, where we have used that for the second inequality. Since when and for , the supremum and infimum of for are respectively bounded from above and below by positive multiples and of :
[TABLE]
Thus is bounded from above by a constant multiple of . When N>\lambda:=\big{(}\frac{2C_{\mathcal{I},\mathbf{v},\alpha}}{c_{\mathcal{I}}}\big{)}^{2/7}, then is bounded from below by .
Part (ii): Define the polynomial , in other terms, as with the linear term removed. We can write and in the forms below:
[TABLE]
The first equality on the top line above uses (12.6) and that , and the first equality on the second line uses that M\big{(}R(s)\big{)}=R(s+1) by part (I) of Lemma 2.3.
We will first prove the bound for . By the same reasoning as in (13.6), there is a such that the inequality below holds
[TABLE]
From the relations (13.11) and (13.12), we get the first inequality below,
[TABLE]
The supremum above is finite since the lowest-order nonzero term in the polynomial is quadratic and as . The above shows that is bounded by a constant multiple of since .
Next we show that is bounded by a constant multiple of . By the triangle inequality and (13.12), we have that
[TABLE]
Note that by (II) of Lemma 2.3 since as . Thus, since the lowest-order nonzero term of the polynomial is quadratic, the first term on the right side of (13.13) is bounded by a constant multiple of . The second term on the right side of (13.13) is bounded by a constant multiple of because . Thus has the stated bound.
Part (iii): The proof follows through the same steps as the proof of part (iii) of Lemma 11.3 with each application of Proposition 12.1 replaced by an application of Proposition 13.7.
Part (iv): The bound for the moment of follows from Proposition 13.7 since . We will only prove the bound for the moment of since the analysis for is similar. By (9.2) the random variable can be written in the form
[TABLE]
It suffices to bound the moment of each of the terms (a) and (b) by a multiple of .
(a): Fix , and let . Since \mathcal{L}^{k}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in\mathbf{a}\cap E_{\mathbf{n}}} is an i.i.d. sum of random variables indexed by , the Marcinkiewicz-Zygmund inequality gives us the first inequality below for some universal constant .
[TABLE]
The third inequality uses that . Applying the inequality above with yields the sought-after bound for the moment of (a).
(b): For , define the array \big{\{}Y^{\ell,\mathbf{n},n}_{a}\big{\}}_{a\in e\cap E_{\mathbf{n}-\ell}}:=\mathcal{E}\mathcal{L}^{\ell-1}\big{\{}X_{g}^{(\mathbf{n},n)}\big{\}}_{g\in e\cap E_{\mathbf{n}}}. By the triangle inequality,
[TABLE]
We will show that the maximum above is bounded by a multiple of , which suffices to show that the moment of (b) has order since . Applying the Marcinkiewicz-Zygmund and Jensen inequalities as in (13.15) yields the following inequality for a representative :
[TABLE]
where is a linear combination of monomials with . It follows that (13.17) is bounded by a multiple of for all since an application of Jensen’s inequality and (13.15) yields
[TABLE]
∎
14 The site-disorder model
The goal of this section is to prove Theorem 3.1. As mentioned in Remark 3.3, the proof involves showing that \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} has a vanishing distance from a reduced partition function, \widetilde{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}, for which the disorder variables corresponding to vertices of generation less than have been integrated out (Lemma 14.2). Moreover, \widetilde{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} is the peak of a -pyramidic array of random variables with layers (Proposition 14.1). Lemmas 14.3 & 14.4 respectively verify the conditions (II) and (III) in Definition 6.12 for the large- behavior of the variance and higher moments of the random variables in the base layer of the -pyramidic array. We can then apply Theorem 6.23 to conclude that \widetilde{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}—and consequently also \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}—converges in distribution to as .
14.1 Proof of Theorem 3.1
We will prove Theorem 3.1 after stating the technical lemmas used in its proof. The proofs of the lemmas are placed in the next four subsections.
Recall that is canonically identifiable with a subset of and that under this identification is referred to as the set of generation- vertices. Thus, for , the set is all vertices on the diamond graph of generation greater than . The elementary proposition below, whose proof is in Section 14.2, states that the conditional expectation of the site-disorder partition function with respect to the -algebra generated by for can be expressed in terms of the array map .
Proposition 14.1**.**
Let , and assume . Define the -algebra \mathcal{F}_{n}^{k}:=\sigma\big{\{}\omega_{a}\,\big{|}\,a\in V_{n}\backslash V_{k}\big{\}}. The conditional expectation of with respect to can be written in the form
[TABLE]
where \big{\{}X_{h}(\beta)\big{\}}_{h\in E_{k}} is an array of independent copies of .
Lemma 14.2 states that the partition function \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} is not changed much by integrating out the disorder variables labeled by vertices of generation less than when is large. The proof is in Section 14.4.
Lemma 14.2**.**
For fixed , let the sequence have the large asymptotics (3.3). The distance between and \widetilde{W}_{n}^{\omega}(\widehat{\beta}_{n,r}):=\mathbb{E}\big{[}\widehat{W}_{n}^{\omega}(\widehat{\beta}_{n,r})\,\big{|}\,\mathcal{F}_{n}^{\lfloor\log n\rfloor}\big{]} vanishes as .
It follows from Proposition 14.1 and Lemma 14.2 that the distance between \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} and 1+\mathcal{Q}^{\lfloor\log n\rfloor}\big{\{}X_{h}(\widehat{\beta}_{n,r})\big{\}}_{h\in E_{\lfloor\log n\rfloor}} converges to zero as , where \big{\{}X_{h}(\widehat{\beta}_{n,r})\big{\}}_{h\in E_{\lfloor\log n\rfloor}} is an array of independent copies of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}-1. The following lemma verifies the variance asymptotics in condition (II) of Definition 6.12—with replaced by —for the sequence in of -pyramidic arrays generated from the edge-labeled arrays \big{\{}X_{h}(\widehat{\beta}_{n,r})\big{\}}_{h\in E_{\lfloor\log n\rfloor}}. Our proof, which is in Section 14.3, refines an argument from the proof of [1, Lemma 5.16].
Lemma 14.3**.**
The variance of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} has the large asymptotics
[TABLE]
Lemma 14.4 verifies the vanishing higher moment condition (III) of Definition 6.12 for random variables in the array \big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}}. The proof is in Section 14.5.
Lemma 14.4**.**
For each , the centered moment of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} vanishes as .
Proof of Theorem 3.1.
For \big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} defined as in Proposition 14.1, the distance between the generation- vertex-disorder partition function \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} and the effectively generation- edge-disorder partition function given by
[TABLE]
vanishes with large by Lemma 14.2, where the second equality above holds by Proposition 14.1. In particular, the Wasserstein- distance between \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}-1 and \mathcal{Q}^{\lfloor\log n\rfloor}\big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} vanishes as . Thus it suffices to prove that the Wasserstein- distance between \mathcal{Q}^{\lfloor\log n\rfloor}\big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} and converges to zero with large .
Notice that the statements (I)-(III) below hold.
- (I)
By Proposition 14.1, the random variables in the array \big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} are independent copies of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}-1. 2. (II)
By Lemma 14.3 the variance of the random variable \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} has the large asymptotics
[TABLE] 3. (III)
By Lemma 14.4, the centered moment of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} vanishes as for each .
Statements (I)-(III) imply that the sequence in of edge-labeled arrays \big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} satisfies the conditions (I)-(III) in Definition 6.12. Thus, by Theorem 6.23, the Wasserstein- distance between and \mathcal{Q}^{\lfloor\log n\rfloor}\big{\{}X_{h}\big{(}\widehat{\beta}_{n,r}\big{)}\big{\}}_{h\in E_{\lfloor\log n\rfloor}} vanishes with large .141414Although the definition of a “regular” sequence of -pyramidic arrays formulated in Definition 6.12 assumes that the generation, , of the bottom layer of the -pyramidic array is , the conclusions of Theorem 6.23 remain valid when is any sequence that diverges to , such as . Therefore, \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} converges in law to as . ∎
14.2 Proof of Proposition 14.1
As a preliminary, we will extend our observations and notations relating to the structure of the diamond hierarchical graphs. For recall that is the set of vertices on the diamond graph of generation greater than .
- (I)
From the construction of the sequence of diamond graphs outlined in Section 2.1, we can see that has embedded copies of , which are in canonical one-to-one correspondence with elements of . The vertices in —viewed as a subset of —are roots of the embedded copies of , and the remaining vertices in are internal (non root) to the embedded copies of . We denote that set of internal vertices on the copy of associated with by .151515This abuse of notation is similar to our previous use of to denote a subset of . The collection is a partition of the set . 2. (II)
For , let denote the set of functions that are directed paths crossing the embedded copy of corresponding to . Thus each is a copy of . 3. (III)
For and , we write when sits internally (non endpoint) along the path , i.e., when is incident to for some . A vertex is an element of if and only if there is an and a such that .161616This is equivalent to the remark in (I) that iff is an internal vertex to one of the subcopies of . 4. (IV)
There is a canonical one-to-one correspondence between and the union of -fold product sets given by . In this association, each has a generation- coarse-graining and the component in the -tuple is the trajectory of through the embedded copy of corresponding to .
The following defines a restricted partition function for the embedded copy of within that corresponds to .
Definition 14.5**.**
Let , and assume . For , define the random variable
[TABLE]
where the set and the relation are defined as in (II) and (III) above, respectively.
Remark 14.6**.**
The random variable in Definition 14.5 is equal in distribution to .
Proof of Proposition 14.1.
Taking the conditional expectation of with respect to is equivalent to integrating out the variables with :
[TABLE]
The last equality is equivalent to what we proved in Proposition 6.5. ∎
14.3 Proof of Lemma 14.3
For and , let denote the variance of the partition function . As a consequence of the distributional identity (3.2), the sequence of variances \big{\{}\hat{\varrho}_{k}(\beta)\big{\}}_{k\in\mathbb{N}_{0}} satisfies the recursive equation
[TABLE]
where the map is defined by
[TABLE]
Of course, reduces to the map M(x)=\frac{1}{b}\big{[}(1+x)^{b}-1\big{]} when .
The inverse temperature scaling (3.3) results in the following variance scaling:171717A short computation at the end of Appendix A verifies (14.4) starting from (3.3).
[TABLE]
It will be convenient to write in the form for \mathbf{n}_{n,r}:=\frac{\pi\kappa}{2}\big{(}\frac{b}{b-1}\big{)}^{1/2}V_{n,r}^{-1/2}, which has the large asymptotics
[TABLE]
Proof of Lemma 14.3.
We separate the proof into parts (a)-(h).
(a) An approximation for the variance map: Since the variance \hat{\varrho}_{k}\big{(}\widehat{\beta}_{n,r}\big{)} of \widehat{W}_{k}\big{(}\widehat{\beta}_{n,r}\big{)} satisfies the recursive equation (14.2) in , we have that
[TABLE]
Let be defined through an approximation of the expression for in (14.3) around that is third-order in and first-order in :
[TABLE]
Define , in other terms, the error of the approximation of by . The error term has the bound below for some and all and :
[TABLE]
The above inequality follows by foiling the expression (14.3) in & and then applying Young’s inequality to the cross-terms, of which the lowest-order cross-term is .
(b) Transforming the variables: For and , define the sequence \big{\{}\mathbf{r}_{k}^{(n,r)}\big{\}}_{k\in\mathbb{N}_{0}} of numbers in the interval as
[TABLE]
Note that since . For notational neatness, we will identify , i.e., suppress the dependence on the superscript variables. The sequence \big{\{}\mathbf{r}_{k}^{(n,r)}\big{\}}_{k\in\mathbb{N}_{0}} converges monotonically to as , and it will suffice for us to show that
[TABLE]
To see the equivalence between (14.8) and (14.1), note that for large —and thus small —we get the second equality below through second-order Taylor expansions of f_{1}(x)=\sin\big{(}\frac{\pi}{2}x\big{)} and f_{2}(x)=\cos\big{(}\frac{\pi}{2}x\big{)} at :
[TABLE]
Finally, recall from (14.5) that for large . Thus we only need to prove (14.8).
(c) Rewriting the increments of using Taylor’s theorem: By writing \widehat{M}_{n,r}^{k+1}(0)=\widehat{M}_{n,r}\big{(}\widehat{M}_{n,r}^{k}(0)\big{)} and splitting into a sum of and the error term , we get the equality
[TABLE]
With (14.7), we can rewrite the equation above in terms of the variables and as below, where the bracketed expressions have combined to form the term.
[TABLE]
If , Taylor’s theorem applied to the function g(x)=\tan\big{(}\frac{\pi}{2}x\big{)} around the point with second-order error implies there is an such that
[TABLE]
Define as the difference between the terms and :
[TABLE]
By Taylor’s theorem applied to the function around the point x=\tan\big{(}\frac{\pi}{2}\mathbf{r}_{k+1}\big{)}, there is an between and such that
[TABLE]
(d) Bounds for the various terms in (14.11): The inequalities below hold for some and all and such that .181818The lower bound of by ensures that is well-defined by (14.10). When is sufficiently large, holds as a consequence of (14.5).
- (i)
0\,\leq\,\mathbf{n}_{n,r}\mathscr{E}\left(\frac{\pi\kappa^{2}}{2\mathbf{n}_{n,r}}\tan\big{(}\frac{\pi}{2}\mathbf{r}_{k}\big{)},\mathbf{n}_{n,r}\right)\,\leq\,\frac{C}{n^{3}(1-\mathbf{r}_{k})^{4}}\,+\,\frac{C}{n^{5/3}} 2. (ii)
3. (iii)
\Big{|}\Delta_{k}-\frac{2\eta}{\pi n^{2}(1-\mathbf{r}_{k})^{3}}\Big{|}\,\leq\,\frac{C}{n^{3}(1-\mathbf{r}_{k})^{4}}\,+\,\frac{C}{n^{5/3}} 4. (iv)
\big{|}\mathbf{r}_{k}+\frac{1}{\mathbf{n}_{n,r}}-\mathbf{r}_{k+1}\big{|}\,\leq\,\frac{C}{n^{2}(1-\mathbf{r}_{k})}\,+\,\frac{C}{n^{5/3}} 5. (v)
\Big{|}\frac{2}{\pi}\Delta_{k}^{2}\sin\big{(}\frac{\pi}{2}\mathbf{r}_{k}^{**}\big{)}\cos^{3}\big{(}\frac{\pi}{2}\mathbf{r}_{k}^{**}\big{)}\Big{|}\,\leq\,\frac{C}{n^{4}(1-\mathbf{r}_{k})^{3}}\,+\,\frac{C}{n^{10/3}} 6. (vi)
\Big{|}\frac{2}{\pi}\Delta_{k}\cos^{2}\big{(}\frac{\pi}{2}\mathbf{r}_{k+1}\big{)}\,-\,\frac{\eta}{n}\log\big{(}\frac{1-\mathbf{r}_{k}}{1-\mathbf{r}_{k+1}}\big{)}\Big{|}\,\leq\,\frac{C}{n^{3}(1-\mathbf{r}_{k})^{2}}\,+\,\frac{C}{n^{5/3}}
The terms above arise from (14.6) and are less important than the first bounding terms. Note that (vi) approximates the second term on the right side of (14.11) by an expression that conveniently telescopes when summed over , and (v) bounds the last term on the right side of (14.11).
The bound (i) follows from (14.6), that for by (14.5), and the estimates below for :
[TABLE]
The bound (ii) follows from (iii), so we will focus on (iii) next. The inequality and (14.5) imply the equalities below.
[TABLE]
It follows from (14.13) and (14.5) that the difference between and the braced expression (III) in part (d) is bounded by
[TABLE]
The last inequality holds by another application of Young’s inequality to get and since . Finally, (iii) follows by combining (14.14) with (i).
Note that (iii) implies that is positive for all with when is sufficiently large. Thus (14.3)-(14.11) imply that . The bound (iv) follows from applying (ii) to (14.11) and using that and are within a distance of from . The bounds (v) & (vi) follow from (ii) and (iii), respectively, using basic calculus estimates.
(e) A consequence of (iv): Before going to the estimates in part (f) below, we will point out an easy consequence of the bound (iv) in (d): if satisfies , then the spacing between the terms in the sequence has the large form
[TABLE]
where the errors, \mathit{O}\big{(}\frac{1}{n\log^{2}n}\big{)}, are uniformly bounded by a multiple of for all and . The second equality above holds since \mathbf{n}_{n,r}=n+\mathit{O}\big{(}1/\log n\big{)}. A Riemann sum approximation thus gives us
[TABLE]
(f) Applying the bounds to a key telescoping sum: Assume that satisfies and that holds so that (14.15) and the inequalities in part (d) are applicable. Since , the equality below results from a telescoping sum:
[TABLE]
(g) How we can make use of (14.16): We will temporarily assume that holds for sufficiently large to show that the asymptotics (14.8) follows. If , then the equality (14.16) holds with , which gives us
[TABLE]
Note that (14.8) holds provided that the bracketed term is for . Since , we can get an upper bound for by substituting in place of on the right side of (14.17):
[TABLE]
Thus is bounded from above and below by constant multiples of for . It follows that the bracketed term in (14.17) is , and hence we can conclude from (14.17) that 1-\mathbf{r}_{n-\lfloor\log n\rfloor}=\frac{\log n}{n}\big{(}1+\mathit{o}(1)\big{)}. Plugging this asymptotics for back into the right side of (14.17), however, yields that the bracketed term in (14.17) is \mathit{o}\big{(}1/n), which proves (14.8) under the assumption that .
(h) Establishing the validity of (14.16) when : It remains to show that holds for large enough . Let be the smallest such that
[TABLE]
Since and by (iv) in part (d), we have the inequality for large enough . Thus the equality (14.16) will hold with when :
[TABLE]
Since is bounded by and the braced term on the right side of (14.3) is greater than for large , the first term on the right side of (14.3) must be negative when , and therefore . It follows that for large . ∎
14.4 Proof of Lemma 14.2
Since the random variables \mathbb{E}\big{[}\widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}\,\big{|}\,\mathcal{F}_{n}^{\lfloor\log n\rfloor}\big{]} and \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}-\mathbb{E}\big{[}\widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}\,\big{|}\,\mathcal{F}_{n}^{\lfloor\log n\rfloor}\big{]} are uncorrelated, the square of the distance between \widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)} and \mathbb{E}\big{[}\widehat{W}_{n}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}\,\big{|}\,\mathcal{F}_{n}^{\lfloor\log n\rfloor}\big{]} is equal to
[TABLE]
where the random variables are independent copies of \widehat{W}_{n-\lfloor\log n\rfloor}^{\omega}\big{(}\widehat{\beta}_{n,r}\big{)}. The equalities above use (14.2), Proposition 14.1, and (i) of Remark 6.6. It follows that Lemma 14.2 is a corollary of the following:
Lemma 14.7**.**
The difference between and M^{\lfloor\log n\rfloor}\big{(}\widehat{M}_{n,r}^{n-\lfloor\log n\rfloor}(0)\big{)} vanishes as .
Remark 14.8**.**
Note that M^{\lfloor\log n\rfloor}\big{(}\widehat{M}_{n,r}^{n-\lfloor\log n\rfloor}(0)\big{)} converges to as . This follows from Lemma 2.3 since , which is equal to the variance of \widehat{W}_{n-\lfloor\log n\rfloor}\big{(}\widehat{\beta}_{n,r}\big{)}, has the large- asymptotics (14.1) by Lemma 14.3.
In the proof of Lemma 14.7, we will use Lemma 14.9 below, which is a result from [10, Lemma 2.2(iv)]. Notice that applying the chain rule to the -fold composition of M(x)=\frac{1}{b}\big{[}(1+x)^{b}-1\big{]} yields
[TABLE]
where the function is defined by
[TABLE]
In the above, denotes the -fold composition of the function inverse of the map . The following lemma gives us uniform bounds for the sequence in of functions .
Lemma 14.9**.**
The sequence of functions converges uniformly over any bounded subinterval of to a limit function . In particular, is finite for any .
Proof of Lemma 14.7.
Define . By Remark 14.8, converges to as . For any , the definition of implies that
[TABLE]
We will return to (14.23) after obtaining bounds for the terms (I) and (II).
Bound for (I): The difference between the functions and has the bound,
[TABLE]
where the inequality holds for large enough since is vanishing. Thus for large
[TABLE]
Bound for (II): By the chain rule, the derivative of can be written in the form
[TABLE]
where the equality uses the definition (14.21) of the function . An application of (14.26) to the term gives us
[TABLE]
where the second inequality again uses that for all .
Returning to (14.23): Applying (14.25) and (14.27) to (14.23) gives us the first inequality below for all when is large enough.
[TABLE]
Let be the minimum of and the largest such that . Applying (14.28) with yields
[TABLE]
We can apply (14.29) to get the inequality below
[TABLE]
However, since for large , the inequality (14.30) precludes the possibility that when is large. It follows from the definition of that , and thus (14.29) implies that the difference between and vanishes with large . ∎
14.5 Proof of Lemma 14.4
Proof.
It suffices to show that the (uncentered) positive integer moments of \widehat{W}_{n-\lfloor\log n\rfloor}\big{(}\widehat{\beta}_{n,r}\big{)} all converge to one as . For , , , and define
[TABLE]
Note that since by definition, and by Jensen’s inequality. We obtain the following recursive equation in by evaluating the moment of both sides of the distributional equality (3.2):
[TABLE]
where is a polynomial with nonnegative coefficients that sum to . In particular, when evaluated at . Moreover, is a lower bound for \mathbf{P}_{m}\Big{(}\big{(}\mu_{n,r}^{(\ell)}(k)\big{)}^{b}\big{(}\nu_{n,r}^{(\ell)}\big{)}^{b}\,;\,\,\ell\in\{2,\ldots,m-1\}\Big{)} since .
We will use induction to prove that \max_{0\leq k\leq n-\lfloor\log n\rfloor}\big{|}\mu_{n,r}^{(m)}(k)-1\big{|} vanishes as for each . As a consequence of Lemma 14.3, \mu_{n,r}^{(2)}\big{(}n-\lfloor\log n\rfloor\big{)} converges to one as . Since \big{\{}\mu_{n,r}^{(2)}(k)\big{\}}_{k\in\mathbb{N}_{0}} is an increasing sequence and , it follows that \max_{0\leq k\leq n-\lfloor\log n\rfloor}\big{|}\mu_{n,r}^{(2)}(k)-1\big{|} vanishes as . Suppose for the purpose of a strong induction argument that
[TABLE]
for each . Note that converges to one as for each since vanishes with large . Fix some . Since is continuous and , we can choose large enough such that
[TABLE]
Let be the minimum of and the smallest such that
[TABLE]
By (14.31) and the definition of , we have the recursive inequality in below.
[TABLE]
Applying (14.35) times and using that yields
[TABLE]
The bracketed term converges to as by the same reasoning as for (14.33). We will show that holds for large enough by showing that the condition (14.34) cannot hold for when . Notice that
[TABLE]
Moreover, since , the following inequality holds for small :
[TABLE]
Thus does not satisfy (14.34) when is large, and therefore for large . Going back to (14.36) with , we get
[TABLE]
Since is arbitrarily and , the sequence \big{\{}\max_{0\leq k\leq n-\lfloor\log n\rfloor}\big{|}\mu_{n,r}^{(m+1)}(k)-1\big{|}\big{\}}_{n\in\mathbb{N}} is vanishing. Therefore, by induction, \max_{0\leq k\leq n-\lfloor\log n\rfloor}\big{|}\mu_{n,r}^{(m)}(k)-1\big{|} converges to zero for each , which completes the proof. ∎
15 Miscellaneous proofs from Sections 12 & 13
15.1 Proof of Proposition 12.1
To prepare for the proof of Proposition 12.1, we will define some additional notation related to the recursive formulas governing the positive integer moments of random variables in a -pyramidic array generated from an i.i.d. array of random variables and cite a bound (Lemma 15.7) from [10].
Let \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} be an i.i.d. array of centered random variables with finite absolute moment for some and be the -pyramidic array generated from it. For and , we will use the notation
[TABLE]
and condense subscripts when as follows: . For note that is interchangeable with our previous notation from (11.1). By (i) of Remark 6.6, the recursive relation implies that for the polynomial . More generally, the multilinear form of the map implies that the vector of higher moments \big{(}\sigma^{(3)}_{k,n},\ldots,\sigma^{(m)}_{k,n}\big{)} obeys a recursive equation with as an additional input:
[TABLE]
where is a vector of polynomials :191919The polynomials are the same as those in (I) of Theorem 2.4.
[TABLE]
In the above, the variables are indexed according to the number of the moment, , that they correspond to. The polynomials have nonnegative coefficients and are thus nondecreasing in each variable for on the subdomain ; see Lemma 15.13 for some additional properties of these polynomials.
Let be defined as below for the limiting moment functions from Theorem 2.4:
[TABLE]
where and is the inverse of the variance function . In other terms, determines the vector of limiting higher moments with from the variance . In Definition 15.1, we use the functions and to construct functions from to that converge pointwise with large to when has small enough norm by [10, Lemma 3.2]. For the purpose of proving Proposition 12.1, the relevant properties of the functions are the identities in Remarks 15.2 & 15.3 below and the bound on their derivatives in Lemma 15.7. As before, denotes the -fold composition of the function inverse of .
Definition 15.1**.**
For , let be the vector of polynomials determined by (15.2). Given and , define such that for
[TABLE]
Define through the -fold composition of the maps given by
[TABLE]
We denote the -by- matrix of first-order derivatives of with respect to the variables for by .
Remark 15.2**.**
Let and . Since , the recursive relation (15.2) implies the identity
[TABLE]
Remark 15.3**.**
Note that R(r-k)=M^{n-k}\big{(}R(r-n)\big{)} by part (I) of Lemma 2.3. Hence part (I) of Theorem 2.4 implies that
[TABLE]
We will use the following simple vector notation.
Notation 15.4**.**
For let and be elements of and be a real-valued matrix.
- (i)
We write if the inequality holds component-wise, i.e., for all . 2. (ii)
* denotes the max norm of , i.e., .* 3. (iii)
* is the operator norm with respect to the max norm on , i.e., .*
Remark 15.5**.**
In the sense of (i) in Notation 15.4, we will refer to a function as being nondecreasing on a subdomain if holds for all with .
Remark 15.6**.**
Since the polynomials have nonnegative coefficients, is nondecreasing on . Since is increasing, it follows from the construction in Definition 15.1 that is also nondecreasing on .
The lemma below from [10, Eqn. 3.8] implies that the function is essentially independent of when and is restricted to a small region around the origin.
Lemma 15.7**.**
For any , there is an such that for all :
[TABLE]
Proof of Proposition 12.1.
Part (i): Pick any with . Since \sigma_{n}^{2}=\textup{Var}\big{(}X_{h}^{(n)}\big{)} has the large asymptotics (6.2) and has the asymptotics in (II) of Lemma 2.3 as , we have the following inequality for all larger than some
[TABLE]
Thus for any and the relations below hold:
[TABLE]
where we have used (I) of Lemma 2.3, the definition \sigma_{k,n}^{2}:=M^{n-k}\big{(}\sigma_{n}^{2}\big{)}, and that is increasing. Since for and takes values in , the terms and are respectively bounded from above and below by positive multiples, and , of . Thus we have for all and . Since there are only finitely many with , the inequalities can be extended to all by choosing the constants to be larger/smaller if needed.
Part (ii): Let be small enough to satisfy the conclusion of Lemma 15.7 with , and fix any . Let be large enough such that statements (a)-(c) below hold for all with .
- (a)
for all , 2. (b)
\max_{m\in\{3,4\}}\big{|}\sigma^{(m)}_{n}\big{|}<\epsilon, and 3. (c)
.
To see that exists, notice the following: statement (a) holds for large by the reasoning leading to (15.4); statement (b) holds for large enough as a consequence of our minimal regularity assumption that the fourth moments vanish as ; statement (c) holds for large enough since vanishes as for each by (II) of Theorem 2.4; .
Since there are only finitely many terms with , we can focus on the case that . For , let be the smallest element of such that . Note that much exist as a consequence of (c). Since converges to as for each by (III) of Lemma 6.15, the following is finite:
[TABLE]
Thus it suffices for us to assume that in the remainder of the proof.
Let and . The equality below is the case of the identity in Remark 15.2.
[TABLE]
The inequality above holds by statement (a) and Remark 15.6. Since statements (b) and (c) imply that , \big{\|}\big{(}\big{|}\sigma^{(3)}_{n}\big{|},\sigma^{(4)}_{n}\big{)}\big{\|}_{\infty}<\epsilon, and \big{\|}\big{(}R^{(3)}(r^{\uparrow}-n),R^{(4)}(r^{\uparrow}-n)\big{)}\big{\|}_{\infty}<\epsilon, we can apply Lemma 15.7 to get the first inequality below.
[TABLE]
By Remark 15.3, the bracketed term is equal to \Big{(}R^{(3)}(r^{\uparrow}-k),R^{(4)}(r^{\uparrow}-k)\Big{)}, and thus the inequality (15.7) implies that
[TABLE]
where refers to the vector in . Combining the vector inequalities (15.6) and (15.8) yields the following for the second components of the vectors:
[TABLE]
For the second inequality, we have used that and that . Since is \mathit{O}\big{(}\frac{1}{s^{2}}\big{)} for by part (II) of Theorem 2.4, is bounded by a constant multiple of for all . Also, \big{(}\frac{b+1}{2b}\big{)}^{n/2}, which decays exponentially in , is bounded from above by a multiple of for all . Thus we have the desired inequality when and , which completes the proof. ∎
15.2 Proof of Lemma 12.2
Let be an array of centered random variables with finite fourth moments, and define for . Recall that Lemma 12.2 states that \mathbb{E}\big{[}\big{(}\sum_{\ell=1}^{n}Y_{\ell}\big{)}^{4}\big{]} is bounded by a constant multiple of n\sum_{\ell=1}^{n}\mathbb{E}\big{[}Y_{\ell}^{4}\big{]}.
Notation 15.8**.**
For distinct , let denote the smallest value of such that there exist distinct with and . When , we define .
Remark 15.9**.**
Let and for . If and for all , then .
Remark 15.10**.**
Let . If is independent of and , then we define for and .
Proof of Lemma 12.2.
Let denote the variance of the variables , . By foiling, we get
[TABLE]
By applying Young’s inequality, , to the bracketed products above with and , respectively, we can bound the second and third terms on the right side of (15.2) by multiples of n\sum_{\ell=1}^{n}\mathbb{E}\big{[}Y_{\ell}^{4}\big{]}. In the analysis below, we will show that \mathbb{E}\big{[}Y_{\ell}Y_{l_{1}}Y_{l_{2}}Y_{l_{3}}\big{]}=0 when and thus that the last term on the right side of (15.2) is zero. We will also show that \mathbb{E}\big{[}Y_{\ell}^{2}Y_{l_{1}}Y_{l_{2}}\big{]} is bounded by a constant multiple of for all with , which implies that there are such that the inequalities below hold for all .
[TABLE]
The third inequality holds since M(x):=\frac{1}{b}\big{[}(1+x)^{b}-1\big{]}\geq x+\frac{b-1}{2}x^{2} for and . The equality holds by Remark 6.6 since , and the last inequality is Jensen’s. It follows that the fourth term on the right side of (15.2) is easily bounded by a constant multiple of n\sum_{\ell=1}^{n}\mathbb{E}\big{[}Y_{\ell}^{4}\big{]}.
For and , define . The random variable can be written in the forms
[TABLE]
From (15.2) we see that is a degree- multilinear polynomial in the variables consisting of a linear combination of monomials for subsets of satisfying
- (I)
and 2. (II)
for any distinct .
For numbers indexed by , let be a subset of satisfying (I)-(II) for . The product of the monomials can be written as
[TABLE]
where the exponent is defined by \lambda(a):=\big{|}\big{\{}j\in\{1,2,3,4\}\,\big{|}\,a\in B_{j}\big{\}}\big{|}. The expectation of (15.11) is zero if for some . The first case below implies \mathbb{E}\big{[}Y_{\ell}Y_{l_{1}}Y_{l_{2}}Y_{l_{3}}]=0 when .
Case : To reach a contradiction, suppose that and for all . Since satisfies properties (I)-(II) with , there must be distinct and distinct such that and . By our assumption that for , property (II) for implies that for each we have or (since otherwise there exist distinct with ). In particular, or is disjoint from the sets for at least two values of . Without loss of generality, we can assume that and consequently that . Since and , we must have to ensure that . Thus . Note that , by Remark 15.9, because , and satisfies property (II) with and . By properties (I)-(II) for , there exists with and . Since and , it follows from property (II) for that . Also since and . To summarize, , but for all . Therefore, , which is a contradiction.
Remark 15.11**.**
To summarize the above contradiction proof, both and need to have for at least two values of to avoid having with , however, this is inconsistent with being distinct and thus disjoint when viewed as subsets of .
Case : Let for satisfy properties (I)-(II) above respectively for with . There are two special types—see ()-() below—of configurations of the sets such that for all . For both types, and .
- ()
There exists and distinct such that and . The sets in the collection \mathscr{P}:=\big{\{}B_{\epsilon}\cap B_{\delta}\,\big{|}\,\epsilon\in\{1,2\},\,\delta\in\{3,4\}\big{\}} are pairwise disjoint, have cardinality one, and their union is equal to . In particular, and .
- ()
There exists and such that and . The sets , , , have cardinality one and . In particular, and .
The types () and () correspond to the cases of and , respectively. The possibility can be excluded because it results in multiple with by simpler reasoning than in the case discussed above.
To understand the type-() configuration, notice that the intersections for and contain at most one element since and satisfy property (II) for and , respectively. Thus and can each contribute at most one to each of the sums and . Since (because and for distinct ), it is only possible that for all if and the collection \mathscr{P}:=\big{\{}B_{\epsilon}\cap B_{\delta}\,\big{|}\,\epsilon\in\{1,2\},\,\delta\in\{3,4\}\big{\}} is a partition of comprised of single-element sets. Similarly, must be a subset of to avoid having an with . Since the sets in are disjoint and have union equal to , it follows that . Finally, and since sets in have cardinality one.
To derive the type-() configuration, suppose that there is a single such that . Since and satisfy property (II) with , the sets and contain at most one element. It follows that and can each contribute at most one to the sum . Since and satisfy property (II) respectively for and with , the set has at most one element. Under these constraints, it is only possible that for all if , , and the sets and have cardinality one and are disjoint. Since and satisfy property (II) with , the sets , jointly contribute at most one to each of the sums and . In order for for all , it must be that and . Hence . Since and satisfy property (II) respectively for and with , there exists and such that and .
Next we bound the expectation of when . Using the formula (15.2), we can write
[TABLE]
where the second equality holds by foiling the product over by our observations above. The type-() and type-() contributions to (15.12) both yield multiples of . The cases are similar, so we will discuss only the type-() case.
When the product over inside the expectation in (15.12) is foiled, only the terms with , , can be of type-() or type-() and thus nonzero. In the type-() case, there are distinct such that and , where and have the roles of and , respectively, in the statement of (). The type-() contribution has the form
[TABLE]
where we interpret inside the product , and is defined as
[TABLE]
Note that the sets and in the definition of both contain exactly two elements. There are respectively and choices for the functions and . When and are given, there are combinatorial possibilities for the pair of functions such that , where the factor of comes from the assignment choices for on the subdomain . For the purpose of evaluating (15.13), it will be convenient to reformulate the sums (ii)-(iii) as
[TABLE]
The summation (15.13) is equal to
[TABLE]
where the sum is independent of a particular choice of with . Moreover, the sum has terms and the summand is indepedent of because of the cancellation of between the numerator and the denominator. The product above is equal to . ∎
15.3 Proof of Lemma 13.6
In this section, we will prove the following lemma, which uses more restrictive assumptions on the asymptotics for in (2.8) to gain more explicit control of the error in the convergence of to as in Lemma 2.3. Recall from Remark 6.6 that if the random variables in an i.i.d. array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} are centered with variance , then the random variables in the array \big{\{}X_{a}^{(k,n)}\big{\}}_{a\in E_{k}}:=\mathcal{Q}^{n-k}\big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} have variance . It follows that Lemma 15.12 below is equivalent to Lemma 13.6.
Lemma 15.12**.**
Fix , , and a bounded interval . There exists such that for any , , and satisfing the inequality
[TABLE]
the following inequality holds:
[TABLE]
The proof of Lemma 15.12 will rely on an application of Lemma 14.9.
Proof.
Let , , and be a bounded interval in . As a preliminary, note that the asymptotic form for as in (II) of Lemma 2.3 implies that there exists a such that for all and
[TABLE]
Let , , and be any values satisfying the condition (15.14), and let . By (I) of Lemma 2.3, we can rewrite the difference between and as
[TABLE]
Since the derivative of is increasing, the absolute value of (15.17) is bounded by
[TABLE]
Let be the smallest such that
[TABLE]
which exists because (15.19) is satisfied with by (15.14) and (15.16). Note that (15.19) implies that since is increasing. Thus with (15.18), for any , , satisfying (15.14), we have that
[TABLE]
We will show that whenever , where is defined by
[TABLE]
Suppose to reach a contradiction that and for some , , such that (15.14) holds. Using similar reasoning as in (15.18), the difference between and is bounded by
[TABLE]
where we have applied (15.20) in the second inequality. Since , the above is smaller than . Thus satisfies \big{|}M^{n-k}(x)\,-\,R(r-k)\big{|}\leq R(r), which contradicts that is the smallest element of satisfying (15.19). Therefore, when .
Since \big{|}M^{n-k}(x)\,-\,R(r-k)\big{|}\leq R(r)+C_{\mathcal{I},\alpha}+\mathbf{v} holds for all when , , satisfy (15.14) and , under these conditions on , , the inequality (15.18) yields
[TABLE]
Thus we have the inequality that we sought under the restriction . The remaining case when is smaller than is trivial. ∎
15.4 Proof of Proposition 13.7
For , let the polynomial be defined as in Section 15.1. The following lemma is from [10, Proposition 3.1].
Lemma 15.13**.**
The multivariate polynomial satisfies the properties below.
- (i)
* has nonnegative coefficients, no constant term, and its only linear term is . In other words, there exist polynomials and with nonnegative coefficients such that*
[TABLE]
where the polynomials and have no constant or linear terms. 2. (ii)
The polynomial is a linear combination of monomials with
[TABLE]
The polynomial is a linear combination of monomials with .
The next lemma follows easily from (II) of Theorem 2.4.
Lemma 15.14**.**
For any and bounded interval , there is a positive number such that for all and
[TABLE]
We will use the notation \sigma^{(m)}_{k,n}:=\mathbb{E}\big{[}\big{(}X_{a}^{(k,n)}\big{)}^{m}\big{]} and from (15.1) throughout the following proof. The absolute moment of variables in the generating array \big{\{}X_{h}^{(n)}\big{\}}_{h\in E_{n}} will be denoted by .
Proof of Proposition 13.7.
Fix , , and a bounded interval .212121Without losing any generality we can assume rather than . We will use induction in to show that there is a such that for any , , and i.i.d. array of centered random variables satisfying
- (I)
\Big{|}\sigma_{n}^{2}-\kappa^{2}\big{(}\frac{1}{n}+\frac{\eta\log n}{n^{2}}+\frac{r}{n^{2}}\big{)}\Big{|}<\frac{\mathbf{v}}{n^{2+\alpha}} and 2. (II)
,
the following inequality holds for all
[TABLE]
Notice that the existence of follows from Lemma 15.14 with and Lemma 13.6. Assume for the purpose of a strong induction argument that there exist constants satisfying the statement above for each for some . Let , , and be an i.i.d. array of centered random variables satisfying (I)-(II) for . Note that for any Jensen’s inequality and condition (II) give us the first two inequalities below:
[TABLE]
The third inequality holds since . Thus satisfies condition (II) for each , and therefore (15.22) holds for all by our induction assumption. Define .
The last component of the recursive relation (15.2) implies that
[TABLE]
Since (15.22) holds for all , the term \Big{|}V_{\mathbf{m}}\Big{(}\sigma_{k,n}^{(2)},\ldots,\sigma_{k,n}^{(\mathbf{m}-1)}\Big{)}\Big{|} has the bound
[TABLE]
where is defined by c^{\prime}=\sup_{\ell\in\mathbb{N}_{0}}\,(\ell+1)^{\frac{\mathbf{m}}{2}}V_{\mathbf{m}}\Big{(}c(\ell+1)^{-1},\ldots,c(\ell+1)^{-\frac{\mathbf{m}-1}{2}}\Big{)}, and we have used that has nonnegative coefficients. The supremum above is finite as a consequence of part (ii) of Lemma 15.13.
Again invoking that (15.22) holds for all , the factor \big{|}U_{\mathbf{m}}\big{(}\sigma_{k,n}^{(2)},\ldots,\sigma_{k,n}^{(\mathbf{m})}\big{)}\big{|} in (15.24) has the bound
[TABLE]
The above also uses that the coefficients of the polynomial are nonnegative. Since the polynomial has no constant term by (i) of Lemma 15.13, there is a such that for all and
[TABLE]
Define . Note that when the inequalities below are satisfied for as a consequence of assumption (II) with :
[TABLE]
For define as the smallest satisfying (15.28). Note that for all
[TABLE]
by (15.26)-(15.28) and since and .
Assume . By the bounds (15.24), (15.25), and (15.29), we have the inequality below for all .
[TABLE]
Using (15.30) recursively, it follows that for any
[TABLE]
where , and we have used the crude bound . It follows from (15.31) that is bounded from above by defined by
[TABLE]
If n\geq\max\big{(}\widehat{k},N_{\varkappa,\mathbf{c}}\big{)}, then (15.31) has the form of our desired inequality (15.22) for and all . Since there are only finitely many remaining , we can use the recursive relation \sigma_{k-1,n}^{(\mathbf{m})}=P_{\mathbf{m}}\Big{(}\sigma_{k,n}^{(2)},\ldots,\sigma_{k,n}^{(\mathbf{m})}\Big{)} and our induction assumption to bound the remaining terms by a constant depending only on , , , , and . Finally, we can pick our constant large enough to extend the inequality to the finitely many with n<\max\big{(}\widehat{k},N_{\varkappa,\mathbf{c}}\big{)}. By induction this completes the proof. ∎
Appendix A Inverse temperature scaling
We will outline the calculation verifying that the variance scaling (2.7) determines the inverse temperature scaling in (2.5). In other terms, as for
[TABLE]
Recall that and . Since for , a computation shows that
[TABLE]
Another computation using the expansion (A.1) shows that for small
[TABLE]
Substituting V_{n,r}+\mathit{o}\big{(}\frac{1}{n^{2}}\big{)} in for on the right side of (A.2) yields
[TABLE]
which is the asymptotic form for in (2.5). Alternatively, if we substitute the sharper asymptotic form V_{n,r}+\mathit{O}\big{(}\frac{1}{n^{2+\alpha}}\big{)} in for on the right side of (A.2), then \mathit{o}\big{(}\frac{1}{n^{3/2}}\big{)} can be replaced by \mathit{O}\big{(}\frac{1}{n^{3/2+\alpha}}\big{)} on the right side of (A.3).
For the site-disorder model, the inverse temperature scaling (3.3) results in the variance scaling (14.4) since by (A.1) we have
[TABLE]
Appendix B Variance function consistency check
There is instructional value in implementing a consistency check between properties (I) and (II) in the statement of Lemma 2.3, i.e., between the claim that M\big{(}R(r)\big{)}=R(r+1) and the asymptotics
[TABLE]
where and . Fix some with and define for . We begin by writing as a telescoping sum
[TABLE]
We will analyze the expressions (a), (b), and (c) to verify that the right side of (B.2) has the asymptotics (B.1). The expression (c) is since the terms are bounded by a constant multiple of as a consequence of (B.1).
Applying (B.1) to in the expression (a) yields
[TABLE]
where we have used a trapezoidal Riemann approximation to get
[TABLE]
and right-hand Riemann approximations to get
[TABLE]
Again applying (B.1) to , foiling, and using that , the expression (b) is equal to
[TABLE]
where we have used the Riemann approximation
[TABLE]
Summing up (a), (b), and (c) gives the desired asymptotics (B.1) as a result of the cancellation between the bracketed terms above.
Appendix C The zero bias approach to Stein’s method
We will discuss the zero bias variation on Stein’s method introduced in [19], which provides an easy proof of Lemma 11.6 (restated in Lemma C.4).
C.1 Zero bias transformation
Let be a centered random variable with variance . The zero bias transformation, , of is the distribution satisfying
[TABLE]
for all absolutely continuous functions on . The right side above can be written as
[TABLE]
Thus if has distribution measure , then is constructed by choosing a number using the measure and then picking a number uniformly at random from the interval between [math] and . The normal distribution is the unique fixed point for the zero bias transformation:
Lemma C.1**.**
Let be a centered random variable with variance . Then iff .
Lemma C.2**.**
Let be a centered random variable with variance and finite absolute moment \varsigma_{n}:=\mathbf{E}\big{[}|X|^{n}\big{]} for some . The absolute moment of is finite and equal to .
Proof.
This follows easily from the definition of since
[TABLE]
∎
The lemma below gives a key distributional identity for the zero bias transformation of a finite sum of independent random variables; see, for instance, Lemma 2.2 of [18] for the proof.
Lemma C.3**.**
Let be independent centered random variables with . Let i be a variable taking values in with probability \mathcal{P}\big{[}\textbf{i}=k\big{]}=\frac{\sigma_{k}^{2}}{\sigma_{1}^{2}+\cdots+\sigma_{n}^{2}}. The distribution of has the form
[TABLE]
where i is independent of the random variables and . In other terms, the variable in the sum is replaced by with probability .
C.2 Relation to Stein’s method
Recall that \rho_{1}(X,Y):=\sup_{h\in\textup{Lip}_{1}}\mathbb{E}\big{[}h(X)-h(Y)\big{]} for two random variables and with finite first absolute moments. Also, recall that the auxiliary function for a given in Stein’s method satisfies the differential equation
[TABLE]
and that the first- and second-order derivatives have the bounds and . In particular is absolutely continuous with Lipschitz constant . If is a centered random variable with variance and \mathcal{X}\sim\mathcal{N}\big{(}0,\sigma^{2}\big{)}, then by definition of we have
[TABLE]
Thus, by supremizing over above, we have the bound \rho(X,\mathcal{X})\,\leq\,2\rho\big{(}X,X^{*}\big{)} since . Therefore, the Wasserstein- norm between and the normal random variable is smaller than two times the Wasserstein- norm between and its zero bias transformation.
Lemma C.4**.**
Let ,…, be i.i.d. variables with mean [math] and variance . For , we have the inequality
[TABLE]
for . Moreover, if \mathbb{E}\big{[}|X_{1}|^{3}\big{]}<\infty and \mathcal{Y}\sim\mathcal{N}\big{(}0,\sigma^{2}\big{)}, then
[TABLE]
Proof.
Let the pairs be i.i.d. couplings of the variables and such that
[TABLE]
Then is bounded as follows:
[TABLE]
and the last term is equal to by assumption. Next we simply observe that
[TABLE]
where the second inequality is by Lemma C.2. The result then holds because .∎
Proof of Lemma 11.7.
Lemma 11.5 gives us the inequality
[TABLE]
The second inequality above uses that \mathbb{E}\big{[}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{n}^{4}\big{]}=3\sigma^{4}\big{(}1-\frac{1}{n}\big{)}+\frac{1}{n}\mathbb{E}\big{[}X_{1}^{4}\big{]} is smaller than 3\mathbb{E}\big{[}X_{1}^{4}\big{]} and 2^{\frac{2}{3}}3^{\frac{1}{3}}\big{(}1+3^{\frac{1}{6}}\big{)}<6.∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T. Alberts, J. Clark, S. Kocic: The intermediate disorder regime for a directed polymer model on a hierarchical lattice , Stoch. Process. Appl. 127 , 3291-3330 (2017).
- 2[2] T. Alberts, K. Khanin, J. Quastel: The intermediate disorder regime for directed polymers in dimension 1 + 1 1 1 1+1 , Ann. Probab. 42 , No. 3, 1212-1256 (2014).
- 3[3] T. Alberts, K. Khanin, J. Quastel: The continuum directed random polymer , J. Stat. Phys. 154 , No. 1-2, 305-326 (2014).
- 4[4] L. Bertini and N. Cancrini: The two-dimensional stochastic heat equation: renormalizing a multiplicative noise , J. Phys. A: Math. Gen. 31 , 615, (1998).
- 5[5] F. Caravenna, R. Sun, and N. Zygouras: Polynomial chaos and scaling limits of disordered systems , J. Eur. Math. Soc. 19 , 1-65 (2017).
- 6[6] F. Caravenna, R. Sun, and N. Zygouras: Universality in marginally relevant disordered systems , Ann. Appl. Probab. 27 , No. 5, 3050-3112 (2017).
- 7[7] F. Caravenna, R. Sun, N. Zygouras, The Dickman subordinator, renewal theorems, and disordered systems , Elect. Journ. Prob. 24 , 1-48 (2019).
- 8[8] F. Caravenna, R. Sun, N. Zygouras: Scaling limits of disordered systems and disorder relevance , Proceedings of XVIII International Congress on Mathematical Physics, ar Xiv:1602.05825.
