Asymmetric scaling in large deviations for rare values bigger or smaller   than the typical value

Cecile Monthus

arXiv:1904.02448·cond-mat.stat-mech·May 12, 2021

Asymmetric scaling in large deviations for rare values bigger or smaller than the typical value

Cecile Monthus

PDF

TL;DR

This paper investigates asymmetric large deviations in empirical observables derived from independent random variables, revealing how different scalings occur for rare events above or below typical values, with insights from Sanov's theorem and renormalization.

Contribution

It unifies the analysis of asymmetric large deviations for various empirical observables using Sanov's theorem and explores their physical interpretation through renormalization.

Findings

01

Asymmetric large deviations occur for empirical maxima, averages, and moments.

02

Sanov's theorem provides a unifying framework for analyzing these deviations.

03

The physical meaning of rate functions is discussed via renormalization.

Abstract

In various disordered systems or non-equilibrium dynamical models, the large deviations of some observables have been found to display different scalings for rare values bigger or smaller than the typical value. In the present paper, we revisit the simpler observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations already occur. The unifying starting point to analyze the large deviations of these various empirical observables is given by the Sanov theorem for the large deviations of the empirical histogram : the rate function corresponds to the relative entropy with respect to the true probability distribution and it can be optimized in the presence of the appropriate constraints. Finally, the physical…

Equations271

P_{N} (u) ≃_{N \to + \infty} δ (u - u^{t y p})

P_{N} (u) ≃_{N \to + \infty} δ (u - u^{t y p})

u ≃_{N \to + \infty} u_{t y p} + \frac{v}{T _{N}}

u ≃_{N \to + \infty} u_{t y p} + \frac{v}{T _{N}}

P_{N} (u) ≃_{N \to + \infty} e^{- N I (u)}

P_{N} (u) ≃_{N \to + \infty} e^{- N I (u)}

I (u^{t y p}) = 0

I (u^{t y p}) = 0

P_{N} (u) ≃_{N \to + \infty} e^{- D_{N}^{+} I_{+} (u)} for u \geq u^{t y p}

P_{N} (u) ≃_{N \to + \infty} e^{- D_{N}^{+} I_{+} (u)} for u \geq u^{t y p}

P_{N} (u) ≃_{N \to + \infty} e^{- D_{N}^{-} I_{-} (u)} for u \leq u^{t y p}

π_{α, ν}^{e x p} (x) ≃_{x \to + \infty} K x^{ν - 1} e^{- x^{α}}

π_{α, ν}^{e x p} (x) ≃_{x \to + \infty} K x^{ν - 1} e^{- x^{α}}

π_{μ}^{p o w er} (x) ≃_{x \to + \infty} \frac{K}{x ^{1 + μ}}

π_{μ}^{p o w er} (x) ≃_{x \to + \infty} \frac{K}{x ^{1 + μ}}

p_{N} (x) \equiv \frac{1}{N} i = 1 \sum N δ (x - x_{i})

p_{N} (x) \equiv \frac{1}{N} i = 1 \sum N δ (x - x_{i})

p_{N}^{t y p} (x) = π (x)

p_{N}^{t y p} (x) = π (x)

x_{N}^{ma x} \equiv 1 \leq i \leq N max (x_{i})

x_{N}^{ma x} \equiv 1 \leq i \leq N max (x_{i})

N_{N} (x) \equiv i = 1 \sum N θ (x_{i} - x) = N \int_{x}^{+ \infty} d y p_{N} (y)

N_{N} (x) \equiv i = 1 \sum N θ (x_{i} - x) = N \int_{x}^{+ \infty} d y p_{N} (y)

0 = N_{L} (x_{N}^{ma x} + 0) = N \int_{x_{N}^{ma x} + 0}^{+ \infty} d y p_{N} (y)

0 = N_{L} (x_{N}^{ma x} + 0) = N \int_{x_{N}^{ma x} + 0}^{+ \infty} d y p_{N} (y)

1 = N_{L} (x_{N}^{ma x} - 0) = N \int_{x_{N}^{ma x} - 0}^{+ \infty} d y p_{N} (y)

M_{N} \equiv (x_{N}^{ma x})^{t y p}

M_{N} \equiv (x_{N}^{ma x})^{t y p}

1 = N \int_{M_{N}}^{+ \infty} d y p_{N}^{t y p} (y) = N \int_{M_{N}}^{+ \infty} d y π (y) \equiv N C (M_{N})

1 = N \int_{M_{N}}^{+ \infty} d y p_{N}^{t y p} (y) = N \int_{M_{N}}^{+ \infty} d y π (y) \equiv N C (M_{N})

C (x) \equiv \int_{x}^{+ \infty} d x^{'} π (x^{'})

C (x) \equiv \int_{x}^{+ \infty} d x^{'} π (x^{'})

r_{N} \equiv \frac{x _{N}^{ma x}}{M _{N}} = \frac{1}{M _{N}} (1 \leq i \leq N max (x_{i}))

r_{N} \equiv \frac{x _{N}^{ma x}}{M _{N}} = \frac{1}{M _{N}} (1 \leq i \leq N max (x_{i}))

r_{N}^{t y p} = 1

r_{N}^{t y p} = 1

C (x) ≃_{x \to + \infty} \frac{K}{α} x^{ν - α} e^{- x^{α}}

C (x) ≃_{x \to + \infty} \frac{K}{α} x^{ν - α} e^{- x^{α}}

\frac{1}{N} = C (M_{N}) ≃_{N \to + \infty} \frac{K}{α} M_{N}^{ν - α} e^{- M_{N}^{α}}

\frac{1}{N} = C (M_{N}) ≃_{N \to + \infty} \frac{K}{α} M_{N}^{ν - α} e^{- M_{N}^{α}}

M_{N} ≃_{N \to + \infty} [ln N + \frac{ν - α}{α} ln (ln N) - ln (\frac{α}{K})]^{\frac{1}{α}}

M_{N} ≃_{N \to + \infty} [ln N + \frac{ν - α}{α} ln (ln N) - ln (\frac{α}{K})]^{\frac{1}{α}}

C (x) = \int_{x}^{+ \infty} d x^{'} π (x^{'}) = \frac{K}{μ x ^{μ}}

C (x) = \int_{x}^{+ \infty} d x^{'} π (x^{'}) = \frac{K}{μ x ^{μ}}

M_{N} = (\frac{K N}{μ})^{\frac{1}{μ}}

M_{N} = (\frac{K N}{μ})^{\frac{1}{μ}}

G_{N} \equiv \frac{1}{N} i = 1 \sum N g (x_{i}) = \int_{0}^{+ \infty} d xg (x) p_{N} (x)

G_{N} \equiv \frac{1}{N} i = 1 \sum N g (x_{i}) = \int_{0}^{+ \infty} d xg (x) p_{N} (x)

a_{N} \equiv \frac{1}{N} i = 1 \sum N x_{i} = \int_{0}^{+ \infty} d xx p_{N} (x)

a_{N} \equiv \frac{1}{N} i = 1 \sum N x_{i} = \int_{0}^{+ \infty} d xx p_{N} (x)

a_{N}^{(q)} \equiv \frac{1}{N} i = 1 \sum N x_{i}^{q} = \int_{0}^{+ \infty} d x x^{q} p_{N} (x)

a_{N}^{(q)} \equiv \frac{1}{N} i = 1 \sum N x_{i}^{q} = \int_{0}^{+ \infty} d x x^{q} p_{N} (x)

G_{N}^{t y p} = \int_{0}^{+ \infty} d xg (x) p_{N}^{t y p} (x) = \int_{0}^{+ \infty} d xg (x) π (x)

G_{N}^{t y p} = \int_{0}^{+ \infty} d xg (x) p_{N}^{t y p} (x) = \int_{0}^{+ \infty} d xg (x) π (x)

a_{N}^{t y p} = \int_{0}^{+ \infty} d xx p_{N}^{t y p} (x) = \int_{0}^{+ \infty} d xx π (x) = \overline{x}

a_{N}^{t y p} = \int_{0}^{+ \infty} d xx p_{N}^{t y p} (x) = \int_{0}^{+ \infty} d xx π (x) = \overline{x}

P_{N} [p_{N} (.)] ≃_{N \to + \infty} δ (1 - \int d x p_{N} (x)) e^{- N S^{r e l} (p_{N} (.) ∣ π (.))}

P_{N} [p_{N} (.)] ≃_{N \to + \infty} δ (1 - \int d x p_{N} (x)) e^{- N S^{r e l} (p_{N} (.) ∣ π (.))}

S^{r e l} (p_{N} (.) ∣ π (.)) \equiv \int d x p_{N} (x) ln (\frac{p _{N} ( x )}{π ( x )})

S^{r e l} (p_{N} (.) ∣ π (.)) \equiv \int d x p_{N} (x) ln (\frac{p _{N} ( x )}{π ( x )})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Asymmetric scaling in large deviations

for rare values bigger or smaller than the typical value

Cécile Monthus

Institut de Physique Théorique, Université Paris Saclay, CNRS, CEA, 91191 Gif-sur-Yvette, France

Abstract

In various disordered systems or non-equilibrium dynamical models, the large deviations of some observables have been found to display different scalings for rare values bigger or smaller than the typical value. In the present paper, we revisit the simpler observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations already occur. The unifying starting point to analyze the large deviations of these various empirical observables is given by the Sanov theorem for the large deviations of the empirical histogram : the rate function corresponds to the relative entropy with respect to the true probability distribution and it can be optimized in the presence of the appropriate constraints. Finally, the physical meaning of large deviations rate functions is discussed from the renormalization perspective.

I Introduction

In macroscopic systems with a large number $N$ of degrees of freedom, it is essential to understand how physical observables fluctuate as a function of the size $N$ . When $u$ is some intensive variable, one usually distinguishes the three following levels of descriptions for its probability distribution $P_{N}(u)$ :

(i) in the thermodynamic limit $N\to+\infty$ , the probability distribution $P_{N}(u)$ becomes concentrated on the typical value $u^{typ}$ that does not depend on $N$

[TABLE]

This statement is the analog of the law of large numbers for the empirical average of independent random variables. Another well-known example is the typical Lyapunov exponent for product of random matrices [1, 2].

(ii) zooming in Eq. 1 will reveal the order $\frac{1}{T_{N}}$ of the small typical fluctuations around the typical value $u^{typ}$ . The appropriate scale $T_{N}$ grows with $N$ , for instance like a power-law $N^{\chi}$ with some exponent $0<\chi<1$ or like a power of $\ln(N)$

[TABLE]

and the rescaled variable $v\equiv T_{N}(u-u^{typ})$ is distributed with some universal limiting distribution $V(v)$ . This statement is the analog of the Central Limit Theorem with the scale $T_{N}=\sqrt{N}$ and where $V(v)$ is the Gaussian distribution for the universality class of probability distributions whose two first moments are finite (if they are not finite, one obtains the other universality classes involving Lévy stable laws). Another famous example is given by the three universality classes Gumbel-Fréchet-Weibull of Extreme Value Statistics [3, 4], with many applications in various physics domains (see the reviews [5, 6, 7] and references therein).

(iii) in the field of large deviations, one is interested instead in evaluating how rare it is for large $N$ to observe some finite value $u$ different from $u^{typ}$ . The standard theory of large deviations is based on the exponential decay [8, 9, 10]

[TABLE]

where the rate function is positive $I(u)\geq 0$ and vanishes only for the typical value $u^{typ}$ of Eq 1

[TABLE]

While the region (ii) of universal typical fluctuations has been traditionally the main focus of studies for various physical observables, the theory of large deviations (iii) is nowadays considered as the unifying language for the statistical physics of equilibrium, non-equilibrium and dynamical systems (see the reviews [8, 9, 10] and references therein). In particular, the large deviations with respect to the large time limit of dynamical trajectories has produced an appropriate statistical physics approach for various Markovian processes (see the reviews [11, 12, 13, 14, 15, 16, 17] and the PhD Theses [18, 19, 20, 21] and the HDR Thesis [22]).

However the recent huge activity on large deviations in the field of random matrices has shown that the maximal eigenvalue [23, 24, 25, 26, 27, 28] and many other observables involving the eigenvalues [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39] display asymmetric scaling in large deviations : the probability $P_{N}(u)$ to observe bigger values than typical $u>u^{typ}$ and smaller values than typical $u<u^{typ}$ are governed by two different scalings $D_{N}^{\pm}$ (for instance two different power-laws $D_{N}^{\pm}=N^{\theta_{\pm}}$ ) and two rate functions $I_{\pm}(u)$

[TABLE]

instead of the standard form of Eq. 3. For the maximal eigenvalue [23, 24, 25, 26, 27, 28], the physical interpretation of this asymmetry is that to push the maximal eigenvalue inside the Wigner sea, one needs to reorganize all the other eigenvalues, whereas to pull the maximal eigenvalue outside the Wigner sea, one may leave the other eigenvalues unchanged. Via mapping between models belonging to the Kardar-Parisi-Zhang universality class (see the list in the review [27] and references therein), these asymmetric large deviations properties for the biggest eigenvalue of some random matrices ensembles can be rephrased in many other frameworks, in particular :

(a) for the Asymmetric Exclusion process, which is one of the most studied models in the field of the non-equilibrium dynamics of interacting particles (see the reviews [11, 16, 17] and references therein), the interpretation of the asymmetric large deviations is that to slow down the traffic, it is sufficient to slow down a single particle, whereas to speed up the traffic, one needs to speed up all particles [40].

(b) for the Directed Polymer in random medium in dimension $d=2$ , which is one of the simplest disordered model displaying a low temperature glassy frozen phase (see the review [41] and references therein), the interpretation is that an anomalously good ground state energy requires only $L$ anomalously good on-site energies along the polymer, while an ’anomalously bad’ ground state energy requires $L^{d}$ bad on-site energies in the sample.

These examples and their very clear physical meanings show that asymmetric large deviations of Eq. 5 are likely to occur in many other problems in the fields of non-equilibrium dynamics or disordered systems, while they are not considered in the standard theory of large deviations [8, 9, 10] based on Eq. 3. As a consequence, it seems useful to revisit simpler observables based on independent random variables where asymmetric large deviations have been found to occur, in particular for the empirical maximum [42, 43], for the empirical average [44, 45, 46, 47], and in joint linear statistics [48]. Since these problems have been already studied in details in these references by exact methods, the goal of the present paper is to give a unifying perspective based on the large deviation properties of the empirical histogram in the presence of constraints corresponding to the observables under study. This point of view also allows to make the link with the studies of large deviations in the field of random matrices [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39] where the Coulomb gas technique is based on the large deviations of the empirical histogram of eigenvalues, in the presence of constraints corresponding to the observables under study. The main difference is that the large deviations of the empirical histogram are governed by the Coulomb interaction energy in the case of random matrices eigenvalues, while it is governed by the relative entropy of the Sanov theorem for the case of independent variables [8, 9, 10]. This large deviation framework also makes the link with the Gibbs theory of ensembles in equilibrium statistical physics [8, 9, 10] and thus allows to understand why it is natural to expect the possibility of phase transitions in large deviation rate functions (see the recent review [51] and references therein).

The paper is organized as follows. In section II, we recall how the empirical histogram of independent random variables allows to reconstruct interesting observables like the empirical maximum, the empirical average, or other additive empirical observables, with the consequences for typical values. In section III, the large deviations of the empirical histogram is presented as the unifying starting point to analyze the large deviations of empirical observables. In section IV, the asymmetry in the large deviations of the empirical maximum is analyzed on various scales. In section V, the asymmetry in the large deviations of the empirical average is described for the case of stretched exponential decay or power-law decay of the initial distribution. In section VI, the generalization for the large deviations of arbitrary non-integer empirical moments is discussed. In section VII, these large deviations properties are analyzed from the renormalization perspective. Our conclusions are summarized in section VIII.

II Empirical observables for independent random variables

II.1 Notations

Our main goal is to analyze the asymmetry in large deviation properties that may occur for simple observables involving $N$ independent random variables $x_{i}$ drawn with some probability distribution $\pi(x)$ . To simplify the discussion, we will focus on the cases of positive variables $0\leq x<+\infty$ , where the decay of the probability distribution $\pi(x)$ for large $x\to+\infty$ is :

(i) either an exponential decay with some exponent $\alpha>0$

[TABLE]

with possibly some power-law prefactor if $\nu\neq 1$ , while $K$ is a constant amplitude.

ii) or a power-law decay with some exponent $\mu>2$ (in order to ensure the existence of the two first moments $\overline{x}$ and $\overline{x^{2}}$ )

[TABLE]

But of course these assumptions are not restrictive, and if one is interested into other cases, one can easily adapt the methods described below by considering the various possible tail behaviors for $x\to-\infty$ .

II.2 Empirical histogram

If one is not interested in the order of appearance of the variables $[x_{i}]_{1\leq i\leq N}$ (otherwise see the pedagogical introduction [52] and references therein), all the information is contained in the empirical histogram

[TABLE]

Its typical value is of course the ’true’ probability distribution $\pi(x)$

[TABLE]

The large deviations around this typical value will be discussed in section III. Let us first recall how the empirical histogram allows to reconstruct the usual empirical observables of interest.

II.3 Empirical maximum

The information on the empirical maximum

[TABLE]

is contained in the empirical histogram of Eq. 8 as follows : the empirical maximum $x^{max}_{N}$ corresponds to the value $x$ where the empirical number of variables bigger than $x$

[TABLE]

jumps from the value [math] for $x$ bigger than $x^{max}_{N}$ towards the value $1$ for $x$ sligthly smaller then $x^{max}_{N}$

[TABLE]

The typical value of the empirical histogram of Eq. 9 yields that the typical value $M_{N}$ of the empirical maximum $x_{N}^{max}$ of Eq. 10

[TABLE]

is given by the typical position of the jump of Eq. 12

[TABLE]

where we have introduced the complementary cumulative distribution function that measures the integrated tail above the threshold $x$

[TABLE]

The large deviations properties of the ratio

[TABLE]

of typical value unity

[TABLE]

will be discussed in section IV. To be self-contained, let us now recall the behavior of the typical values $M_{N}$ as a function of $N$ for the two types of decay under study here.

II.3.1 Typical value $M_{N}$ of the maximum for the exponential decay

For the asymptotic behavior of Eq. 6, the asymptotic behavior of its primitive of Eq. 15 reads

[TABLE]

Then Eq. 14 determining the typical value $M_{N}$ of the empirical maximum becomes for large $N$

[TABLE]

The inversion yields at leading order the well-known logarithmic behavior [3, 4]

[TABLE]

II.3.2 Typical value $M_{N}$ of the maximum the power-law decay

For the power-law decay of Eq. 7, the asymptotic behavior of its primitive of Eq. 15

[TABLE]

yields that the solution of Eq. 14 follows the well-known power-law [3, 4]

[TABLE]

II.4 Empirical additive observables

The empirical histogram of Eq. 8 allows to reconstruct any additive observable $G_{N}$ involving some function $g(x)$

[TABLE]

Th most studied observable in the whole history of probability is of course the empirical average

[TABLE]

The empirical moments of arbitrary non-integer order $q$

[TABLE]

have been also considered [24, 6] in order to interpolate between the case $q=1$ of the empirical average of Eq. 24 and the empirical maximum of Eq. 10 that should dominate the empirical moment of Eq 25 for large $q\to+\infty$ . Another examples include the exponential case $g(x)=e^{tx}$ considered in Ref [53] or the logarithmic case $g(x)=\ln(x)$ .

The typical value of the empirical histogram of Eq. 9 yields that the typical values of additive observables of Eq. 23 are simply

[TABLE]

In particular the typical value of the empirical average of Eq. 24 corresponds to the first moment $\overline{x}$ of $\pi(x)$

[TABLE]

The possibility of asymmetric large deviations properties around this typical value will be discussed in section V.

III Analysis based on the large deviations of the empirical histogram

III.1 Reminder on the Sanov theorem involving the relative entropy

The large deviations of the empirical histogram $p_{N}(x)$ of Eq. 8 around its typical value $p_{N}^{typ}(x)=\pi(x)$ of Eq 9 are described by the Sanov theorem (see the reviews [8, 9, 10] and the pedagogical introduction [52])

[TABLE]

The delta function appears in order to impose the normalization constraint of the empirical histogram $\int dxp_{N}(x)=1$ . The exponentially small term in $N$ involves the relative entropy of the empirical histogram $p_{N}(x)$ with respect to the true probability distribution $\pi(x)$

[TABLE]

III.2 Exact generating function of the empirical histogram for finite $N$

Among the various derivations of Eq. 28, one is based on the exact generating function ${\cal Z}_{N}[\kappa(.)]$ of the empirical histogram for any finite $N$

[TABLE]

where the scaled cumulant generating function

[TABLE]

is related to the relative entropy of Eq. 29 via the appropriate Legendre transform (see [52] for more details on the Legendre transforms in the two directions).

III.3 Constraint to reproduce the cumulative distribution of the maximum $x_{N}^{max}$

The probability distribution ${\cal X}_{N}(x^{max}_{N})$ of the empirical maximum $x^{max}_{N}$ of Eq. 10 is well known to be [3, 4]

[TABLE]

in terms of the cumulative function $C(x)$ introduced in Eq. 15.

Here it is instructive to mention how it can be reproduced from the large deviations of the empirical histogram of Eq. 28 : the cumulative probability of the maximum $x_{N}^{max}$ amounts to replace the normalisation constraint $\int dxp_{N}(x)=1$ by the two constraints

[TABLE]

leading to

[TABLE]

Introducing the Lagrange multiplier $\omega$ , one needs to optimize the functional

[TABLE]

over the empirical histogram $p_{N}(.)$

[TABLE]

The optimal solution is thus simply proportional to the true distribution $\pi(x)$ on $[0,x_{N}^{max}]$ (while it vanishes for $x>x_{N}^{max}$ )

[TABLE]

where the normalization constraint determines the Lagrange multiplier $\omega$

[TABLE]

Plugging the corresponding optimal value of the functional of Eq. 35

[TABLE]

into Eq. 34

[TABLE]

thus allows to recover the exact cumulative distribution of Eq. 32. In this derivation, Eq. 34 thus corresponds to the entropic cost for the emptiness of the region $[x_{N}^{max},+\infty[$ . Section IV will be devoted to the asymmetric large deviations properties of this distribution.

III.4 Standard large deviations for additive empirical observables

The probability distribution ${\cal G}_{N}(G_{N})$ of the additive observable of Eq. 23

[TABLE]

can be directly characterized by its exact generating function for finite $N$ by applying Eq 30 to the case $\kappa(x)=kg(x)$

[TABLE]

with the scaled cumulant generating function $\phi(k)$

[TABLE]

The alternative evaluation of Eq. 42 based on the standard large deviation form for the probability

[TABLE]

yields

[TABLE]

via the saddle-point method for the integral over $G$ that $\phi(k)$ is the Legendre transform of the rate function $I(G)$

[TABLE]

with the reciprocal Legendre transform

[TABLE]

Another way to understand the physical meaning of these Legendre transforms consists in evaluating the probability ${\cal G}_{N}(G)$ via the addition of the sum constraint in the large deviations of the empirical histogram of Eq. 28 :

[TABLE]

Introducing the two Lagrange multiplier $\omega$ and $k$ , one needs to optimize the functional

[TABLE]

over the empirical histogram $p_{N}(.)$

[TABLE]

The optimal solution reads

[TABLE]

where the the Lagrange multipliers are fixed by the two constraints

[TABLE]

i.e. in terms of the function $\phi(k)$ introduced in Eq. 43

[TABLE]

The corresponding optimal value of the functional of Eq. 49 using 53 thus involves the Legendre transform $I(G)$ of $\phi(k)$ (Eqs 46 and 47)

[TABLE]

as it should to recover via Eq. 48

[TABLE]

Here the analogy with the Gibbs theory of ensembles in equilibrium statistical physics is obvious : the effective distribution of Eq 51 for an individual random variable $x$ is the analog of the Boltzmann distribution in the canonical ensemble, where the Lagrange multiplier $k$ is conjugated to the quantity $g(x)$ , whose average $G$ over the $N$ variables is fixed.

Of course these computations make sense only if the integrals of Eq. 52 converge : depending on the function $g(x)$ defining the empirical observable $G_{N}$ under study, these integrals may diverge in some region of the Lagrange multiplier $k$ . The consequences for the non-standard large deviations properties will be discussed for the case of the empirical average in section V and for the empirical moments in section VI.

III.5 Large deviations for joint additive empirical observables

If one is interested in the large deviations of the joint probability of two additive empirical observables, one needs to add another constraint in Eq. 48, as described in detail in Ref [48] for the joint probability of the empirical average and of an empirical moment of order $q$ . More generally, one can add as many constraints as needed for the problem one is interested in.

IV Asymmetry in the Large deviations of the empirical maximum

IV.1 Probability distribution of the ratio $r_{N}=\frac{x_{N}^{max}}{M_{N}}$

Via the change of variables $r=\frac{x_{N}^{max}}{M_{N}}$ of Eq. 16, the probability distribution ${\cal X}_{N}(x^{max}_{N})$ of Eq. 32 becomes the probability distribution

[TABLE]

Since the typical value $M_{N}$ is large as a consequence of the equation $C(M_{N})=\frac{1}{N}$ , while the ratio $r$ is finite, the value of $(rM_{N})$ is also large, i.e. the value of $C(rM_{N})$ is also small . Then Eq. 56 becomes

[TABLE]

It is thus more convenient to substitute $N=\frac{1}{C(M_{N})}$ in order to write everything in terms of the single scale $M_{N}$

[TABLE]

IV.2 Case of the exponential decay

For the exponential decay of Eq. 6, the asymptotic behavior of the function $C(x)$ given in Eq. 18 yields in Eq. 58

[TABLE]

This means that the rescaled variable

[TABLE]

is distributed with the Gumbel probability distribution [3, 4]

[TABLE]

whose very strong asymmetry for the asymptotic behaviors for $y\to\pm\infty$ is well-known

[TABLE]

As a consequence when $r$ is finite and different from the typical value $r^{typ}=1$ , the variable $y$ of Eq. 60 will be near $(\pm\infty)$ depending on the sign of $(r-1)$ : the asymptotic behaviors of Eqs 62 will thus produce completely different scalings in the region bigger than typical $r>1$ [42, 43]

[TABLE]

and in the region smaller than typical $0<r<1$

[TABLE]

The link with the region of small typical fluctuations around the typical value $r^{typ}=1$ usually considered in the Extreme Value Statistics [3, 4] corresponds here to the Taylor expansion at first order of the variable $y$ of Eq. 60

[TABLE]

This yields that the appropriate rescaling to have a finite variable $y$ distributed with the Gumbel distribution $G(y)$ is

[TABLE]

or equivalently for the unrescaled maximum of Eq. 10

[TABLE]

where the behavior of $M_{N}$ as a function of $N$ was recalled in Eq. 20.

IV.3 Case of the power-law decay

For the power-law decay of Eq. 7, the asymptotic behavior of its primitive of Eq. 21 yields in Eq. 58

[TABLE]

where the Fréchet distribution $F_{\mu}(r)$ of parameter $\mu$ appears for any finite value $r$ , while the scale $M_{N}$ has completely disappeared, in contrast to the exponential decay case described above. So here the probability of any value $r\neq r^{typ}\neq 1$ does not even decay with $N$ .

IV.4 Asymmetry beyond the regime of finite ratio $r$

In the following section, we will need the probability of the maximum of Eq 32 beyond the regime of finite ratio $r$ with respect to the typical value $M_{N}$ . In the region much bigger than typical $x^{max}_{N}\gg M_{N}$ where $C(x^{max}_{N})\ll C(M_{N})=\frac{1}{N}$ , the factor $\left[1-C(x^{max}_{N})\right]^{N-1}$ can be neglected in Eq 32 and one obtains the leading behavior

[TABLE]

The physical meaning is that one just needs to draw the anomalously big value $x^{max}_{N}$ , while the other $(N-1)$ variables may remain typical and thus have no probabilistic cost.

On the contrary, in the region much smaller than typical $x^{max}_{N}\ll M_{N}$ where $C(x^{max}_{N})\gg C(M_{N})=\frac{1}{N}$ , the factor $\left[1-C(x^{max}_{N})\right]^{N-1}$ is the leading behavior in Eq 32 and produces an extensive cost in $N$ in the exponential

[TABLE]

This asymmetry is thus very strong, and reads for the exponential decay of Eq. 6 with Eq. 18,

[TABLE]

while for the power-law decay of Eq 7 with Eq 21, it is given by

[TABLE]

V Possible asymmetry in the large deviations of the empirical average

Whenever the first moment $\overline{x}$ and the variance $\sigma^{2}\equiv\overline{x^{2}}-(\overline{x})^{2}$ are finite, the Central Limit Theorem means that the empirical average of Eq 24 will display typical fluctuations of order $\frac{1}{\sqrt{N}}$ around the typical value $\overline{x}$

[TABLE]

where $v$ is a Gaussian random variable of zero mean and variance unity. In this section, we focus on the large deviations properties for the probability distribution ${\cal A}_{N}(a)$ of the empirical average $a$

[TABLE]

to discuss how rare it is to observe $a\neq\overline{x}$ for large $N$ and when an asymmetry will occur.

V.1 Standard large deviation theory for the exponential decay with exponent $\alpha\geq 1$

The standard large deviation theory recalled in section III.4 for additive empirical observable can be applied to the empirical average $G=a$ with $g(x)=x$ : the probability to observe the value $a\neq a^{typ}=\overline{x}$ is exponentially small in $N$

[TABLE]

The rate function $I(a)$ can be either evaluated directly or can be computed as the Legendre transform (Eqs 46 and 47) of the scaled cumulant generating function of Eq. 43

[TABLE]

For the exponential decay of Eq. 6 with an exponent $\alpha>1$ , the scaled cumulant generating function of Eq. 76 is defined for any $k\in]-\infty,+\infty[$ , and the above large deviation theory can be applied without worry.

For the exponential decay of Eq. 6 with an exponent $\alpha=1$ , the scaled cumulant generating function of Eq. 76 is defined only for $k\in]-\infty,1[$ , and it is thus useful to describe the example of the gamma distribution of parameter $\nu$

[TABLE]

of Laplace transform

[TABLE]

So computing its power $N$ simply amounts to change the parameter $\nu$ into $(N\nu)$ . As a consequence, the sum of $N$ variables $x_{i}$ is distributed with the gamma distribution $\gamma_{N\nu}(.)$ of parameter $(N\nu)$ that corresponds to the convolution of $N$ distributions $\gamma_{\nu}(.)$ . After the rescaling by $N$ , the probability distribution of the empirical average is thus exactly given by

[TABLE]

For large $N$ , the Stirling approximation for $\Gamma(N\nu)$ yields the large deviation form of Eq. 75 where the rate function

[TABLE]

is well defined for $a\in]0,+\infty[$ and measures how rare it is to observe a value $a$ different from the typical value $a^{typ}=\nu$ .

The corresponding scaled generating cumulant function $\phi(k)$ of Eq. 76 is defined only for $k<1$ .

[TABLE]

The correspondence between $k$ and $a$ via the Legendre transform of Eq. 46 is

[TABLE]

or equivalently via the reciprocal Legendre transform of Eq. 47

[TABLE]

So the region $k\in]-\infty,0[$ allows to parametrize the whole smaller than typical region $a\in]0,\nu[$ , while the region $k\in]0,1[$ allows to parametrize the whole bigger than typical region $a\in]\nu,+\infty[$ without problems.

V.2 Asymmetry in the large deviations for stretched exponential decay $0<\alpha<1$

V.2.1 Usual large deviation form in the region smaller than typical $a\leq a^{typ}=\overline{x}$

For the exponential decay of Eq. 6 with an exponent $0<\alpha<1$ , the scaled cumulant generating function of Eq. 76 is defined only for $k\in]-\infty,0]$ that corresponds to the region smaller than typical $a\leq a^{typ}=\overline{x}$ , where the usual large deviation form will thus be valid

[TABLE]

The rate function $I_{-}(a)$ for $a\leq a^{typ}=\overline{x}$ corresponds to the Legendre transform of the function $\phi(k)$ defined for $k\leq 0$ .

V.2.2 Unusual large deviation in the region $a>a^{typ}$

For $k>0$ that corresponds to the region bigger than the typical value $a>a^{typ}=\overline{x}$ , the function $\phi(k)$ as defined by Eq. 76 does not exist as a consequence of the divergence of the integral at $(+\infty)$ when $\pi(x)$ decays only as a stretched exponential with $0<\alpha<1$

[TABLE]

This suggests to consider the strategy based on the maximum alone : one considers that $(N-1)$ variables have their typical sum $(N-1)\overline{x}$ , which happens with probability one for large $N$ , i.e. with no probabilistic cost, while the remaining variable, that will have to coincide with the maximum $x_{N}^{max}$ of Eq. 10, should be anomalously big in order to satisfy the sum constraint

[TABLE]

So the cost of this strategy directly involves the probability ${\cal X}_{N}(x^{max}_{N})$ of Eqs 69 and 71 of the anomalously extensive value of Eq. 86

[TABLE]

that decays only as the stretched exponential of exponent $\alpha\in]0,1[$ . The corresponding rate function

[TABLE]

has been proven to be valid in the whole region $a>\overline{x}$ in Refs [44, 46].

V.3 Asymmetry in the large deviations for power-law decay

For the power-law decay of Eq. 7, one has the same scenario as for the stretched exponential case discussed above:

(i) in the region smaller than typical $a\leq a^{typ}=\overline{x}$ , the usual large deviation form is valid

[TABLE]

where the rate function $I_{-}(a)$ corresponds to the Legendre transform of the function $\phi(k)$ defined for $k\leq 0$ .

(ii) in the region bigger than the typical value $a>a^{typ}=\overline{x}$ where the function $\phi(k)$ of Eq. 76 does not exist as a consequence of the divergence of the integral, the strategy based on the anomalous maximum of Eq. 86 leads to the probability (using Eqs 69 and 72)

[TABLE]

that decays only as the power-law $N^{-\mu}$ . This phenomenon of ’condensation’ in the power-law case has been studied in great detail in the references [45, 47, 48], with motivations coming from the zero-range process (see explanations and references in [45, 47, 48]). Other physical applications can be found in [54, 55].

VI Asymmetry in the large deviations of the empirical moment of order $q>0$

The analysis of the previous section concerning the empirical average can be directly generalized to obtain the large deviations properties of the empirical moment of arbitrary non-integer order $q>0$ of Eq 25.

VI.1 Standard large deviation theory for the exponential decay with exponent $\alpha\geq q$

The standard large deviation theory recalled in section III.4 for additive empirical observable can be applied to the empirical moment $G_{N}=a^{(q)}_{N}$ of arbitrary non-integer order $q>0$ of Eq 25 with $g(x)=x^{q}$ : the probability to observe a value $a_{q}\neq\overline{x^{q}}$ is exponentially small in $N$

[TABLE]

and the rate function $I(a_{q})$ corresponds to the Legendre transform (Eqs 46 and 47) of the scaled cumulant generating function of Eq. 43

[TABLE]

which is well defined for any $k\in]-\infty,+\infty[$ when $\pi(x)$ displays the exponential decay of Eq. 6 with an exponent $\alpha>q$ .

VI.2 Asymmetry in the large deviations for stretched exponential decay $0<\alpha<q$

VI.2.1 Usual large deviation form in the region smaller than typical $a_{q}\leq\overline{x^{q}}$

For the exponential decay of Eq. 6 with an exponent $0<\alpha<q$ , the scaled cumulant generating function of Eq. 92 is defined only for $k\in]-\infty,0]$ that corresponds to the region smaller than typical $a_{q}\leq\overline{x^{q}}$ , where the usual large deviation form will thus be valid

[TABLE]

The rate function $I_{q}^{-}(a_{q})$ corresponds to the Legendre transform of the function $\phi_{q}(k)$ defined for $k\leq 0$ .

VI.2.2 Unusual large deviation in the region $a_{q}>\overline{x^{q}}$

For $k>0$ that corresponds to the region bigger than the typical value $a_{q}>\overline{x^{q}}$ the function $\phi_{q}(k)$ as defined by Eq. 92 does not exist. The strategy based on the maximum alone explained in the previous section can be then considered: $(N-1)$ variables $x_{i}^{q}$ have their typical sum $(N-1)\overline{x^{q}}$ , which happens with probability one for large $N$ , i.e. with no probabilistic cost, while the remaining variable will have to coincide with the power $q$ of the maximum $x_{N}^{max}$ of Eq. 10. This maximum should be anomalously big in order to satisfy the sum constraint

[TABLE]

So the cost of this strategy reads in terms of the probability ${\cal X}_{N}(x^{max}_{N})$ of Eqs 69 and 71 of the anomalously big value of Eq. 94

[TABLE]

that decays only as the stretched exponential of exponent $\frac{\alpha}{q}\in]0,1[$ , with the corresponding rate function

[TABLE]

VI.2.3 Discussion

So for any exponential decay with exponent $\alpha>0$ in Eq. 6, only the empirical moments of order $q<\alpha$ display a standard form of large deviations, while the empirical moments of order $q>\alpha$ will be characterized by asymmetric large deviations. For instance for the gamma distribution of Eq. 77 corresponding to $\alpha=1$ , where the large deviations of the empirical average $a$ corresponding to $q=1$ are still standard (with the rate function of Eq. 80), all the empirical moments of order $q>1$ will have asymmetric large deviations, in particular the empirical second moment corresponding to $q=2$ .

VII Renormalization interpretation of large deviations rate functions

The region of typical fluctuations around the typical value (see the Introduction around Eq. 2) has been analyzed in detail from the renormalization point of view, both for the sum of random variables [56, 57, 58] and for the maximum of random variables [59, 60, 61, 62, 63]. In this section, it is thus interesting to discuss the meaning of large deviations from the renormalization perspective.

VII.1 Merging two sets of $N$ variables

To see more clearly the renormalization meaning of large deviations, it is interesting to consider the merging of two sets of $N$ random variables :

(1) the first set $[x_{i}]_{1\leq i\leq N}$ of variables is drawn with the probability distribution $\pi_{1}(x)$ and is characterized by the empirical histogram

[TABLE]

(2) the second set $[x_{i}]_{N+1\leq i\leq 2N}$ of variables is drawn with the probability distribution $\pi_{2}(x)$ and is characterized by its empirical histogram

[TABLE]

VII.2 Renormalization for the large deviations of the empirical histogram

Each of these two sets labelled by $b=1,2$ is characterized by the large deviation properties of its empirical histogram $p^{(b)}_{N}(x)$ (Eq. 28 and 29)

[TABLE]

or its exact generating function ${\cal Z}^{(b)}_{N}[\kappa(.)]$ of Eq. 30 for any finite $N$

[TABLE]

Via the merging of the data of Eqs 97 and 98, the global histogram for the $(2N)$ variables is simply the average of the two histograms

[TABLE]

with the typical value

[TABLE]

Its generating function is simply the products of the generating functions of Eq. 100

[TABLE]

i.e. the scaled cumulant generating function follows the renormalization rule

[TABLE]

In particular, when the two sets are drawn with the same probability distribution $\pi_{1}=\pi_{2}=\pi$ , the scaled cumulant generating function $\Phi[\kappa(.)]$ is exactly conserved along the RG flow.

In terms of large deviations form of Eq. 99, the probability of the histogram of Eq. 101

[TABLE]

corresponds to the optimization of the function $\left[-S^{rel}(p^{(1)}_{N}(.)|\pi_{1}(.))-S^{rel}(p^{(2)}_{N}(.)|\pi_{2}(.))\right]$ in the exponential in the presence of the constraints that can be taken into account via Lagrange multipliers. One obtains the optimal solution

[TABLE]

and the corresponding sum of the relative entropies in the exponential

[TABLE]

coincides with the relative entropy of the histogram $p_{2N}(x)$ with respect to its typical value of Eq. 102 as it should for consistency. In particular, when the two sets are drawn with the same probability distribution $\pi_{1}=\pi_{2}=\pi$ , the optimal solution to produce an anomalous empirical histogram $p_{2N}(x)$ consists in choosing the same anomalous empirical histogram for the two subsets (Eq. 106).

VII.3 Renormalization for the large deviations of the empirical maximum

For each set $b=1,2$ , the cumulative probability distribution of the empirical maximum of Eq. 32 can be interpreted as an exact large deviation form

[TABLE]

where the rate function reads

[TABLE]

in terms of the complementary cumulative distribution function (Eq 15) associated to each distribution $\pi_{b}(x)$

[TABLE]

The empirical maximum of the $(2N)$ variables is of course the maximum of the two maximal values associated to the two sets of Eqs 97 and 98

[TABLE]

So the corresponding cumulative distribution

[TABLE]

is written exactly in a large deviation form with the rate function

[TABLE]

When the two sets are drawn with the same probability distribution $\pi_{1}=\pi_{2}=\pi$ , one obtains that the rate function $J(x)$ is exactly conserved along the RG flow.

VII.4 Renormalization for the large deviations of the empirical average

VII.4.1 Case of standard large deviations

In the case of standard large deviations, each set $b=1,2$ is described by the large deviation form of Eq. 75) for its empirical average $a_{b}$

[TABLE]

where the rate function $I_{b}(a_{b})$ is the Legendre transform of the scaled cumulant generating function of Eq. 76

[TABLE]

involved in the generating function

[TABLE]

Via the merging of the data of Eqs 97 and 97, the empirical average of the $(2N)$ variables is simply the average of the two empirical averages of the two sets $b=1,2$

[TABLE]

Its generating function is simply the products of the generating functions of Eq. 116

[TABLE]

so the renormalization rule for the scaled cumulant generating function is simply

[TABLE]

In particular, when the two sets are drawn with the same probability distribution $\pi_{1}=\pi_{2}=\pi$ , the scaled cumulant generating function $\phi(k)$ is exactly conserved along the RG flow.

In terms of large deviations form of Eq. 114, the probability of the empirical average reads for this case $\pi_{1}=\pi_{2}=\pi$

[TABLE]

The saddle-point evaluation of this integral requires to find the maximum of the function in the exponential

[TABLE]

The vanishing of the first derivative

[TABLE]

gives the symmetric solution $a_{1}=a=2a-a_{1}=a_{2}$ which is indeed a maximum if the second derivative is negative

[TABLE]

For instance for the gamma distribution of parameter $\nu$ of Eq. 77 the second derivative of the rate function of Eq. 80

[TABLE]

satisfies this condition.

VII.4.2 Case of large deviations with asymmetric scaling

It is now interesting to compare the above discussion with the case of the large deviations for stretched exponential decay $0<\alpha<1$ that display the asymmetric scaling (Eqs 84 and 87)

[TABLE]

Then Eq. 120 is replaced by the sum of four possible contributions of various orders with respect to $N$

[TABLE]

The fourth contribution of order $e^{-N^{\alpha}}$ will be the leading contribution whenever the domain of integration for $a_{1}$ is not empty, i.e. in the region $a>\overline{x}$ : then the saddle-point evaluation requires the maximization of the function involving the rate function $I_{+}(a)=(a-\overline{x})^{\alpha}$ of Eq. 88

[TABLE]

However the symmetric solution $a_{1}=a=2a-a_{1}=a_{2}$ is a minimum here as a consequence of the sign of the second derivative for any $0<\alpha<1$

[TABLE]

The maximization of Eq 121 occurs instead at the boundaries $a_{1}=\overline{x}$ and $a_{2}=2a-\overline{x}$ (or vice-versa) and one obtains the leading contribution in the region $a>\overline{x}$

[TABLE]

as it should for consistency with Eq. 125 in the region $a>\overline{x}$ .

VIII Conclusion

In this paper, we have revisited the empirical observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations occur. We have stressed the analogy with equilibrium statistical mechanics : the Sanov theorem for the large deviations of the empirical histogram that involves as rate function the relative entropy with respect to the true probability distribution has been taken as the unifying starting point. The various empirical observables have been then analyzed by optimizing this relative entropy in the presence of the appropriate constraints.

Finally, we have discussed the physical meaning of large deviations rate functions from the renormalization perspective. While most renormalization procedures have been studied in the past at the level of their typical fluctuations, it will be thus interesting in the future to re-analyze them at the level of their large deviations, as in the recent study [64] concerning disordered directed polymers where asymmetric large deviations are known to occur,

Bibliography64

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J.M. Luck, Aléa Saclay (1992) ”Systèmes désordonnés unidimensionnels”.
2[2] A. Crisanti, G. Paladin, A. Vulpiani, “Products of random matrices in statistical physics”, Springer Verlag (1993).
3[3] E.J. Gumbel, “ Statistics of extreme” (Columbia University Press, NY 1958).
4[4] J. Galambos, “ The asymptotic theory of extreme order statistics” ( Krieger , Malabar, FL 1987).
5[5] J. P. Bouchaud and M. Mézard, J. Phys. A: Math. Gen. 30, 7997 (1997)
6[6] M. Clusel and E. Bertin, Int. J. Mod. Phys. B 22, 3311 (2008)
7[7] J.Y. Fortin and M. Clusel, J. Phys. A: Math. Theor. 48 183001 (2015).
8[8] Y. Oono, Progress of Theoretical Physics Supplement 99, 165 (1989).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Asymmetric scaling in large deviations

Abstract

I Introduction

II Empirical observables for independent random variables

II.1 Notations

II.2 Empirical histogram

II.3 Empirical maximum

II.3.1 Typical value MNM_{N}MN​ of the maximum for the exponential decay

II.3.2 Typical value MNM_{N}MN​ of the maximum the power-law decay

II.4 Empirical additive observables

III Analysis based on the large deviations of the empirical histogram

III.1 Reminder on the Sanov theorem involving the relative entropy

III.2 Exact generating function of the empirical histogram for finite NNN

III.3 Constraint to reproduce the cumulative distribution of the maximum xNmaxx_{N}^{max}xNmax​

III.4 Standard large deviations for additive empirical observables

III.5 Large deviations for joint additive empirical observables

IV Asymmetry in the Large deviations of the empirical maximum

IV.1 Probability distribution of the ratio rN=xNmaxMNr_{N}=\frac{x_{N}^{max}}{M_{N}}rN​=MN​xNmax​​

IV.2 Case of the exponential decay

IV.3 Case of the power-law decay

IV.4 Asymmetry beyond the regime of finite ratio rrr

V Possible asymmetry in the large deviations of the empirical average

V.1 Standard large deviation theory for the exponential decay with exponent α≥1\alpha\geq 1α≥1

V.2 Asymmetry in the large deviations for stretched exponential decay 0<α<10<\alpha<10<α<1

V.2.1 Usual large deviation form in the region smaller than typical a≤atyp=x‾a\leq a^{typ}=\overline{x}a≤atyp=x

V.2.2 Unusual large deviation in the region a>atypa>a^{typ}a>atyp

V.3 Asymmetry in the large deviations for power-law decay

VI Asymmetry in the large deviations of the empirical moment of order q>0q>0q>0

VI.1 Standard large deviation theory for the exponential decay with exponent α≥q\alpha\geq qα≥q

VI.2 Asymmetry in the large deviations for stretched exponential decay 0<α<q0<\alpha<q0<α<q

VI.2.1 Usual large deviation form in the region smaller than typical aq≤xq‾a_{q}\leq\overline{x^{q}}aq​≤xq

VI.2.2 Unusual large deviation in the region aq>xq‾a_{q}>\overline{x^{q}}aq​>xq

VI.2.3 Discussion

VII Renormalization interpretation of large deviations rate functions

VII.1 Merging two sets of NNN variables

VII.2 Renormalization for the large deviations of the empirical histogram

VII.3 Renormalization for the large deviations of the empirical maximum

VII.4 Renormalization for the large deviations of the empirical average

VII.4.1 Case of standard large deviations

VII.4.2 Case of large deviations with asymmetric scaling

VIII Conclusion

II.3.1 Typical value $M_{N}$ of the maximum for the exponential decay

II.3.2 Typical value $M_{N}$ of the maximum the power-law decay

III.2 Exact generating function of the empirical histogram for finite $N$

III.3 Constraint to reproduce the cumulative distribution of the maximum $x_{N}^{max}$

IV.1 Probability distribution of the ratio $r_{N}=\frac{x_{N}^{max}}{M_{N}}$

IV.4 Asymmetry beyond the regime of finite ratio $r$

V.1 Standard large deviation theory for the exponential decay with exponent $\alpha\geq 1$

V.2 Asymmetry in the large deviations for stretched exponential decay $0<\alpha<1$

V.2.1 Usual large deviation form in the region smaller than typical $a\leq a^{typ}=\overline{x}$

V.2.2 Unusual large deviation in the region $a>a^{typ}$

VI Asymmetry in the large deviations of the empirical moment of order $q>0$

VI.1 Standard large deviation theory for the exponential decay with exponent $\alpha\geq q$

VI.2 Asymmetry in the large deviations for stretched exponential decay $0<\alpha<q$

VI.2.1 Usual large deviation form in the region smaller than typical $a_{q}\leq\overline{x^{q}}$

VI.2.2 Unusual large deviation in the region $a_{q}>\overline{x^{q}}$

VII.1 Merging two sets of $N$ variables