Asymmetric scaling in large deviations for rare values bigger or smaller than the typical value
Cecile Monthus

TL;DR
This paper investigates asymmetric large deviations in empirical observables derived from independent random variables, revealing how different scalings occur for rare events above or below typical values, with insights from Sanov's theorem and renormalization.
Contribution
It unifies the analysis of asymmetric large deviations for various empirical observables using Sanov's theorem and explores their physical interpretation through renormalization.
Findings
Asymmetric large deviations occur for empirical maxima, averages, and moments.
Sanov's theorem provides a unifying framework for analyzing these deviations.
The physical meaning of rate functions is discussed via renormalization.
Abstract
In various disordered systems or non-equilibrium dynamical models, the large deviations of some observables have been found to display different scalings for rare values bigger or smaller than the typical value. In the present paper, we revisit the simpler observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations already occur. The unifying starting point to analyze the large deviations of these various empirical observables is given by the Sanov theorem for the large deviations of the empirical histogram : the rate function corresponds to the relative entropy with respect to the true probability distribution and it can be optimized in the presence of the appropriate constraints. Finally, the physical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asymmetric scaling in large deviations
for rare values bigger or smaller than the typical value
Cécile Monthus
Institut de Physique Théorique, Université Paris Saclay, CNRS, CEA, 91191 Gif-sur-Yvette, France
Abstract
In various disordered systems or non-equilibrium dynamical models, the large deviations of some observables have been found to display different scalings for rare values bigger or smaller than the typical value. In the present paper, we revisit the simpler observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations already occur. The unifying starting point to analyze the large deviations of these various empirical observables is given by the Sanov theorem for the large deviations of the empirical histogram : the rate function corresponds to the relative entropy with respect to the true probability distribution and it can be optimized in the presence of the appropriate constraints. Finally, the physical meaning of large deviations rate functions is discussed from the renormalization perspective.
I Introduction
In macroscopic systems with a large number of degrees of freedom, it is essential to understand how physical observables fluctuate as a function of the size . When is some intensive variable, one usually distinguishes the three following levels of descriptions for its probability distribution :
(i) in the thermodynamic limit , the probability distribution becomes concentrated on the typical value that does not depend on
[TABLE]
This statement is the analog of the law of large numbers for the empirical average of independent random variables. Another well-known example is the typical Lyapunov exponent for product of random matrices [1, 2].
(ii) zooming in Eq. 1 will reveal the order of the small typical fluctuations around the typical value . The appropriate scale grows with , for instance like a power-law with some exponent or like a power of
[TABLE]
and the rescaled variable is distributed with some universal limiting distribution . This statement is the analog of the Central Limit Theorem with the scale and where is the Gaussian distribution for the universality class of probability distributions whose two first moments are finite (if they are not finite, one obtains the other universality classes involving Lévy stable laws). Another famous example is given by the three universality classes Gumbel-Fréchet-Weibull of Extreme Value Statistics [3, 4], with many applications in various physics domains (see the reviews [5, 6, 7] and references therein).
(iii) in the field of large deviations, one is interested instead in evaluating how rare it is for large to observe some finite value different from . The standard theory of large deviations is based on the exponential decay [8, 9, 10]
[TABLE]
where the rate function is positive and vanishes only for the typical value of Eq 1
[TABLE]
While the region (ii) of universal typical fluctuations has been traditionally the main focus of studies for various physical observables, the theory of large deviations (iii) is nowadays considered as the unifying language for the statistical physics of equilibrium, non-equilibrium and dynamical systems (see the reviews [8, 9, 10] and references therein). In particular, the large deviations with respect to the large time limit of dynamical trajectories has produced an appropriate statistical physics approach for various Markovian processes (see the reviews [11, 12, 13, 14, 15, 16, 17] and the PhD Theses [18, 19, 20, 21] and the HDR Thesis [22]).
However the recent huge activity on large deviations in the field of random matrices has shown that the maximal eigenvalue [23, 24, 25, 26, 27, 28] and many other observables involving the eigenvalues [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39] display asymmetric scaling in large deviations : the probability to observe bigger values than typical and smaller values than typical are governed by two different scalings (for instance two different power-laws ) and two rate functions
[TABLE]
instead of the standard form of Eq. 3. For the maximal eigenvalue [23, 24, 25, 26, 27, 28], the physical interpretation of this asymmetry is that to push the maximal eigenvalue inside the Wigner sea, one needs to reorganize all the other eigenvalues, whereas to pull the maximal eigenvalue outside the Wigner sea, one may leave the other eigenvalues unchanged. Via mapping between models belonging to the Kardar-Parisi-Zhang universality class (see the list in the review [27] and references therein), these asymmetric large deviations properties for the biggest eigenvalue of some random matrices ensembles can be rephrased in many other frameworks, in particular :
(a) for the Asymmetric Exclusion process, which is one of the most studied models in the field of the non-equilibrium dynamics of interacting particles (see the reviews [11, 16, 17] and references therein), the interpretation of the asymmetric large deviations is that to slow down the traffic, it is sufficient to slow down a single particle, whereas to speed up the traffic, one needs to speed up all particles [40].
(b) for the Directed Polymer in random medium in dimension , which is one of the simplest disordered model displaying a low temperature glassy frozen phase (see the review [41] and references therein), the interpretation is that an anomalously good ground state energy requires only anomalously good on-site energies along the polymer, while an ’anomalously bad’ ground state energy requires bad on-site energies in the sample.
These examples and their very clear physical meanings show that asymmetric large deviations of Eq. 5 are likely to occur in many other problems in the fields of non-equilibrium dynamics or disordered systems, while they are not considered in the standard theory of large deviations [8, 9, 10] based on Eq. 3. As a consequence, it seems useful to revisit simpler observables based on independent random variables where asymmetric large deviations have been found to occur, in particular for the empirical maximum [42, 43], for the empirical average [44, 45, 46, 47], and in joint linear statistics [48]. Since these problems have been already studied in details in these references by exact methods, the goal of the present paper is to give a unifying perspective based on the large deviation properties of the empirical histogram in the presence of constraints corresponding to the observables under study. This point of view also allows to make the link with the studies of large deviations in the field of random matrices [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39] where the Coulomb gas technique is based on the large deviations of the empirical histogram of eigenvalues, in the presence of constraints corresponding to the observables under study. The main difference is that the large deviations of the empirical histogram are governed by the Coulomb interaction energy in the case of random matrices eigenvalues, while it is governed by the relative entropy of the Sanov theorem for the case of independent variables [8, 9, 10]. This large deviation framework also makes the link with the Gibbs theory of ensembles in equilibrium statistical physics [8, 9, 10] and thus allows to understand why it is natural to expect the possibility of phase transitions in large deviation rate functions (see the recent review [51] and references therein).
The paper is organized as follows. In section II, we recall how the empirical histogram of independent random variables allows to reconstruct interesting observables like the empirical maximum, the empirical average, or other additive empirical observables, with the consequences for typical values. In section III, the large deviations of the empirical histogram is presented as the unifying starting point to analyze the large deviations of empirical observables. In section IV, the asymmetry in the large deviations of the empirical maximum is analyzed on various scales. In section V, the asymmetry in the large deviations of the empirical average is described for the case of stretched exponential decay or power-law decay of the initial distribution. In section VI, the generalization for the large deviations of arbitrary non-integer empirical moments is discussed. In section VII, these large deviations properties are analyzed from the renormalization perspective. Our conclusions are summarized in section VIII.
II Empirical observables for independent random variables
II.1 Notations
Our main goal is to analyze the asymmetry in large deviation properties that may occur for simple observables involving independent random variables drawn with some probability distribution . To simplify the discussion, we will focus on the cases of positive variables , where the decay of the probability distribution for large is :
(i) either an exponential decay with some exponent
[TABLE]
with possibly some power-law prefactor if , while is a constant amplitude.
ii) or a power-law decay with some exponent (in order to ensure the existence of the two first moments and )
[TABLE]
But of course these assumptions are not restrictive, and if one is interested into other cases, one can easily adapt the methods described below by considering the various possible tail behaviors for .
II.2 Empirical histogram
If one is not interested in the order of appearance of the variables (otherwise see the pedagogical introduction [52] and references therein), all the information is contained in the empirical histogram
[TABLE]
Its typical value is of course the ’true’ probability distribution
[TABLE]
The large deviations around this typical value will be discussed in section III. Let us first recall how the empirical histogram allows to reconstruct the usual empirical observables of interest.
II.3 Empirical maximum
The information on the empirical maximum
[TABLE]
is contained in the empirical histogram of Eq. 8 as follows : the empirical maximum corresponds to the value where the empirical number of variables bigger than
[TABLE]
jumps from the value [math] for bigger than towards the value for sligthly smaller then
[TABLE]
The typical value of the empirical histogram of Eq. 9 yields that the typical value of the empirical maximum of Eq. 10
[TABLE]
is given by the typical position of the jump of Eq. 12
[TABLE]
where we have introduced the complementary cumulative distribution function that measures the integrated tail above the threshold
[TABLE]
The large deviations properties of the ratio
[TABLE]
of typical value unity
[TABLE]
will be discussed in section IV. To be self-contained, let us now recall the behavior of the typical values as a function of for the two types of decay under study here.
II.3.1 Typical value of the maximum for the exponential decay
For the asymptotic behavior of Eq. 6, the asymptotic behavior of its primitive of Eq. 15 reads
[TABLE]
Then Eq. 14 determining the typical value of the empirical maximum becomes for large
[TABLE]
The inversion yields at leading order the well-known logarithmic behavior [3, 4]
[TABLE]
II.3.2 Typical value of the maximum the power-law decay
For the power-law decay of Eq. 7, the asymptotic behavior of its primitive of Eq. 15
[TABLE]
yields that the solution of Eq. 14 follows the well-known power-law [3, 4]
[TABLE]
II.4 Empirical additive observables
The empirical histogram of Eq. 8 allows to reconstruct any additive observable involving some function
[TABLE]
Th most studied observable in the whole history of probability is of course the empirical average
[TABLE]
The empirical moments of arbitrary non-integer order
[TABLE]
have been also considered [24, 6] in order to interpolate between the case of the empirical average of Eq. 24 and the empirical maximum of Eq. 10 that should dominate the empirical moment of Eq 25 for large . Another examples include the exponential case considered in Ref [53] or the logarithmic case .
The typical value of the empirical histogram of Eq. 9 yields that the typical values of additive observables of Eq. 23 are simply
[TABLE]
In particular the typical value of the empirical average of Eq. 24 corresponds to the first moment of
[TABLE]
The possibility of asymmetric large deviations properties around this typical value will be discussed in section V.
III Analysis based on the large deviations of the empirical histogram
III.1 Reminder on the Sanov theorem involving the relative entropy
The large deviations of the empirical histogram of Eq. 8 around its typical value of Eq 9 are described by the Sanov theorem (see the reviews [8, 9, 10] and the pedagogical introduction [52])
[TABLE]
The delta function appears in order to impose the normalization constraint of the empirical histogram . The exponentially small term in involves the relative entropy of the empirical histogram with respect to the true probability distribution
[TABLE]
III.2 Exact generating function of the empirical histogram for finite
Among the various derivations of Eq. 28, one is based on the exact generating function of the empirical histogram for any finite
[TABLE]
where the scaled cumulant generating function
[TABLE]
is related to the relative entropy of Eq. 29 via the appropriate Legendre transform (see [52] for more details on the Legendre transforms in the two directions).
III.3 Constraint to reproduce the cumulative distribution of the maximum
The probability distribution of the empirical maximum of Eq. 10 is well known to be [3, 4]
[TABLE]
in terms of the cumulative function introduced in Eq. 15.
Here it is instructive to mention how it can be reproduced from the large deviations of the empirical histogram of Eq. 28 : the cumulative probability of the maximum amounts to replace the normalisation constraint by the two constraints
[TABLE]
leading to
[TABLE]
Introducing the Lagrange multiplier , one needs to optimize the functional
[TABLE]
over the empirical histogram
[TABLE]
The optimal solution is thus simply proportional to the true distribution on (while it vanishes for )
[TABLE]
where the normalization constraint determines the Lagrange multiplier
[TABLE]
Plugging the corresponding optimal value of the functional of Eq. 35
[TABLE]
into Eq. 34
[TABLE]
thus allows to recover the exact cumulative distribution of Eq. 32. In this derivation, Eq. 34 thus corresponds to the entropic cost for the emptiness of the region . Section IV will be devoted to the asymmetric large deviations properties of this distribution.
III.4 Standard large deviations for additive empirical observables
The probability distribution of the additive observable of Eq. 23
[TABLE]
can be directly characterized by its exact generating function for finite by applying Eq 30 to the case
[TABLE]
with the scaled cumulant generating function
[TABLE]
The alternative evaluation of Eq. 42 based on the standard large deviation form for the probability
[TABLE]
yields
[TABLE]
via the saddle-point method for the integral over that is the Legendre transform of the rate function
[TABLE]
with the reciprocal Legendre transform
[TABLE]
Another way to understand the physical meaning of these Legendre transforms consists in evaluating the probability via the addition of the sum constraint in the large deviations of the empirical histogram of Eq. 28 :
[TABLE]
Introducing the two Lagrange multiplier and , one needs to optimize the functional
[TABLE]
over the empirical histogram
[TABLE]
The optimal solution reads
[TABLE]
where the the Lagrange multipliers are fixed by the two constraints
[TABLE]
i.e. in terms of the function introduced in Eq. 43
[TABLE]
The corresponding optimal value of the functional of Eq. 49 using 53 thus involves the Legendre transform of (Eqs 46 and 47)
[TABLE]
as it should to recover via Eq. 48
[TABLE]
Here the analogy with the Gibbs theory of ensembles in equilibrium statistical physics is obvious : the effective distribution of Eq 51 for an individual random variable is the analog of the Boltzmann distribution in the canonical ensemble, where the Lagrange multiplier is conjugated to the quantity , whose average over the variables is fixed.
Of course these computations make sense only if the integrals of Eq. 52 converge : depending on the function defining the empirical observable under study, these integrals may diverge in some region of the Lagrange multiplier . The consequences for the non-standard large deviations properties will be discussed for the case of the empirical average in section V and for the empirical moments in section VI.
III.5 Large deviations for joint additive empirical observables
If one is interested in the large deviations of the joint probability of two additive empirical observables, one needs to add another constraint in Eq. 48, as described in detail in Ref [48] for the joint probability of the empirical average and of an empirical moment of order . More generally, one can add as many constraints as needed for the problem one is interested in.
IV Asymmetry in the Large deviations of the empirical maximum
IV.1 Probability distribution of the ratio
Via the change of variables of Eq. 16, the probability distribution of Eq. 32 becomes the probability distribution
[TABLE]
Since the typical value is large as a consequence of the equation , while the ratio is finite, the value of is also large, i.e. the value of is also small . Then Eq. 56 becomes
[TABLE]
It is thus more convenient to substitute in order to write everything in terms of the single scale
[TABLE]
IV.2 Case of the exponential decay
For the exponential decay of Eq. 6, the asymptotic behavior of the function given in Eq. 18 yields in Eq. 58
[TABLE]
This means that the rescaled variable
[TABLE]
is distributed with the Gumbel probability distribution [3, 4]
[TABLE]
whose very strong asymmetry for the asymptotic behaviors for is well-known
[TABLE]
As a consequence when is finite and different from the typical value , the variable of Eq. 60 will be near depending on the sign of : the asymptotic behaviors of Eqs 62 will thus produce completely different scalings in the region bigger than typical [42, 43]
[TABLE]
and in the region smaller than typical
[TABLE]
The link with the region of small typical fluctuations around the typical value usually considered in the Extreme Value Statistics [3, 4] corresponds here to the Taylor expansion at first order of the variable of Eq. 60
[TABLE]
This yields that the appropriate rescaling to have a finite variable distributed with the Gumbel distribution is
[TABLE]
or equivalently for the unrescaled maximum of Eq. 10
[TABLE]
where the behavior of as a function of was recalled in Eq. 20.
IV.3 Case of the power-law decay
For the power-law decay of Eq. 7, the asymptotic behavior of its primitive of Eq. 21 yields in Eq. 58
[TABLE]
where the Fréchet distribution of parameter appears for any finite value , while the scale has completely disappeared, in contrast to the exponential decay case described above. So here the probability of any value does not even decay with .
IV.4 Asymmetry beyond the regime of finite ratio
In the following section, we will need the probability of the maximum of Eq 32 beyond the regime of finite ratio with respect to the typical value . In the region much bigger than typical where , the factor can be neglected in Eq 32 and one obtains the leading behavior
[TABLE]
The physical meaning is that one just needs to draw the anomalously big value , while the other variables may remain typical and thus have no probabilistic cost.
On the contrary, in the region much smaller than typical where , the factor is the leading behavior in Eq 32 and produces an extensive cost in in the exponential
[TABLE]
This asymmetry is thus very strong, and reads for the exponential decay of Eq. 6 with Eq. 18,
[TABLE]
while for the power-law decay of Eq 7 with Eq 21, it is given by
[TABLE]
V Possible asymmetry in the large deviations of the empirical average
Whenever the first moment and the variance are finite, the Central Limit Theorem means that the empirical average of Eq 24 will display typical fluctuations of order around the typical value
[TABLE]
where is a Gaussian random variable of zero mean and variance unity. In this section, we focus on the large deviations properties for the probability distribution of the empirical average
[TABLE]
to discuss how rare it is to observe for large and when an asymmetry will occur.
V.1 Standard large deviation theory for the exponential decay with exponent
The standard large deviation theory recalled in section III.4 for additive empirical observable can be applied to the empirical average with : the probability to observe the value is exponentially small in
[TABLE]
The rate function can be either evaluated directly or can be computed as the Legendre transform (Eqs 46 and 47) of the scaled cumulant generating function of Eq. 43
[TABLE]
For the exponential decay of Eq. 6 with an exponent , the scaled cumulant generating function of Eq. 76 is defined for any , and the above large deviation theory can be applied without worry.
For the exponential decay of Eq. 6 with an exponent , the scaled cumulant generating function of Eq. 76 is defined only for , and it is thus useful to describe the example of the gamma distribution of parameter
[TABLE]
of Laplace transform
[TABLE]
So computing its power simply amounts to change the parameter into . As a consequence, the sum of variables is distributed with the gamma distribution of parameter that corresponds to the convolution of distributions . After the rescaling by , the probability distribution of the empirical average is thus exactly given by
[TABLE]
For large , the Stirling approximation for yields the large deviation form of Eq. 75 where the rate function
[TABLE]
is well defined for and measures how rare it is to observe a value different from the typical value .
The corresponding scaled generating cumulant function of Eq. 76 is defined only for .
[TABLE]
The correspondence between and via the Legendre transform of Eq. 46 is
[TABLE]
or equivalently via the reciprocal Legendre transform of Eq. 47
[TABLE]
So the region allows to parametrize the whole smaller than typical region , while the region allows to parametrize the whole bigger than typical region without problems.
V.2 Asymmetry in the large deviations for stretched exponential decay
V.2.1 Usual large deviation form in the region smaller than typical
For the exponential decay of Eq. 6 with an exponent , the scaled cumulant generating function of Eq. 76 is defined only for that corresponds to the region smaller than typical , where the usual large deviation form will thus be valid
[TABLE]
The rate function for corresponds to the Legendre transform of the function defined for .
V.2.2 Unusual large deviation in the region
For that corresponds to the region bigger than the typical value , the function as defined by Eq. 76 does not exist as a consequence of the divergence of the integral at when decays only as a stretched exponential with
[TABLE]
This suggests to consider the strategy based on the maximum alone : one considers that variables have their typical sum , which happens with probability one for large , i.e. with no probabilistic cost, while the remaining variable, that will have to coincide with the maximum of Eq. 10, should be anomalously big in order to satisfy the sum constraint
[TABLE]
So the cost of this strategy directly involves the probability of Eqs 69 and 71 of the anomalously extensive value of Eq. 86
[TABLE]
that decays only as the stretched exponential of exponent . The corresponding rate function
[TABLE]
has been proven to be valid in the whole region in Refs [44, 46].
V.3 Asymmetry in the large deviations for power-law decay
For the power-law decay of Eq. 7, one has the same scenario as for the stretched exponential case discussed above:
(i) in the region smaller than typical , the usual large deviation form is valid
[TABLE]
where the rate function corresponds to the Legendre transform of the function defined for .
(ii) in the region bigger than the typical value where the function of Eq. 76 does not exist as a consequence of the divergence of the integral, the strategy based on the anomalous maximum of Eq. 86 leads to the probability (using Eqs 69 and 72)
[TABLE]
that decays only as the power-law . This phenomenon of ’condensation’ in the power-law case has been studied in great detail in the references [45, 47, 48], with motivations coming from the zero-range process (see explanations and references in [45, 47, 48]). Other physical applications can be found in [54, 55].
VI Asymmetry in the large deviations of the empirical moment of order
The analysis of the previous section concerning the empirical average can be directly generalized to obtain the large deviations properties of the empirical moment of arbitrary non-integer order of Eq 25.
VI.1 Standard large deviation theory for the exponential decay with exponent
The standard large deviation theory recalled in section III.4 for additive empirical observable can be applied to the empirical moment of arbitrary non-integer order of Eq 25 with : the probability to observe a value is exponentially small in
[TABLE]
and the rate function corresponds to the Legendre transform (Eqs 46 and 47) of the scaled cumulant generating function of Eq. 43
[TABLE]
which is well defined for any when displays the exponential decay of Eq. 6 with an exponent .
VI.2 Asymmetry in the large deviations for stretched exponential decay
VI.2.1 Usual large deviation form in the region smaller than typical
For the exponential decay of Eq. 6 with an exponent , the scaled cumulant generating function of Eq. 92 is defined only for that corresponds to the region smaller than typical , where the usual large deviation form will thus be valid
[TABLE]
The rate function corresponds to the Legendre transform of the function defined for .
VI.2.2 Unusual large deviation in the region
For that corresponds to the region bigger than the typical value the function as defined by Eq. 92 does not exist. The strategy based on the maximum alone explained in the previous section can be then considered: variables have their typical sum , which happens with probability one for large , i.e. with no probabilistic cost, while the remaining variable will have to coincide with the power of the maximum of Eq. 10. This maximum should be anomalously big in order to satisfy the sum constraint
[TABLE]
So the cost of this strategy reads in terms of the probability of Eqs 69 and 71 of the anomalously big value of Eq. 94
[TABLE]
that decays only as the stretched exponential of exponent , with the corresponding rate function
[TABLE]
VI.2.3 Discussion
So for any exponential decay with exponent in Eq. 6, only the empirical moments of order display a standard form of large deviations, while the empirical moments of order will be characterized by asymmetric large deviations. For instance for the gamma distribution of Eq. 77 corresponding to , where the large deviations of the empirical average corresponding to are still standard (with the rate function of Eq. 80), all the empirical moments of order will have asymmetric large deviations, in particular the empirical second moment corresponding to .
VII Renormalization interpretation of large deviations rate functions
The region of typical fluctuations around the typical value (see the Introduction around Eq. 2) has been analyzed in detail from the renormalization point of view, both for the sum of random variables [56, 57, 58] and for the maximum of random variables [59, 60, 61, 62, 63]. In this section, it is thus interesting to discuss the meaning of large deviations from the renormalization perspective.
VII.1 Merging two sets of variables
To see more clearly the renormalization meaning of large deviations, it is interesting to consider the merging of two sets of random variables :
(1) the first set of variables is drawn with the probability distribution and is characterized by the empirical histogram
[TABLE]
(2) the second set of variables is drawn with the probability distribution and is characterized by its empirical histogram
[TABLE]
VII.2 Renormalization for the large deviations of the empirical histogram
Each of these two sets labelled by is characterized by the large deviation properties of its empirical histogram (Eq. 28 and 29)
[TABLE]
or its exact generating function of Eq. 30 for any finite
[TABLE]
Via the merging of the data of Eqs 97 and 98, the global histogram for the variables is simply the average of the two histograms
[TABLE]
with the typical value
[TABLE]
Its generating function is simply the products of the generating functions of Eq. 100
[TABLE]
i.e. the scaled cumulant generating function follows the renormalization rule
[TABLE]
In particular, when the two sets are drawn with the same probability distribution , the scaled cumulant generating function is exactly conserved along the RG flow.
In terms of large deviations form of Eq. 99, the probability of the histogram of Eq. 101
[TABLE]
corresponds to the optimization of the function in the exponential in the presence of the constraints that can be taken into account via Lagrange multipliers. One obtains the optimal solution
[TABLE]
and the corresponding sum of the relative entropies in the exponential
[TABLE]
coincides with the relative entropy of the histogram with respect to its typical value of Eq. 102 as it should for consistency. In particular, when the two sets are drawn with the same probability distribution , the optimal solution to produce an anomalous empirical histogram consists in choosing the same anomalous empirical histogram for the two subsets (Eq. 106).
VII.3 Renormalization for the large deviations of the empirical maximum
For each set , the cumulative probability distribution of the empirical maximum of Eq. 32 can be interpreted as an exact large deviation form
[TABLE]
where the rate function reads
[TABLE]
in terms of the complementary cumulative distribution function (Eq 15) associated to each distribution
[TABLE]
The empirical maximum of the variables is of course the maximum of the two maximal values associated to the two sets of Eqs 97 and 98
[TABLE]
So the corresponding cumulative distribution
[TABLE]
is written exactly in a large deviation form with the rate function
[TABLE]
When the two sets are drawn with the same probability distribution , one obtains that the rate function is exactly conserved along the RG flow.
VII.4 Renormalization for the large deviations of the empirical average
VII.4.1 Case of standard large deviations
In the case of standard large deviations, each set is described by the large deviation form of Eq. 75) for its empirical average
[TABLE]
where the rate function is the Legendre transform of the scaled cumulant generating function of Eq. 76
[TABLE]
involved in the generating function
[TABLE]
Via the merging of the data of Eqs 97 and 97, the empirical average of the variables is simply the average of the two empirical averages of the two sets
[TABLE]
Its generating function is simply the products of the generating functions of Eq. 116
[TABLE]
so the renormalization rule for the scaled cumulant generating function is simply
[TABLE]
In particular, when the two sets are drawn with the same probability distribution , the scaled cumulant generating function is exactly conserved along the RG flow.
In terms of large deviations form of Eq. 114, the probability of the empirical average reads for this case
[TABLE]
The saddle-point evaluation of this integral requires to find the maximum of the function in the exponential
[TABLE]
The vanishing of the first derivative
[TABLE]
gives the symmetric solution which is indeed a maximum if the second derivative is negative
[TABLE]
For instance for the gamma distribution of parameter of Eq. 77 the second derivative of the rate function of Eq. 80
[TABLE]
satisfies this condition.
VII.4.2 Case of large deviations with asymmetric scaling
It is now interesting to compare the above discussion with the case of the large deviations for stretched exponential decay that display the asymmetric scaling (Eqs 84 and 87)
[TABLE]
Then Eq. 120 is replaced by the sum of four possible contributions of various orders with respect to
[TABLE]
The fourth contribution of order will be the leading contribution whenever the domain of integration for is not empty, i.e. in the region : then the saddle-point evaluation requires the maximization of the function involving the rate function of Eq. 88
[TABLE]
However the symmetric solution is a minimum here as a consequence of the sign of the second derivative for any
[TABLE]
The maximization of Eq 121 occurs instead at the boundaries and (or vice-versa) and one obtains the leading contribution in the region
[TABLE]
as it should for consistency with Eq. 125 in the region .
VIII Conclusion
In this paper, we have revisited the empirical observables based on independent random variables, namely the empirical maximum, the empirical average, the empirical non-integer moments or other additive empirical observables, in order to describe the cases where asymmetric large deviations occur. We have stressed the analogy with equilibrium statistical mechanics : the Sanov theorem for the large deviations of the empirical histogram that involves as rate function the relative entropy with respect to the true probability distribution has been taken as the unifying starting point. The various empirical observables have been then analyzed by optimizing this relative entropy in the presence of the appropriate constraints.
Finally, we have discussed the physical meaning of large deviations rate functions from the renormalization perspective. While most renormalization procedures have been studied in the past at the level of their typical fluctuations, it will be thus interesting in the future to re-analyze them at the level of their large deviations, as in the recent study [64] concerning disordered directed polymers where asymmetric large deviations are known to occur,
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J.M. Luck, Aléa Saclay (1992) ”Systèmes désordonnés unidimensionnels”.
- 2[2] A. Crisanti, G. Paladin, A. Vulpiani, “Products of random matrices in statistical physics”, Springer Verlag (1993).
- 3[3] E.J. Gumbel, “ Statistics of extreme” (Columbia University Press, NY 1958).
- 4[4] J. Galambos, “ The asymptotic theory of extreme order statistics” ( Krieger , Malabar, FL 1987).
- 5[5] J. P. Bouchaud and M. Mézard, J. Phys. A: Math. Gen. 30, 7997 (1997)
- 6[6] M. Clusel and E. Bertin, Int. J. Mod. Phys. B 22, 3311 (2008)
- 7[7] J.Y. Fortin and M. Clusel, J. Phys. A: Math. Theor. 48 183001 (2015).
- 8[8] Y. Oono, Progress of Theoretical Physics Supplement 99, 165 (1989).
