Equilibrium large deviations for mean-field systems with translation invariance
Julien Reygner (CERMICS)

TL;DR
This paper establishes large deviation principles for mean-field particle systems with translation invariance, covering McKean-Vlasov and rank-based diffusions, with applications to capital distribution analysis.
Contribution
It introduces a framework for large deviations in translation-invariant mean-field systems, including new results for systems without external potential and in orbit spaces.
Findings
Large deviation principles are proved for equilibrium empirical measures.
Results apply to systems with and without external potential.
Application to atypical capital distribution is demonstrated.
Abstract
We consider particle systems with mean-field interactions whose distribution is invariant by translations. Under the assumption that the system seen from its centre of mass be reversible with respect to a Gibbs measure, we establish large deviation principles for its empirical measure at equilibrium. Our study covers the cases of McKean-Vlasov particle systems without external potential, and systems of rank-based interacting diffusions. Depending on the strength of the interaction, the large deviation principles are stated in the space of centered probability measures endowed with the Wasserstein topology of appropriate order, or in the orbit space of the action of translations on probability measures. An application to the study of atypical capital distribution is detailed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Equilibrium large deviations for mean-field systems with translation invariance
Julien Reygner
Université Paris-Est, CERMICS (ENPC), F-77455 Marne-la-Vallée
Abstract.
We consider particle systems with mean-field interactions whose distribution is invariant by translations. Under the assumption that the system seen from its centre of mass be reversible with respect to a Gibbs measure, we establish large deviation principles for its empirical measure at equilibrium. Our study covers the cases of McKean-Vlasov particle systems without external potential, and systems of rank-based interacting diffusions. Depending on the strength of the interaction, the large deviation principles are stated in the space of centered probability measures endowed with the Wasserstein topology of appropriate order, or in the orbit space of the action of translations on probability measures. An application to the study of atypical capital distribution is detailed.
Key words and phrases:
Large deviations, mean-field systems, McKean-Vlasov particle systems, rank-based interacting diffusions, free energy.
2010 Mathematics Subject Classification:
60F10, 60J60, 60K35
This work is partially supported by: the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement number 614492; the Chaire Risques Financiers, Fondation du Risque; and the French National Research Agency (ANR) under the programs ANR-12-BLAN Stab and ANR-17-CE40-0030 EFI
1. Introduction
This work is dedicated to the study of the large deviations of the empirical measure of particle systems at equilibrium exhibiting the following formal features:
- (a)
they are reversible with respect to an explicit Gibbs measure; 2. (b)
the particles are coupled through mean-field interactions; 3. (c)
their distribution is invariant under spatial translations.
The typical models that we aim to study include McKean-Vlasov particle systems without external potential, whose mean-field limit allows to approximate the granular media equation [3, 4, 37, 14], and systems of one-dimensional diffusions interacting through their rank, which arise in the probabilistic interpretation of scalar nonlinear conservation laws [9, 10, 31, 44, 32]. Both models also appear in mathematical finance, in the modelling of inter-bank borrowing and lending [27] and of stable equity markets [26, 33], respectively.
For McKean-Vlasov particle systems with an external potential, which in general satisfy the conditions (a) and (b) but not (c), the large deviations of the empirical measure of the particle system under its equilibrium measure are governed by the free energy functional, which combines entropic and energetic contributions. Prefiguring the interpretation by Otto [29, 39] and Carrillo, McCann and Villani [12, 13] of (nonlinear) Fokker-Planck equations as functional gradient flows, Dawson and Gärtner [18, 19, 20] showed that, for such systems, the free energy plays the role of a quasipotential, in the sense of the Freidlin-Wentzell theory. Thus it not only describes the static large scale properties of the particle system, such as typical configurations or possible phase transitions, but it also sheds light on its large scale dynamics, providing both typical paths and fluctuation rates.
For models satisfying the condition (c), translation invariance generally prevents ergodicity, so that there is no equilibrium measure for the original particle system. Still it was noted in [37] for McKean-Vlasov systems, and in [30, 40] for rank-based interacting diffusions, that under suitable assumptions on the interactions between the particles, a stationary behaviour can be observed for the particle system seen from its centre of mass. Centering the particle system induces a conserved quantity in its evolution, and the purpose of this article is to understand the effect of this constraint on its equilibrium large deviations. To the best of the author’s knowledge, this is the first study in this direction.
For such systems, a free energy functional can still be defined, with an energetic contribution depending only on the interaction between the particles. Thus it may be expected that, under the assumption that the centered particle system be ergodic, the large deviations of its empirical measure at equilibrium be described by this free energy functional, restricted to the space of centered probability measures. The first result of this article, Theorem 2.14, provides a rigorous formulation of this assertion; however, it only holds under the assumption that the interaction between particles be strong enough, in a sense to be made precise below — typically, for McKean-Vlasov systems with an interaction potential growing faster than linearly. In contrast, when this assumption is not satisfied, which turns out to be the case for systems of rank-based interacting diffusions, we show that the rate function may fail to have compact level sets, so that the expected large deviation principle does not hold. This is formally explained by the following two facts: the topology on which a large deviation principle can be expected to hold depends on the strength of the interaction; and on too weak topologies, the space of centered probability measures is not closed.
In order to connect the free energy functional to the equilibrium large deviations of the particle system without restriction on the strength of the interaction, and thereby cover the case of rank-based interacting diffusions, we avoid resorting to the notion of centered probability measures, and rather work at the level of the orbit of the empirical measure of the particle system at equilibrium, under the action of translations. This provides an equivalent description of the particle system, however the quotient topology on the orbit space becomes weak enough for a large deviation principle to hold without any assumption on the strength of the interaction. This is the second main result of the article, Theorem 2.16, which is weaker than the first in the sense that it is implied by the latter, but holds under less restrictive assumptions.
The adaptation of these results to the specific examples of McKean-Vlasov particle systems and systems of rank-based interacting diffusions are stated as corollaries. For the latter example, the large deviation principle allows to associate a notion of free energy to scalar nonlinear conservation laws, which complements, at the level of the stationary measure, the results by Dembo, Shkolnikov, Varadhan and Zeitouni on finite time intervals [21]. As an application, we discuss at the end of the article the estimation of the probability of an atypical capital distribution in the framework of Fernholz’ Stochastic Portfolio Theory [26].
Outline of the article
The notations and main results of the article are presented in Section 2. The proof of our two main theorems is based on the approximation of the particle system without external potential by a particle system with a small external potential. The large deviation results for this approximating system are presented in Section 3, and the control of these results when the external potential vanishes is studied in Section 4. The application of the main results to the particular cases of McKean-Vlasov particle systems, and systems of rank-based interacting diffusions, is detailed in Section 5. A technical result on the metrisability of the quotient topology is proved in Appendix A.
2. Notations and main results
2.1. Spaces of probability measures
For , we denote by the space of Borel probability measures on . It is endowed with the topology of weak convergence [7, Chapter 1, p. 7], which makes it a Polish space [7, Theorem 6.8, p. 73].
For all , we define the translation by as the operator such that, for all ,
[TABLE]
for all measurable and bounded functions . It is known that the operator is continuous on .
For all , we denote by the space of Borel probability measures on with a finite -th order moment. It is endowed with the Wasserstein topology of order [45, Definition 6.8, p. 96], which makes it a Polish space [45, Theorem 6.18, p. 104].
The Wasserstein topology is stronger than the topology induced on by the topology of weak convergence on , so that for any , the translation is continuous on .
We denote by the subset of centered probability measures with a finite -th order moment, and define the centering operator by
[TABLE]
for all . It is easily checked that is continuous on , and that is a closed subset of , hence it is a Polish space itself.
In the sequel of this article, we shall consider probability measures defined on the respective Borel -fields of the topological spaces , and .
2.2. Energy functional and Gibbs measure
Throughout the article, the temperature parameter is fixed.
The physical systems which we aim to study are described by an energy functional
[TABLE]
satisfying the following set of conditions:
- (TI)
translation invariance: for all , for all , ;
- (F)
-finiteness: if has compact support, then ;
- (LSC)
the function is lower semicontinuous on ;
- (GC)
growth control: there exists and such that if , and
[TABLE]
For all , the energy of a configuration of a system with particles is defined by
[TABLE]
where
[TABLE]
is the empirical measure of the configuration . Notice that Assumption (F) ensures that for any configuration .
The Gibbs density naturally associated with the energy function is never integrable on , because Assumption (TI) implies that is invariant under the translations , . However, introducing the linear subspace
[TABLE]
and denoting by the Lebesgue measure on , we get the following first result.
Lemma 2.1** (Finiteness of the partition function).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC) and (GC). For all , we have
[TABLE]
where the function is defined by (1).
Proof.
The combination of Assumptions (F) and (LSC) ensures that the function is positive and measurable, so that is well-defined as an element of . Using (3), Assumption (GC) and the trivial bound
[TABLE]
we get the inequality
[TABLE]
whose right-hand side is proven to be finite by using the parametrisation of by . ∎
Definition 2.2** (Gibbs measure).**
Under the assumptions of Lemma 2.1, we denote by the probability measure on with density
[TABLE]
with respect to the Lebesgue measure on .
By definition, for all , the probability measure gives full weight to the subspace .
2.3. Two specific examples
When is smooth enough to ensure the well-posedness of the system of stochastic differential equations
[TABLE]
with independent standard -valued Brownian motions, the Gibbs measure of Definition 2.2 is related to the long time behaviour of the diffusion process . We first give two explicit examples of such processes, for which the energy functional satisfies Assumption (TI).
Example 2.3** (mv-model).**
Given a smooth, nonnegative and even interaction potential , the energy functional
[TABLE]
leads to the McKean-Vlasov particle system without external potential
[TABLE]
This particle system arises, for instance, in the probabilistic approximation of the granular media equation [3, 4, 37, 14], for which the choice is of particular physical interest [6, 5].
Example 2.4** (rb-model).**
In dimension , given a and nonnegative function such that
[TABLE]
the energy functional
[TABLE]
where denotes the cumulative distribution function of , is associated with the system of rank-based interacting diffusions
[TABLE]
where for all , denotes the order statistics of , and
[TABLE]
This particle system serves as a model for large equity markets, and is also related to the probabilistic interpretation of nonlinear scalar conservation laws [26, 42]. For the latter reason, we shall call a flux function.
Remark 2.5** (Intersection between both classes of models).**
Taking and in the mv-model yields the energy functional
[TABLE]
so that this model coincides with the rb-model for .
For both the mv-model and the rb-model, it is quickly observed that the centre of mass of the system
[TABLE]
is a Brownian motion in , which prevents from converging to an equilibrium probability measure. Following the remark made in [37] for the mv-model and in [30, 40] for the rb-model, we define the diffusion process on the linear subspace by
[TABLE]
which describes the particle system seen from its centre of mass. Under the assumptions of Lemma 2.1, this process turns out to be reversible with respect to the Gibbs measure .
2.4. Free energy and large deviations
Under the assumptions of Lemma 2.1, the central object of our study is the sequence of probability measures defined by
[TABLE]
which describe the distribution of the empirical measure of the particle system, seen from its centre of mass, at equilibrium. Notice that, for all , the restriction of to defines a continuous mapping from to either or , for any ; in particular, it is measurable for both the topology of weak convergence and the Wasserstein topology. As a consequence, for all , the probability measure is well-defined on both the Borel -field of and the Borel -field of .
In order to study the large deviations of the sequence , we first introduce the following two functionals on .
Definition 2.6** (Boltzmann’s entropy).**
For all , we let
[TABLE]
if and has a density with respect to the Lebesgue measure on , and
[TABLE]
otherwise.
Remark 2.7** (On the moment condition).**
The requirement that ensures that the negative part of is integrable [1, Remark 9.3.7, p. 212], and therefore ensures that is well-defined as an element of .
Definition 2.8** (Free energy).**
The free energy associated with an energy functional is defined by
[TABLE]
for all .
Remark 2.9** (Physical free energy).**
In statistical physics, is usually assigned the value , where is the Boltzmann constant and is the temperature, and Boltzmann’s entropy is rather defined by . Therefore, to be consistent with the classical definition of the free energy
[TABLE]
one should rather define the free energy to be worth . The difference with (14) merely lies in the multiplicative constant, and we shall keep the latter definition as it alleviates some computations throughout the article.
If the energy functional satisfies the assumptions of Lemma 2.1, the free energy possesses the following properties.
Lemma 2.10** (Bounds on the free energy).**
Let be an energy functional satisfying the assumptions of Lemma 2.1.
- (i)
There exists such that . 2. (ii)
* is bounded from below on .*
Remark 2.11**.**
It is easily checked that the uniform distribution on any compact set has a finite Boltzmann entropy, which by Assumption (F) yields the statement (i) of Lemma 2.10.
The statement (ii) of Lemma 2.10 is proved in Subsection 4.1.
Under the assumptions of Lemma 2.10, we may define
[TABLE]
This quantity is sometimes referred to as Gibbs’ free energy [20].
Before stating our first result, we introduce two further assumptions on the energy functional :
- (SH)
subhomogeneity: for all , for all , ;
- (CC)
chaos compatibility: for all , if is a sequence of independent random variables with identical distribution on some probability space , then
[TABLE]
Remark 2.12** (On Assumption (SH)).**
Unlike the remainder of the assumptions, Assumption (SH) is quite technical and is only employed once in the article, namely in the proof of the exponential estimates of Lemma 4.4. It may certainly be replaced by a variety of other similar assumptions, as long as they allow to obtain the same exponential estimates, but we believe that the present formulation achieves a reasonable balance between the generality of the models that it covers, and the relative simplicity of the computations that it requires to prove Lemma 4.4.
Remark 2.13** (On Assumptions (CC) and (LSC)).**
If is a sequence of independent random variables with identical distribution , then by the Glivenko-Cantelli Lemma, the empirical measure converges to in , -almost surely. As a consequence, Assumption (LSC) and Fatou’s Lemma yield
[TABLE]
so that Assumption (CC) merely involves the limit superior of .
We are now ready to state the first main result of the article. We recall that, on a metric space, a good rate function is a proper function with compact level sets, and refer to [22] for introductory material on large deviation principles.
Theorem 2.14** (LDP for in Wasserstein spaces).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC), (GC), (SH) and (CC). If the index given by Assumption (GC) is such that , then for all , the sequence satisfies a large deviation principle on with good rate function
[TABLE]
Notice that the large deviation principle holds only in Wasserstein topologies with order strictly smaller than the index of Assumption (GC), which for the mv-model coincides with the order of polynomial growth of the interaction potential . Furthermore, since the Wasserstein topology is stronger than the topology of weak convergence, the Contraction Principle [22, Theorem 4.2.1, p. 126] implies that under the assumptions of Theorem 2.14, the large deviation principle for also holds on the space , with good rate function defined by
[TABLE]
As far as the role of the topology in the large deviation principle is concerned, a parallel can be drawn with Sanov’s Theorem. Indeed, let denote the law of the empirical measure of independent random variables in , with identical distribution , where has a density proportional to , with . The standard Sanov Theorem [22, Theorem 6.2.10, p. 263] asserts that the sequence satisfies a large deviation principle on , and it was proved by Wang, Wang and Wu [46] that, if , then the large deviation principle actually holds on , for — but not for .
Keeping the analogy between Sanov’s Theorem and Theorem 2.14 in mind, one may therefore wonder, if Assumption (GC) in the latter theorem is only satisfied with , whether the large deviation principle continues to hold on , with the rate function defined by (15), for want of holding in a Wasserstein topology. We show that the answer is negative, by exhibiting an example for which the level sets of the function fail to be compact on , which prevents the large deviation principle from holding. As should be clear from the example, this is related to the lack of continuity of the centering operator on .
Example 2.15** (Counter-example to Theorem 2.14 when ).**
We assume that and take the energy functional
[TABLE]
of Remark 2.5. It will be checked in Subsection 5.2 that this energy functional satisfies the assumptions of Theorem 2.14, except that Assumption (GC) is only satisfied with . This in fact occurs for any instance of the rb-model, and not only for the case corresponding to the energy functional chosen here.
Let be the density of the standard Gaussian distribution on , and for all , let us define the density
[TABLE]
For all , the probability measure with density is centered, and we have
[TABLE]
due to the convexity of , while
[TABLE]
where we have used the triangle inequality twice. As a consequence, the collection is contained in a level set of the rate function defined by (15). But on the other hand, converges weakly, when vanishes, to the Gaussian distribution centered in , at which takes the value . Therefore the level sets of are not closed, whence not compact, in .
2.5. Large deviations in the quotient space
Let us denote by the orbit space of the group action
[TABLE]
and define
[TABLE]
the associated orbit map. The space is endowed with the quotient topology, which is defined as the strongest topology making the map continuous. It is proved in Appendix A that this topology is metrisable.
If a functional on is translation invariant, then it is constant on orbits and we may define the functional on by
[TABLE]
for any . Under the assumptions of Lemma 2.10, the functionals , and are translation invariant, and it is immediate that
[TABLE]
For all , we define the probability measure
[TABLE]
on the Borel -field of . The next theorem is the second main result of this article.
Theorem 2.16** (LDP for in the quotient space).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC), (GC), (SH) and (CC). The sequence satisfies a large deviation principle on with good rate function
[TABLE]
Of course, in the case , Theorem 2.16 can be obtained by contraction from Theorem 2.14, but we will not take advantage of this remark and we will rather prove both theorems simultaneously.
Remark 2.17** (Large deviations in ).**
Let be a standard -valued Brownian motion, and consider the occupation measure
[TABLE]
Because of the lack of ergodicity of the Brownian motion, the large deviations of , when , are not covered by the standard Donsker-Varadhan theory. Recently, Mukherjee and Varadhan [38] introduced a suitable compactification of the space , in which a large deviation principle can be stated for the orbit of . This result also allows to get estimates on translation invariant functionals, such as probability measures on the space of sample paths with density proportional to
[TABLE]
with respect to the Wiener measure, for some interaction potential .
Although we rely on the same idea of working in the orbit space in order to compensate the lack of ergodicity of our original process, our topological construction is quite distinct. In particular, no compactification of the orbit space is required for Theorem 2.16 to hold.
2.6. Sketch of the proof of Theorems 2.14 and 2.16
Our two main theorems are proved simultaneously. In Section 3, we first state a large deviation principle for the law of the empirical measure of a system with energy functional and a confining functional , the magnitude of which depends on a small parameter . This result can be considered standard, and our proof closely follows the lines of [24, Theorem 1.5]. For consistency when vanishes, we choose the external potential associated with to grow as , where is given by Assumption (GC). As a result, the large deviation principle for holds on , and if , on for any .
By contraction, we then obtain large deviation principles for the respective pushforward measures and of by and , respectively on , and if , on for any . The end of the proof, detailed in Section 4, then consists in checking that, when vanishes, and provide sufficiently good approximations of and , at the level of large deviations. This part can be considered as the main original contribution of the article.
2.7. Large deviations for the mv-model and the rb-model
We come back to the specific examples of the mv-model and rb-model introduced in Subsection 2.3, and state large deviation principles for these models which come as corollaries of Theorems 2.14 and 2.16.
2.7.1. mv-model
Let be an interaction potential which possesses the decomposition
[TABLE]
where the functions and satisfy the following respective assumptions.
- (mv-)
The function is even, lower semicontinuous on , there exists and such that, for all , , and for all , for all , .
- (mv-)
The function is even, continuous on and, with given by Assumption (mv-) on :
- •
if , then is bounded;
- •
if , then there exists such that is bounded on .
Any polynomial function of , with nonnegative but possibly fractional powers, degree larger or equal to , and positive leading coefficient, satisfies this set of assumptions — up to renormalisation of the constant term in order to ensure nonnegativity. This is in particular the case of the cubic potential corresponding to the granular media equation [6, 5]. However, singular potentials such as those involved in the particle approximation of the Keller-Segel equation [28, 15], or in the study of Coulomb gases [35], do not satisfy our set of assumptions.
Corollary 2.18** (LDP for the mv-model).**
Let be an interaction potential possessing the decomposition (17), with functions and satisfying the respective Assumptions (mv-) and (mv-). Let us define the energy functional by the identity (7). The sequence of associated probability measures is well-defined, and letting , be defined by (12) and (16), respectively, we have the following results.
- (i)
The sequence satisfies a large deviation principle on with good rate function defined by Theorem 2.16. 2. (ii)
If the index of Assumptions (mv-) and (mv-) is such that , then for all , the sequence satisfies a large deviation principle on with good rate function defined by Theorem 2.14.
The proof of Corollary 2.18 is presented in Subsection 5.1. If , then the energy functional actually satisfies the assumptions of Theorems 2.14 and 2.16, so that the result of Corollary 2.18 is straightforward. The case is treated as a perturbation of the previous case, thanks to the Laplace-Varadhan Lemma.
2.7.2. rb-model
Let be a flux function satisfying the condition (8), which ensures that the energy functional defined by (9) is not identically equal to . It is known [42] that the condition
[TABLE]
which is called Oleinik’s entropy condition in the vocabulary of conservation laws, ensures the ergodicity of the centered particle system introduced in Subsection 2.3. The combination of (8) and (18) implies that , and the stronger condition
[TABLE]
which is called Lax’ entropy condition, generally ensures better ergodic properties of both the particle system and its mean-field limit [30, 32, 33, 43]. Notice that if is assumed to be concave, then Oleinik’s and Lax’ conditions are equivalent, and hold as soon as is not identically zero.
We shall check in Subsection 5.2 that this set of conditions implies that the energy functional satisfies the assumptions of Theorem 2.16, and in particular Assumption (GC) with , which allows to define the sequence associated with and leads to the following result.
Corollary 2.19** (LDP for the rb-model).**
Let be a flux function satisfying the conditions (8), (18) and (19). Let be the energy functional associated with by (9). The sequence associated with is well-defined, and it satisfies a large deviation principle on , with good rate function given by Theorem 2.16.
In mathematical finance, systems of rank-based interacting diffusions are employed to model the evolution of the logarithmic capitalisations of stocks on an equity market [26, 2, 33]. In Subsection 5.3, we present an application of Corollary 2.19 to the study of atypical capital distribution in this framework.
3. Large deviations with a small external potential
Throughout this section, is an energy functional satisfying Assumptions (TI), (F), (LSC), (GC) and (CC), and is the index given by Assumption (GC). We do not repeat these assumptions in the statements of our results.
We first introduce a few notations. For all , we define
[TABLE]
for all , and let
[TABLE]
for all , as well as
[TABLE]
for all . Let
[TABLE]
and let be the probability measure on with density with respect to the Lebesgue measure on .
3.1. Relative entropy and Sanov’s Theorem
We recall that the relative entropy of with respect to is defined by
[TABLE]
The following lemma is straightforward.
Lemma 3.1** (From Boltzmann’s entropy to relative entropy).**
For all ,
[TABLE]
The identity (24) holds in , in the sense that if , then ; while if , then and are simultaneously finite or equal to .
With the notations introduced above, let us define
[TABLE]
Proposition 3.2** (Sanov’s Theorem).**
For all , the sequence satisfies a large deviation principle on , with good rate function . If , the large deviation principle holds on for all , with the same rate function.
The statement of the large deviation principle on is the usual formulation of Sanov’s Theorem [22, Theorem 6.2.10, p. 263]. Its extension to is due to Wang, Wang and Wu [46].
3.2. Large deviations in the interacting case
Owing to Assumption (F), we have
[TABLE]
and we denote by the probability measure on with density
[TABLE]
with respect to the Lebesgue measure on . We finally let
[TABLE]
and define the free energy functional by
[TABLE]
Large deviation principles for equilibrium mean-field systems with an external potential may be considered to be standard results in the literature [36, 20, 16, 24]. We however give a complete proof of the next statement, which is adapted to our assumptions on the energy functional , and follows closely the arguments of Dupuis, Laschos and Ramanan [24, Theorem 1.5].
Proposition 3.3** (LDP for the sequence ).**
For all , the sequence satisfies a large deviation principle on with good rate function
[TABLE]
where
[TABLE]
If , the large deviation principle holds on for all , with the same rate function.
Notice that the same arguments as in Remark 2.11 show that . On the other hand, combining Lemma 3.1 with (27) yields
[TABLE]
so that the nonnegativity of both the relative entropy and the energy functional ensure that .
We may now proceed to the proof of Proposition 3.3.
Proof.
The proof relies on the so-called weak convergence approach to large deviations developed by Dupuis and Ellis [23]. Throughout the proof, we use the notation to refer to either of the topological spaces or , if and . We recall that both spaces are Polish.
As a first step, we invoke [23, Theorem 1.2.3, p. 7] to reduce the proof of Proposition 3.3 to the verification of the following two facts:
- (i)
the function has compact level sets on ; 2. (ii)
for any continuous and bounded functional , the Laplace principle
[TABLE]
holds.
Proof of (i). Using Lemma 3.1, we rewrite
[TABLE]
so that it suffices to show that has compact level sets. As a consequence of Proposition 3.2, is a good rate function on and therefore has compact level sets. Since the functional is nonnegative and satisfies Assumption (LSC), then any level set of is a closed subset of a level set of , and therefore is compact.
Reformulation of (28). Let us first remark that, on account of the definitions of and ,
[TABLE]
As a consequence, the prelimit in (28) rewrites
[TABLE]
so that it suffices to compute the limit of the first term in the right-hand side, and deduce the limit of the second by taking . The computation of such quantities is typically the object of Varadhan’s Lemma, which cannot be directly applied here since the functional is not assumed to be continuous and bounded.
Lower bound in the Laplace principle. Using the fact that is bounded from below and satisfies Assumption (LSC), the combination of Proposition 3.2 with the variant of Varadhan’s Lemma [22, Lemma 4.3.6, p. 138] provides the lower bound
[TABLE]
Upper bound in the Laplace principle. In order to obtain an upper bound of the same order as (32), we first introduce a few notations. For all , we define
[TABLE]
and for all ,
[TABLE]
The function is measurable and bounded on , so that the representation formula [23, Proposition 1.4.2, p. 27] — or dually the Donsker-Varadhan variational characterisation of the relative entropy [23, Lemma 1.4.3, p. 29] — show that, for all probability measures on ,
[TABLE]
where the definition of the relative entropy of probability measures on is the same as (23) for probability measures on . Using the trivial bound on the one hand, and the fact that since is bounded from above on , the Dominated Convergence Theorem yields
[TABLE]
on the other hand, we deduce that
[TABLE]
which rewrites
[TABLE]
Let , and let be such that
[TABLE]
We evaluate the right-hand side of (33) with . On the one hand, it is easily seen that
[TABLE]
while on the other hand,
[TABLE]
where are independent random variables in with identical distribution on some probability space . By Assumption (CC),
[TABLE]
whereas to justify the convergence of to , we now show that
[TABLE]
and conclude by the Dominated Convergence Theorem using the fact that is continuous and bounded on .
- •
If refers to the topological space , then (34) is the Glivenko-Cantelli Lemma.
- •
If and refers to the topological space , with , then by the strong Law of Large Numbers,
[TABLE]
which, combined with the Glivenko-Cantelli Lemma, implies the -almost sure convergence in of to [45, Definition 6.8, p. 96].
As a consequence, we finally get
[TABLE]
Conclusion of the proof. Letting in (35) and combining the latter inequality with (32), we conclude that
[TABLE]
so that, taking (31) into account,
[TABLE]
By (29), the right-hand side above rewrites , which yields (28) and completes the proof. ∎
3.3. The measures and
Let us define the functional by
[TABLE]
Notice that if and only if , and that is translation invariant.
For all , we define the probability measures and , respectively on the Borel -fields of the topological spaces and , for any , by the identities
[TABLE]
Since the operators and are continuous, the following result is obtained from Proposition 3.3 by means of the Contraction Principle [22, Theorem 4.2.1, p. 126].
Corollary 3.4** (LDP for and ).**
For all , the sequence satisfies a large deviation principle on with good rate function
[TABLE]
In addition, if , then for all , for all , the sequence satisfies a large deviation principle on with good rate function
[TABLE]
3.4. Alternative expression for
We denote by the orthogonal projection of onto the subspace , and for all , we define the probability measure on by
[TABLE]
Notice that . We also define the function by the identity
[TABLE]
where for all , we denote by the corresponding element of .
Lemma 3.5** (Relation between and ).**
For all ,
[TABLE]
and the probability measure defined by (37) possesses the density
[TABLE]
with respect to the Lebesgue measure on . Besides, the probability measure defined by (36) satisfies
[TABLE]
Proof.
Let be a Borel subset of . By (37) and (25),
[TABLE]
Any admits the orthogonal decomposition , with and for some . As a consequence, rewrites
[TABLE]
where we have used the fact that , thanks to Assumption (TI), and the definition (38) of . This shows (39) and the fact that possesses the density . Last, (40) follows from the elementary relation on . ∎
In Section 4, we shall rely on the following bounds on the function .
Lemma 3.6** (Bounds on ).**
Let and . For all ,
[TABLE]
where we recall the definition (22) of .
Proof.
The upper bound follows from the convexity inequality
[TABLE]
while the lower bound follows from Jensen’s inequality
[TABLE]
since . ∎
4. Proof of Theorems 2.14 and 2.16
This section is dedicated to the proof of the large deviation principles contained in Theorems 2.14 and 2.16. We first check in Subsection 4.1 that, under the respective assumptions of these theorems, the functionals and are good rate functions. In Subsection 4.2, we obtain auxiliary results on the respective approximation of and by the measures and introduced in Section 3. These results allow us to prove large deviation upper and lower bounds in Subsection 4.3, thereby completing the proof of Theorems 2.14 and 2.16.
4.1. Rate functions
The purpose of this subsection is to prove the following result.
Lemma 4.1** (Goodness of rate functions).**
Under the assumptions of Lemma 2.10, the functional has compact level sets on , and if the index given by Assumption (GC) is such that , then for all , the functional has compact level sets on .
Combining the results of Lemmas 2.10 and 4.1, we conclude that, under the respective assumptions of Theorems 2.14 and 2.16, the functionals and are good rate functions, respectively on and . We first state an auxiliary result.
Lemma 4.2** (Level sets on ).**
Under the assumptions of Lemma 2.10, for all , the set
[TABLE]
is closed in . Besides, letting be given by Assumption (GC), we have and there exists such that
[TABLE]
Proof.
Since, by Remark 2.7, neither nor can take the value , any satisfies and , which by Assumption (GC) ensures that .
Let us now fix and define . For all , we recall the definitions of and from Section 3. By the translation invariance of , Lemma 3.1 and the definition (21) of ,
[TABLE]
Using the fact that the relative entropy is nonnegative and then Assumption (GC), we deduce that
[TABLE]
so that taking and recalling that yields
[TABLE]
which provides (41).
In order to show that is closed in , let us take a sequence in , which converges to some in , and prove that
[TABLE]
which implies . As a first step, we note that, according to the first part of the proof, for all , which allows us to define and notice that ; besides, by (41), the sequence of -th order moments of is bounded. Since the functional is nonnegative, the sequence is also bounded. Denoting by the density of , we then obtain from standard arguments [29, pp. 7-8] the existence of a probability density toward which converges weakly in , at least along a subsequence, and such that
[TABLE]
where we denote by the probability measure with density . Finally, since the orbit map is continuous, the series of identities
[TABLE]
in implies that , whence the conclusion. ∎
The inequality (42) shows that is bounded from below on , which proves the statement (ii) of Lemma 2.10. We may now complete the proof of Lemma 4.1.
Proof of Lemma 4.1.
We fix and first prove that the set
[TABLE]
is compact in . By Lemma A.1, this set is closed if and only if is closed in , which is the case since is easily seen to coincide with the set of Lemma 4.2. We now proceed to show that this set is sequentially compact. Let be a sequence of elements of . By Lemma 4.2, for all there exists such that , and we have the moment control
[TABLE]
given by (41). Markov’s inequality implies that the sequence is tight, so that by Prohorov’s Theorem [7, Theorem 5.1, p. 59], it possesses a converging subsequence. The continuity of the map then ensures that the sequence possesses a converging subsequence as well, which shows the sequential compactness of . Since we prove in Lemma A.3 that the quotient topology on is metrisable, [22, Theorem B.2, p. 345] allows us to conclude that is compact and obtain the first part of Lemma 4.1.
We now assume that , fix , and prove that the set
[TABLE]
is compact in . Since the Wasserstein topology is stronger than the topology of weak convergence, Lemma 4.2 implies that is closed in , and therefore is closed in . Now for all sequences of elements of , the moment control (43) ensures that possesses a subsequence, that we still index by for convenience, which converges to some in . To prove that the convergence actually holds in , we remark that since , the moment control (43) also ensures the uniform integrability of the -th order moment of , so that by [45, Definition 6.8, p. 96], converges to in , therefore is sequentially compact in . By [22, Theorem B.2, p. 345] again, we conclude that is compact in , whence the second part of Lemma 4.1. ∎
4.2. Exponential comparisons
This subsection contains two auxiliary results which will be used in the proof of the large deviation upper and lower bounds.
Lemma 4.3** (Exponential tilting of ).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC) and (GC), and let .
- (i)
For all , for all Borel sets of ,
[TABLE]
and
[TABLE] 2. (ii)
For all Borel sets of ,
[TABLE]
and
[TABLE]
Proof.
We first address the proof of the identities (44) and (46). The equality (44) is a straightforward consequence of the definition (36) of . To check the validity of (46), we recall that the respective definitions (36) and (26) of and yield
[TABLE]
Besides, since for all ,
[TABLE]
we have on . Hence we may substitute with in (48) to obtain
[TABLE]
thanks to (37). This equality immediately leads to (46).
We now address the proof of (45) and (47). For all , for all Borel sets of , (12) yields
[TABLE]
so that (45) follows from Lemma 3.5. Likewise, for all Borel sets of , (47) is obtained by the same chain of arguments, but starting with (16) in place of (12). ∎
Lemma 4.4** (Exponential moment control).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC), (GC) and (SH). For all ,
[TABLE]
Proof.
Let us fix and . The proof is divided in 3 steps.
Step 1. In this step, we construct , depending on , such that for all , there exists which depends on such that, for all , for all , if
[TABLE]
then
[TABLE]
We first rewrite (51) under the equivalent formulation
[TABLE]
On the one hand, the upper bound of Lemma 3.6 yields, for all ,
[TABLE]
with
[TABLE]
on the other hand, Assumption (GC) yields, for all ,
[TABLE]
We deduce that (51) holds as soon as
[TABLE]
With the latter condition at hand, let us define
[TABLE]
and notice that, for all ,
[TABLE]
so that there exists , depending on , such that, for all ,
[TABLE]
and therefore
[TABLE]
As a conclusion, for all , if satisfies (50) then (52) holds, which leads to (51).
Step 2. Let us fix and , where and are given by Step 1. In this step, we give an upper bound on
[TABLE]
by studying this integral separately on the domains
[TABLE]
and on its complement. By the upper bound of Lemma 3.6,
[TABLE]
On the other hand, using Lemma 3.5 and Step 1 we obtain the chain of inequalities
[TABLE]
By Assumption (SH), . We now derive a similar bound for . The definition (38) yields
[TABLE]
where we have performed the change of variable and used the fact that . Thus,
[TABLE]
thanks to the change of variable . Injecting this inequality at the end of (53), we obtain
[TABLE]
so that we may conclude this step by stating that
[TABLE]
Step 3. We complete the proof by studying the asymptotic behaviour of . By Step 2 and the standard asymptotic subadditivity argument,
[TABLE]
from which we then deduce that
[TABLE]
We may now complete the proof of (49) by letting vanish. ∎
4.3. Large deviation upper and lower bounds
In this subsection, we complete the proof of Theorems 2.14 and 2.16 by addressing the large deviation upper and lower bounds.
Lemma 4.5** (Large deviation upper bound).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC), (GC), (SH) and (CC).
- (i)
For all closed sets of ,
[TABLE] 2. (ii)
If the index given by Assumption (GC) is such that , then for all , for all closed sets of ,
[TABLE]
Proof.
We shall prove both statements at once. Let refer to either or , (resp. ) refer to either (resp. ) or (resp. ), and so on. By Lemma 4.3, for all , for all ,
[TABLE]
where refers to either or .
By (39) and the lower bound of Lemma 3.6,
[TABLE]
whence
[TABLE]
for any .
Let us now fix such that . By Hölder’s inequality, for all ,
[TABLE]
By Lemma 4.3, for all ,
[TABLE]
and by Corollary 3.4,
[TABLE]
with referring to either or . Using Lemma 4.4, we thus deduce that
[TABLE]
from which we deduce that
[TABLE]
thanks to Lemma 4.7 stated below. Since is arbitrarily close to , the proof is completed. ∎
Lemma 4.6** (Large deviation lower bound).**
Let be an energy functional satisfying Assumptions (TI), (F), (LSC), (GC), (SH) and (CC).
- (i)
For all open sets of ,
[TABLE] 2. (ii)
If the index given by Assumption (GC) is such that , then for all , for all open sets of ,
[TABLE]
Proof.
We shall prove both statements at once, and use the same shortcut notations as in the proof of Lemma 4.5. Once again, we start from the fact that satisfies the identity (54). Noting that
[TABLE]
and then using Lemma 4.4 with , we first obtain
[TABLE]
We now combine the lower bound of Lemma 3.6 with Lemma 4.3 to write
[TABLE]
from which we deduce that
[TABLE]
thanks to Corollary 3.4. The conclusion follows from the application of Lemma 4.7, which is stated below. ∎
Lemma 4.7** (Convergence of rate functions).**
Under the assumptions of either Theorem 2.16 or Theorem 2.14, let (resp. , for ) refer to either (resp. ) or (resp. ). Then for any subset of either or ,
[TABLE]
Proof.
The functions and write
[TABLE]
with obvious notations for and , and
[TABLE]
Thus, it is sufficient to prove that
[TABLE]
The fact that immediately yields
[TABLE]
Furthermore, for any ,
[TABLE]
and letting vanish in both sides of the inequality yields
[TABLE]
so that taking the infimum of the right-hand side over yields
[TABLE]
which completes the proof. ∎
5. Application to McKean-Vlasov and rank-based models
5.1. mv-model
This subsection presents the proof of Corollary 2.18. We first assume that, in the decomposition (17), .
Lemma 5.1** (Case ).**
Let be an interaction potential satisfying Assumption (mv-). Then the associated energy functional defined by (7) with satisfies Assumptions (TI), (F), (LSC), (GC), (SH) and (CC); besides, Assumption (GC) holds with the index given by Assumption (mv-).
Proof.
Assumptions (TI) and (F) are straightforward. The continuity of the mapping on , combined with the fact that, by Assumtion (mv-), is nonnegative and lower semicontinuous, and Fatou’s Lemma, yield Assumption (LSC).
Let be given by Assumption (mv-). By (7), for all ,
[TABLE]
which, by the Fubini-Tonelli Theorem, implies that if . On the other hand, if , then by Jensen’s Inequality,
[TABLE]
so that
[TABLE]
and satisfies Assumption (GC).
Assumption (SH) is a straightforward consequence of Assumption (mv-).
We finally let and take a sequence of independent random variables on some probability space with identical distribution . For all ,
[TABLE]
which leads to Assumption (CC) and completes the proof. ∎
We now address the general case , with . We decompose the energy functional , defined by (7), as
[TABLE]
with obvious definitions for and . By Lemma 5.1 and Theorems 2.16 and 2.14, the sequences and associated with satisfy the large deviation principles of Corollary 2.18, with respective rate functions denoted by and . On the other hand,
[TABLE]
with an obvious definition for .
If , then by Assumption (mv-), is a bounded and continuous functional on , so that the application of the Laplace-Varadhan Lemma [25, Theorem II.7.2, p. 52] is straightforward and yields the first part of Corollary 2.18.
Remark 5.2** (On the Laplace-Varadhan Lemma).**
The statement of the Laplace-Varadhan Lemma in [25, Theorem II.7.2, p. 52] requires the state space to be Polish, which is not proved for in the present article. However, a careful examination of the proof of this theorem shows that this assumption is in fact not necessary. More generally, we refer to [22, Section 4.3] for an exposition of Varadhan’s Lemma and various developments on regular (and in particular metric) topological spaces, which are not necessarily Polish.
Let us now assume that , and fix , where is given by Assumption (mv-). The functional is continuous on , but not necessarily bounded, so that following [25, Theorem II.7.2, p. 52] and [22, Lemma 4.3.8, p. 138], we shall check the exponential moment condition
[TABLE]
for some — in fact, since any multiple of also satisfies Assumption (mv-), this condition should hold for any .
Taking (55) for granted, the Laplace-Varadhan Lemma [25, Theorem II.7.2, p. 52] allows to transfer the large deviation principle from to on , for any . This result is then extended on , for any , and to , by the use of the Contraction Principle [22, Theorem 4.2.1, p. 126], which completes the proof of Corollary 2.18.
Proof of (55).
The argument is similar to the proof of Lemma 4.4. Let us fix , and rewrite
[TABLE]
Assumption (mv-) and Jensen’s Inequality imply that there exists such that, for all ,
[TABLE]
By Hölder’s Inequality and Assumption (mv-),
[TABLE]
so that
[TABLE]
and for any , there exists such that, for all , for all , the condition
[TABLE]
implies that
[TABLE]
Studying the integral in the numerator of the right-hand side of (56) separately on the domains and on its complement, we get the bound
[TABLE]
and the same change of variable as in the proof of Lemma 4.4 allows to complete the proof of (55). ∎
5.2. rb-model
The next lemma allows to deduce Corollary 2.19 from a straightforward application of Theorem 2.16.
Lemma 5.3** (Assumptions of Theorem 2.16 for the rb-model).**
If the flux function satisfies the assumptions of Corollary 2.19, then the associated energy functional defined by (9) satisfies Assumptions (TI), (F), (LSC), (GC) with , (SH) and (CC).
Proof.
Assumptions (TI) and (F) are straightforward.
To check Assumption (LSC), we recall that the weak convergence of probability measures implies the convergence -almost everywhere of their cumulative distribution functions, so that the lower semicontinuity of follows from Fatou’s Lemma and the fact that is continuous and nonnegative.
Let us now define
[TABLE]
The combination of the conditions (8), (18) and (19) implies that . As a consequence, for all ,
[TABLE]
which by Lemma 5.1 implies Assumption (GC) with .
Assumption (SH) follows from the remark that
[TABLE]
so that .
We finally let and take a sequence of independent random variables on some probability space with identical distribution . If , then by Remark 2.13,
[TABLE]
On the contrary, if , let us write
[TABLE]
where is a short notation for . Denoting , we get
[TABLE]
where is the Wasserstein distance of order . That this distance coincides with the distance of cumulative distribution functions is a specific feature of probability measures on the real line, see [8, Theorem 2.9, p. 16]. By [8, Theorem 2.14, p. 20], converges to [math] when grows to infinity, which shows that satisfies Assumption (CC). ∎
5.3. Application to the study of atypical capital distribution
It was proved in [43] that under the assumptions of Corollary 2.19, converges weakly, on for any , to the Dirac mass , where is the unique centered stationary measure of the nonlinear diffusion process describing the mean-field limit of (10) — we refer to [44, 32, 21] for associated propagation of chaos results in the space of sample-paths. This measure satisfies the stationary nonlinear Fokker-Planck equation
[TABLE]
which implies that it possesses a density with respect to the Lebesgue measure on , which solves the fixed point relation
[TABLE]
As a consequence, if we let be a random vector with distribution , and be independent random variables with identical distribution , then and satisfy the same weak law of large numbers, and converge to . However, the large deviations of these random empirical measures are respectively described by Corollary 2.19 (in the orbit space ), and by Sanov’s Theorem. We first examine the difference between the associated rate functions, and then detail an application of this result to the estimation of the probability of atypical capital distribution in the context of Stochastic Portfolio Theory.
5.3.1. Difference between rate functions
With the notations introduced above, let us define
[TABLE]
By Sanov’s Theorem and the Contraction Principle, the sequence satisfies a large deviation principle on , with good rate function
[TABLE]
Lemma 5.4** (Comparison of rate functions).**
Under the assumptions of Corollary 2.19, we have, for all , for all such that ,
[TABLE]
where
[TABLE]
As a consequence, if is concave, then for all ,
[TABLE]
Proof.
As a preliminary remark, we observe that the convergence of to implies that converges weakly to , with , on . Combining this weak law of large numbers with Corollary 2.19, we get that is the unique zero of , and therefore that
[TABLE]
Notice that (57) is the optimality condition associated with the definition of .
We now let and be such that . If , then by Assumption (GC), so that ; besides, it is known that has exponential tails [32, 33] so that . Likewise, if is not absolutely continuous with respect to the Lebesgue measure on , then both and are infinite.
We now assume that and has a density with respect to the Lebesgue measure, and write
[TABLE]
Besides,
[TABLE]
By (58),
[TABLE]
which after the use of Fubini’s Theorem yields
[TABLE]
and leads to (59).
If is concave, then for all , so that, for all , (59) yields
[TABLE]
for all such that . Taking the infimum over of the right-hand side results in (60) and completes the proof. ∎
5.3.2. Capital distribution curves
In the framework of Stochastic Portfolio Theory [26, 2], systems of rank-based interacting diffusions of the form (10) serve as first-order approximations of stable equity markets, in the sense that on a market with companies, the process provides a good representation of the behaviour of the logarithmic capitalisation of the -th company. Thus, the proportion of the total capital held by this company is given by its market weight
[TABLE]
Using the reverse order statistics notation
[TABLE]
the capital distribution curve is defined as the log-log plot of the mapping and summarises in which manner the whole capital of a market is spread among the companies.
Notice that the market weights are invariant by translation of , so that the vector (and therefore the associated capital distribution curve) only depends on . Besides, empirical studies (see for instance [26, Figure 5.1]) show that the capital distribution curves are remarkably stable over long times. These remarks suggest to study the statistical distribution of the vector under the probability measure [2, 17, 33].
When grows to infinity, the law of large numbers for prescribes a deterministic form for the (suitably rescaled) capital distribution curve, which was observed to fit empirical data in [33]. This defines a distribution of capital which we call typical. If one wants to study the capital distribution without having to sample the high-dimensional vector from the distribution , then the discussion above shows that using independent random variables identically distributed according to as a surrogate model provides correct results concerning this typical behaviour. Such a surrogate model was for instance employed in [33] to evaluate the performance of diversity-weighted portfolios, and in [11, Section 3] to study hitting times and rank-rank correlations.
On the contrary, Lemma 5.4 shows that the fluctuations of the capital distribution far from its typical behaviour, due to finite-size effects, and which are described by the large deviations of , are not correctly captured by the surrogate model in general. In short: both sequences and concentrate around , but their rate functions differ. On a more quantitative level, if the flux function is concave, then at the level of large deviations, the probability of an atypical distribution of the capital is always underestimated by (the surrogate model) with respect to . In other words, the interaction between the stocks increases the probability of an atypical capital distribution.
For similar works on the study of the fluctuations of mean-field rank-based interacting diffusions around, or far from, their typical behaviour, we refer to the respective works by Kolli and Shkolnikov [34], and Dembo, Shkolnikov, Varadhan and Zeitouni [21]. We also mention that inequalities between rate functions for sequence of probability measures having the same law of large numbers, such as in Lemma 5.4, naturally provide comparisons between asymptotic variances in Monte-Carlo numerical methods. For more details in this direction, we refer to the work by Rey-Bellet and Spiliopoulos [41] and the references therein.
Appendix A Metrisability of the quotient topology on
By definition, the quotient topology on is the strongest topology making the orbit map continuous. The purpose of this appendix is to prove that this topology is metrisable, which is in general not the case for quotient topologies.
We first note that the definition of the quotient topology implies the following characterisation of open and closed sets.
Lemma A.1** (Open and closed sets in ).**
A subset of is open (respectively closed) if and only if the set is open (respectively closed) in .
Our construction of a metric on is based on the Prohorov metric on , which following [7, Theorem 6.9, p. 74] can be defined by
[TABLE]
where the infimum is taken over all the couplings of and . We recall that a sequence of probability measures converges weakly to in if and only if converges to [math], so that the metric topology associated with the Prohorov metric coincides with the topology of weak convergence [7, Theorem 6.8, p. 73].
The following property of the Prohorov metric is immediate.
Lemma A.2** (Translation invariance of the Prohorov metric).**
For all , for all ,
[TABLE]
For all , let us define
[TABLE]
For any such that and , it is a consequence of Lemma A.2 that rewrites
[TABLE]
Lemma A.3** (Metrisability of ).**
The function is a metric on , and the associated metric topology is the same as the quotient topology.
We call the quotient Prohorov metric.
Proof.
It is obvious that is symmetric. To show that it satisfies the triangle inequality, we take and fix such that , and . By (62) and the triangle inequality for , for all ,
[TABLE]
so that taking the infimum of the right-hand side of the inequality over and using (63), we obtain
[TABLE]
We now take such that . Let such that and . By (63), for all there exists such that
[TABLE]
therefore converges to . By Ulam’s Theorem [7, Theorem 1.3, p. 8], is tight, hence there exists a centered ball , , such that
[TABLE]
Likewise, by Prohorov’s Theorem [7, Theorem 5.2, p. 60], the family is tight, so that there exists such that
[TABLE]
Assume that there exists an extracted sequence such that diverges to : then for large enough, the balls and are disjoint, so that the combination of (64) and (65) yields
[TABLE]
which is absurd. As a consequence, the sequence is bounded and therefore possesses a converging subsequence, that we still index by for convenience, and the limit of which is denoted . Using the continuity of the mapping , we get
[TABLE]
which implies that and completes the proof that is a metric.
As an immediate consequence of the definition (62) of , we have the inequality
[TABLE]
which implies that is continuous for the metric topology induced on by , so by definition of the quotient topology, the latter is stronger than the former. Now let be an open set in the quotient topology. By the definition of the quotient topology, the set is open in , so that for all , there exists such that , whence
[TABLE]
Since for any and , we may rewrite
[TABLE]
Introducing the notation
[TABLE]
we deduce from (63) that, for all ,
[TABLE]
so that
[TABLE]
As a consequence,
[TABLE]
therefore is an open set in the metric topology and the proof is completed. ∎
Acknowledgements
This work was motivated by several discussions with Freddy Bouchet on the large deviations of mean-field particle systems. The author is grateful to Cyril Labbé for his careful reading of this manuscript, and thanks the referee for correcting a mistake in the proof of Lemma A.3.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the space of probability measures . Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, second edition, 2008.
- 2[2] A. D. Banner, R. Fernholz, and I. Karatzas. Atlas models of equity markets. Ann. Appl. Probab. , 15(4):2296–2330, 2005.
- 3[3] S. Benachour, B. Roynette, D. Talay, and P. Vallois. Nonlinear self-stabilizing processes. I. Existence, invariant probability, propagation of chaos. Stochastic Process. Appl. , 75(2):173–201, 1998.
- 4[4] S. Benachour, B. Roynette, and P. Vallois. Nonlinear self-stabilizing processes. II. Convergence to invariant probability. Stochastic Process. Appl. , 75(2):203–224, 1998.
- 5[5] D. Benedetto, E. Caglioti, J. A. Carrillo, and M. Pulvirenti. A non-Maxwellian steady distribution for one-dimensional granular media. J. Statist. Phys. , 91(5-6):979–990, 1998.
- 6[6] D. Benedetto, E. Caglioti, and M. Pulvirenti. A kinetic equation for granular media. RAIRO Modél. Math. Anal. Numér. , 31(5):615–641, 1997.
- 7[7] P. Billingsley. Convergence of probability measures . Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons Inc., New York, second edition, 1999. A Wiley-Interscience Publication.
- 8[8] S. Bobkov and M. Ledoux. One-dimensional empirical measures, order statistics and Kantorovich transport distances. To appear in Mem. Amer. Math. Soc.
