An Introduction to Large Deviations and Equilibrium Statistical Mechanics for Turbulent Flows
Corentin Herbert

TL;DR
This paper introduces how large deviation theory and equilibrium statistical mechanics can be used to understand and predict the emergence of organized coherent structures in two-dimensional turbulent and geophysical flows.
Contribution
It applies principles of equilibrium statistical mechanics and large deviation theory to analyze and predict large-scale energy condensation and coherent structures in turbulent flows.
Findings
Energy condenses at large scales in 2D turbulence
Coherent structures can be predicted using statistical mechanics
Large deviation theory provides a framework for flow analysis
Abstract
Two-dimensional turbulent flows, and to some extent, geophysical flows, are systems with a large number of degrees of freedom, which, albeit fluctuating, exhibit some degree of organization: coherent structures emerge spontaneously at large scales. In this short course, we show how the principles of equilibrium statistical mechanics apply to this problem and predict the condensation of energy at large scales and allow for computing the resulting coherent structures. We focus on the structure of the theory using the language of large deviation theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Corentin Herbert 22institutetext: National Center for Atmospheric Research, Boulder, CO 80307, USA. 22email: [email protected]
An Introduction to Large Deviations and Equilibrium Statistical Mechanics for Turbulent Flows
Corentin Herbert
Abstract
Two-dimensional turbulent flows, and to some extent, geophysical flows, are systems with a large number of degrees of freedom, which, albeit fluctuating, exhibit some degree of organization: coherent structures emerge spontaneously at large scales. In this short course, we show how the principles of equilibrium statistical mechanics apply to this problem and predict the condensation of energy at large scales and allow for computing the resulting coherent structures. We focus on the structure of the theory using the language of large deviation theory.
1 Introduction
Various characterizations of turbulent flows can be encountered; the components they usually entail are a chaotic dynamics on a strange attractor RuelleBook1989 , a large range of scales (i.e. a large number of degrees of freedom), and strong nonlinear effects due to the prevalence of inertia over molecular dissipation FalkovichBook . Such flows can be found in industrial problems, but also in nature, for instance in geophysical flows and astrophysical flows. The above mentioned properties typically mean that not much can be said about the system in a deterministic framework, and that one should try instead to predict statistical properties.
This is exactly the purpose of the field of statistical mechanics: given a dynamical system (or set of ordinary or partial differential equations) in a large phase space (the microscopic state), can we predict typical values for specific functions on phase space (the macroscopic observables) without knowing the exact trajectory in phase space? For a large class of systems, said to be in equilibrium, such typical values can be obtained by assuming that the microscopic variables are random and distributed according to probability measures built upon a few macroscopic quantities, the invariants of the dynamical system. A classical example is that of the ideal gas: the exact position and velocity of the molecules matters little to us, but knowing the relations between macroscopic quantities such as temperature, pressure, energy and entropy is fundamental.
The ideas of statistical mechanics have been applied successfully to a large number of models of physical phenomena. An example of achievement of this approach is the theory of phase transitions, in which systems such as the Ising model, a toy-model of ferromagnetism, have been instrumental. However, turbulent flows present a number of difficulties: (i) they are directly formulated as continuous fields (infinite number of degrees of freedom) and have an infinity of conserved quantities, (ii) the interactions between constituents have a long range, (iii) in many practical applications, the system is driven out of equilibrium by external forces.
Although we shall not tackle issue (iii) at all in this chapter, we will try to show how (i) and (ii) are actually useful ingredients to make probabilistic predictions for the system. They are the cornerstones of a mean-field theory: interacting degrees of freedom can be treated as statistically independent random variables in the limit of a large number of degrees of freedom. A natural language to express these properties is that of large deviations theory LanfordBook ; RuelleBook ; EllisBook : the probability of the outcome of a given observable concentrates exponentially around a set of values when the size of the system goes to infinity. The focus of the chapter is on the presentation of the large deviation principles for carefully chosen observables for a discretized form of 2D turbulence. To show that the principles at work are very general, we shall underline the connection with simpler models such as variants of the Ising model of ferromagnetism. Although it is shown that the theory allows us to compute the equilibrium states of the system, we shall not dwell on the description of such equilibrium states; the reader is referred to the review articles Bouchet2012 ; Lucarini2014 on this topic. We shall also refrain from discussing the connections with earlier applications of statistical mechanics, like the point vortex approach of Onsager, reviewed in Eyink2006 , or the Kraichnan approach to Galerkin truncated flows Kraichnan1980 , only mentioned briefly in section 3.6.
These notes are based on lectures given at the Stochastic Equations for Complex Systems: Theory and Applications summer school organized at the University of Wyoming in June 2014. They mostly serve a pedagogical purpose, and we shall not give proofs of the results with the required mathematical rigor. However, we have tried as much as possible to provide the original references for the interested readers. The presentation adopted here owes much to the references Touchette2009 ; Bouchet2010 ; Potters2013 . Note that the ideas discussed here are applicable to many other systems with long range interactions DauxoisLRIbook ; Campa2009 and in particular gravitational systems Padmanabhan1990 ; Chavanis2006h , plasmas, cold atoms or toy models of statistical physics.
2 Models of turbulent flows
2.1 3D and 2D hydrodynamics
We are mainly interested here in the behavior of incompressible fluid flows, which is governed by the Navier-Stokes equations:
[TABLE]
where is the velocity field, the pressure and the viscosity. The equations can be recast into non-dimensional form by introducing a velocity scale , a length scale , the corresponding time scale or eddy turnover time , and the Reynolds number . In other words, the Reynolds number measures the ratio of the nonlinear term and the dissipative term, or equivalently, of inertia and viscosity LandauFluidBook ; FalkovichBook . Since viscosity acts at small scales, it is also a measure of the range of scales characteristic of the flow: the smallest scale is the Kolmogorov scale , where is the energy dissipation rate. Now, with , we obtain . Hence, the effective number of degrees of freedom in 3D flows grows as : flows with Reynolds number on the order of are not uncommon in nature (the atmosphere and the ocean for instance), leading to a very large typical number of degrees of freedom.
The Navier-Stokes equations can be recast in terms of the vorticity field :
[TABLE]
This difference between 2D and 3D flows have important consequences on their respective behavior. While 3D flows tend to transfer energy from the large scales to the small scales, where it is dissipated by viscosity, in a process referred to as a direct energy cascade Kolmogorov1941a ; FrischBook (big vortices break up into smaller and smaller vortices), 2D flows, on the contrary, transfer energy from the small scales to the large scales, and this is called an inverse energy cascade Kraichnan1980 ; Sommeria2001b ; Tabeling2002 ; Boffetta2012 . In this inverse cascade process, vortices merge to form larger and larger vortices McWilliams1984 . Unless sufficient large scale dissipation (e.g. bottom friction) is present, the energy piles up at the largest available scales, forming a condensate which dominates the flow LSmith1993 ; Chertkov2007 ; Boffetta2012 .
The physical problem we are interested in here is the inverse energy cascade and the emergence of large scale coherent structures.
2.2 Global invariants
The equations of motion for 3D and 2D hydrodynamics have a Hamiltonian structure, although non-canonical: there exists a Poisson structure, but it is degenerate OlverBook . This degeneracy leads to the existence of invariants, described in this section.
Inviscid 3D flows have two global invariants, the energy and the helicity Serre1984 :
[TABLE]
Helicity being sign indefinite, it does not in general constrain the nonlinear transfers sufficiently to hamper the direct energy cascade process Kraichnan1973 (see however, Biferale2012 ; Herbert2014a for particular cases). On the contrary, in 2D, vorticity conservation along streamlines leads to a family of invariants in addition to the energy ( being the stream function, defined by )
[TABLE]
where is an arbitrary function. As a particular case, all the moments (or norms) of the vorticity field are conserved:
[TABLE]
including the norm of the vorticity field, referred to as the enstrophy. It was anticipated early on Kraichnan1967 ; Leith1968 ; Batchelor1969 that the existence of a second, positive-definite, quadratic invariant, in addition to the energy, is sufficient to invert the direction of the energy cascade. The basic idea is that enstrophy is stronger in the presence of small-scale activity: transferring energy towards the small-scales while keeping the total energy constant cannot be done if we also need to conserve enstrophy. This loose statement was made more precise by a number of analytic arguments Fjortoft1953 ; Kraichnan1967 ; Leith1968 ; Batchelor1969 ; Merilees1975 ; NazarenkoBook , and verified in experiments Paret1997 ; Rutgers1998 and high-resolution numerical simulations Boffetta2012 . Statistical mechanics provides one of these analytical arguments (see section 3.6).
Conservation of the Casimir invariants can be formulated equivalently in terms of the moments of the vorticity field, as above, or in terms of the vorticity distribution. Indeed, the fraction of the domain area , occupied by the vorticity level , which can be written as
[TABLE]
is conserved. We shall see that this form is particularly convenient in section 3, but note that the two formulations are connected by the formula
[TABLE]
Finally, note that the vorticity distribution is normalized:
[TABLE]
2.3 Geophysical flows
Although 2D flows are interesting in themselves, part of the motivation for studying them comes from their common features with geophysical flows. Indeed, in addition to the small aspect ratio of the atmosphere and the ocean, their dynamics is subjected to the effect of strong rotation and density stratification. These properties allow for an asymptotic regime which describes well the large-scale dynamics, the quasi-geostrophic regime VallisBook . This regime is very similar to 2D flows, because it reduces to a quantity, called potential vorticity, being advected by the flow, similarly to the vorticity (see (4)). In particular, the velocity field is purely horizontal. The only difference is that the fields also depend on the vertical, and whereas the vorticity is the laplacian of the stream function: in 2D, here the potential vorticity is related to the stream function by a slightly more complicated linear differential operator. The existence of Casimir invariants similar to those of 2D flows leads again to an inverse cascade of energy and the formation of coherent structures at large scales Charney1971 ; Rhines1979 ; SalmonBook . Therefore, the considerations presented here may apply to such flows as well, and attempts to extend the theory in this context have flourished over the past few years. For the sake of simplicity, we shall restrict ourselves here to the case of 2D flows; the interested reader may consult the literature on extensions to quasi-geostrophic flows in the barotropic case Bouchet2002 ; Naso2011 ; Venaille2011b ; Herbert2012b , the baroclinic case DiBattista2001a ; Venaille2012a ; Venaille2012b ; Herbert2014b , shallow-water equations Chavanis2002a ; Chavanis2006b , as well as the general references MajdaWangBook ; Bouchet2012 ; Lucarini2014 , for instance.
The quasi-geostrophic regime breaks down at smaller scales, and we enter an intermediate regime, often referred to as stratified turbulence Lilly1983 . In this regime, even though we can still define a potential vorticity which is a Lagrangian invariant, it does not put as strong a constraint on the system as in the 2D case. Indeed, the fields (velocity, density) can be decomposed into a balanced part which contributes to potential vorticity, and inertia-gravity waves, which do not. As a result, the organization of the system in terms of inertial ranges and energy cascades is not so simple. High-resolution numerical simulations have indicated the existence of two inertial ranges with a constant and opposite flux of energy Pouquet2013 . A possible interpretation is that the vortical modes are responsible for the inverse cascade of energy while the inertia-gravity waves have to do with the direct energy cascade. This interpretation is supported by a statistical mechanics argument Herbert2014c , which is an adaptation of the Kraichnan argument (see section 3.6) in the context of the restricted canonical ensemble Penrose1979 .
Independently of the constraining effect of rotation and stratification (which can be seen as forces breaking isotropy), another direction of generalization which has been considered is that of 3D flows with symmetries, and especially axisymmetric flows Leprovost2006 ; Naso2010b ; Thalabard2014 . This configuration is relevant for setups used in laboratory experiments, such as the von Karman experiment. It has been shown in particular that one could define a microcanonical measure using an approach analogous to that of section 3.1, with, however, some considerable complications to treat the fluctuations of the poloidal field Thalabard2014 .
2.4 Discretized form for 2D Euler flows and analogies with toy models of magnetic systems
Instead of the continuous vorticity field and the infinite dimensional phase space it belongs to, it may be more convenient to introduce finite dimensional models. Here we shall mostly consider a discretization on a square lattice with sites equally spaced in the domain (see Fig. 1), and the variables of interest are the values taken by vorticity at each site.
In this form, the system can be related to some classical models of statistical physics.
Two-vorticity level system and long-range Ising model
The Ising model is one of the most famous models in statistical physics. It can be seen as a toy model of ferromagnetism, but it has served as a testbed for a very large number of ideas going far beyond this particular problem Cipra1987 . It consists of a finite number of spins located on a lattice of arbitrary shape and dimension (although a square lattice is often considered) and interacting through a hamiltonian of the form:
[TABLE]
In this form, the hamiltonian is just any quadratic function. A standard choice of interaction is the nearest-neighbor model: if the sites and are connected in the lattice, and if they are not. That way, aligned neighboring spins will contribute a term to the hamiltonian, while anti-aligned neighboring spins will contribute . If is positive the system is called ferromagnetic and if it is negative the system is called antiferromagnetic. An observable of interest is the magnetization:
[TABLE]
When one finds about the same proportion of positive and negative spins, the magnetization should vanish. Applying an external magnetic field, represented by a term of the form in the hamiltonian, leads to alignment of spins, and therefore a non-vanishing magnetization. This is the standard behavior of so-called paramagnetic materials. By contrast, in ferromagnetic materials, spins may align spontaneously and yield unit magnetization (in absolute value) without imposing an external magnetic field (or, in experiments, the system retains its magnetization when the applied magnetic field is switched off). The Ising model can be seen as a toy model of the paramagnetic-ferromagnetic transition.
In the above case, the system has short-range interactions, since only neighboring spins interact. Versions with long-range interactions can be built by allowing non-vanishing for distant sites and . For instance, one may assume that all the spins interact with all the other spins with the same coupling constant: , where the ensures that the Hamiltonian is an intensive quantity. In this case, the Hamiltonian becomes a function of magnetization only:
[TABLE]
This version of the Ising model is referred to as mean-field, because it is tantamount to saying that each spins feels the effect of a magnetic field created by all the other spins rather than the individual effect of each of his neighbors. Indeed, let us consider a given spin ; it provides a contribution , which is the same has a non-interacting spin under external magnetic field would. If we replace this magnetic field by the magnetization, we obtain the mean-field Hamiltonian. Note that the geometric shape (square, triangle, etc) and the dimension of the lattice do not matter here since all the spins interact with the same intensity.
An advantage of the mean-field Ising model is that it has an exact solution in any dimension BaxterBook . On the contrary, exact solutions for the standard, short-range Ising models are only know for dimension one Ising1925 and two Onsager1944 .
The discretized version of 2D flows described above is related to the Ising model in the following way: rather than allowing the vorticity to take any real value, we can restrict it to a two-level set . Then the system becomes analogous to the Ising model, with an interaction matrix given by the Green function of the Laplacian on the lattice. On a plane, this amounts to interactions proportional to the logarithm of the distance between sites: for . This is a kind of long-range interaction. The difference with the Ising model is the presence of the vorticity distribution conservation constraint. This would amount to fixing the number of spins and the number of spins in the Ising model.
Energy-enstrophy ensemble and the long-range spherical model
Another variant of the Ising model consists in letting the spins take any real value, while satisfying the global constraint . Clearly, this constraint is satisfied in the standard Ising model with spins in . The name spherical model was coined for this variant because of the form of the global constraint, which means that the set of all spin values lies on the surface of a sphere in . It was introduced by Berlin and Kac Berlin1952 as an attempt to patch the divergence arising from assuming that the spins are distributed according to a normal distribution (the Gaussian model) while remaining exactly solvable in any dimension BaxterBook . The observables of interest (Hamiltonian, magnetization) are the same as for the Ising model. Versions with short-range Berlin1952 or long-range Joyce1966 interactions can again be considered by choosing different quadratic forms .
In their discretized version, 2D flows resemble a long-range spherical model if we only retain one Casimir invariant: the enstrophy. Indeed, enstrophy conservation implies . Again, the interaction matrix is given by the Green function of the Laplacian on the lattice. This connection is further investigated in section 3.6. It has also been pointed out in a series of papers by Lim Lim2001c ; Lim2012 .
3 Mean-field theory for 2D flows
We provide here a heuristic presentation of the mean-field theory introduced independently by Miller Miller1990 ; Miller1992 , Robert and Sommeria Robert1991a ; Robert1991b , and further developed by many others. The presentation is inspired by the original work by Miller and the more recent references Bouchet2010 ; Bouchet2012 ; Potters2013 . More rigorous mathematical proofs can be found in the original papers by Robert and coworkers Robert1989 ; Robert1990 ; Robert1991a ; Robert1991b ; Michel1994a ; Michel1994b ; Robert2000 and Ellis, Turkington and coworkers Turkington1999 ; Boucher1999 ; Boucher2000 ; Ellis2000 ; Ellis2002 .
3.1 Microcanonical measure and large deviations for the energy and vorticity distribution
The general idea is to consider the vorticity field , referred to as the microstate, as a random variable distributed according to the microcanonical distribution. In other words, we introduce a probability measure on the phase space , where is a 2D domain (we shall mostly consider the case of a rectangular domain here). We are going to give a sketch of the construction of this measure as a limit of measures on finite-dimensional phase spaces corresponding to approximations of the continuous vorticity field. Then, we will be able to make predictions on the value of macrostates, i.e. observables (or more generally where the space is macroscopic in some sense, e.g. has a dimension much lower than ) on phase space, which, as we shall see, satisfy large deviation properties: they concentrate in probability around some specific values, the equilibrium states.
To keep things simple, we shall consider a finite number of vorticity levels . This amounts to saying that the vorticity distribution has the form . We consider the discretized system with sites on the square lattice introduced in section 2.4 (see Fig. 1), and define a microstate as being simply the value of the vorticity field at all the points of the lattice. Therefore the phase space is simply .
Considering the conservation laws mentioned above, there are two observables of primary interest: the energy observable, i.e. the Hamiltonian, given by
[TABLE]
Note that the set of accessible energies (i.e. the values taken by the observable ) is finite and depends both on the vorticity levels and on the number of sites . Ultimately, in the limit , we shall be interested in a continuum of energy levels. One approach to circumvent this difficulty is to consider in a first step energy shells with finite width , large enough so that each shell is attained by the energy observable for some microstates Potters2013 . In the limit , the results will not depend on the value of . To keep notations as simple as possible, we shall refrain from doing so here, but in all rigor one should understand whenever we write . In this framework, the set of microstates with vorticity distribution and energy is
[TABLE]
This is a finite set whose cardinality we denote by .
We are going to introduce two probability measures on phase space: first, let us consider a prior measure , which here is just the normalized counting measure: if . This amounts to saying that all the microstates are equiprobable: for any observable , the probability of the outcome is just the fraction of microstates for which . Now, we want to restrict that statement to all the microstates with a fixed energy and vorticity distribution, while assigning vanishing probability to all the other microstates. Hence, we introduce the (finite-) microcanonical measure : if , . Hence, for an observable , the probability law of the random variable is . Note that we have introduced indices and to distinguish from probabilities computed with respect to the prior measure. Probabilities in the microcanonical ensemble are thus just conditional probabilities:
[TABLE]
As mentioned above, observables of particular interest are the hamiltonian and the vorticity distribution observables . The joint probability to observe an energy and a vorticity distribution , with respect to the prior measure, satisfies a large-deviation property, and the large deviation rate function is the opposite of the entropy :
[TABLE]
with
[TABLE]
3.2 Large deviations for the macrostates
We now introduce a new class of observables associated with the coarse-graining of the vorticity field. We decompose the lattice into cells, each containing sites. For a microstate , we shall denote the components as where is the index of the cell and is the index of the site within the cell (see Fig. 1). The coarse-graining observable is given by
[TABLE]
More generally, we can define an observable which corresponds to the distribution of vorticity levels in each cell. It is just the empirical vector
[TABLE]
Note that . Besides, the observable can be deduced from since for . Let us refer to the elements of the image of as the macrostates. The set of microstates corresponding to a given macrostate is simply its pre-image . The number of microstates realizing a given macrostate will be denoted . It is easily computed that:
[TABLE]
The vorticity distribution observables take a constant value over an equivalence class :
[TABLE]
so that if are such that , then for , . In other words, the equivalence kernel of the observable is finer than that of any of the observables . In practice, this means that we need not worry about enforcing the vorticity distribution constraint when counting the number of microstates realizing a given macrostate. For the energy observable, the situation is slightly more subtle: denoting the Green function of the Laplacian on the lattice with the new indexing of the sites, the energy observable is given by:
[TABLE]
The above results are sometimes restated by saying that we have an energy (and here, also vorticity distribution) representation function Touchette2009 (see Fig. 2). It allows us to obtain the most probable states with respect to the microcanonical measure by obtaining a large deviation property with respect to the prior (unconstrained) measure.
Indeed, the unconstrained probability of observing a macrostate is
[TABLE]
which again appears as a large deviation rate function (up to an additive constant and a minus sign), although this time it is a large deviation of an empirical vector (observable ) rather than a sample mean (energy observable ). Hence, the above result should in all rigor be seen as a consequence of the Sanov theorem.
Now, in the microcanonical ensemble, the probability involves the joint (unconstrained) probability . But due to the existence of the energy and vorticity distribution representation functions, we have:
[TABLE]
therefore,
[TABLE]
It follows that the probability of a given macrostate also satisfies a large deviation result with respect to the microcanonical measure:
[TABLE]
with the large deviation rate function
[TABLE]
Hence, the most probable macrostates with respect to the microcanonical measure are those which minimize the large deviation rate function, i.e. those which maximize the entropy while satisfying the constraints on energy and vorticity distribution: they are solutions of a constrained variational problem. It is worthy of note that the Boltzmann-Gibbs entropy , defined in (37), evaluated at a solution of the variational problem, agrees with the entropy defined from the Boltzman formula (24). This is not a coincidence, but a cornerstone of the mean-field approach. It can be understood in the language of large deviation theory as a contraction principle Touchette2009 . Roughly speaking, due to the existence of representation functions, the probability of observing an energy and a vorticity distribution can be computed as the integral over all the macrostates (rather than the microstates) with these constraints: denoting
[TABLE]
we have
[TABLE]
As a conclusion, the most probables macrostates with respect to the microcanonical measure satisfy : they are solutions of the constrained variational problem:
[TABLE]
3.3 Thermodynamic limit and mean-field equation
We are now interested in the macrostates obtained in the limit . Letting also , they are the probability distributions for fine-grained vorticity : is the probability that the vorticity at point lies in the interval . The local normalization condition must still be satisfied for each point . The coarse-grained vorticity field is now . As explained above, the energy and vorticity distribution depend only on the macrostate :
[TABLE]
The most probable macrostates are now those maximizing (50) while satisfying the energy and vorticity distribution constraints. They are solutions of the microcanonical variational problem:
[TABLE]
The critical points of the variational problem are readily found: there exist Lagrange multipliers and such that the first variations vanish:
[TABLE]
which leads to the Gibbs states
[TABLE]
This is a (elliptic) partial differential equation, referred to as the mean-field equation, characterizing the most probable coarse-grained vorticity fields. Note that the equation is of the same form as the equation defining stationary states of the Euler equation: equilibrium states form a subclass of steady-states for which the function relating vorticity and stream function is fixed by the invariants of the system.
The equilibrium states of the system can thus be obtained by solving (55). In general, this is a difficult task. Analytical solutions have been obtained in the limit of a linear function (the mean-field equation then reduces to a Helmholtz equation), using the method introduced by Chavanis and Sommeria Chavanis1996a , which consists in decomposing the vorticity field and stream function on a basis of Laplacian eigenfunctions. Numerical methods are also available: Turkington and Whitaker have proposed an algorithm to iteratively solve the variational problem described above Turkington1996 , while Robert and Sommeria Robert1992 have proposed relaxation equations where the dynamics maximize the entropy production rate, thereby reaching a maximum entropy state. We shall not describe in details these methods here, nor the solutions they yield. Note, however, that in general, they correspond to large scale coherent structures, like vortices or unidirectional (e.g. zonal) flows, depending on the geometry of the domain: for instance dipole/monopole in a rectangular domain Chavanis1996a , dipole/unidirectional flow in a doubly periodic domain Bouchet2009 , Fofonoff flows on a beta-plane Naso2011 , and solid-body rotation/dipole/quadrupole/unidirectional flow on a sphere Herbert2012a ; Herbert2012b ; Herbert2013b ; Qi2014 .
3.4 Non-equivalence of ensembles
Statistical ensembles and variational problems
So far we have been using exclusively the microcanonical measure
[TABLE]
and similarly in the thermodynamic limit . If we replace the microcanonical measure in section 3.2 by any of these two measures, we obtain mutas mutandi a large deviation principle for the macrostates. In the thermodynamic limit, the most probable macrostates (i.e. the equilibrium states) are therefore solutions of the following variational problems:
[TABLE]
respectively for the microcanonical measure, the canonical measure and the grand-canonical measure. The maximized functions arise as large deviation rate functions, and the constraints stem from the definition of the ensembles as conditional probabilities and from the existence of representation functions. The entropy , the free energy and the grand potential are referred to generically as thermodynamic potentials.
The existence of a large deviation principle for the macrostates does not depend on the particular choice of ensemble, but the most probable macrostates may depend on this choice. The task that we set out to investigate in this section is therefore how the different ensembles are related. The discussion closely follows the references Ellis2000 ; Touchette2004 .
Ensemble equivalence at the macrostate level
First of all, it is clear from the structure of the variational problem and the Lagrange multiplier rule that they all have the same critical points. However, the critical points may be of different nature: a maximizer of one variational problem may be a saddle point of another variational problem for instance. Nevertheless, it is easily seen that a solution of a variational problem with a constraint relaxed (e.g. the canonical variational problem) is always a solution of the original constrained variational problem (e.g. the microcanonical problem). We can formalize this remark by introducing the sets of equilibrium states (i.e. solutions of the variational problems):
[TABLE]
As per the above remark, we always have,
[TABLE]
In particular,
[TABLE]
If the converse statements hold, i.e.
[TABLE]
we say, respectively, that the microcanonical and canonical ensembles are equivalent at the macrostate level or that the canonical and grand canonical ensembles are equivalent at the macrostate level. It is straightforward to see that it is a transitive relation, in the sense that if the microcanonical ensemble is equivalent to the canonical ensemble at the macrostate level, and if the canonical ensemble and the grand-canonical ensemble are equivalent at the macrostate level, then the microcanonical and the grand-canonical ensembles are equivalent a the macrostate level. Besides, if the grand-canonical ensemble is equivalent to the microcanonical ensemble at the macrostate level, then the canonical ensemble is equivalent to both the microcanonical and the grand-canonical ensembles at the macrostate level.
If the three ensembles are equivalent at the macrostate level, we have the equalities:
[TABLE]
Ensemble equivalence at the thermodynamic level
Due to the definition of the thermodynamic potentials through the variational problems, connections exist between them as well. For the free energy for instance, we have
[TABLE]
We know that the Legendre transform is an involution ArnoldMecaBook . This is not necessarily the case for the Legendre-Fenchel transform, because the Legendre-Fenchel transform of an arbitrary function is always a concave function, but it is true when the function is concave. In general, we only obtain the concave hull of the original function RockafellarBook . Hence, the free energy is always a concave function of and the grand-potential is always a concave function of its arguments, while is always a concave function of , and is the smallest concave function satisfying . The equality holds if is a concave function. Therefore, we say that the microcanonical and canonical ensemble are equivalent at the thermodynamic level if , or equivalently, if is a concave function of . Similarly, the grand canonical and the canonical ensembles are equivalent at the thermodynamic level if , i.e. if is a concave function of .
Again, we have a transitivity property: equivalence of the grand canonical and canonical ensembles on the one hand, and of the canonical and microcanonical ensembles on the other hand implies equivalence of the grand canonical and microcanonical ensembles. Besides, if the grand canonical and the microcanonical ensembles are equivalent, then the canonical ensemble is equivalent to both the grand canonical and the microcanonical ensembles. In both these cases, the entropy is a concave function of all its arguments.
Equivalence and Non-equivalence of statistical ensembles
The notions of ensemble equivalence at the macrostate level (section 3.4) and at the thermodynamic level (section 3.4) are connected. Indeed, the local concavity properties of the thermodynamic potential determine the possibility to invert the relation with the Lagrange multiplier, or in other words, the possibility that the macrostates can be obtained by solving a relaxed variational problem. Following Ellis2000 , let us examine the three possibilities in the context of the microcanonical and canonical ensembles. Let us fix , then one of the three following assertions holds:
- (i)
Total Ensemble Equivalence: If and is not locally flat, then for . 2. (ii)
Marginal Ensemble Equivalence: If and is locally flat, then for . 3. (iii)
Ensemble Inequivalence: If , then .
3.5 Large deviations for the coarse-grained vorticity field
In section 3.2, we have considered how the probability of the outcome of a given observable (the distribution of fine-grained vorticity) behaves when the size of the system goes to infinity. We have found that it satisfies a large deviation property, which allows us to compute the most probable outcomes (see section 3.3). From there, we are able to deduce what the most probable coarse-grained vorticity fields are. But can we apply the same methods directly to the coarse-graining observable to compute the most probable coarse-grained vorticity fields? In other words, can we obtain a large deviation principle directly for the observable ?
In general this is not straightforward, because we do not have a representation function for the vorticity distribution in terms of the coarse-grained vorticity field. Let us give an exemple in the simple case where we have only three levels of vorticity: . We have represented on Fig. 3 two microstates which lead to the same coarse-grained vorticity field, with different vorticity distributions.
As a consequence, we cannot deduce a large deviation principle with respect to the microcanonical measure (or any of the other ensembles) from a large deviation principle with respect to the prior measure. In principle it remains possible to evaluate directly the probability of a coarse-grained vorticity field in the microcanonical ensemble, but this is a much more complicated combinatorial problem. However, in the special case of a two-level vorticity system, we do have a representation function for the vorticity distribution. We illustrate this in the following sections by making use of the analogy with the mean-field Ising model pointed out above.
Mean-field Ising model
Remember the mean-field Ising model described in section 2.4. We have mentioned above that there is a representation function for the energy in terms of the magnetization (Fig. 4). Therefore it is sufficient to obtain a large deviation principle for the magnetization with respect to the unconstrained measure. If (resp. ) is the number of (resp. ) spins, the magnetization is given by , and we have . In other words, . Hence, the unconstrained probability to observe a given magnetization is
[TABLE]
which proves that the magnetization observable satisfies a large deviation principle. It is customary to work in the canonical ensemble (see section 3.4), and the most probable states are therefore solutions of the variational problem:
[TABLE]
where is the free energy. Using and (73), it is easily shown that for smaller than a critical value (high temperature ), there is a unique solution , while for larger than the critical value (low temperature), there are two non-zero solutions . The most probable magnetization as a function of the temperature is represented on Fig. 4.
Two-level system
We have noted above that when the vorticity level set is made of two opposite values, (with ), the system becomes analogous to the mean-field Ising model studied above. The only difference is the vorticity distribution conservation constraint (and the interaction coefficients). This amounts to keeping fixed the number of and spins in the Ising model, or equivalently, to fixing the magnetization. But the magnetization here is nothing but the circulation . Therefore, conservation of the Casimir invariants in the discretized two-level model reduces to conservation of the circulation.
Another way to see this is to show explicitly that there exists a representation function for the vorticity distribution in this case. The coarse-graining operator takes value in a discrete subset of : denoting , the image of the operator is . Here, corresponds to the number of sites with value in each coarse-graining cell. The relation between and can be inverted: , and we obtain
[TABLE]
Note that, as expected, (Eq. (13)) and (Eq. (11)). Now, it is an easy task to evaluate the unconstrained probability of a given coarse-grained vorticity field:
[TABLE]
The contraction principle ensures that, as can be checked explicitly,
[TABLE]
By the same token as in section 3.2, it follows that the most probable coarse-grained vorticity fields are solutions of the constrained variational problem:
[TABLE]
Straightforward computations show that the critical points of the variational problem are solutions of the equation:
[TABLE]
Fragile constraints and constrained Casimir variational problem
It has been observed by several authors that none of the Casimir invariants (moments of the vorticity field) except the first (circulation) can be obtained from the coarse-grained vorticity field . For this reason they are often referred to as fragile invariants, in the sense that they do not survive coarse-graining. This is exactly the same as saying that there is no representation function for the Casimir invariants (or, equivalently, for the vorticity distribution ) in terms of the coarse-grained vorticity field , except in the particular case mentioned above. However, with an arbitrary vorticity distribution, a large deviation principle can still be obtained by contraction, as illustrated above in the two-layer case. This provides a variational problem for the most probable coarse-grained vorticity field, even though it still relies on an auxiliary maximization on the distribution for the vorticity levels.
Because of their fragile nature, and because an infinite number of invariants is difficult to handle in practice, it was suggested Ellis2002 ; Chavanis2003 to treat these invariants in a canonical ensemble, and to consider the Lagrange parameter as a prior vorticity distribution chosen based on physical intuition of the problem at hand. This provides a subset of solutions of the microcanonical variational problem, but not necessarily the full set (see section 3.4). However, this variational problem, expressed in terms of the distribution for the vorticity levels, is equivalent to minimizing, with respect to the coarse-grained vorticity field , the so-called Casimir functionals with fixed energy, where is a convex function, choosing for the Legendre-Fenchel transform of Bouchet2008 .
3.6 The energy-enstrophy measure
Gibbs measure for Galerkin truncated flows
In this section we investigate the statistical mechanics of the 2D Euler equations resulting from simplifying the conservation constraints: we retain only the energy and the enstrophy invariants. This was actually one of the starting points for statistical mechanics of turbulent flows: Lee in 3D Lee1952 and Kraichnan in 2D Kraichnan1967 considered Fourier series of the dynamical fields truncated at a given order . In 3D, the only invariants are the energy and the helicity, while in 2D, Kraichnan considered the energy:
[TABLE]
This is a Gaussian probability density, well-defined if for all . This condition leads to three possible regimes: (i) , (ii) , (iii) . In each case, Kraichnan considered the energy spectrum and computed its average value with respect to the Gibbs measure Kraichnan1967 ; Kraichnan1980 : . In the negative temperature () regime, the spectrum peaks at the gravest mode ; there is even an infrared divergence when . This is classically interpreted as an indication that not only nonlinear interactions in 2D flows tend to transfer energy towards the large scales (the inverse cascade), but there is a tendency for energy to accumulate in the gravest mode to form a condensate LSmith1993 ; Chertkov2007 ; Boffetta2012 . Note that the average value of each vorticity mode vanishes by symmetry: , because a given vorticity field and its opposite have the same probability in the canonical ensemble. Of course, in reality, the system will spontaneously break the symmetry and choose a vorticity field, which can be computed in the limit of large using large deviations results for the macrostates as we did above. In the energy-enstrophy ensemble, averaging over the set of equilibrium states indeed yields a vanishing mean value, thereby showing that statistical mechanics is more about most probable states than average values.
Large deviations in the microcanonical ensemble
Using the same notations as in the previous paragraph, one may assume that the truncated vorticity field is distributed according to the microcanonical measure
[TABLE]
Note that Bouchet and Corvellec Bouchet2010 have also checked with explicit computations that this entropy defined as the joint large deviation rate function for the energy and enstrophy observables (i.e. the Boltzmann formula , given in (92)) coincides with the entropy defined through the variational problem for the macrostates, as expected from the contraction principle (Eq. (47)). A similar computation of the structure function was carried out by Kastner and Schnetz Kastner2006 for the mean-field spherical model defined in section 2.4.
From the joint large deviation principle for the energy and enstrophy, we can deduce a large deviation principle for the energy spectrum observable Bouchet2010 :
[TABLE]
The large deviation rate functions are monotonous: is an increasing function of , and are decreasing functions of . Therefore, the most probable energy spectrum in the limit has all its energy in the gravest mode. This can be seen as the microcanonical counterpart of the Kraichnan argument presented in section 3.6. It provides further theoretical evidence for the spectral condensation in 2D turbulence.
The above discussion on the vanishing of the average truncated vorticity field also applies in the microcanonical ensemble. The mean-field theory allows to compute the most probable macrostates: we find a linear mean-field equation for the coarse-grained vorticity field: , which is easily solved and yields , in agreement with the above prediction.
4 Conclusion
In this chapter, we have given a brief introduction to the methods of equilibrium statistical mechanics applied to models of turbulent flows, focusing on the case of two-dimensional flows. The main purpose of the course was to show, in the context of a lattice discretization of the system, how some well-chosen observables, such as the distribution of fined-grained vorticity levels, concentrate in probability around a set of equilibrium values. Such properties are conveniently expressed using the theory of large deviations. In fact, we have closely followed the principles of equilibrium statistical mechanics formulated in the language of large deviations, as exposed for instance in Touchette2009 . A major ingredient in deriving the large deviation results is the long-range character of the interactions, because it leads to the existence of a representation function for the energy. This is a major simplification, as it allows us to compute the probability of a macrostate with respect to the uniform measure and then deduce the probability with respect to the microcanonical measure. We have emphasized this point by considering another observable, the coarse-grained vorticity field (for which there is no representation function for the vorticity distribution) and by making the analogy with a simpler system, the mean-field Ising model.
The large deviation principle leads to a variational problem characterizing the most probable macrostates. This allows to compute coarse-grained vorticity fields which should correspond in practice to the final state of the system, if ergodicity holds. This provides a statistical explanation of the spontaneous emergence of coherent structures in two-dimensional flows. The equilibrium states obtained may depend on the choice of probability measure in phase space: we have discussed the relations between the standard ensembles of statistical mechanics and given a connection with the concavity properties of the entropy.
In the simpler context of the energy-enstrophy measure, we have explained that the energy spectrum observable also satisfies a large deviation principle, which shows that the most probable state has all its energy condensed in the gravest mode. This is physically consistent with the familiar ideas of inverse cascade of energy and energy condensation for 2D flows.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Arnold, V.I.: Mathematical Methods of Classical Mechanics, 2nd edition edn. Springer (1989)
- 2(2) Batchelor, G.: Computation of the energy spectrum in homogeneous two-dimensional turbulence. Phys. Fluids 12 (Suppl. II), 233–239 (1969). DOI 10.1063/1.1692443
- 3(3) Baxter, R.J.: Exactly Solved Models in Statistical Mechanics. Academic Press (1982)
- 4(4) Berlin, T.H., Kac, M.: The spherical model of a ferromagnet. Phys. Rev. 86 , 821 (1952). DOI 10.1103/Phys Rev.86.821
- 5(5) Biferale, L., Musacchio, S., Toschi, F.: Inverse Energy Cascade in Three-Dimensional Isotropic Turbulence. Phys. Rev. Lett. 108 (16), 164,501 (2012). DOI 10.1103/Phys Rev Lett.108.164501 . URL http://link.aps.org/doi/10.1103/Phys Rev Lett.108.164501
- 6(6) Boffetta, G., Ecke, R.E.: Two-Dimensional Turbulence. Ann. Rev. Fluid Mech. 44 , 427 (2012). DOI 10.1146/annurev-fluid-120710-101240
- 7(7) Boucher, C., Ellis, R.S., Turkington, B.: Spatializing random measures: Doubly indexed processes and the large deviation principle. Annals of Probability 27 , 297–324 (1999)
- 8(8) Boucher, C., Ellis, R.S., Turkington, B.: Derivation of maximum entropy principles in two-dimensional turbulence via large deviations. J. Stat. Phys. 98 (5-6), 1235–1278 (2000)
