Irreversibility and typicality: A simple analytical result for the Ehrenfest model
Marco Baldovin, Lorenzo Caprini, Angelo Vulpiani

TL;DR
This paper uses the Ehrenfest model to analytically demonstrate that macroscopic irreversibility is a typical property of stochastic processes, showing most trajectories behave irreversibly and align with ensemble averages, confirmed by simulations and proofs.
Contribution
It provides a simple analytical framework clarifying the typicality of irreversibility in the Ehrenfest model, supported by rigorous proofs and numerical validation.
Findings
Most trajectories exhibit irreversible behavior
Trajectories stay close to ensemble averages
Rigorous proof of typicality in the thermodynamic limit
Abstract
With the aid of simple analytical computations for the Ehrenfest model, we clarify some basic features of macroscopic irreversibility. The stochastic character of the model allows us to give a non-ambiguous interpretation of the general idea that irreversibility is a typical property: for the vast majority of the realizations of the stochastic process, a single trajectory of a macroscopic observable behaves irreversibly, remaining "very close" to the deterministic evolution of its ensemble average, which can be computed using probability theory. The validity of the above scenario is checked through simple numerical simulations and a rigorous proof of the typicality is provided in the thermodynamic limit.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Irreversibility and typicality: A simple analytical result for the Ehrenfest model
Marco Baldovin
Dipartimento di Fisica, Università di Roma “Sapienza”, P.le A.Moro 5 I-00185, Rome, Italy
Lorenzo Caprini
Gran Sasso Science Institute (GSSI), Via F.Crispi 7, I-67100 L’Aquila, Italy
Angelo Vulpiani
Dipartimento di Fisica, Università di Roma Sapienza, P.le Aldo Moro 5, 00185, Rome, Italy
Centro Linceo Interdisciplinare “B. Segre”, Accademia dei Lincei, Rome, Italy
Abstract
With the aid of simple analytical computations for the Ehrenfest model, we clarify some basic features of macroscopic irreversibility. The stochastic character of the model allows us to give a non-ambiguous interpretation of the general idea that irreversibility is a typical property: for the vast majority of the realizations of the stochastic process, a single trajectory of a macroscopic observable behaves irreversibly, remaining “very close” to the deterministic evolution of its ensemble average, which can be computed using probability theory. The validity of the above scenario is checked through simple numerical simulations and a rigorous proof of the typicality is provided in the thermodynamic limit.
I Introduction
Understanding the irreversibility from first principles is an old and noble problem of Physics. The technical reason of its difficulty is rather clear: on the one hand, the microscopic world is ruled by laws (Hamilton equations) which are invariant under the transformation of time reversal (, being and the positions and momenta of the system); the macroscopic world, on the other hand, is described by irreversible equations, e.g. the Fick equation for the diffusion Boltzmann [2003], Cercignani [1998], Emch and Liu [2002], Castiglione et al. [2008].
How is it possible to conciliate the two above facts? On this topic there is an aged debate which started with the celebrated Boltzmann’s theorem and the well known criticisms by Loschmidt (about reversibility) and Zermelo (about recurrency). Of course we cannot enter in a detailed discussion about this fascinating chapter of statistical mechanics Cercignani [1998], Zanghì [2005]. Already Boltzmann and Smoluchowski understood that the criticism by Zermelo is not a real serious problem as long as macroscopic systems are considered: basically, because of the Kac’s lemma, in macroscopic systems the recurrence time is so large that it cannot be observed Cercignani [1998], Chibbaro et al. [2014], Lazarovici and Reichert [2015]. We can summarise the conclusions of Boltzmann by saying that the irreversibility describes an empirical regularity of macroscopic objects which is valid for a “vast majority” of the possible initial conditions. Often such validity for the “vast majority” of initial conditions is called typicality. According to Lebowitz Lebowitz [1993] (as well as many others) a certain behavior is typical if the set of microscopic states for which it occurs comprises a region whose volume fraction goes to one as the number of molecules grows. We can state that irreversibility is an emergent property Berry [1994], Chibbaro et al. [2014], Zanghì [2005] which appears as the number of degrees of freedom becomes (sufficiently) large; in such a limit, a single observation of the system is enough to determine its macroscopic properties.
Several mathematical results, as well as detailed numerical simulations, support the coherence of the scenario proposed by Boltzmann Cercignani [1998]. Among the others, Lanford’s work about the rarified gases is particularly important: he was able to prove, in a rigorous fashion, the validity of the Boltzmann equation for short times (of the order of the collision time) in the so-called Boltzmann-Grad limit Lanford [1981].
In spite of the above mentioned results, irreversibility still remains a somehow misinterpreted and controversial issue. The reader may appreciate the diversity of opinions from the comments Barnum et al. [1994] to a well known paper by Lebowitz on Boltzmann’s approach to the irreversibility. For instance, Prigogine and his school claim that irreversibility is either true on all levels or on none: it cannot emerge as if out of nothing, on going from one level to another Bricmont [1995], Chibbaro et al. [2014], Barnum et al. [1994]. For others, irreversibility either results from (microscopic) chaotic dynamics or it is a mere consequence of the interaction with the external environment.
One source of the controversy about the Boltzmann point of view, in particular among philosophers of science, is how to interpret typicality Frigg [2009].
This article aims at supplementing, mainly for pedagogical purposes, the basic aspects of Boltzmann’s explanation of macroscopic irreversibility. In order to present a clear non ambiguous analysis we treat a stochastic system, i.e the celebrated Ehrenfest model, which is nothing but a Markov chain. A simple and neat analytical computation for the model shows, in a precise way, that for each realization is very close, at any time, to the average value (which can be easily computed).
The paper is organized as follows: in Sec. II we introduce the problem of typicality together with some remarks about ensembles and entropies. Sec. III is devoted to the Ehrenfest model, and to some numerical results. Then, we rigorously prove the typicality of a trajectory for such a model in Sec. IV. Finally we summarize the results in Sec. V.
II Remarks about ensembles, entropies and typicality
Traditionally, entropy has an important relevance in the treatment of irreversibility; it seems to us that this central role is mainly based on historical grounds. In the present paper we do not discuss irreversibility in terms of entropy, for two main reasons. First, the word entropy can be source of confusion: for instance the entropy , defined in terms of the probability distribution function in the - space, has a completely different behavior from , i.e the entropy obtained from the probability density of a single particle (-space); for a discussion on this point see Castiglione et al. [2008], Cerino et al. [2016], Lebowitz [1993]. Second, at a practical, as well as at a conceptual level, for understanding of irreversibility it is enough to observe that, if the system starts from a typical far-from-equilibrium initial state, the macroscopic observables stay close to their mean values during the evolution, and therefore they approach their equilibrium values. In Sec. IV we will discuss this point in a precise mathematical way.
Even if probability theory has a great relevance for statistical mechanics, it is necessary to avoid mixing mathematics and physics. It is true that the building of the standard formulation of statistical mechanics is based on statistical ensembles, but this approach can be seen as a mere stratagem, and it is ultimately unconvincing in the following sense: in experiments, as well as in numerical computations, we are forced to treat a unique system, and we have not access to a collection of identical systems Caprara and Vulpiani [2018], H. Zurek [2018]. At a physical level the relevant problem is: what is the link between the probabilistic computations (i.e. the averages over an ensemble) and the actual results obtained by looking at a single realization (or sample) of the system under investigation?
In particular one should be careful to avoid the confusion between irreversibility and relaxation of the phase space probability distribution Cerino et al. [2016], Lebowitz [1993]. In presence of “good chaotic properties” (mixing systems) one has that the probability density, , relaxes (in a suitable technical sense) to the invariant distribution for large times, i.e. as . This property is remarkable, and rather important in the dynamical systems context; still, it cannot be considered physical irreversibility. Actually, from a physical point of view, the true question is to show that a single macroscopic system shows an irreversible behavior, for a “generic” initial state. In crude terms the interesting point is to understand the cooling of a single (initially hot) pot and not the behavior of an ensemble of pots.
In deterministic systems a delicate point is how to intend typicality i.e. the precise mathematical meaning of “vast majority”. As already mentioned, for many scientists active in statistical mechanics “vast majority” means with probability close to with respect to the Lebesgue measure, in physical terms microcanonical distribution Goldstein [2012], Lebowitz [1993], H. Zurek [2018]. Such an interpretation has been considered not convincing by some authors who criticized the privileged status of the Lebesgue measure Frigg [2009]. Although, in our opinion, there are very good reasons to privilege the microcanonical distribution, we do not insist in such controversial aspect. In the following we will consider a well known stochastic model Ehrenfest and Ehrenfest [2015], Kac [1957] where there is no ambiguity about the meaning of “vast majority”. An analysis of simplified stochastic models of this kind can help very much, in particular at pedagogical level, for understanding irreversibility (see for instance Gottwald and Oliver [2009] for a discussion about Kac ring model).
III The Ehrenfest model: heuristic results
III.1 Description and basic properties
For the sake of self-consistency, we briefly recall the Ehrenfest model Ehrenfest and Ehrenfest [2015]. We consider particles, labeled with an index , and two boxes, A and B: at the beginning, each particle can be placed either in box A or in box B. At every time step we randomly choose an integer number between 1 and , with a uniform distribution, and we move the corresponding particle from its box to the other one. The “macroscopic” state of the model at time is identified by , the number of particles in A, while the corresponding “microscopic” configuration is defined by the (labeled) particles which actually are in box A at that time. The Markovian evolution for the macroscopic state is ruled by the transition probabilities for to become :
[TABLE]
As a consequence, for any starting value , during steps the macroscopic state can realize different trajectories. Let us notice that in the Ehrenfest model the detailed balance holds: this property in the stochastic context is somehow equivalent to the time reversibility.
This model can be seen as a crude description of a system with particles in two vessels (A and B), connected by a narrow pipe, as shown in Fig. 1. Of course in this case the true dynamics is deterministic, while in the Ehrenfest model the evolution is stochastic. As a link between the original, Hamiltonian system and its Markovian counterpart, we can imagine to associate to each realization of the Ehrenfest model a set of initial conditions in the deterministic system.
The simplicity of the model allows us to study the statistical features of the evolution of an ensemble of (microscopic) initial conditions in the same (macroscopic) state, , by computing the evolution of the first and the second conditional momenta of the state , namely and ; for the sake of simplicity, we omit the conditional argument in the average . It is easy, see A, to show that:
[TABLE]
where . From Eq.(2) is clear that relaxes monotonically to the equilibrium (mean) value , with an exponential decay ruled by the characteristic time . In a similar way the conditional standard deviation tends to its equilibrium value with a characteristic time .
III.2 First numerical clues of typicality
Let us note that Eqs.(2) and (3) provide a description of the process only at an average level, giving no information about the single realization. We will see that under the assumption (somehow equivalent to the thermodynamic limit in real physical systems) almost all actual realizations are arbitrarily close to the average evolution at any time, i.e. almost all trajectories are “typical”.
Simple numerical computations suggest that a single trajectory is typical in the sense discussed above, i.e. that behaviors very different from the average one are extremely rare. In Fig.2 we show in function of time, panels (a) and (b), for several single realizations of for the same , where the exponential behavior clearly emerges for each trajectory. Panels (c) and (d), on the other hand, display vs for the same trajectories; we superimpose a confidence interval (light blue region) obtained by considering a stripe around the conditional average value, , as computed in Eqs. (2) and (3). Each trajectory is contained in this stripe for almost all times: since the trajectories are closer to their mean value as increases, Fig.2 can be seen as a first, rough numerical clue of the emergence of typicality in the limit .
III.3 Maximal deviations from the average
In the next Section, using just simple mathematical methods of the probability theory, we will show that starting from a far-from-equilibrium initial condition (e.g. ), exhibits an irreversible behavior in the limit , namely it remains close to , which exponentially decays to its equilibrium value . More precisely, defining the quantity
[TABLE]
i.e. the maximal deviation of from its average value along a trajectory of time-steps, we will show that
[TABLE]
where is , and both the constants and tend to zero in the limit .
Before we exhibit a mathematical proof of relation (5), let us provide a numerical evidence of its validity. In Fig. 3 we show , defined as the maximal value of along different trajectories of length , in function of . The scaling of such a quantity with is particularly interesting, since the number of realizations we need to observe a certain maximal deviation from the average is an indication of its probability.
Using Kac’s lemma for recurrence times Kac [1947, 1957], in B we derive the following scaling:
[TABLE]
under the assumption that , for large values, is distributed as a Gaussian variable. Let us remark that detailed statistical features of extremal events can be established in the framework of Gumbel’s theory Gumbel [1958]: however, since we are only interested in the scaling law of vs , we can avoid the use of the complete theory by a direct application of Kac’s lemma. Let us remark that with a different distribution for , still exponentially decaying to zero, only minor changes occur in the result. We observe that Eq.(6) is in fair agreement with the numerical data presented in Fig.3.
Roughly speaking, Eq. (6) means that the maximal observed value of grows very slowly with the number of trials , implying that significative deviations from the average are extremely rare.
IV A simple analytical result
In order to understand how much a single realization of deviates from its average , i.e. how much a single trajectory is “typical” in the sense discussed in the previous sections, we focus on the probability:
[TABLE]
where is a constant (it only depends on ) and by we mean . Our goal is to prove that
[TABLE]
In other words, we aim at showing that a single realization is almost surely contained in a stripe , at least up to times , being a quantity that grows slower than when tends to infinity.
First of all let us define the following sets:
[TABLE]
where and is the integer part of the real number ; is the set of the equidistant discrete times separated by , while the remaining times of the trajectory form the set . Defining as the event that , where is a constant, we can write the following lower bound for :
[TABLE]
that can be decomposed, using the definition of the conditional probability , as
[TABLE]
Because of the Markovian character of the process, the above inequality can be written as:
[TABLE]
We will see that, with a suitable choice of , such decomposition allows us to prove our statement (8). Now we need to study the two factors of the right hand side of Eq. (12): the first one can be estimated using the transition rules of the Markov chain (1), whereas for the second one we will apply the Chebyshev’s inequality.
Since at every time step the value of can only increase (or decrease) by 1, and the same holds for , we can write the following inequality:
[TABLE]
i.e. during time steps, the distance between a particular trajectory and the average cannot spread more than . We could consider even stronger bounds on , but inequality (13) is enough to prove our result. In particular, if relation (13) implies
[TABLE]
Let us notice that if we consider:
[TABLE]
if is large enough, equation (14) holds as soon as
[TABLE]
In order to evaluate the product in the left hand side of Eq.(12) we use a different strategy: denoting with the complementary event of , for every Chebyshev’s inequality Gnedenko [1998] ensures that
[TABLE]
where we have used the bound (Eq. (31) of A). Noting that
[TABLE]
from (17) we easily get
[TABLE]
Finally, using relation (19) to estimate the product in Eq.(12), we find:
[TABLE]
where is the cardinality of . If we choose to be proportional to , we can always find a constant , independent of , such that
[TABLE]
It is easy to show that, if the constraint (16) holds and, in addition, we choose such that:
[TABLE]
the right hand side of Eq. (20) approaches to one as when (note indeed that from inequalities (16) and (22) one has ). This completes the proof, since Eq. (15) ensures that in such limit.
As an example, we have that the couple and satisfies the relations (16) and (22), giving .
V Conclusions
The time irreversibility is an experimental fact whose validity must be accepted as a (quite obvious) empirical property of macroscopic systems. On the other hand it is not easy at all to build a coherent theory that conciliates such macroscopic behavior with the reversible nature of the laws at the microscopic level (i.e. Newton’s equations). The difficult point is to give a mathematical dignity to the great visionary conjecture of Boltzmann, stating that in macroscopic systems an irreversible behavior occurs for the overwhelming majority of the initial conditions.
One of the most important steps in the ambitious program of formalizing the idea that irreversibility is a typical property is due to Lanford: he had been able to show the validity of the conjecture of Boltzmann for rarified gases in a suitable limit. Lanford proved his result only for a short time (of the order of one collision time); in addition, some authors claim that the use of the Lebesgue measure in the formulation of the idea of typicality is questionable.
In the present paper we give an additional contribution, mainly at a pedagogical level, to support the Boltzmann’s conjecture. For the Ehrenfest model we show a result which shares the same philosophy of the Lanford’s work: in the limit , a single trajectory of is very close to its mean value , with probability close to .
Due to the stochastic nature of our system, it is possible to obtain analytical results up to large times, and there is no ambiguity about the possible interpretations of typicality.
Appendix A Derivation of Eqs. (2) and (3)
Let us start with the derivation of Eq.(2). Defining the variable as a random quantity which takes values with probabilities and , respectively, we have the recurrence relation:
[TABLE]
Taking the conditional average of the state for a given , using Eq.(23), we get:
[TABLE]
where we have just applied the definition of . Defining the variable , we can replace Eq. (24) with a recursive relation for :
[TABLE]
Fixing the initial state , corresponding to some , we finally get:
[TABLE]
which, in terms of and the initial state , reads:
[TABLE]
Analogue calculations allows us to compute Eq.(3), using the same strategy of Eq.(24), we can obtain a relation for :
[TABLE]
Applying recursively this equation and using that , we get:
[TABLE]
Using Eq.(27) it is straightforward to derive the variance, conditioned to the initial value :
[TABLE]
which leads to Eq.(3). As a remark, we note that Eq.(30) is bounded by:
[TABLE]
the stationary variance of the Ehrenfest model.
Appendix B Scaling law for
Let us consider the Markov chain (1) starting with the initial condition . For each trajectory, Eq. (4) defines the largest deviation from the average (within a time ), and we have indicated as the maximal occurrence of this quantity in independent realizations. Our goal is to find an (approximate) relation between and . We assume an asymptotic behavior of the probability density function, , of the random variable for a single trajectory. For instance, we can assume that such distribution decays with a Gaussian tail,
[TABLE]
The (average) number of independent attempts that are needed in order to observe a value of larger than is simply given by Kac’s lemma Kac [1957]:
[TABLE]
With the assumption (32), in the limit , the above relation can be written as
[TABLE]
Inserting into the above relation, we finally get the scaling law in Eq. (6). It is easy to understand that the above argument for the logarithmic dependence of as a function of is rather robust and does not depend on the details of the distribution: assuming an asymptotic shape , one obtains
[TABLE]
Acknowledgments
We thank M. Falcioni and R. Figari for a critical reading of the manuscript.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Boltzmann [2003] Ludwig Boltzmann. Lectures on Gas Theory . Dover Publications, New York, reprint edition, 2003.
- 2Cercignani [1998] Carlo Cercignani. Ludwig Boltzmann: The Man Who Trusted Atoms . Oxford University Press, Oxford, 1998.
- 3Emch and Liu [2002] Gerard G. Emch and Chuang Liu. The Logic of Thermostatistical Physics . Springer-Verlag, Berlin Heidelberg, 2002.
- 4Castiglione et al. [2008] Patrizia Castiglione, Massimo Falcioni, Annick Lesne, and Angelo Vulpiani. Chaos and Coarse Graining in Statistical Mechanics . Cambridge University Press, Cambridge, UK, 2008.
- 5Zanghì [2005] Nino Zanghì. I fondamenti concettuali dell’approccio statistico in fisica. In Valia Allori, Mauro Dorato, Federico Laudisa, and Nino Zanghì, editors, La natura delle cose: Introduzione ai fondamenti e alla filosofia della fisica , pages 202–247. Carocci Ed., Roma, 2005.
- 6Chibbaro et al. [2014] Sergio Chibbaro, Lamberto Rondoni, and Angelo Vulpiani. Reductionism, Emergence and Levels of Reality: The Importance of Being Borderline . Springer International Publishing, 2014.
- 7Lazarovici and Reichert [2015] Dustin Lazarovici and Paula Reichert. Typicality, Irreversibility and the Status of Macroscopic Laws. Erkenntnis , 80:689–716, 2015.
- 8Lebowitz [1993] Joel L. Lebowitz. Boltzmann’s Entropy and Time’s Arrow. Physics Today , 46:32, 1993. doi: 10.1063/1.881363 .
