Notions of the ergodic hierarchy for curved statistical manifolds
Ignacio S. Gomez

TL;DR
This paper extends the ergodic hierarchy to curved statistical manifolds using information geometry, linking statistical independence with Hamiltonian dynamics and illustrating with examples like harmonic oscillators and Gaussian ensembles.
Contribution
It introduces a geometric framework for ergodic properties on curved manifolds, connecting statistical models with physical systems and providing new tools for analyzing Hamiltonian dynamics.
Findings
Scalar curvature acts as a global indicator of dynamics.
Correlation between microvariables follows a power law in temperature.
The geometric approach applies to systems with phase transitions.
Abstract
We present an extension of the ergodic, mixing, and Bernoulli levels of the ergodic hierarchy for statistical models on curved manifolds, making use of elements of the information geometry. This extension focuses on the notion of statistical independence between the microscopical variables of the system. Moreover, we establish an intimately relationship between statistical models and family of probability distributions belonging to the canonical ensemble, which for the case of the quadratic Hamiltonian systems provides a closed form for the correlations between the microvariables in terms of the temperature of the heat bath as a power law. From this we obtain an information geometric method for studying Hamiltonian dynamics in the canonical ensemble. We illustrate the results with two examples: a pair of interacting harmonic oscillators presenting phase transitions and the 2x2 Gaussian…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Notions of the ergodic hierarchy for curved statistical manifolds
Ignacio S. Gomez
IFLP, UNLP, CONICET, Facultad de Ciencias Exactas, Calle 115 y 49, 1900 La Plata, Argentina
Abstract
We present an extension of the ergodic, mixing, and Bernoulli levels of the ergodic hierarchy for statistical models on curved manifolds, making use of elements of the information geometry. This extension focuses on the notion of statistical independence between the microscopical variables of the system. Moreover, we establish an intimately relationship between statistical models and family of probability distributions belonging to the canonical ensemble, which for the case of the quadratic Hamiltonian systems provides a closed form for the correlations between the microvariables in terms of the temperature of the heat bath as a power law. From this we obtain an information geometric method for studying Hamiltonian dynamics in the canonical ensemble. We illustrate the results with two examples: a pair of interacting harmonic oscillators presenting phase transitions and the Gaussian ensembles. In both examples the scalar curvature results a global indicator of the dynamics.
keywords:
Information geometry , Statistical models , 2D Correlated model , Ergodic hierarchy , IGEH , Canonical ensemble
††journal: Journal of LaTeX Templates
1 Introduction
The possibility of that relevant features of the dynamics of a system could be obtained from the differential geometric structure of the probabilities distributions gave place to the first encounter between differential geometry and probability theory. In this sense, geometrization of thermodynamics and statistical mechanics constituted the most important achievement in the subject with several approaches like the obtained by means of the internal energy as considered Weinhold [1], or the Ruppeiner metric given by the second moments of thermodynamical fluctuations [2], among others. With the same Riemannian character of these approaches, another formulations based in the thermodynamic of parameters were established in the field of the statistical mechanics like the foundational works of Rao [3] and Amari [5], and also given by Ingarden [6], Janiszek [7]. The successful application of all this vast body of approaches in characterizing several phenomena, such as the phase transitions and their critical points in non ideal gases, gave them entity to constitute a discipline within the information theory, called Information Geometry. Curved statistical manifolds are the subject of study of the information geometry, and they have associated the Fisher–Rao metric [3] which in turn is linked to the concepts of entropy and Fisher information. Generalized extensions of the information geometry [8, 9, 10] with regard to nonextensive formulation of statistical mechanics [11] has been also considered. The utility of information geometry is not only limited to thermodynamics and statistical mechanics. For instance, it has been applied in quantum mechanics leading a quantum generalization of the Fisher metric [12], and recently also in nuclear plasmas [13]. In particular, an application of information geometry to chaos can be performed by considering complexity on curved manifolds [14, 15, 16, 17]. In this approach, asymptotic expressions for information measures are obtained by means of geodesic equations leading to a criterion for characterizing global chaos on statistical manifolds [17]: the more negative is the curvature, the more chaotic is the dynamics. As usual, chaos can be characterized in terms of diverging initially nearby trajectories [18]. For the statistical models this condition results in the divergence of geodesic paths on the statistical manifold and constitutes a local criterion for chaos.
Besides, in dynamical systems theory, the ergodic hierarchy (EH) characterizes the chaotic behavior in terms of a type of correlation between subsets of the phase space [19, 20]. In the asymptotic limit of large times, the EH establishes that the dynamics is more chaotic when the correlation decays faster. According to correlation decay, the four levels of EH are, from the weakest to the strongest: ergodic, mixing, Kolmogorov, and Bernoulli. In particular, in mixing systems any two subsets enough separated in time can be considered as “statistically independent" which allows one to use a statistical description of the behavior of the system. In quantum chaos, the statistical independence is present in the universal statistical properties of energy levels which are given by the Gaussian ensembles [21, 22, 23, 24]. In Gaussian ensembles theory one assumes that in a fully chaotic quantum system the interactions are neglected in such way that the Hamiltonian matrix elements can be considered statistically independent [25]. Related to this, in [26, 27, 28] a quantum extension of the EH was proposed, called the quantum ergodic hierarchy, which allowed to provide a characterization of the chaotic behaviors of the Casati–Prosen model [24] and the kicked rotator [21, 22, 23].
Inspired by the characterizations of quantum chaotic systems made in [26, 27, 28, 29, 30] and making use of curved statistical models, we propose a generalization of the ergodic, mixing and Bernoulli levels of the EH in the context of the information geometry, which we called Information Geometric Ergodic Hierarchy (IGEH). In order to use it, we define a distinguishability measure for a correlated model that allows us to give an upper bound for the correlation of IGEH. Moreover, considering Hamiltonian systems belonging to the canonical ensemble we also give a method for characterizing their dynamics in terms of the statistical parameters and the levels of the IGEH.
In this way, our main contribution is two–fold: 1) an intimately connection between statistical models and probability distributions of the canonical ensemble which allows one to geometrize the phase transitions, and 2) an information geometric version of the ergodic hierarchy as an alternative framework for studying the chaotic dynamics in curved statistical models.
The paper is organized as follows. In Section 2, we give the notions and concepts of information geometry used throughout the paper, along with brief description of a correlated model. In Section 3, we make a brief review of the levels the ergodic hierarchy. Section 4 is devoted to an information geometric definition of the ergodic hierarchy by expressing the correlations in terms of probability distributions instead of subsets of phase space. Next, in Section 5 we define a distinguishability measure for the correlated model and an upper bound for the correlation of the IGEH is given. In Section 6, we establish the connection between the statistical models and the family of probability distributions belonging to the canonical ensemble. For quadratic Hamiltonian systems we show that their associated statistical models are the multivariate Gaussian ones, for which we obtain a closed form of the determinant of the covariance matrix in terms of the Hessian of the Hamiltonian. Here we also give an upper bound for the IG correlation where the temperature of the heat bath is considered as an external parameter. In Section 7, we illustrate the formalism with two examples: a pair of interacting harmonic oscillators presenting phase transitions in the canonical ensemble and the Gaussian Orthogonal Ensemble (GOE). Also, a panoramic outlook of the IGEH is sketched. Finally, in Section 8 we draw some conclusions, and future research directions are outlined.
2 Elements of information geometry
We begin by introducing some fundamentals and concepts, following the definitions given in [5].
2.1 Statistical models
Given an abstract set one can consider the set M of all the probability density functions (PDFs) defined on , i.e.
[TABLE]
where is the set of real numbers and the integration must be replaced by a sum when is discrete. Consider a subset such that each element of may be parameterized using a –real vector so that
[TABLE]
where is a subset of . Then, if the mapping is injective it is said that is a statistical model on . The dimension of the statistical model is that of the macrospace, i.e. it is –dimensional. The physical interpretation of and is as follows. Generally, represents the microscopic variables of the system under study which typically are difficult to control, for instance the positions of all the particles in a gas. Thus, is called the microspace and are the microvariables. On the other hand, represent the macroscopic variables that can be measured in an experiment, like the mean value or the moments of the microvariables. It is said that is the macrospace and are the macrovariables. Since the microspace is fixed by the system then one can only choose the macrospace, and in this way the statistical model is established. Then, the statistical models are system–specific, from which follows that a statistical model could be useful for a system while that for another not.
2.2 Metric structure of the statistical manifold
Next step is to describe the behavior of a system by means of a previously and adequately chosen, statistical model. For this, some kind of dynamics must be introduced on the statistical model. In information geometry this is accomplished by means of the Fisher–Rao tensor
[TABLE]
where is a generic element of . The metric tensor endows the dynamics to the macrospace in terms of the geodesic equations for the macrovariables , i.e. results to be a statistical manifold. More precisely, is a Riemmanian manifold and the statistical character is due to the elements of are probability distributions.
Thus, the main goal of the statistical models is to obtain some relevant information about the dynamics by means of the geodesic equations and geometrical quantities like the Ricci tensor, the scalar curvature etc. In this sense, we will use a criteria given by Cafaro et al. [14, 15, 16] to characterize global chaos on statistical models: “the more negative is the scalar curvature, the more chaotic is the dynamics". From the metric tensor (3) one can obtain the geodesic equations for the macrovariales along with the following geometrical quantities that we will use throughout the paper.
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
where the comma in the subindexes denotes the partial derivative operation (of first and second orders), is the inverse of , and is a parameter that characterizes the geodesic curves.
2.3 The correlated Gaussian model
Most statistical models used in the literature are the so called Gaussian models, due to its wide versatility for describing multiple phenomena. This models are obtained by choosing the subfamily as the set of multivariate Gaussian distributions. If are the microvariables and there is no correlations between them then are the set of macrovariables, where and correspond to the mean value and the variance of the –th microvariable. However, for a more realistic description of the system the correlations between each of the microvariables must be taken into account. Considering the family of bivariate (binormal) distributions, one of the Gaussian models of lower dimension that present correlations can be obtained, which is given by
[TABLE]
where is the covariance between and and is the correlation coefficient that assumes values within the ranges . Here the microspace is and the macrospace is . Nevertheless, one still can obtain a non trivial description by adding the following macroscopic constraint
[TABLE]
where is a constant belonging to . Mathematically, the effect of is to restrict the dynamics to the submanifold . Physically, resembles the minimum uncertainty relation when one chooses as the position of a particle and its conjugate variable. Moreover, this interpretation allows to give an explanation of the phenomenon of “suppression of classical chaos by quantization" from an information geometric point of view [17]. For the sake of simplicity one also can fix the mean value of as zero. With the help of (10) then one can rewrite (9), thus obtaining the non–trivial correlated statistical model of lower dimensionality
[TABLE]
where we renamed as . This is the so called the correlated model [17], since it present correlations between and by means of and its macrospace is bidimensional. It should be noted that here is considered as an external parameter that does not belong to the macrospace. For this model, from Eqs. (3)–(8) one can obtain the Fisher tensor along with the following geometrical quantities.
[TABLE]
[TABLE]
[TABLE]
[TABLE]
From (17) one can see that the curvature has a minimum value when (absence of correlations) while for (maximally correlated case) it has a maximum value . In terms of the criterium of global chaos this can be interpreted as: the dynamics of the uncorrelated case is more chaotic than the corresponding to the maximally correlated case. Moreover, the divergence of the metric observed for expresses the maximally correlated case as a critical point of the dynamics.
3 The ergodic hierarchy
In classical chaos, the exponential instability implies continuous spectrum, and therefore, a decay of correlations in such a way that for large times the measure of the intersection between two sets of phase space (separated from each other in time) tends to the product of their measures. This is the well known mixing property, and constitutes one of the foundations of the statistical mechanics. The main feature of mixing is that it establishes the statistical independence of different parts of a trajectory, when sufficiently separated in time. This is the main reason for the application of probability theory in the classical domain, which allows one to calculate statistical features such as diffusion, relaxation and distribution functions [23]. Consequently, the description in terms of trajectories can be replaced by an equivalent one in terms of distribution functions, which, if not singular, represent not a single trajectory but a continuum of them.
In ergodic theory, any classical system is represented mathematically by a dynamical system where is a set, is a sigma–algebra of , a measure defined over and a group of measure–preserving transformations. The ergodic hierarchy ranks the chaos of a dynamical system according to a type of correlation between two subsets and of that are separated by a time . This is defined as [19, 20]
[TABLE]
The ergodic, mixing and Bernoulli levels of the EH are given in terms of (18) in the following way. Given two arbitrary sets , it is said that is
ergodic if
[TABLE] 2.
mixing if
[TABLE] 3.
Kolmogorov if for all integer , for all , and for all there exists a positive integer such that if one has
[TABLE]
where is the minimal algebra generated by . 4.
Bernoulli if
[TABLE]
In ergodic systems the correlation vanishes “in time average" for large times while in mixing systems vanishes for . In Kolmogorov systems the correlations between an arbitrary set and another one belonging to the –algebra cancel for . In Bernoulli systems the correlation is zero for all times. These levels classify the dynamics according to Eqs. (19), (20), (21), and (22), from the weakest level (the ergodic) to the strongest (the Bernoulli). The following strict inclusions hold:
[TABLE]
In order to express by means of probability distributions it is more convenient to use the definition (18) in terms of distribution functions, which is given by [20]
[TABLE]
where denotes the composition of and , i.e. for all and now the role of is played by the functions . Physically, represents any initial density function of the classical system whose value at time is given by with the classical Liouville evolution (in Hamiltonian systems).
4 An information geometric version of the ergodic hierarchy
Following the idea of characterizing chaos by means of the ergodic hierarchy [26, 27, 28], now we consider an extension of the EH within the context of the information geometry. In principle, in information geometry one has probability distributions that depend on a set of parameters , and the dynamics of the macrovariables is performed along the geodesics of the statistical manifold. Moreover, in the statistical manifold the role of time variable of dynamical systems is played by a parameter along the geodesics.
In order to introduce the tools of information geometry, we propose the following approach by defining a correlation between functions as the macrovariables evolve along the geodesics. Given functions , each one of them in terms of the variable for all , we define the information geometric correlation (IG correlation) between at time–like parameter as
[TABLE]
where is the M–dimensional vector of the macrovariables at “time" and,
[TABLE]
are the marginal distributions of . From (4) one can see that measures how independent the variables are at time–like . This can be considered as a sort of information geometric generalization of the EH correlation.
Having established and taking into account the ergodic, mixing and Bernoulli levels given by Eqs. (19), (20) and (22), we define the information geometric ergodic hierarchy (IGEH) as follows. For the sake of simplicity and since we focus mainly on the ergodicity and mixing properties (which are the fundamentals of statistical mechanics), in this contribution we do not extend the Kolmogorov level that involves a –algebra. Given a set of arbitrary functions we say that now the statistical model is
IG ergodic if
[TABLE] 2.
IG mixing if
[TABLE] 3.
IG Bernoulli if
[TABLE]
As in the ergodic hierarchy, the following strict inclusions hold:
[TABLE]
For instance, a statistical model that is IG ergodic can be given by assuming that is proportional to with . Making this replacement in (26) one obtains that is equal to zero. Since oscillates, then this model is not IG mixing nor IG Bernoulli. Examples of statistical models that are IG mixing and IG Bernoulli will be illustrated in Section 7. Our approach, thus, deals with ergodic hierarchy in statistical models from an information geometry viewpoint.
5 A measure of distinguishability for the correlated model
In order to use the levels of the IGEH for characterizing the dynamics of statistical models one should have a manner of determining the decay of the correlation in Eqs. (26), (27) or (28). For the family of the correlated probabilities of (11), we define a distinguishability measure , given by
[TABLE]
where are the marginal distributions of . Furthermore, if are arbitrary functions of and , then we have
[TABLE]
Eq. (30) expresses that is un upper bound for . Therefore, it is convenient to find an analytic expression for (29). After some algebra one can obtain that111The demonstration can be found in the Appendix.
[TABLE]
The behavior of , which is independent of and , is shown in Fig. 1. Two relevant regions, corresponding to the limiting cases and , can be well distinguished. The region corresponds to the zone where the statistical model is characterized by the IG mixing and IG Bernoulli levels, with the particularity that the variables of microspace are uncorrelated. Moreover, one can see that near to the decay is linear in . The curve also shows that, if when , then the statistical model is IG mixing.
In the region the measure diverges corresponding to the maximally correlated case, which physically means that the system presents strong correlations between the variables of microspace. Due to the correlations are strong in this regime the statistical model cannot be IG mixing nor IG Bernoulli.
Finally, it should be noted that does not allow one to distinguish between two probability distributions having and respectively. The symmetry respect to the axis is due to the mathematical form of the infinite norm in the definition (29). That is, with other choices of one could distinguish states (probability distributions) with correlation coefficients and .
6 Geometrizing the canonical ensemble by means of statistical models and the IGEH
On the basis of the above characterization of the dynamics of the macrospace in terms the IGEH, our next aim is to give a method for studying the dynamics of a system belonging to the canonical ensemble.
6.1 Canonical ensembles in the context of the information geometry
Beyond the relationship between statistical models and statistical physics has been already established [5, 31], extensions in several directions have been recently introduced [32, 33, 34, 35, 36, 37, 38, 39], with a particular focusing on the exponential families since they represent mathematically the Liouville densities of the statistical ensembles. Relevant consequences from these researches such as the connection between Hessian structures and exponential families [38], nonextensive statistical models [34, 35, 36, 39] and other extensions [32, 33, 37] has been reported.
In order to apply the IGEH to the canonical ensemble, here we obtain some explicit formulas that relate the statistical parameters of multivariate Gaussian distributions (which are a special case of exponential family) with the physical parameters of quadratic Hamiltonians. In the present contribution we only focus on the multivariate Gaussian distributions and we will omit the definition of exponential families of a more general character. We begin by considering the family of probability density functions given by the classical canonical ensemble
[TABLE]
where is the well known partition function, is the Boltzmann factor, is the energy of the system expressed in terms of the phase space coordinates (with the phase space), and are the macrovariables of the system. The function represents the probability density of the microstate of the system, corresponding to the macrostate given by the macroscopic parameters , when contacted in thermal equilibrium with a heat bath at a fixed temperature . In this way, one can see that there is a biunivocal correspondence between statistical models and statistical ensembles: each pair of microvariables corresponds to a microstate of the ensemble, and each value of the macrovariable is associated to a macrostate of the system.
Assuming a –dimensional phase space and a –dimensional macrospace , from (32) and using (3) one obtains the Fisher tensor for the canonical ensemble
[TABLE]
where and are a short notation for and , and the subffix stands for the canonical ensemble. The formula (6.1) is the expression of the canonical ensemble in the context of the information geometry. All the dynamics over the macrospace can be derived by means of the Eqs. (4)–(8), thus obtaining a geometrical characterization of the canonical ensemble.
6.2 Dynamics of the canonical ensemble in terms of the IGEH levels
The particular dependence of the energy on the microvariables determines the correlations between them, which are reflected in the probability distribution . Since the IG correlation measures the degree of statistical independence between the microvariables of the microspace at time–like parameter , then for a given expression of the energy it is desirable to know what is the form of . We focus on a particular form of the energy, i.e. when is a quadratic function of . The physical relevance of this assumption lies, among other things, in the fact that it allows to study the dynamics of the system near of an equilibrium point. In this case, the formula of the energy is given by
[TABLE]
with G the Hessian of around at a point and the transposed of . Then, the probability distribution adopts the form
[TABLE]
with the normalization factor. From (35) one can see that is nothing but a –multivariate Gaussian distribution on the microvariables . At this point it is convenient to recall the expression of the –multivariate Gaussian distribution
[TABLE]
with the vector of microvariables, the mean value vector, the covariance matrix and its inverse, and the determinant of . Moreover, the marginals of are given by
[TABLE]
where stands for the –th matrix element of with .
Let us consider arbitrary functions . Then, replacing (36) and (37) in (4) one has that
[TABLE]
is the IG correlation for the family of –multivariate Gaussian distributions. As in the correlated model, one can obtain an upper bound for as follows.
[TABLE]
This inequality expresses an upper bound of the IG correlation for the family of the –multivariate Gaussian distributions, where the maximum is a measure of distinguishability, and the parameters , are dependent on along the geodesics by means of the application of Eqs. (3)–(5) to .
Now we can set the upper bound of (6.2) in the language of the canonical ensemble. By simple inspection of Eqs. (6.2)–(37), if one makes the following replacements
[TABLE]
in (6.2) then one obtains
[TABLE]
The inequality (6.2) is an upper bound of the IG correlation for a system in the canonical ensemble whose energy is a quadratic form, expanded around a point of the phase space. Since the macrovariables and G are functions of the time–like parameter then the utility of (6.2) is that, according to the way the IG correlation cancel for large values of , one can classify the statistical model as belonging to some of the levels of the IGEH. In this way, one can study phase transitions in terms of the IGEH levels as the macrovariables vary. For instance, if the dynamics in macrospace is such that the maximum in (6.2) tends to zero when , then one has that the canonical ensemble behaves as an IG mixing statistical model.
It should be noted that the multivariate Gaussian distribution only exists if the covariance matrix is positive, which implies that must be also positive. Then, it follows that . In particular, the stable equilibrium points of the system satisfy this condition.
Now we are in a position to reach one of our main contributions of this work. By the Eq. (6.2) and using the definition of the tensor G, one finally obtains
[TABLE]
This equation expresses an intimate relationship between the canonical ensemble and the statistical models, which one can express in words as follows:
“given a quadratic form of the energy, the determinant of the covariance matrix which measures the correlations between the microvariables, is proportional to a power (equal to the dimension of the phase space) of the temperature of the heat bath".
Moreover, since the determinant of covariance matrix is a decreasing function of the correlations, as the temperature of the heat bath increases the correlations tend to be suppressed, as expected statistically. This result will be useful for characterizing phase transitions, as we shall see below.
7 Models and results
In order to illustrate the relevance of the IGEH we consider two examples belonging to different topics: an interacting bipartite system presenting phase transitions in the canonical ensemble, and the Gaussian orthogonal ensemble. Next, we give a panoramic outlook of the IGEH and of the relationship between statistical models and Liouville densities of the canonical ensemble.
7.1 Phase transitions in a pair of interacting harmonic oscillators
Let us consider a pair of unidimensional and interacting harmonic oscillators in the canonical ensemble whose total energy is given by
[TABLE]
where , , , , and are the position, the momentum, the equilibrium position, the mass, and the frequency of the –th particle with . Here is the kinetic energy and is the potential energy which is composed by three terms: the first two are the potential energy of each oscillator separatelly while the term represents the interaction between the oscillators, and the coefficient measures the strength coupling. For instance, when the oscillators are uncoupled, and therefore, their motions are independent of each other.
Now, in order to use the analysis made about the correlated model one must reduce the number of microvariables and macrovariables, and also impose some type of constraints. In this sense, we fix the masses and set . Also, we consider the following constraint
[TABLE]
where is a real constant, is the Boltzmann constant, and is a temperature of reference222For instance, the room temperature C (293.15 K).. Here plays the role of being a temperature that breaks up the correlations between the microvariables.
Due to the correlations are only between the position coordinates then one can reasonably neglect the momentum coordinates by integrating the probability distribution given by the canonical ensemble, i.e. , over and . And since only the kinetic energy has the dependence on and then this equivalent to consider a sort of marginal probability distribution with respect to the potential energy, i.e.
[TABLE]
With the help of (43) one can express the potential energy as
[TABLE]
Then, from (44) and (45) one has
[TABLE]
which is nothing but the probability distribution of the correlated model (11) by means of the identifications
[TABLE]
From the last line of (7.1) one obtains
[TABLE]
Let us show that this equation is a particular case of the formula (42): since the determinant of the covariance matrix is and the determinant of the Hessian of the potential energy evaluated at is
then one can replace both expressions in (42) with , thus obtaining
[TABLE]
With the help of (43) one can recast this equation as
[TABLE]
from which one obtains .
Considering the temperature as an external parameter one can study the phase transitions of the system as varies. Also, we set the reference temperature as the room temperature. Given that the parameter is arbitrary, we choose as
[TABLE]
This choice for is convenient since one has
[TABLE]
In this way, the asymptotic limit is identified with the limit , and therefore, the transition towards high temperatures can be studied by means of the limit . This transition express the behavior of the correlations between the oscillators when the bath temperature pass from a finite value (which is lower than ) to the room temperature. Physically, at the room temperature is expected that if the energy delivered by the bath to each oscillator is larger than the energies of them (i.e., ) then as a result of the thermal agitation the correlations between the oscillators tend to be canceled. Indeed, from (48) one can see that when .
Now let us see that this phase transition is characterized in terms of the mixing level of the IGEH. When is vanishingly small, the following approximations hold:
[TABLE]
Using these approximations, and neglecting terms of order , in the formula (31) of one obtains that holds for . Replacing this inequality in (30) one has that
[TABLE]
In turn, since when this equation implies
[TABLE]
According to the definition (27) then the system is IG mixing. This is the regime of null correlations corresponding to the region of the curve of around at , as can be seen in Fig. 1. When the probability distribution (7.1) can be factorized as the product of its marginals, and therefore, from the Eq. (28) one has that the system is IG Bernoulli.
Moreover, since the scalar curvature for the correlated model is then one obtains
[TABLE]
The formula (51) expresses the connection between the thermodynamics of the canonical ensemble and the information geometry of the system of coupled oscillators. It can be seen that the statistical model behaves as an “intermediary" between the thermodynamic parameters and the geometrical quantities of the statistical manifold. As a consequence, the determinant of the covariance matrix of the statistical model is determined by the Boltzmann factor, thus linking the temperature with all the geometrical quantities like the metric tensor, the scalar curvature, etc.
From Eq. (51) it follows that the scalar curvature decreases as the bath temperature increases up to reach a minimum value at the room temperature, where the correlation coefficient is zero. On the other hand, when the temperature tends to zero the scalar curvature takes a maximum value which corresponds to the maximally correlated case . This reflects the intuitively image that in absence of thermal agitation the correlations remain present.
Finally, it should be noted that this analysis is consistent with the Cafaro’s criteria of global chaos: as the temperature grows the dynamics become more chaotic and the scalar curvature turns out more negative. From the point of view of the IGEH this is characterized in terms of the mixing level, in which the breaking up of the coupling between the oscillators at the room temperature is expressed by means of the cancellation of the IG correlation in the asymptotic limit.
7.2 Gaussian Orthogonal Ensembles (GOE)
In Gaussian Orthogonal Ensembles theory one deals with the probability distribution for the Hamiltonian matrix elements assuming that the are uncorrelated [21, 22]. Then in the framework of information geometry one could try to describe them by defining a microspace and a macrospace in a suitable way.
In order to characterize the GOE within a statistical model we study a correlated ensemble of matrices. We take the microspace as the Hamiltonian matrix elements and define the macrospace as follows. For the sake of simplicity, we choose the macrospace in such way that only and are correlated, and that the mean values of all variables are zero, except for the mean value corresponding to which is equal to . Also, we consider that the variance of , and are the same, denoted by . Moreover, in order to study how independent the diagonal Hamiltonian elements are, we restrict the dynamics by considering that is the correlation coefficient between and , and that the product of the covariances between and is a constant . Taking this into account, the resulting macrospace is and the correlated probability distribution is given by 333Note that, since the GOE correspond to the orthogonal class of Hamiltonians then one has that . However, in the formalism of Random matrices and for the orthogonal case, the volume element (as if and were independent variables) is the real Lebesgue measure of and must to be taken into account in order to normalize the probability distribution [21].
[TABLE]
where the correlation coefficient is considered as an external parameter and since the correlation between and is in terms of then can be taken as a fixed constant. In turn, given that and are the only microvariables correlated to each other and since the transitions of the dynamics depend fundamentally on the correlations, then one can reasonably neglect and . Analogously as it was made in the pair of oscillators, one can integrate the correlated probability distribution over and , thus obtaining
[TABLE]
which is nothing but the correlated model, i.e. (11) and (52) identical by renaming and as and respectively. Then, the non vanishing components of the Ricci tensor and the Ricci scalar curvature are given by the Eqs. (16) and (17)
[TABLE]
Three remarks follow. First, the statistical manifold has a curvature which is negative for all values of the correlation coefficient . Based on the Cafaro’s criterium above, this simply means that the dynamics in macrospace is chaotic for all .
Second, the GOE case corresponds to and , thus having the minimum value of the scalar curvature
[TABLE]
In this case the correlated probability distribution is the product of their marginals and thus, the model is IG Bernoulli. Therefore, one can see that the GOE corresponds to the strongest level of the IGEH and this can be considered as a characterization of the Gaussian ensembles from an information geometric point of view.
Third, for the strongly correlated case that corresponds to one has
[TABLE]
which can be interpreted, by the Cafaro’s criterium of global chaos, as the case when the dynamics is the least chaotic of all.
7.3 A panoramic outlook of the IGEH and of the canonical ensemble in curved statistical models
Here we summarize the aspects of our proposal that can provide innovative tools within the context of the information geometry for characterizing the dynamics of a system in terms of the macrospace of the chosen statistical model. For this, below we provide two schematic diagrams, Figs. 2 and 3, showing the content of the two main contributions of this work and its physical relevance from a panoramic outlook.
8 Conclusions
We have proposed an extension of the ergodic, mixing and Bernoulli levels in the context of information geometry, that we called information geometric ergodic hierarchy (IGEH), and we applied it to characterize: i) the phase transitions of a pair of interacting harmonic oscillators in the canonical ensemble and ii) the Gaussian Orthogonal Ensembles. The relevance and novelty of our main contributions, i.e. the IGEH and the information geometric characterization of the Lioville densities of the canonical ensemble expressed by the Eqs. (4)–(28) and (6.1)–(42) respectively, lie in the following remarks:
Statistical models provides a unified scenario for approaches involving correlations between microscopic variables. This was illustrated with a correlated GOE by adding correlations between two variables and showing that, this modification attenuates the chaotic dynamics on macrospace by increasing the scalar curvature, accordingly to the Cafaro’s criterium of global chaos.
- 2.
The IGEH generalizes the chaos characterization of the ergodic hierarchy by quantifying the statistical independence between the microvariables (instead of subsets of phase space) of the statistical model. This is performed in the asymptotic limit of large values of the time–like parameter which is expressed in terms of upper bounds of the IG correlation as the measure for the case of the correlated model.
- 3.
The association between multivariate Gaussian distributions and quadratic Hamiltonians can be useful for studying the type of stability that present the dynamics in their equilibrium points in the context of the information geometry.
- 4.
Geometrical notions and the Cafaro’s criterium of global chaos can be related with the levels of the IGEH. The GOE case belonging to the most chaotic level, the IG Bernoulli, has an associated minimum negative value of the scalar curvature .
- 5.
By obtaining upper bounds on the IG correlation for a specific family of probability distributions, as exemplified by the curve of Fig. 1, one could study geometrical phase transitions moving along curves as an external parameter is varied.
Acknowledgments
This work was partially supported by CONICET and Universidad Nacional de La Plata, Argentina.
References
- [1] F. Weinhold, J. Chem. Phys. 63, 2479, 2488 (1975); 65, 559 (1976).
- [2] G. Ruppeiner, Phys. Rev. A 20, 1608 (1979); Rev. Mod. Phys. 67, 605 (1995); Erratum: 68, 313 (1996).
- [3] C. R. Rao, Differential Geometry in Statistical Inference. In chap. Differential metrics in probability spaces; Institute of Mathematical Statistics, Hayward, CA, 1987.
- [4] C. R. Rao, Bull. Calcutta Math. Soc. 37, 81 (1945).
- [5] S. Amari, H. Nagaoka. Methods of Information Geometry; Oxford University Press: Oxford, UK, 2000.
- [6] R. S. Ingarden, Tensor N.S. 30, 201 (1976).
- [7] H. Janyszek, Rep. Math. Phys. 24, 1, 11 (1986).
- [8] S. Abe, Phys. Rev. E 68, 031101 (2003).
- [9] J. Naudts, Open Sys. and Information Dyn. 12, 13 (2005).
- [10] M. Portesi, A. Plastino, F. Pennini, Phys. A 365, 173–176 (2006); Phys. A 373, 273–282 (2007).
- [11] C. Tsallis, J. Stat. Phys. 52, 479 (1988).
- [12] D. Bures, Trans. Am. Math. Soc. 135, 199 (1969).
- [13] G. Verdoolaege, AIP Conf. Proc. 1641, 564–571 (2014); Rev. Sci. Instrum. 85, 11E810 (2014).
- [14] C. Cafaro, Chaos Solitons & Fractals 41, 886–891 (2009).
- [15] C. Cafaro, S. Mancini, Phys. D 240, 607–618 (2011).
- [16] C. Cafaro, A. Giffin, C. Lupo, S. Mancini, Open Syst. Inf. Dyn. 19, 1250001 (2012).
- [17] A. Giffin, S. A. Ali, C. Cafaro, Entropy 15, 4622-4623 (2013).
- [18] A. J. Lichtenberg, M. A. Lieberman. Regular and Chaotic Dynamics (Applied Mathematical Sciences), Springer, Berlin, (2010).
- [19] J. Berkovitz, R. Frigg, F. Kronz, Stud. Hist. Phil. Mod. Phys. 37, 661–691 (2006).
- [20] A. Lasota, M. Mackey. Probabilistic properties of deterministic systems; Cambridge Univ. Press, Cambridge, 1985.
- [21] H. Stockmann. * Quantum Chaos - An Introduction*; Cambridge Univ. Press, Cambridge, 1999.
- [22] F. Haake. Quantum Signatures of Chaos, Springer-Verlag, Heidelberg, 2001.
- [23] G. Casati, B. Chirikov. Quantum Chaos: between order and disorder, Cambridge Univ. Press, Cambridge, 1995.
- [24] G. Casati, T. Prosen, Phys. Lett. A 72, 032111 (2005).
- [25] O. Bohigas, M. Giannoni, C. Schmit, Phys. Rev. Lett. 52, 1 (1984).
- [26] I. Gomez, M. Castagnino, Phys. A 393, 112–131 (2014).
- [27] I. Gomez, M. Castagnino, Chaos, Solitons & Fractals 68, 98–113 (2014).
- [28] I. Gomez, M. Castagnino, Chaos, Solitons & Fractals 70, 99–116 (2015).
- [29] I. Gomez, M. Losada, S. Fortin, M. Castagnino, M. Portesi, Int. Journ. Theor. Phys. 7, 2192–2203 (2015).
- [30] I. S. Gomez, M. Portesi, arXiv:1503.02751v3 [quant-ph] (2017).
- [31] O. E. Barndorff–Nielsen. Information and Exponential Families in Statistical Theory, J. Wiley and Sons, New York, 1978.
- [32] J. Naudts, J. Ineq. Pure Appl. Math. 5, 102 (2004).
- [33] P. D. Grünwald, A. P. Dawid, Ann. Stat. 32 1367–1433 (2004).
- [34] A. Ohara, T. Wada, J. Phys. A 43 (2010) (2010).
- [35] J. Naudts, Entropy 10, 131–149 (2008).
- [36] S. Amari, A. Ohara, H. Matsuzoe, Phys. A 391, 4308–4319 (2012).
- [37] M. Molitor, J. Geom. Phys. 70, 54–80 (2013).
- [38] H. Matsuzoe, Diff. Geom. App. 35, 323–333 (2014).
- [39] F. Zang, Y. Shi, R. Wang, Phys. A 468, 552–565 (2017).
Appendix A Proof of the distinguishability measure (formula (31))
Proof.
Replacing (11) in the definition of one obtains
[TABLE]
Now, by defining the following adimensional variables and one can recast as
[TABLE]
Therefore, in order to calculate , it is enough to study the maximum and minimum on of the function for each value of the parameter . For this, it is convenient to write with and . Then, one must to find the critical points of from the equations
[TABLE]
From (54) it follows that , thus
[TABLE]
Since is an exponential then is always positive. Then one obtains
[TABLE]
If then it is clear that is equal to the product of their marginals and it follows that the infinite norm is zero. Assuming and replacing in the formula of then one obtains a function that only depends on , that we can denote as . Then one has that
[TABLE]
Thus, the value of is given by the maximum value of with . For this, we calculate the critical points of by making its derivative equal to zero, thus obtaining
[TABLE]
So the critical points of satisfy
[TABLE]
Replacing (56) in (55) one has the value of in its critical points
[TABLE]
In order to find explicitly one must solve the second relation of the Eq. (56), taking the natural logarithm in both sides of it then one has
[TABLE]
That is,
[TABLE]
Then, from (58) and (57) one obtains the following values of in its critical points
[TABLE]
Now, given that the maximum and minimum values of are precisely the same of subject to the restriction and since the maximum value of is equal to , then from (59) one deduces that is
[TABLE]
from which follows the desired result. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] F. Weinhold, J. Chem. Phys. 63 , 2479, 2488 (1975); 65 , 559 (1976).
- 2[2] G. Ruppeiner, Phys. Rev. A 20 , 1608 (1979); Rev. Mod. Phys. 67 , 605 (1995); Erratum: 68 , 313 (1996).
- 3[3] C. R. Rao, Differential Geometry in Statistical Inference . In chap. Differential metrics in probability spaces; Institute of Mathematical Statistics, Hayward, CA, 1987.
- 4[4] C. R. Rao, Bull. Calcutta Math. Soc. 37 , 81 (1945).
- 5[5] S. Amari, H. Nagaoka. Methods of Information Geometry ; Oxford University Press: Oxford, UK, 2000.
- 6[6] R. S. Ingarden, Tensor N.S. 30, 201 (1976).
- 7[7] H. Janyszek, Rep. Math. Phys. 24 , 1, 11 (1986).
- 8[8] S. Abe, Phys. Rev. E 68 , 031101 (2003).
