Notions of the ergodic hierarchy for curved statistical manifolds

Ignacio S. Gomez

arXiv:1703.03515·math-ph·June 20, 2018

Notions of the ergodic hierarchy for curved statistical manifolds

Ignacio S. Gomez

PDF

TL;DR

This paper extends the ergodic hierarchy to curved statistical manifolds using information geometry, linking statistical independence with Hamiltonian dynamics and illustrating with examples like harmonic oscillators and Gaussian ensembles.

Contribution

It introduces a geometric framework for ergodic properties on curved manifolds, connecting statistical models with physical systems and providing new tools for analyzing Hamiltonian dynamics.

Findings

01

Scalar curvature acts as a global indicator of dynamics.

02

Correlation between microvariables follows a power law in temperature.

03

The geometric approach applies to systems with phase transitions.

Abstract

We present an extension of the ergodic, mixing, and Bernoulli levels of the ergodic hierarchy for statistical models on curved manifolds, making use of elements of the information geometry. This extension focuses on the notion of statistical independence between the microscopical variables of the system. Moreover, we establish an intimately relationship between statistical models and family of probability distributions belonging to the canonical ensemble, which for the case of the quadratic Hamiltonian systems provides a closed form for the correlations between the microvariables in terms of the temperature of the heat bath as a power law. From this we obtain an information geometric method for studying Hamiltonian dynamics in the canonical ensemble. We illustrate the results with two examples: a pair of interacting harmonic oscillators presenting phase transitions and the 2x2 Gaussian…

Figures3

Click any figure to enlarge with its caption.

Equations171

M = {p : p : X \to R, p (x) \geq 0, \int_{X} p (x) d x}

M = {p : p : X \to R, p (x) \geq 0, \int_{X} p (x) d x}

S = {p_{θ} \in M : p_{θ} = p (x; θ), θ = (θ_{1}, \dots, θ_{m}) \in Θ}

S = {p_{θ} \in M : p_{θ} = p (x; θ), θ = (θ_{1}, \dots, θ_{m}) \in Θ}

g_{ij} (θ) = \int_{X} d x p (x; θ) \frac{\partial lo g p ( x ; θ )}{\partial θ _{i}} \frac{\partial lo g p ( x ; θ )}{\partial θ _{j}} i, j = 1, \dots, m

g_{ij} (θ) = \int_{X} d x p (x; θ) \frac{\partial lo g p ( x ; θ )}{\partial θ _{i}} \frac{\partial lo g p ( x ; θ )}{\partial θ _{j}} i, j = 1, \dots, m

Geodesic equations : \frac{d θ _{k}}{d τ} + Γ_{ij}^{k} \frac{d θ _{i}}{d τ} \frac{d θ _{j}}{d τ} =, \forall k = 1, \dots, m

Geodesic equations : \frac{d θ _{k}}{d τ} + Γ_{ij}^{k} \frac{d θ _{i}}{d τ} \frac{d θ _{j}}{d τ} =, \forall k = 1, \dots, m

Christoffel symbols : Γ_{ij}^{k} = \frac{1}{2} g^{im} (g_{mk, l} + g_{m l, k} - g_{k l, m})

Christoffel symbols : Γ_{ij}^{k} = \frac{1}{2} g^{im} (g_{mk, l} + g_{m l, k} - g_{k l, m})

Riemman curvature tensor : R_{ik l m} = \frac{1}{2} (g_{im, k l} + g_{k l, im} - g_{i l, k m} - g_{k m, i l}) + g_{n p} (Γ_{k l}^{n} Γ_{im}^{p} - Γ_{k m}^{n} Γ_{i l}^{p})

Riemman curvature tensor : R_{ik l m} = \frac{1}{2} (g_{im, k l} + g_{k l, im} - g_{i l, k m} - g_{k m, i l}) + g_{n p} (Γ_{k l}^{n} Γ_{im}^{p} - Γ_{k m}^{n} Γ_{i l}^{p})

Ricci tensor : R_{ik} = g^{l m} R_{l imk}

Ricci tensor : R_{ik} = g^{l m} R_{l imk}

Scalar curvature : R = g^{ik} R_{ik}

Scalar curvature : R = g^{ik} R_{ik}

p (x, y; μ_{x}, μ_{y}, σ_{x}, σ_{y}, r) = \frac{1}{2 π σ _{x} σ _{y} 1 - r ^{2}} exp (- \frac{1}{2 ( 1 - r ^{2} )} [\frac{( x - μ _{x} ) ^{2}}{σ _{x}^{2}} + \frac{( y - μ _{y} ) ^{2}}{σ _{y}^{2}} - \frac{2 r ( x - μ _{x} ) ( y - μ _{y} )}{σ _{x} σ _{y}}])

p (x, y; μ_{x}, μ_{y}, σ_{x}, σ_{y}, r) = \frac{1}{2 π σ _{x} σ _{y} 1 - r ^{2}} exp (- \frac{1}{2 ( 1 - r ^{2} )} [\frac{( x - μ _{x} ) ^{2}}{σ _{x}^{2}} + \frac{( y - μ _{y} ) ^{2}}{σ _{y}^{2}} - \frac{2 r ( x - μ _{x} ) ( y - μ _{y} )}{σ _{x} σ _{y}}])

Σ^{2} = σ_{x} σ_{y}

Σ^{2} = σ_{x} σ_{y}

p (x, y; μ_{x}, σ, r) = \frac{1}{2 π Σ ^{2} 1 - r ^{2}} exp (- \frac{1}{2 ( 1 - r ^{2} )} [\frac{( x - μ _{x} ) ^{2}}{σ ^{2}} + \frac{y ^{2} σ ^{2}}{Σ ^{4}} - \frac{2 r ( x - μ _{x} ) y}{Σ ^{2}}])

p (x, y; μ_{x}, σ, r) = \frac{1}{2 π Σ ^{2} 1 - r ^{2}} exp (- \frac{1}{2 ( 1 - r ^{2} )} [\frac{( x - μ _{x} ) ^{2}}{σ ^{2}} + \frac{y ^{2} σ ^{2}}{Σ ^{4}} - \frac{2 r ( x - μ _{x} ) y}{Σ ^{2}}])

\displaystyle\textrm{Fisher--Rao metric}:\ \ \ \ \ \ g_{ij}(\theta)=\left(\begin{array}[]{cc}\frac{1}{\sigma^{2}(1-r^{2})}&0\\ 0&\frac{4}{\sigma^{2}(1-r^{2})}\\ \end{array}\right)

\displaystyle\textrm{Fisher--Rao metric}:\ \ \ \ \ \ g_{ij}(\theta)=\left(\begin{array}[]{cc}\frac{1}{\sigma^{2}(1-r^{2})}&0\\ 0&\frac{4}{\sigma^{2}(1-r^{2})}\\ \end{array}\right)

Non-vanishing Christoffel symbols : Γ_{12}^{1} = Γ_{21}^{1} = - \frac{1}{σ}, Γ_{11}^{2} = \frac{1}{4 σ}, Γ_{22}^{2} = - \frac{1}{σ}

Non-vanishing Christoffel symbols : Γ_{12}^{1} = Γ_{21}^{1} = - \frac{1}{σ}, Γ_{11}^{2} = \frac{1}{4 σ}, Γ_{22}^{2} = - \frac{1}{σ}

Non-vanishing Ricci tensor components : R_{11} = - \frac{1}{4 σ ^{2}}, R_{22} = - \frac{1}{σ ^{2}}

Non-vanishing Ricci tensor components : R_{11} = - \frac{1}{4 σ ^{2}}, R_{22} = - \frac{1}{σ ^{2}}

Scalar curvature : R (r) = - \frac{1}{2} (1 - r^{2}), - 1 \leq r \leq 1

Scalar curvature : R (r) = - \frac{1}{2} (1 - r^{2}), - 1 \leq r \leq 1

C (T_{t} A, B) = μ (T_{t} A \cap B) - μ (A) μ (B)

C (T_{t} A, B) = μ (T_{t} A \cap B) - μ (A) μ (B)

T \to \infty lim \frac{1}{T} \int_{0}^{T} C (T_{t} A, B) d t = 0,

T \to \infty lim \frac{1}{T} \int_{0}^{T} C (T_{t} A, B) d t = 0,

t \to \infty lim C (T_{t} A, B) = 0,

t \to \infty lim C (T_{t} A, B) = 0,

∣ C (A_{0}, B) ∣ < ε, for all B \in σ_{n, r} (A_{1}, \dots, A_{r})

∣ C (A_{0}, B) ∣ < ε, for all B \in σ_{n, r} (A_{1}, \dots, A_{r})

C (T_{t} A, B) = 0 for all t \geq 0

C (T_{t} A, B) = 0 for all t \geq 0

ergodic \supset mixing \supset Kolmogorov \supset Bernoulli

ergodic \supset mixing \supset Kolmogorov \supset Bernoulli

C (f \circ T_{t}, g) = \int_{X} (f \circ T_{t}) (x) g (x) d x - \int_{X} f (x) d x \int_{X} g (x) d x \forall f, g \in L^{1} (X)

C (f \circ T_{t}, g) = \int_{X} (f \circ T_{t}) (x) g (x) d x - \int_{X} f (x) d x \int_{X} g (x) d x \forall f, g \in L^{1} (X)

C (f_{1}, \dots, f_{N}, τ) ≐

C (f_{1}, \dots, f_{N}, τ) ≐

\int p (x_{1}, \dots, x_{N}; θ (τ)) f_{1} (x_{1}) \dots f_{N} (x_{N}) d x_{1} \dots d x_{N} - i = 1 \prod N \int p_{i} (x_{i}; θ (τ)) f_{i} (x_{i}) d x_{i}

p_{i} (x_{i}; θ (τ)) = \int p (x_{1}, \dots, x_{N}; θ (τ)) j \neq = i \prod d x_{j}, i = 1, \dots, N

p_{i} (x_{i}; θ (τ)) = \int p (x_{1}, \dots, x_{N}; θ (τ)) j \neq = i \prod d x_{j}, i = 1, \dots, N

T \to \infty lim \frac{1}{T} \int_{0}^{T} C (f_{1}, \dots, f_{N}, τ) d τ = 0,

T \to \infty lim \frac{1}{T} \int_{0}^{T} C (f_{1}, \dots, f_{N}, τ) d τ = 0,

τ \to \infty lim C (f_{1}, \dots, f_{N}, τ) = 0,

τ \to \infty lim C (f_{1}, \dots, f_{N}, τ) = 0,

C (f_{1}, \dots, f_{N}, τ) = 0 for all t \in R

C (f_{1}, \dots, f_{N}, τ) = 0 for all t \in R

IG ergodic \supset IG mixing \supset IG Bernoulli

IG ergodic \supset IG mixing \supset IG Bernoulli

F (p) ≐ ∥ p (x, y; μ_{x}, σ, r) - p_{1} (x) p_{2} (y) ∥_{\infty} = (x, y) \in R^{2} max ∣ p (x, y; μ_{x}, σ, r) - p_{1} (x) p_{2} (y) ∣

F (p) ≐ ∥ p (x, y; μ_{x}, σ, r) - p_{1} (x) p_{2} (y) ∥_{\infty} = (x, y) \in R^{2} max ∣ p (x, y; μ_{x}, σ, r) - p_{1} (x) p_{2} (y) ∣

∣ C (f_{1}, f_{2}, τ) ∣ = \int_{R^{2}} p (x, y; μ_{x}, σ, r) f_{1} (x) f_{2} (y) d x d y - \int_{R} p_{1} (x) f_{1} (x) d x \int_{R} p_{2} (y) f_{2} (y) d y

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Notions of the ergodic hierarchy for curved statistical manifolds

Ignacio S. Gomez

[email protected]

IFLP, UNLP, CONICET, Facultad de Ciencias Exactas, Calle 115 y 49, 1900 La Plata, Argentina

Abstract

We present an extension of the ergodic, mixing, and Bernoulli levels of the ergodic hierarchy for statistical models on curved manifolds, making use of elements of the information geometry. This extension focuses on the notion of statistical independence between the microscopical variables of the system. Moreover, we establish an intimately relationship between statistical models and family of probability distributions belonging to the canonical ensemble, which for the case of the quadratic Hamiltonian systems provides a closed form for the correlations between the microvariables in terms of the temperature of the heat bath as a power law. From this we obtain an information geometric method for studying Hamiltonian dynamics in the canonical ensemble. We illustrate the results with two examples: a pair of interacting harmonic oscillators presenting phase transitions and the $2\times 2$ Gaussian ensembles. In both examples the scalar curvature results a global indicator of the dynamics.

keywords:

Information geometry , Statistical models , 2D Correlated model , Ergodic hierarchy , IGEH , Canonical ensemble

††journal: Journal of LaTeX Templates

1 Introduction

The possibility of that relevant features of the dynamics of a system could be obtained from the differential geometric structure of the probabilities distributions gave place to the first encounter between differential geometry and probability theory. In this sense, geometrization of thermodynamics and statistical mechanics constituted the most important achievement in the subject with several approaches like the obtained by means of the internal energy as considered Weinhold [1], or the Ruppeiner metric given by the second moments of thermodynamical fluctuations [2], among others. With the same Riemannian character of these approaches, another formulations based in the thermodynamic of parameters were established in the field of the statistical mechanics like the foundational works of Rao [3] and Amari [5], and also given by Ingarden [6], Janiszek [7]. The successful application of all this vast body of approaches in characterizing several phenomena, such as the phase transitions and their critical points in non ideal gases, gave them entity to constitute a discipline within the information theory, called Information Geometry. Curved statistical manifolds are the subject of study of the information geometry, and they have associated the Fisher–Rao metric [3] which in turn is linked to the concepts of entropy and Fisher information. Generalized extensions of the information geometry [8, 9, 10] with regard to nonextensive formulation of statistical mechanics [11] has been also considered. The utility of information geometry is not only limited to thermodynamics and statistical mechanics. For instance, it has been applied in quantum mechanics leading a quantum generalization of the Fisher metric [12], and recently also in nuclear plasmas [13]. In particular, an application of information geometry to chaos can be performed by considering complexity on curved manifolds [14, 15, 16, 17]. In this approach, asymptotic expressions for information measures are obtained by means of geodesic equations leading to a criterion for characterizing global chaos on statistical manifolds [17]: the more negative is the curvature, the more chaotic is the dynamics. As usual, chaos can be characterized in terms of diverging initially nearby trajectories [18]. For the statistical models this condition results in the divergence of geodesic paths on the statistical manifold and constitutes a local criterion for chaos.

Besides, in dynamical systems theory, the ergodic hierarchy (EH) characterizes the chaotic behavior in terms of a type of correlation between subsets of the phase space [19, 20]. In the asymptotic limit of large times, the EH establishes that the dynamics is more chaotic when the correlation decays faster. According to correlation decay, the four levels of EH are, from the weakest to the strongest: ergodic, mixing, Kolmogorov, and Bernoulli. In particular, in mixing systems any two subsets enough separated in time can be considered as “statistically independent" which allows one to use a statistical description of the behavior of the system. In quantum chaos, the statistical independence is present in the universal statistical properties of energy levels which are given by the Gaussian ensembles [21, 22, 23, 24]. In Gaussian ensembles theory one assumes that in a fully chaotic quantum system the interactions are neglected in such way that the Hamiltonian matrix elements can be considered statistically independent [25]. Related to this, in [26, 27, 28] a quantum extension of the EH was proposed, called the quantum ergodic hierarchy, which allowed to provide a characterization of the chaotic behaviors of the Casati–Prosen model [24] and the kicked rotator [21, 22, 23].

Inspired by the characterizations of quantum chaotic systems made in [26, 27, 28, 29, 30] and making use of curved statistical models, we propose a generalization of the ergodic, mixing and Bernoulli levels of the EH in the context of the information geometry, which we called Information Geometric Ergodic Hierarchy (IGEH). In order to use it, we define a distinguishability measure for a $2D$ correlated model that allows us to give an upper bound for the correlation of IGEH. Moreover, considering Hamiltonian systems belonging to the canonical ensemble we also give a method for characterizing their dynamics in terms of the statistical parameters and the levels of the IGEH.

In this way, our main contribution is two–fold: 1) an intimately connection between statistical models and probability distributions of the canonical ensemble which allows one to geometrize the phase transitions, and 2) an information geometric version of the ergodic hierarchy as an alternative framework for studying the chaotic dynamics in curved statistical models.

The paper is organized as follows. In Section 2, we give the notions and concepts of information geometry used throughout the paper, along with brief description of a $2D$ correlated model. In Section 3, we make a brief review of the levels the ergodic hierarchy. Section 4 is devoted to an information geometric definition of the ergodic hierarchy by expressing the correlations in terms of probability distributions instead of subsets of phase space. Next, in Section 5 we define a distinguishability measure for the $2D$ correlated model and an upper bound for the correlation of the IGEH is given. In Section 6, we establish the connection between the statistical models and the family of probability distributions belonging to the canonical ensemble. For quadratic Hamiltonian systems we show that their associated statistical models are the multivariate Gaussian ones, for which we obtain a closed form of the determinant of the covariance matrix in terms of the Hessian of the Hamiltonian. Here we also give an upper bound for the IG correlation where the temperature of the heat bath is considered as an external parameter. In Section 7, we illustrate the formalism with two examples: a pair of interacting harmonic oscillators presenting phase transitions in the canonical ensemble and the $2\times 2$ Gaussian Orthogonal Ensemble (GOE). Also, a panoramic outlook of the IGEH is sketched. Finally, in Section 8 we draw some conclusions, and future research directions are outlined.

2 Elements of information geometry

We begin by introducing some fundamentals and concepts, following the definitions given in [5].

2.1 Statistical models

Given an abstract set $X$ one can consider the set M of all the probability density functions (PDFs) $p$ defined on $X$ , i.e.

[TABLE]

where $\mathbb{R}$ is the set of real numbers and the integration must be replaced by a sum when $X$ is discrete. Consider a subset $S\subset\textit{M}$ such that each element of $S$ may be parameterized using a $m$ –real vector $\theta=(\theta_{1},\ldots,\theta_{m})$ so that

[TABLE]

where $\Theta$ is a subset of $\mathbb{R}^{m}$ . Then, if the mapping $\theta\mapsto p_{\theta}$ is injective it is said that $S$ is a statistical model on $X$ . The dimension of the statistical model is that of the macrospace, i.e. it is $m$ –dimensional. The physical interpretation of $X$ and $\Theta$ is as follows. Generally, $X$ represents the microscopic variables of the system under study which typically are difficult to control, for instance the positions of all the particles in a gas. Thus, $X$ is called the microspace and $x$ are the microvariables. On the other hand, $\Theta$ represent the macroscopic variables that can be measured in an experiment, like the mean value or the moments of the microvariables. It is said that $\Theta$ is the macrospace and $\theta_{1},\ldots,\theta_{m}$ are the macrovariables. Since the microspace is fixed by the system then one can only choose the macrospace, and in this way the statistical model is established. Then, the statistical models are system–specific, from which follows that a statistical model could be useful for a system while that for another not.

2.2 Metric structure of the statistical manifold

Next step is to describe the behavior of a system by means of a previously and adequately chosen, statistical model. For this, some kind of dynamics must be introduced on the statistical model. In information geometry this is accomplished by means of the Fisher–Rao tensor

[TABLE]

where $p(x;\theta)$ is a generic element of $S$ . The metric tensor $g_{ij}$ endows the dynamics to the macrospace in terms of the geodesic equations for the macrovariables $\theta_{i}$ , i.e. $S$ results to be a statistical manifold. More precisely, $S$ is a Riemmanian manifold and the statistical character is due to the elements of $S$ are probability distributions.

Thus, the main goal of the statistical models is to obtain some relevant information about the dynamics by means of the geodesic equations and geometrical quantities like the Ricci tensor, the scalar curvature etc. In this sense, we will use a criteria given by Cafaro et al. [14, 15, 16] to characterize global chaos on statistical models: “the more negative is the scalar curvature, the more chaotic is the dynamics". From the metric tensor (3) one can obtain the geodesic equations for the macrovariales $\theta_{1},\ldots,\theta_{m}$ along with the following geometrical quantities that we will use throughout the paper.

[TABLE]

where the comma in the subindexes denotes the partial derivative operation (of first and second orders), $g^{kl}$ is the inverse of $g_{ij}$ , and $\tau$ is a parameter that characterizes the geodesic curves.

2.3 The $2D$ correlated Gaussian model

Most statistical models used in the literature are the so called Gaussian models, due to its wide versatility for describing multiple phenomena. This models are obtained by choosing the subfamily $S$ as the set of multivariate Gaussian distributions. If $(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ are the microvariables and there is no correlations between them then $(\mu_{1},\ldots,\mu_{n},\sigma_{1},\ldots,\sigma_{n})\in\mathbb{R}^{n}\times\mathbb{R}_{+}^{n}$ are the set of macrovariables, where $\mu_{i}$ and $\sigma_{i}^{2}$ correspond to the mean value and the variance of the $i$ –th microvariable. However, for a more realistic description of the system the correlations between each of the microvariables must be taken into account. Considering the family of bivariate (binormal) distributions, one of the Gaussian models of lower dimension that present correlations can be obtained, which is given by

[TABLE]

where $\sigma_{xy}=r\sigma_{x}\sigma_{y}$ is the covariance between $x$ and $y$ and $r$ is the correlation coefficient that assumes values within the ranges $-1\leq r\leq 1$ . Here the microspace is $X=\{(x,y)\in\mathbb{R}^{2}\}$ and the macrospace is $\Theta=\{(\mu_{x},\mu_{y},\sigma_{x},\sigma_{y})\in\mathbb{R}\times\mathbb{R}\times\mathbb{R}_{+}\times\mathbb{R}_{+}\}$ . Nevertheless, one still can obtain a non trivial description by adding the following macroscopic constraint

[TABLE]

where $\Sigma$ is a constant belonging to $\mathbb{R}_{+}$ . Mathematically, the effect of $\Sigma$ is to restrict the dynamics to the submanifold $\Theta\cap\{\Sigma=\sigma_{x}\sigma_{y}\}$ . Physically, $\Sigma$ resembles the minimum uncertainty relation when one chooses $x$ as the position of a particle and $y$ its conjugate variable. Moreover, this interpretation allows to give an explanation of the phenomenon of “suppression of classical chaos by quantization" from an information geometric point of view [17]. For the sake of simplicity one also can fix the mean value $\mu_{y}$ of $y$ as zero. With the help of (10) then one can rewrite (9), thus obtaining the non–trivial correlated statistical model of lower dimensionality

[TABLE]

where we renamed $\sigma_{x}$ as $\sigma$ . This is the so called the $2D$ correlated model [17], since it present correlations between $x$ and $y$ by means of $\Sigma^{2}=\sigma_{x}\sigma_{y}$ and its macrospace $\Theta=\{(\mu_{x},\sigma)\in\mathbb{R}\times\mathbb{R}_{+}\}$ is bidimensional. It should be noted that here $r$ is considered as an external parameter that does not belong to the macrospace. For this model, from Eqs. (3)–(8) one can obtain the Fisher tensor along with the following geometrical quantities.

[TABLE]

From (17) one can see that the curvature has a minimum value $R=-\frac{1}{2}$ when $r=0$ (absence of correlations) while for $|r|\rightarrow 1$ (maximally correlated case) it has a maximum value $R=0$ . In terms of the criterium of global chaos this can be interpreted as: the dynamics of the uncorrelated case is more chaotic than the corresponding to the maximally correlated case. Moreover, the divergence of the metric observed for $|r|\rightarrow 1$ expresses the maximally correlated case as a critical point of the dynamics.

3 The ergodic hierarchy

In classical chaos, the exponential instability implies continuous spectrum, and therefore, a decay of correlations in such a way that for large times the measure of the intersection between two sets of phase space (separated from each other in time) tends to the product of their measures. This is the well known mixing property, and constitutes one of the foundations of the statistical mechanics. The main feature of mixing is that it establishes the statistical independence of different parts of a trajectory, when sufficiently separated in time. This is the main reason for the application of probability theory in the classical domain, which allows one to calculate statistical features such as diffusion, relaxation and distribution functions [23]. Consequently, the description in terms of trajectories can be replaced by an equivalent one in terms of distribution functions, which, if not singular, represent not a single trajectory but a continuum of them.

In ergodic theory, any classical system is represented mathematically by a dynamical system $(X,\Sigma,\mu,\{T_{t}\}_{t\in J})$ where $X$ is a set, $\Sigma$ is a sigma–algebra of $X$ , $\mu$ a measure defined over $\Sigma$ and $\{T_{t}\}_{t\in J}$ a group of measure–preserving transformations. The ergodic hierarchy ranks the chaos of a dynamical system according to a type of correlation $C(T_{t}A,B)$ between two subsets $A$ and $B$ of $X$ that are separated by a time $t$ . This is defined as [19, 20]

[TABLE]

The ergodic, mixing and Bernoulli levels of the EH are given in terms of (18) in the following way. Given two arbitrary sets $A,B\in X$ , it is said that $T_{t}$ is

$\bullet$

ergodic if

[TABLE] 2. $\bullet$

mixing if

[TABLE] 3. $\bullet$

Kolmogorov if for all integer $r$ , for all $A_{0},A_{1},\ldots,A_{r}\subseteq\Gamma$ , and for all $\varepsilon>0$ there exists a positive integer $n_{0}>0$ such that if $n\geq n_{0}$ one has

[TABLE]

where $\sigma_{n,r}(A_{1},\ldots,A_{r})$ is the minimal $\sigma-$ algebra generated by $\{T^{k}A_{i}:k\geq n\ ;\ i=1,...,r\}$ . 4. $\bullet$

Bernoulli if

[TABLE]

In ergodic systems the correlation vanishes “in time average" for large times while in mixing systems $C(T_{t}A,B)$ vanishes for $t\rightarrow\infty$ . In Kolmogorov systems the correlations between an arbitrary set and another one belonging to the $\sigma$ –algebra $\sigma_{n,r}(A_{1},\ldots,A_{r})$ cancel for $n\rightarrow\infty$ . In Bernoulli systems the correlation is zero for all times. These levels classify the dynamics according to Eqs. (19), (20), (21), and (22), from the weakest level (the ergodic) to the strongest (the Bernoulli). The following strict inclusions hold:

[TABLE]

In order to express $C(T_{t}A,B)$ by means of probability distributions it is more convenient to use the definition (18) in terms of distribution functions, which is given by [20]

[TABLE]

where $f\circ T_{t}$ denotes the composition of $f$ and $T_{t}$ , i.e. $f\circ T_{t}(x)=f(T_{t}(x))$ for all $x\in X$ and now the role of $A,B$ is played by the functions $f,g\in\mathbb{L}^{1}(X)$ . Physically, $f$ represents any initial density function of the classical system whose value at time $t$ is given by $f\circ T_{t}$ with $T_{t}$ the classical Liouville evolution (in Hamiltonian systems).

4 An information geometric version of the ergodic hierarchy

Following the idea of characterizing chaos by means of the ergodic hierarchy [26, 27, 28], now we consider an extension of the EH within the context of the information geometry. In principle, in information geometry one has probability distributions $p_{\theta}$ that depend on a set of parameters $\theta$ , and the dynamics of the macrovariables $\theta$ is performed along the geodesics of the statistical manifold. Moreover, in the statistical manifold the role of time variable $t$ of dynamical systems is played by a parameter $\tau$ along the geodesics.

In order to introduce the tools of information geometry, we propose the following approach by defining a correlation between functions as the macrovariables $\theta$ evolve along the geodesics. Given $N$ functions $f(x_{i})$ , each one of them in terms of the variable $x_{i}$ for all $i=1,\ldots,N$ , we define the information geometric correlation (IG correlation) $C(f_{1},\ldots,f_{N},\tau)$ between $f_{1},\ldots,f_{N}$ at time–like parameter $\tau$ as

[TABLE]

where $\theta(\tau)=(\theta_{1}(\tau),\ldots,\theta_{M}(\tau))$ is the M–dimensional vector of the macrovariables at “time" $\tau$ and,

[TABLE]

are the marginal distributions of $p(x_{1},\ldots,x_{N};\theta(\tau))$ . From (4) one can see that $C(f_{1},\ldots,f_{N},\tau)$ measures how independent the variables $x_{1},\ldots,x_{N}$ are at time–like $\tau$ . This can be considered as a sort of information geometric generalization of the EH correlation.

Having established $C(f_{1},\ldots,f_{N},\tau)$ and taking into account the ergodic, mixing and Bernoulli levels given by Eqs. (19), (20) and (22), we define the information geometric ergodic hierarchy (IGEH) as follows. For the sake of simplicity and since we focus mainly on the ergodicity and mixing properties (which are the fundamentals of statistical mechanics), in this contribution we do not extend the Kolmogorov level that involves a $\sigma$ –algebra. Given a set of $N$ arbitrary functions $f_{1}(x_{1}),\ldots,f_{N}(x_{N})$ we say that now the statistical model is

$\bullet$

IG ergodic if

[TABLE] 2. $\bullet$

IG mixing if

[TABLE] 3. $\bullet$

IG Bernoulli if

[TABLE]

As in the ergodic hierarchy, the following strict inclusions hold:

[TABLE]

For instance, a statistical model that is IG ergodic can be given by assuming that $C(f_{1},\ldots,f_{N},\tau)$ is proportional to $\sin(\alpha\tau)||f_{1}||_{1}\ldots||f_{N}||_{1}$ with $\alpha\in\mathbb{R}$ . Making this replacement in (26) one obtains that $\lim_{T\rightarrow\infty}\frac{1}{T}\int_{0}^{T}C(f_{1},\ldots,f_{N},\tau)d\tau$ is equal to zero. Since $\sin(\alpha\tau)||f_{1}||_{1}\ldots||f_{N}||_{1}$ oscillates, then this model is not IG mixing nor IG Bernoulli. Examples of statistical models that are IG mixing and IG Bernoulli will be illustrated in Section 7. Our approach, thus, deals with ergodic hierarchy in statistical models from an information geometry viewpoint.

5 A measure of distinguishability for the $2D$ correlated model

In order to use the levels of the IGEH for characterizing the dynamics of statistical models one should have a manner of determining the decay of the correlation $C(f_{1},\ldots,f_{N},\tau)$ in Eqs. (26), (27) or (28). For the family of the $2D$ correlated probabilities $p(x,y|\mu_{x},\sigma;r)$ of (11), we define a distinguishability measure $F:\{p(x,y;\mu,\sigma,r)\ \ |\ \ \mu_{x}\in(-\infty,\infty)\ ,\ \sigma\in(0,\infty)\ ,\ -1\leq r\leq 1\ \}\longmapsto\mathbb{R}$ , given by

[TABLE]

where $p_{1}(x),p_{2}(y)$ are the marginal distributions of $p(x,y;\mu_{x},\sigma,r)$ . Furthermore, if $f_{1}(x),f_{2}(y)\in\mathbb{L}^{1}(\mathbb{R})$ are arbitrary functions of $x$ and $y$ , then we have

[TABLE]

Eq. (30) expresses that $F(p)||f_{1}f_{2}||_{1}$ is un upper bound for $|C(f_{1},f_{2},\tau)|$ . Therefore, it is convenient to find an analytic expression for (29). After some algebra one can obtain that111The demonstration can be found in the Appendix.

[TABLE]

The behavior of $F(p)$ , which is independent of $\mu_{x}$ and $\sigma$ , is shown in Fig. 1. Two relevant regions, corresponding to the limiting cases $r\rightarrow 0$ and $r\rightarrow\pm 1$ , can be well distinguished. The region $r\rightarrow 0$ corresponds to the zone where the statistical model is characterized by the IG mixing and IG Bernoulli levels, with the particularity that the variables of microspace are uncorrelated. Moreover, one can see that near to $r=0$ the decay is linear in $r$ . The curve $F(p)$ also shows that, if $r\rightarrow 0$ when $\tau\rightarrow\infty$ , then the statistical model is IG mixing.

In the region $r\rightarrow\pm 1$ the measure $F(p)$ diverges corresponding to the maximally correlated case, which physically means that the system presents strong correlations between the variables of microspace. Due to the correlations are strong in this regime the statistical model cannot be IG mixing nor IG Bernoulli.

Finally, it should be noted that $F(p)$ does not allow one to distinguish between two probability distributions having $r$ and $-r$ respectively. The symmetry respect to the axis $r=0$ is due to the mathematical form of the infinite norm $||.||_{\infty}$ in the definition (29). That is, with other choices of $F(p)$ one could distinguish states (probability distributions) with correlation coefficients $r$ and $-r$ .

6 Geometrizing the canonical ensemble by means of statistical models and the IGEH

On the basis of the above characterization of the dynamics of the macrospace in terms the IGEH, our next aim is to give a method for studying the dynamics of a system belonging to the canonical ensemble.

6.1 Canonical ensembles in the context of the information geometry

Beyond the relationship between statistical models and statistical physics has been already established [5, 31], extensions in several directions have been recently introduced [32, 33, 34, 35, 36, 37, 38, 39], with a particular focusing on the exponential families since they represent mathematically the Liouville densities of the statistical ensembles. Relevant consequences from these researches such as the connection between Hessian structures and exponential families [38], nonextensive statistical models [34, 35, 36, 39] and other extensions [32, 33, 37] has been reported.

In order to apply the IGEH to the canonical ensemble, here we obtain some explicit formulas that relate the statistical parameters of multivariate Gaussian distributions (which are a special case of exponential family) with the physical parameters of quadratic Hamiltonians. In the present contribution we only focus on the multivariate Gaussian distributions and we will omit the definition of exponential families of a more general character. We begin by considering the family of probability density functions given by the classical canonical ensemble

[TABLE]

where $Z(\theta)=\int p(q,p;\theta)dqdp$ is the well known partition function, $\beta=\frac{1}{k_{B}T}$ is the Boltzmann factor, $E(q,p,\theta)$ is the energy of the system expressed in terms of the phase space coordinates $(q,p)\in\Gamma$ (with $\Gamma$ the phase space), and $\theta$ are the macrovariables of the system. The function $p(q,p;\theta)$ represents the probability density of the microstate $(q,p)$ of the system, corresponding to the macrostate given by the macroscopic parameters $\theta$ , when contacted in thermal equilibrium with a heat bath at a fixed temperature $T$ . In this way, one can see that there is a biunivocal correspondence between statistical models and statistical ensembles: each pair of microvariables $(q,p)$ corresponds to a microstate of the ensemble, and each value of the macrovariable $\theta$ is associated to a macrostate of the system.

Assuming a $2n$ –dimensional phase space $\Gamma$ and a $m$ –dimensional macrospace $\Theta$ , from (32) and using (3) one obtains the Fisher tensor for the canonical ensemble

[TABLE]

where $q$ and $p$ are a short notation for $(q_{1},\ldots,q_{n})$ and $(p_{1},\ldots,p_{n})$ , and the subffix $CE$ stands for the canonical ensemble. The formula (6.1) is the expression of the canonical ensemble in the context of the information geometry. All the dynamics over the macrospace can be derived by means of the Eqs. (4)–(8), thus obtaining a geometrical characterization of the canonical ensemble.

6.2 Dynamics of the canonical ensemble in terms of the IGEH levels

The particular dependence of the energy $E(q,p;\theta)$ on the microvariables $(q,p)$ determines the correlations between them, which are reflected in the probability distribution $p(q,p;\theta)$ . Since the IG correlation $C(f_{1},\ldots,f_{N},\tau)$ measures the degree of statistical independence between the microvariables of the microspace at time–like parameter $\tau$ , then for a given expression of the energy it is desirable to know what is the form of $C(f_{1},\ldots,f_{N},\tau)$ . We focus on a particular form of the energy, i.e. when $E(q,p;\theta)$ is a quadratic function of $(q,p)$ . The physical relevance of this assumption lies, among other things, in the fact that it allows to study the dynamics of the system near of an equilibrium point. In this case, the formula of the energy is given by

[TABLE]

with G the Hessian of $E$ around at a point $(q_{0},p_{0})$ and $(q-q_{0},p-p_{0})^{T}$ the transposed of $(q-q_{0},p-p_{0})$ . Then, the probability distribution $p(q,p;\theta)$ adopts the form

[TABLE]

with $Z(\theta)^{-1}$ the normalization factor. From (35) one can see that $p(q,p;\theta)$ is nothing but a $2n$ –multivariate Gaussian distribution on the microvariables $q_{1},\ldots,q_{n},p_{1},\ldots,p_{n}$ . At this point it is convenient to recall the expression of the $2n$ –multivariate Gaussian distribution $p(x;\mu,\boldsymbol{\Sigma})$

[TABLE]

with $x=(x_{1},\ldots,x_{2n})$ the vector of microvariables, $\mu=(\mu_{1},\ldots,\mu_{2n})$ the mean value vector, $\boldsymbol{\Sigma}$ the covariance matrix and $\boldsymbol{\Sigma}$ its inverse, and $|\boldsymbol{\Sigma}|$ the determinant of $\boldsymbol{\Sigma}$ . Moreover, the marginals $p_{i}(x_{i};\mu_{i},\boldsymbol{\Sigma}_{ii})$ of $p(x;\mu,\boldsymbol{\Sigma})$ are given by

[TABLE]

where $\boldsymbol{\Sigma}_{ij}$ stands for the $ij$ –th matrix element of $\boldsymbol{\Sigma}$ with $i,j=1,\ldots,2n$ .

Let us consider $2n$ arbitrary functions $f_{1}(x_{1}),\ldots,f_{2n}(x_{n})\in\mathbb{L}^{1}(\mathbb{R})$ . Then, replacing (36) and (37) in (4) one has that

[TABLE]

is the IG correlation for the family of $2n$ –multivariate Gaussian distributions. As in the $2D$ correlated model, one can obtain an upper bound for $C(f_{1},\ldots,f_{2n},\tau)$ as follows.

[TABLE]

This inequality expresses an upper bound of the IG correlation for the family of the $2n$ –multivariate Gaussian distributions, where the maximum is a measure of distinguishability, and the parameters $\mu=\mu(\tau)$ , $\boldsymbol{\Sigma}=\boldsymbol{\Sigma}(\tau)$ are dependent on $\tau$ along the geodesics by means of the application of Eqs. (3)–(5) to $p(x;\mu,\boldsymbol{\Sigma})$ .

Now we can set the upper bound of (6.2) in the language of the canonical ensemble. By simple inspection of Eqs. (6.2)–(37), if one makes the following replacements

[TABLE]

in (6.2) then one obtains

[TABLE]

The inequality (6.2) is an upper bound of the IG correlation for a system in the canonical ensemble whose energy is a quadratic form, expanded around a point $(q_{0},p_{0})$ of the phase space. Since the macrovariables $(q_{0},p_{0})$ and G are functions of the time–like parameter $\tau$ then the utility of (6.2) is that, according to the way the IG correlation cancel for large values of $\tau$ , one can classify the statistical model as belonging to some of the levels of the IGEH. In this way, one can study phase transitions in terms of the IGEH levels as the macrovariables vary. For instance, if the dynamics in macrospace is such that the maximum in (6.2) tends to zero when $\tau\rightarrow\infty$ , then one has that the canonical ensemble behaves as an IG mixing statistical model.

It should be noted that the multivariate Gaussian distribution only exists if the covariance matrix $\boldsymbol{\Sigma}$ is positive, which implies that $\boldsymbol{\Sigma}^{-1}$ must be also positive. Then, it follows that $\frac{\partial^{2}E}{\partial^{2}x_{i}}(q_{0},p_{0})>0$ . In particular, the stable equilibrium points of the system satisfy this condition.

Now we are in a position to reach one of our main contributions of this work. By the Eq. (6.2) and using the definition of the tensor G, one finally obtains

[TABLE]

This equation expresses an intimate relationship between the canonical ensemble and the statistical models, which one can express in words as follows:

“given a quadratic form of the energy, the determinant of the covariance matrix which measures the correlations between the microvariables, is proportional to a power (equal to the dimension of the phase space) of the temperature of the heat bath".

Moreover, since the determinant of covariance matrix is a decreasing function of the correlations, as the temperature of the heat bath increases the correlations tend to be suppressed, as expected statistically. This result will be useful for characterizing phase transitions, as we shall see below.

7 Models and results

In order to illustrate the relevance of the IGEH we consider two examples belonging to different topics: an interacting bipartite system presenting phase transitions in the canonical ensemble, and the $2\times 2$ Gaussian orthogonal ensemble. Next, we give a panoramic outlook of the IGEH and of the relationship between statistical models and Liouville densities of the canonical ensemble.

7.1 Phase transitions in a pair of interacting harmonic oscillators

Let us consider a pair of unidimensional and interacting harmonic oscillators in the canonical ensemble whose total energy is given by

[TABLE]

where $q_{i}$ , $p_{i}$ , $q_{i0}$ , $m_{i}$ , and $\omega_{i}$ are the position, the momentum, the equilibrium position, the mass, and the frequency of the $i$ –th particle with $i=1,2$ . Here $T(p_{1},p_{2};m_{1},m_{2})$ is the kinetic energy and $V(q_{10},q_{20},m_{1},m_{2},\omega_{1},\omega_{2},r)$ is the potential energy which is composed by three terms: the first two are the potential energy of each oscillator separatelly while the term $-r\sqrt{m_{1}m_{2}}\omega_{1}\omega_{2}(q_{1}-q_{10})(q_{2}-q_{20})$ represents the interaction between the oscillators, and the coefficient $r\in[-1,1]$ measures the strength coupling. For instance, when $r=0$ the oscillators are uncoupled, and therefore, their motions are independent of each other.

Now, in order to use the analysis made about the $2D$ correlated model one must reduce the number of microvariables and macrovariables, and also impose some type of constraints. In this sense, we fix the masses $m_{1},m_{2}$ and set $q_{20}=0$ . Also, we consider the following constraint

[TABLE]

where $\Sigma$ is a real constant, $k_{B}$ is the Boltzmann constant, and $T_{0}$ is a temperature of reference222For instance, the room temperature $\sim 20^{\circ}$ C (293.15 K).. Here $T_{0}$ plays the role of being a temperature that breaks up the correlations between the microvariables.

Due to the correlations are only between the position coordinates $q_{1},q_{2}$ then one can reasonably neglect the momentum coordinates $p_{1},p_{2}$ by integrating the probability distribution $p(q_{1},q_{2},p_{1},p_{2};\theta)$ given by the canonical ensemble, i.e. $\frac{1}{Z(\theta)}\exp\{-\beta E(q_{1},q_{2},p_{1},p_{2};\theta)\}$ , over $p_{1}$ and $p_{2}$ . And since only the kinetic energy has the dependence on $p_{1}$ and $p_{2}$ then this equivalent to consider a sort of marginal probability distribution with respect to the potential energy, i.e.

[TABLE]

With the help of (43) one can express the potential energy as

[TABLE]

Then, from (44) and (45) one has

[TABLE]

which is nothing but the probability distribution of the $2D$ correlated model (11) by means of the identifications

[TABLE]

From the last line of (7.1) one obtains

[TABLE]

Let us show that this equation is a particular case of the formula (42): since the determinant of the covariance matrix is $\Sigma^{4}(1-r^{2})$ and the determinant of the Hessian of the potential energy evaluated at $(q_{10},q_{20})$ is

$(1-r^{2})m_{1}m_{2}\omega_{1}^{2}\omega_{2}^{2}$ then one can replace both expressions in (42) with $n=1$ , thus obtaining

[TABLE]

With the help of (43) one can recast this equation as

[TABLE]

from which one obtains $1-r^{2}=\frac{T}{T_{0}}$ .

Considering the temperature $T$ as an external parameter one can study the phase transitions of the system as $T$ varies. Also, we set the reference temperature $T_{0}$ as the room temperature. Given that the parameter $\tau$ is arbitrary, we choose $\tau$ as

[TABLE]

This choice for $\tau$ is convenient since one has

[TABLE]

In this way, the asymptotic limit $\tau\rightarrow\infty$ is identified with the limit $T\rightarrow T_{0}$ , and therefore, the transition towards high temperatures can be studied by means of the limit $\tau\rightarrow\infty$ . This transition express the behavior of the correlations between the oscillators when the bath temperature pass from a finite value (which is lower than $T_{0}$ ) to the room temperature. Physically, at the room temperature is expected that if the energy $k_{B}T_{0}$ delivered by the bath to each oscillator is larger than the energies of them (i.e., $k_{B}T_{0}\gg m_{1,2}\omega_{1,2}^{2}$ ) then as a result of the thermal agitation the correlations between the oscillators tend to be canceled. Indeed, from (48) one can see that $r\rightarrow 0$ when $T\rightarrow T_{0}$ .

Now let us see that this phase transition is characterized in terms of the mixing level of the IGEH. When $r$ is vanishingly small, the following approximations hold:

[TABLE]

Using these approximations, and neglecting terms of order $r^{2}$ , in the formula (31) of $F(p)$ one obtains that $F(p)\lesssim|r|$ holds for $|r|\ll 1$ . Replacing this inequality in (30) one has that

[TABLE]

In turn, since $r\rightarrow 0$ when $\tau\rightarrow\infty$ this equation implies

[TABLE]

According to the definition (27) then the system is IG mixing. This is the regime of null correlations corresponding to the region of the curve of $F(p)$ around at $r=0$ , as can be seen in Fig. 1. When $r=0$ the probability distribution (7.1) can be factorized as the product of its marginals, and therefore, from the Eq. (28) one has that the system is IG Bernoulli.

Moreover, since the scalar curvature for the $2D$ correlated model is $R=-\frac{1}{2}(1-r^{2})$ then one obtains

[TABLE]

The formula (51) expresses the connection between the thermodynamics of the canonical ensemble and the information geometry of the system of coupled oscillators. It can be seen that the statistical model behaves as an “intermediary" between the thermodynamic parameters and the geometrical quantities of the statistical manifold. As a consequence, the determinant of the covariance matrix of the statistical model is determined by the Boltzmann factor, thus linking the temperature with all the geometrical quantities like the metric tensor, the scalar curvature, etc.

From Eq. (51) it follows that the scalar curvature decreases as the bath temperature increases up to reach a minimum value $R=-\frac{1}{2}$ at the room temperature, where the correlation coefficient is zero. On the other hand, when the temperature tends to zero the scalar curvature takes a maximum value $R=0$ which corresponds to the maximally correlated case $r\rightarrow\pm 1$ . This reflects the intuitively image that in absence of thermal agitation the correlations remain present.

Finally, it should be noted that this analysis is consistent with the Cafaro’s criteria of global chaos: as the temperature grows the dynamics become more chaotic and the scalar curvature turns out more negative. From the point of view of the IGEH this is characterized in terms of the mixing level, in which the breaking up of the coupling between the oscillators at the room temperature is expressed by means of the cancellation of the IG correlation in the asymptotic limit.

7.2 $2\times 2$ Gaussian Orthogonal Ensembles (GOE)

In Gaussian Orthogonal Ensembles theory one deals with the probability distribution $p(H_{11},H_{12},\ldots,H_{nn})$ for the Hamiltonian matrix elements assuming that the $H_{ij}$ are uncorrelated [21, 22]. Then in the framework of information geometry one could try to describe them by defining a microspace $x_{1},x_{2},\ldots,x_{n}$ and a macrospace $\theta_{1},\theta_{2},\ldots,\theta_{m}$ in a suitable way.

In order to characterize the GOE within a statistical model we study a correlated ensemble of $2\times 2$ matrices. We take the microspace as the Hamiltonian matrix elements $\{H_{11},H_{22},H_{12},H_{21}\}$ and define the macrospace as follows. For the sake of simplicity, we choose the macrospace in such way that only $H_{11}$ and $H_{22}$ are correlated, and that the mean values of all variables are zero, except for the mean value corresponding to $H_{11}$ which is equal to $\mu$ . Also, we consider that the variance of $H_{11}$ , $H_{12}$ and $H_{21}$ are the same, denoted by $\sigma$ . Moreover, in order to study how independent the diagonal Hamiltonian elements are, we restrict the dynamics by considering that $r\in[-1,1]$ is the correlation coefficient between $H_{11}$ and $H_{22}$ , and that the product of the covariances between $H_{11}$ and $H_{22}$ is a constant $\Sigma^{2}$ . Taking this into account, the resulting macrospace is $\{(\mu,\sigma)\in\mathbb{R}\times(0,\infty)\}$ and the correlated probability distribution is given by 333Note that, since the GOE correspond to the orthogonal class of Hamiltonians then one has that $H_{12}=H_{21}$ . However, in the formalism of Random matrices and for the orthogonal case, the volume element $dH_{11}dH_{22}dH_{12}dH_{21}$ (as if $H_{12}$ and $H_{21}$ were independent variables) is the real Lebesgue measure of $\mathbb{R}^{4}$ and must to be taken into account in order to normalize the probability distribution [21].

[TABLE]

where the correlation coefficient $r$ is considered as an external parameter and since the correlation between $H_{11}$ and $H_{22}$ is in terms of $r$ then $\Sigma$ can be taken as a fixed constant. In turn, given that $H_{11}$ and $H_{22}$ are the only microvariables correlated to each other and since the transitions of the dynamics depend fundamentally on the correlations, then one can reasonably neglect $H_{12}$ and $H_{21}$ . Analogously as it was made in the pair of oscillators, one can integrate the correlated probability distribution over $H_{12}$ and $H_{21}$ , thus obtaining

[TABLE]

which is nothing but the $2D$ correlated model, i.e. (11) and (52) identical by renaming $H_{11}$ and $H_{22}$ as $x$ and $y$ respectively. Then, the non vanishing components of the Ricci tensor $R_{ij}$ and the Ricci scalar curvature $R$ are given by the Eqs. (16) and (17)

[TABLE]

Three remarks follow. First, the statistical manifold has a curvature which is negative for all values of the correlation coefficient $r\in[-1,1]$ . Based on the Cafaro’s criterium above, this simply means that the dynamics in macrospace $(\mu,\sigma)$ is chaotic for all $r$ .

Second, the $2\times 2$ GOE case corresponds to $r=0$ and $\Sigma=\sigma$ , thus having the minimum value of the scalar curvature

[TABLE]

In this case the correlated probability distribution is the product of their marginals and thus, the model is IG Bernoulli. Therefore, one can see that the GOE corresponds to the strongest level of the IGEH and this can be considered as a characterization of the Gaussian ensembles from an information geometric point of view.

Third, for the strongly correlated case that corresponds to $|r|\sim 1$ one has

[TABLE]

which can be interpreted, by the Cafaro’s criterium of global chaos, as the case when the dynamics is the least chaotic of all.

7.3 A panoramic outlook of the IGEH and of the canonical ensemble in curved statistical models

Here we summarize the aspects of our proposal that can provide innovative tools within the context of the information geometry for characterizing the dynamics of a system in terms of the macrospace of the chosen statistical model. For this, below we provide two schematic diagrams, Figs. 2 and 3, showing the content of the two main contributions of this work and its physical relevance from a panoramic outlook.

8 Conclusions

We have proposed an extension of the ergodic, mixing and Bernoulli levels in the context of information geometry, that we called information geometric ergodic hierarchy (IGEH), and we applied it to characterize: i) the phase transitions of a pair of interacting harmonic oscillators in the canonical ensemble and ii) the $2\times 2$ Gaussian Orthogonal Ensembles. The relevance and novelty of our main contributions, i.e. the IGEH and the information geometric characterization of the Lioville densities of the canonical ensemble expressed by the Eqs. (4)–(28) and (6.1)–(42) respectively, lie in the following remarks:

Statistical models provides a unified scenario for approaches involving correlations between microscopic variables. This was illustrated with a correlated $2\times 2$ GOE by adding correlations between two variables and showing that, this modification attenuates the chaotic dynamics on macrospace by increasing the scalar curvature, accordingly to the Cafaro’s criterium of global chaos.

2.

The IGEH generalizes the chaos characterization of the ergodic hierarchy by quantifying the statistical independence between the microvariables (instead of subsets of phase space) of the statistical model. This is performed in the asymptotic limit of large values of the time–like parameter which is expressed in terms of upper bounds of the IG correlation as the measure $F(p)$ for the case of the $2D$ correlated model.

3.

The association between multivariate Gaussian distributions and quadratic Hamiltonians can be useful for studying the type of stability that present the dynamics in their equilibrium points in the context of the information geometry.

4.

Geometrical notions and the Cafaro’s criterium of global chaos can be related with the levels of the IGEH. The $2\times 2$ GOE case belonging to the most chaotic level, the IG Bernoulli, has an associated minimum negative value of the scalar curvature $R_{GOE}=-\frac{1}{2}$ .

5.

By obtaining upper bounds $F(p)$ on the IG correlation for a specific family of probability distributions, as exemplified by the curve of Fig. 1, one could study geometrical phase transitions moving along curves $F(p)$ as an external parameter $r$ is varied.

Acknowledgments

This work was partially supported by CONICET and Universidad Nacional de La Plata, Argentina.

References

[1] F. Weinhold, J. Chem. Phys. 63, 2479, 2488 (1975); 65, 559 (1976).
[2] G. Ruppeiner, Phys. Rev. A 20, 1608 (1979); Rev. Mod. Phys. 67, 605 (1995); Erratum: 68, 313 (1996).
[3] C. R. Rao, Differential Geometry in Statistical Inference. In chap. Differential metrics in probability spaces; Institute of Mathematical Statistics, Hayward, CA, 1987.
[4] C. R. Rao, Bull. Calcutta Math. Soc. 37, 81 (1945).
[5] S. Amari, H. Nagaoka. Methods of Information Geometry; Oxford University Press: Oxford, UK, 2000.
[6] R. S. Ingarden, Tensor N.S. 30, 201 (1976).
[7] H. Janyszek, Rep. Math. Phys. 24, 1, 11 (1986).
[8] S. Abe, Phys. Rev. E 68, 031101 (2003).
[9] J. Naudts, Open Sys. and Information Dyn. 12, 13 (2005).
[10] M. Portesi, A. Plastino, F. Pennini, Phys. A 365, 173–176 (2006); Phys. A 373, 273–282 (2007).
[11] C. Tsallis, J. Stat. Phys. 52, 479 (1988).
[12] D. Bures, Trans. Am. Math. Soc. 135, 199 (1969).
[13] G. Verdoolaege, AIP Conf. Proc. 1641, 564–571 (2014); Rev. Sci. Instrum. 85, 11E810 (2014).
[14] C. Cafaro, Chaos Solitons & Fractals 41, 886–891 (2009).
[15] C. Cafaro, S. Mancini, Phys. D 240, 607–618 (2011).
[16] C. Cafaro, A. Giffin, C. Lupo, S. Mancini, Open Syst. Inf. Dyn. 19, 1250001 (2012).
[17] A. Giffin, S. A. Ali, C. Cafaro, Entropy 15, 4622-4623 (2013).
[18] A. J. Lichtenberg, M. A. Lieberman. Regular and Chaotic Dynamics (Applied Mathematical Sciences), Springer, Berlin, (2010).
[19] J. Berkovitz, R. Frigg, F. Kronz, Stud. Hist. Phil. Mod. Phys. 37, 661–691 (2006).
[20] A. Lasota, M. Mackey. Probabilistic properties of deterministic systems; Cambridge Univ. Press, Cambridge, 1985.
[21] H. Stockmann. * Quantum Chaos - An Introduction*; Cambridge Univ. Press, Cambridge, 1999.
[22] F. Haake. Quantum Signatures of Chaos, Springer-Verlag, Heidelberg, 2001.
[23] G. Casati, B. Chirikov. Quantum Chaos: between order and disorder, Cambridge Univ. Press, Cambridge, 1995.
[24] G. Casati, T. Prosen, Phys. Lett. A 72, 032111 (2005).
[25] O. Bohigas, M. Giannoni, C. Schmit, Phys. Rev. Lett. 52, 1 (1984).
[26] I. Gomez, M. Castagnino, Phys. A 393, 112–131 (2014).
[27] I. Gomez, M. Castagnino, Chaos, Solitons & Fractals 68, 98–113 (2014).
[28] I. Gomez, M. Castagnino, Chaos, Solitons & Fractals 70, 99–116 (2015).
[29] I. Gomez, M. Losada, S. Fortin, M. Castagnino, M. Portesi, Int. Journ. Theor. Phys. 7, 2192–2203 (2015).
[30] I. S. Gomez, M. Portesi, arXiv:1503.02751v3 [quant-ph] (2017).
[31] O. E. Barndorff–Nielsen. Information and Exponential Families in Statistical Theory, J. Wiley and Sons, New York, 1978.
[32] J. Naudts, J. Ineq. Pure Appl. Math. 5, 102 (2004).
[33] P. D. Grünwald, A. P. Dawid, Ann. Stat. 32 1367–1433 (2004).
[34] A. Ohara, T. Wada, J. Phys. A 43 (2010) (2010).
[35] J. Naudts, Entropy 10, 131–149 (2008).
[36] S. Amari, A. Ohara, H. Matsuzoe, Phys. A 391, 4308–4319 (2012).
[37] M. Molitor, J. Geom. Phys. 70, 54–80 (2013).
[38] H. Matsuzoe, Diff. Geom. App. 35, 323–333 (2014).
[39] F. Zang, Y. Shi, R. Wang, Phys. A 468, 552–565 (2017).

Appendix A Proof of the distinguishability measure $F(p)$ (formula (31))

Proof.

Replacing (11) in the definition of $F(p)$ one obtains

[TABLE]

Now, by defining the following adimensional variables $\widetilde{x}=\frac{x-\mu_{x}}{\sigma}$ and $\widetilde{y}=y\frac{\sigma}{\Sigma^{2}}$ one can recast $F(p)$ as

[TABLE]

Therefore, in order to calculate $F(p)$ , it is enough to study the maximum and minimum on $\mathbb{R}^{2}$ of the function $G_{r}(x,y)=\frac{1}{\sqrt{1-r^{2}}}\exp\left(\frac{-1}{2(1-r^{2})}\left(x^{2}+y^{2}-2rxy\right)\right)-\exp\left(-\frac{x^{2}+y^{2}}{2}\right)$ for each value of the parameter $r\in[-1,1]$ . For this, it is convenient to write $G_{r}(x,y)=g_{1}(x,y)-g_{2}(x,y)$ with $g_{1}(x,y)=\frac{1}{\sqrt{1-r^{2}}}\exp\left(\frac{-1}{2(1-r^{2})}\left(x^{2}+y^{2}-2rxy\right)\right)$ and $g_{2}(x,y)=\exp\left(-\frac{x^{2}+y^{2}}{2}\right)$ . Then, one must to find the critical points of $G_{r}(x,y)$ from the equations

[TABLE]

From (54) it follows that $y\frac{\partial G_{r}(x,y)}{\partial x}-x\frac{\partial G_{r}(x,y)}{\partial y}=0$ , thus

[TABLE]

Since $g_{1}(x,y)$ is an exponential then is always positive. Then one obtains

[TABLE]

If $r=0$ then it is clear that $p$ is equal to the product of their marginals and it follows that the infinite norm is zero. Assuming $r\neq 0$ and replacing $y=\pm x$ in the formula of $G_{r}(x,y)$ then one obtains a function that only depends on $x$ , that we can denote as $\varphi_{r}(x)$ . Then one has that

[TABLE]

Thus, the value of $F(p)$ is given by the maximum value of $|\varphi_{r}(x)|$ with $x\in\mathbb{R}$ . For this, we calculate the critical points of $\varphi_{r}(x)$ by making its derivative equal to zero, thus obtaining

[TABLE]

So the critical points of $\varphi_{r}(x)$ satisfy

[TABLE]

Replacing (56) in (55) one has the value of $\varphi_{r}(x)$ in its critical points $x_{c}$

[TABLE]

In order to find explicitly $\exp\left(-x_{c}^{2}\right)$ one must solve the second relation of the Eq. (56), taking the natural logarithm in both sides of it then one has

[TABLE]

That is,

[TABLE]

Then, from (58) and (57) one obtains the following values of $\varphi_{r}(x_{c})$ in its critical points

[TABLE]

Now, given that the maximum and minimum values of $G(x,y)$ are precisely the same of $\varphi_{r}(x)$ subject to the restriction $y=\pm x$ and since the maximum value of $|G(x,y)|$ is equal to $F(p)$ , then from (59) one deduces that $F(p)$ is

[TABLE]

from which follows the desired result. ∎

Bibliography39

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] F. Weinhold, J. Chem. Phys. 63 , 2479, 2488 (1975); 65 , 559 (1976).
2[2] G. Ruppeiner, Phys. Rev. A 20 , 1608 (1979); Rev. Mod. Phys. 67 , 605 (1995); Erratum: 68 , 313 (1996).
3[3] C. R. Rao, Differential Geometry in Statistical Inference . In chap. Differential metrics in probability spaces; Institute of Mathematical Statistics, Hayward, CA, 1987.
4[4] C. R. Rao, Bull. Calcutta Math. Soc. 37 , 81 (1945).
5[5] S. Amari, H. Nagaoka. Methods of Information Geometry ; Oxford University Press: Oxford, UK, 2000.
6[6] R. S. Ingarden, Tensor N.S. 30, 201 (1976).
7[7] H. Janyszek, Rep. Math. Phys. 24 , 1, 11 (1986).
8[8] S. Abe, Phys. Rev. E 68 , 031101 (2003).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Notions of the ergodic hierarchy for curved statistical manifolds

Abstract

keywords:

1 Introduction

2 Elements of information geometry

2.1 Statistical models

2.2 Metric structure of the statistical manifold

2.3 The 2D2D2D correlated Gaussian model

3 The ergodic hierarchy

4 An information geometric version of the ergodic hierarchy

5 A measure of distinguishability for the 2D2D2D correlated model

6 Geometrizing the canonical ensemble by means of statistical models and the IGEH

6.1 Canonical ensembles in the context of the information geometry

6.2 Dynamics of the canonical ensemble in terms of the IGEH levels

7 Models and results

7.1 Phase transitions in a pair of interacting harmonic oscillators

7.2 2×22\times 22×2 Gaussian Orthogonal Ensembles (GOE)

7.3 A panoramic outlook of the IGEH and of the canonical ensemble in curved statistical models

8 Conclusions

Acknowledgments

References

Appendix A Proof of the distinguishability measure F(p)F(p)F(p) (formula (31))

Proof.

2.3 The $2D$ correlated Gaussian model

5 A measure of distinguishability for the $2D$ correlated model

7.2 $2\times 2$ Gaussian Orthogonal Ensembles (GOE)

Appendix A Proof of the distinguishability measure $F(p)$ (formula (31))