One-dimensional exponential families with constant Hessian scalar curvature
Mathieu Molitor

TL;DR
This paper classifies one-dimensional exponential families with finite sample spaces that have constant Hessian scalar curvature, revealing a specific relationship between the curvature and integer parameters, and highlighting the binomial distribution's role.
Contribution
It provides a complete classification of 1D exponential families with constant Hessian scalar curvature, identifying the curvature values and the significance of the binomial distribution.
Findings
Hessian scalar curvature is quantized as 2/k for positive integers k
Binomial distribution plays a central role in the classification
Explicit characterization of exponential families with constant curvature
Abstract
We give a complete classification of 1-dimensional exponential families defined over a finite space whose Hessian scalar curvature is constant. We observe an interesting phenomenon: if has constant Hessian scalar curvature, say , then for some positive integer . We also discuss the central role played by the binomial distribution in this classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCosmology and Gravitation Theories · Advanced Thermodynamics and Statistical Mechanics · Black Holes and Theoretical Physics
One-dimensional exponential families with constant Hessian scalar curvature
Mathieu Molitor
e-mail: [email protected]
Abstract
We give a complete classification of 1-dimensional exponential families defined over a finite space whose Hessian scalar curvature is constant. We observe an interesting phenomenon: if has constant Hessian scalar curvature, say then for some positive integer . We also discuss the central role played by the binomial distribution in this classification.
1 Introduction
Let be a finite set endowed with the counting measure and let be a 1-dimensional exponential family defined over , with elements of the form , where are functions, and . We denote by , and , the Fisher metric, exponential connection and mixture connection, respectively.
As it is well-known, the dualistic structure is dually flat [AN00]. Therefore, the tangent bundle is naturally a Kähler manifold of real dimension 2 [Mol13, Shi07]. Let be the corresponding scalar curvature.
In this paper, we classify all 1-dimensional exponential families, as described above, for which Scal is constant (see Theorem 7.9). Our proof is based on the particularly simple expression for the Ricci tensor in complex coordinates (since is Kähler), which implies that Scal factorizes as , where is the canonical projection and is a globally well-defined function. Thus, solving the equation amounts to solve the simpler equation , which can be done by solving elementary differential equations in one variable.
An interesting consequence of the above classification is that if has constant scalar curvature, then for some positive integer satisfying (see Corollary 7.10). For instance, if is the set of binomial distributions defined over , then .
The last section of the paper is devoted to analysing the “internal symmetries” of the problem, leading to a somewhat simpler reformulation of the classification discussed above that emphasizes the importance of the binomial distribution. For this purpose, we introduce an equivalence relation on the set of all 1-dimensional exponential families defined over the same set by declaring to be equivalent to if and only if they coincide as spaces of maps . In Proposition 8.4, we show that the set of equivalence classes is in one-to-one correspondence with the affine Grassmannian of 1-dimensional affine subspaces of , that is,
[TABLE]
Then, given , we introduce the reduced exponential family of , denoted by (see Definition 8.6). It is an exponential family defined over a finite set , with elements of the form . Its dualistic structure is isomorphic to that of , but in general , and is always strictly increasing. Then we reformulate the classification given in Theorem 7.9 as follows (see Proposition 8.8). If is a 1-dimensional exponential family defined over , then is constant if and only if , where is the cardinality of .
For the convenience of the reader, the paper gives a rather detailed discussion on the relation between Kähler geometry and statistics. The topics covered include: definition and examples of Kähler manifolds (Section 2), connections and connectors (Section 3), Dombrowki’s construction (Section 4), Ricci curvature (Section 5) and statistical manifolds (Section 6).
In Section 7, we classify all 1-dimensional exponential families with constant scalar curvature on (Theorem 7.9).
In Section 8, we reformulate the classification result obtained in the preceding section by using equivalence classes and reduced exponential families (Proposition 8.8).
Notations. If is a manifold, then will denote the space of vector fields on and the space of smooth real-valued functions on . will denote the tangent bundle of and the tangent bundle of the tangent bundle of (hence if , then ). The derivative of a smooth map between manifolds at a point will be denoted by .
2 Kähler manifolds
General references are [Bal06, Huy05, Mor07].
Definition 2.1**.**
A complex manifold of complex dimension is a Hausdorff topological space together with a family of maps , , where each is an open set, such that:
- •
,
- •
is an open subset of for all ,
- •
is a homeomorphism onto its image for all ,
- •
is holomorphic for all (provided ).
The family \mathcal{A}:=\big{\{}(U_{\alpha},\phi_{\alpha})\,|\,\alpha\in A\big{\}} is called a complex atlas.
Just as for smooth manifolds, one defines complex charts and complex coordinates on a complex manifold .
In what follows, we will often identify with via the map
[TABLE]
Upon this identification, every complex atlas of determines a smooth atlas of real dimension . Therefore, complex manifolds of complex dimension are naturally smooth manifolds of dimension .
Let be a complex manifold of complex dimension with complex atlas \mathcal{A}=\big{\{}(U_{\alpha},\phi_{\alpha})\,\big{|}\,\alpha\in A\big{\}}. For each and each , define by
[TABLE]
where is the linear map whose matrix representation in the canonical basis is
[TABLE]
It is easy to check that if , then . Thus we will use the notation instead of . Letting vary, we obtain a smooth tensor on the smooth manifold that satisfies .
The tensor is called the complex structure of the complex manifold .
Definition 2.2**.**
Let be a smooth manifold. A smooth tensor satisfying is called an almost complex structure.
Example 2.3**.**
The complex structure of a complex manifold is an almost complex structure.
An almost complex structure on a smooth manifold is said to be integrable if there exists a complex atlas on whose corresponding complex structure coincides with . The Newlander-Nirenberg Theorem asserts that an almost complex structure is integrable if and only if the Nijenhuis tensor, defined by
[TABLE]
vanishes identically for all vector fields on (see [NN57]).
Therefore, a complex manifold can be viewed as a pair , where is a smooth manifold and is an integrable almost complex structure.
Definition 2.4**.**
An almost Hermitian manifold is a triple , where is a smooth manifold, is a Riemannian metric and is an almost complex structure such that for all and all .
If is an almost Hermitian manifold, we define a 2-form on , called the fundamental form, by
[TABLE]
Definition 2.5**.**
A Kähler manifold is an almost Hermitian manifold satisfying the following analytical conditions:
- (i)
the fundamental form is closed, that is, , 2. (ii)
is integrable.
Example 2.6**.**
The manifold endowed with the Euclidean metric and the almost complex structure is a Kähler manifold, whose fundamental form is
[TABLE]
where are linear coordinates of .
Example 2.7**.**
The unit sphere endowed with the round metric induced by , and complex structure (cross product) is a Kähler manifold with fundamental form given by
[TABLE]
where and .
Example 2.8**.**
The complex projective space is the set of all complex lines in passing through the origin. Let . We define a topology on by declaring to be open if and only if is open in . It can be shown that the family of maps \phi_{i}\,:\,\big{\{}[z_{1},...,z_{n}]\,\big{|}\,z_{i}\neq 0\big{\}}\rightarrow\mathbb{C}^{n-1},\,\,\,[z_{1},...,z_{n}]\mapsto\big{(}\tfrac{z_{1}}{z_{i}},...,\tfrac{z_{i-1}}{z_{i}},\tfrac{z_{i+1}}{z_{i}},...,\tfrac{z_{n}}{z_{i}}\big{)}, , defines a complex atlas on . The restriction of to the unit sphere yields a surjective submersion and hence there are tensors and on characterized by the formulas
[TABLE]
where is the inclusion, and are the real and imaginary parts of the standard Hermitian product on .
It can be shown that is a Kähler manifold, where is the associated complex structure, with fundamental form .
3 Connections and connectors
This section follows closely [Dom62]. Let be a manifold.
Definition 3.1**.**
A linear connection on is a map , , satisfying the following properties:
- (i)
, 2. (ii)
, 3. (iii)
,
for all vector fields and for all functions .
In local coordinates on , if and , then, by standard computations,
[TABLE]
where are the Christoffel symbols, defined by the formula
[TABLE]
Let be the canonical projection and let be a chart for with local coordinates . Define by
[TABLE]
Then is a chart for ; let be the corresponding local coordinates (in particular, for every .
Let u=\sum_{k=1}^{n}u_{k}\tfrac{\partial}{\partial x_{k}}\big{|}_{p}\in\pi^{-1}(U) be arbitrary. Define a linear map by
[TABLE]
Lemma 3.2**.**
Let and be vector fields on . Suppose . Then .
Proof.
By standard computations,
[TABLE]
so
[TABLE]
where we have used . Comparing the above formula with the local expression for in coordinates (see (3.1)), one obtains the desired formula. ∎
Clearly, vectors of the form , with , generate , and so the above lemma implies that the definition of is independant of the choice of the chart .
The map
[TABLE]
defined for by , is called connector, or connection map, associated to .
The following result is an immediate consequence of the definition of .
Proposition 3.3**.**
Let be the connector associated to a connection on . The following holds.
- (i)
For every pair of vector fields on , , where denotes the derivative of in the direction of . 2. (ii)
For every , the restriction of to is a linear map .
If is such that and , then a simple calculation using local coordinates shows that . Therefore,
Proposition 3.4**.**
Let be the connector associated to a connection on . Given , the map , defined by
[TABLE]
is a linear bijection.
Thus, given a linear connection , we can identify at any point the vector spaces and via the map (3.2).
4 Dombrowski’s construction
Let be a smooth manifold endowed with a connection . We will denote by the canonical projection.
By Proposition 3.4, there is an identification of vector spaces , where . If there is no danger of confusion, we will therefore regard an element of as a pair , where .
Let be a Riemannian metric on . The pair determines an almost Hermitian structure on via the following formulas:
[TABLE]
where .
The tensors are smooth (this will follow from their coordinate representation, see Proposition 4.5 below) and clearly, , and for all such that . Thus, is an almost Hermitian manifold with fundamental form . This is Dombrowski’s construction [Dom62].
We now review the analytical properties of Dombrowski’s construction. We begin with some definitions.
Definition 4.1**.**
A dualistic structure on a manifold is a triple , where is a Riemannian metric and where and are linear connections satisfying
[TABLE]
for all vector fields on . The connection is called the dual connection of (and vice versa).
As the literature is not uniform, let us agree that the torsion and the curvature tensor of a connection are defined as
[TABLE]
where are vector fields on . By definition, a linear connection is flat if the torsion and curvature tensor are identically zero on . A manifold endowed with a flat linear connection is called an affine manifold.
Definition 4.2**.**
A dualistic structure is dually flat if both and are flat.
Proposition 4.3**.**
Let be a dualistic structure on and let be the almost Hermitian structure on associated to via Dombrowski’s construction. The following are equivalent.
- (i)
is a Kähler manifold. 2. (ii)
is dually flat.
Proof.
We now direct our attention to the coordinate expressions for and .
Definition 4.4**.**
Suppose is an affine manifold. An affine coordinate system is a coordinate system defined on some open set such that
[TABLE]
for all .
It can be shown that for every point in an affine manifold , there is an affine coordinate system defined on some neighborhood of (see [Shi07]).
Proposition 4.5**.**
Let be a dually flat structure on a manifold and let be the Kähler structure on associated to via Dombrowski’s construction. Let be an affine coordinate system with respect to on , and let denote the corresponding coordinates on , as described before Lemma 3.2. Then, in the coordinates ,
[TABLE]
where h_{ij}=h\big{(}\tfrac{\partial}{\partial x_{i}},\tfrac{\partial}{\partial x_{j}}\big{)}, .
Proof.
See [Mol14]. ∎
Corollary 4.6**.**
Under the hypotheses of Proposition 4.5, if , , then are complex coordinates on the complex manifold .
5 Ricci curvature
Let be a Kähler manifold with Kähler metric . We denote by Ric the Ricci tensor of ,
[TABLE]
where are vector fields on and is the curvature tensor of .
On the complexified tangent bundle , we extend -linearly every tensor of at every point . For simplicity, we use the same symbols (, Ric, etc) to indicate the corresponding -linear extensions.
Regarding local computations and indices, Greek indices shall run over while capital letters shall run over . Let be a system of complex coordinates on . We denote by and the real and imaginary part of , i.e., . With this notation, the vectors
[TABLE]
where , form a basis for . Let be the components of the Ricci tensor in this basis. As it is well-known, these components are elegantly expressed via the following formulas:
[TABLE]
where is the associated Hermitian matrix.
We now specialize to the case , assuming that is the Kähler metric associated to a dually flat structure on via Dombrowski’s construction.
Fix an affine coordinate system on an open set with respect to , and let be the corresponding coordinates on , as described before Lemma 3.2, where is the canonical projection.
Given , define . Then are complex coordinates on . Applying (5.1), we obtain
[TABLE]
where is the determinant of the matrix . The second formula in (5.2) is the local expression for the Ricci tensor in the basis . Returning to the coordinates , a direct calculation using
[TABLE]
shows the following result (see [Mol14]).
Proposition 5.1**.**
Let be a dually flat structure on and let be the Kähler metric on associated to via Dombrowski’s construction. If is an affine coordinate system on with respect to , then in the coordinates , the matrix representation of the Ricci tensor of is
[TABLE]
and where is the determinant of the matrix h_{\alpha\beta}=h\big{(}\tfrac{\partial}{\partial x_{\alpha}},\tfrac{\partial}{\partial x_{\alpha}}\big{)}.
Recall that the scalar curvature is, by definition, the trace of the Ricci tensor.
Corollary 5.2**.**
Under the hypotheses of Proposition 5.1, the scalar curvature of is given in the coordinates by
[TABLE]
where is the determinant of the matrix , and where are the coefficients of the inverse matrix of .
Observe that the scalar curvature on can be written , where is a globally defined function whose local expression is given by the right hand side of (5.4). The function is called Hessian scalar curvature (see [Shi07]).
6 Statistical manifolds
General references are [AJLS17, AN00, MR93].
Definition 6.1**.**
A statistical manifold is a pair , where is a manifold and where is an injective map from to the space of all probability density functions defined on a fixed measure space :
[TABLE]
If is a coordinate system on a statistical manifold , then we shall indistinctly write or for the probability density function determined by .
Given a “reasonable” statistical manifold , it is possible to define a metric and a family of connections on in the following way: for a chart of , define
[TABLE]
where denotes the mean, or expectation, with respect to the probability , and where is a shorthand for . It can be shown that if the above expressions are defined and smooth for every chart of , then is a well defined metric on called the Fisher metric, and that the ’s define a connection via the formula , which is called the -connection.
Among the -connections, the -connections are particularly important; the 1-connection is usually referred to as the exponential connection, also denoted by , while the -connection is referred to as the mixture connection, denoted by .
In this paper, we will only consider statistical manifolds for which the Fisher metric and -connections are well defined.
Proposition 6.2**.**
Let be a statistical manifold. Then, is a dualistic structure on . In particular, is the dual connection of .
Proof.
See [AN00]. ∎
We now recall the definition of an exponential family.
Definition 6.3**.**
An exponential family on a measure space is a set of probability density functions of the form
[TABLE]
where are measurable functions on , is a vector varying in an open subset of and where is a function defined on .
In the above definition it is assumed that the family of functions is linearly independent, so that the map becomes a bijection, hence defining a global chart for . The parameters are called the natural or canonical parameters of the exponential family .
Example 6.4** (Normal distribution).**
Normal distributions,
[TABLE]
form a -dimensional statistical manifold, denoted by , parameterized by , where is the mean and is the standard deviation (here \mathbb{R}_{+}^{*}:=\bigl{\{}x\in\mathbb{R}\bigl{|}x>0\bigr{\}}). It is an exponential family, because p(x;\mu,\sigma)=\exp\big{\{}\theta_{1}F_{1}(x)+\theta_{2}F_{2}(x)-\psi(\theta)\big{\}}, where
[TABLE]
Example 6.5**.**
Given a finite set , define
[TABLE]
Elements of can be parametrized as follows: p(x;\theta)=\exp\big{\{}\sum_{i=1}^{n-1}\theta_{i}F_{i}(x)-\psi(\theta)\big{\}}, where
[TABLE]
Therefere is an exponential family of dimension .
Example 6.6** (Binomial distribution).**
The set of binomial distributions defined over ,
[TABLE]
where , is a -dimensional statistical manifold, denoted by , parametrized by q\in\bigl{(}0,1\bigr{)}. It is an exponential family, because p(k)=\exp\big{\{}C(k)+\theta F(k)-\psi(\theta)\big{\}}, where
[TABLE]
Proposition 6.7**.**
Let be an exponential family such as in Definition 6.3. Then is dually flat.
Proof.
See [AN00]. ∎
Corollary 6.8**.**
The tangent bundle of an exponential family is a Kähler manifold for the Kähler structure associated to via Dombrowski’s construction.
Proof.
Follows from Proposition 4.3. ∎
In the sequel, by the Kähler structure of , we will implicitly refer to the Kähler structure of described in Corollary 6.8.
Example 6.9** ([Mol12, Mol13]).**
Let be the statistical manifold defined in Example 6.5. For an appropriate normalization of the Fubini-Study metric and symplectic form, it can be shown that there exists a map
[TABLE]
where , with the following properties:
- (i)
is a universal covering map whose Deck transformation group is isomorphic to , 2. (ii)
is holomorphic and locally isometric.
In particular, if denotes the Deck transformation group of , then (isomorphism of Kähler manifolds).
Example 6.10** (Binomial distribution [Mol13]).**
Let be the set of binomial distributions defined over , as in Example 6.6. Let be the unit sphere in . Consider the map given by
[TABLE]
where are the coordinates on associated to the natural parameter , as described before Lemma 3.2.
It is easy to check that if the Kähler structure of (as described in Example 2.7) is multiplied by , then is a holomorphic and locally isometric universal covering map whose Deck transformation group is . Therefore (isomorphism of Kähler manifolds).
Example 6.11** (Normal distributions [Mol14]).**
Let be the set of Gaussian distributions, as defined in Example 6.4. As a complex manifold, is the product , where is the Poincaré upper half-plane. The metric of the space is the Kähler-Berndt metric , which can be described as follows. If and , then in the coordinates \bigl{(}u,v,x,y\bigr{)},
[TABLE]
This metric plays an important role in the context of Number Theory, in relation to the so-called Jacobi forms [BS98, EZ85].
We end this section with some technical results that we will use in the next section.
Let be an exponential family of dimension defined over the measure space , with elements of the form
[TABLE]
where are measurable functions on , is a vector in an open subset of and where is a function defined on .
Given , we defined by
[TABLE]
The functions are called expectation parameters. Note that, if the functions are not measurable, the existence of the functions is not guaranteed. However, in the particular case where is finite, the functions exist and have good properties, as described in the following result.
Proposition 6.12**.**
Let be an exponential family defined over a finite set endowed with the counting measure, with , and as above. The following holds.
- (i)
The set can be taken equal to . 2. (ii)
\bigl{(}\eta_{1},...,\eta_{n}\bigr{)} is a global system of affine coordinates with respect to . 3. (iii)
for all , 4. (iv)
\dfrac{\partial^{2}\psi}{\partial\theta_{i}\partial\theta_{j}}=\dfrac{\partial\eta_{i}}{\partial\theta_{j}}=h_{F}\Big{(}\dfrac{\partial}{\partial\theta_{i}},\dfrac{\partial}{\partial\theta_{j}}\Big{)}, for all . 5. (v)
\psi(\theta)=\ln\biggl{\{}\displaystyle\sum_{k=0}^{m}\exp\biggl{(}C\bigl{(}x_{k}\bigr{)}+\sum_{j=1}^{n}\theta_{j}F_{j}\bigl{(}x_{k}\bigr{)}\biggr{)}\biggr{\}} for all .
Proof.
See [AN00]. ∎
7 Constant scalar curvature
Let be a finite set endowed with the counting measure . Let be a 1-dimensional exponential family defined over , with elements of the form
[TABLE]
where are functions, and is a function. We denote by the expectation parameter.
We will use the following notations:
- •
, , ,
- •
, ,
- •
,
- •
,
- •
.
Note that (since the functions and are assumed to be linearly independent). Note also that and .
Lemma 7.1**.**
We have
[TABLE]
In particular, is a bounded function.
Proof.
By Proposition 6.12, (iii) and (iv), we have
[TABLE]
Multiplying the numerator and denominator by yields
[TABLE]
where . If , then and so,
[TABLE]
Thus,
[TABLE]
Analogously,
[TABLE]
The lemma follows. ∎
Let be the Kähler structure on associated to \bigl{(}h_{F},\nabla^{(e)}\bigr{)} via Dombrowski’s construction. We denote by the corresponding scalar curvature.
Proposition 7.2**.**
Suppose the scalar curvature of is constant and equal to . Then and there exist , with , such that
[TABLE]
for all . Consequently, the coordinate expression for the Fisher metric with respect to is
[TABLE]
where is the hyperbolic cosine function.
Proof.
By Corollary 5.2,
[TABLE]
where h_{F}(\theta):=h_{F}\bigl{(}\frac{\partial}{\partial\theta},\frac{\partial}{\partial\theta}\bigr{)}, and so,
[TABLE]
where we have used (see Proposition 6.12). Integrating we obtain
[TABLE]
which is equivalent to
[TABLE]
where . We conclude that if and only if there exist such that
[TABLE]
Because {\displaystyle\tfrac{\partial\eta}{\partial\theta}=h_{F}\bigl{(}\theta\bigr{)}>0}, Equation (7.2) implies that
[TABLE]
Therefore we can divide both sides of (7.2) by (7.3). This yields
[TABLE]
Hence it all boils down to integrate the function
[TABLE]
Let be the discriminant of the polynomial . We will consider 3 cases.
Case 1: . Integration of (7.4) yields:
[TABLE]
The left hand side of this equation is a bounded function, whereas the right hand side is not. Thus this case is not possible.
Case 2: . First, suppose that . Then the condition implies that , which also implies by (7.3) that . On the other hand, it follows from (7.2) that \eta\bigl{(}\theta\bigr{)}=b\theta+c. By Lemma 7.1, the function is bounded, whereas the function is not (since ). It follows that is impossible.
Now suppose that . In this case, there exists such that
[TABLE]
Integrating (7.4) we obtain
[TABLE]
Putting in (7.5) yields , which is not possible.
Case 3: . Suppose first that (in particular, this implies ). Then, by (7.4),
[TABLE]
and so, there exists such that
[TABLE]
which is not possible, since is bounded.
Suppose that . In this case, there exist with , such that
[TABLE]
Then, integration of (7.4) yields
[TABLE]
and so
[TABLE]
It follows from (7.6) that \alpha,\beta\notin\textup{Im}\bigl{(}\eta\bigr{)}. Therefore we have the following possibilities:
(i) \textup{Im}\bigl{(}\eta\bigr{)}\subseteq\bigl{(}-\infty,\alpha\bigr{)}\cup\bigl{(}\beta,+\infty\bigr{)}.
In this case, (7.6) becomes
[TABLE]
Putting we obtain which is a contradiction.
(ii) . In this case,
[TABLE]
from which it follows that
[TABLE]
Integrating again (remember that ) we obtain
[TABLE]
where \tfrac{2}{\lambda}\ln\bigl{(}e^{\omega}\bigr{)} is a constant. Then,
[TABLE]
Letting , , and , yields the desired result. ∎
Corollary 7.3**.**
If the scalar curvature of is constant and equal to , then .
Proof.
This follows immediately from the formula h_{F}(\theta)=\tfrac{(a-b)^{2}}{2\lambda\,\textup{cosh}^{2}\big{(}\tfrac{a-b}{2}\theta+\tfrac{r-s}{2}\big{)}} and the fact that for all (since is a metric). ∎
Remark 7.4**.**
If \psi(\theta)=\tfrac{2}{\lambda}\ln\bigl{\{}e^{a\theta+r}+e^{b\theta+s}\bigr{\}}, with , then and . To see this, it suffices to compute and to compare with Lemma 7.1.
Lemma 7.5**.**
Let be a nonzero real number and . Given , let be the linear subspace of spanned by , ,…, (derivatives of ), that is,
[TABLE]
If , then .
Proof.
It is easy to see that if , then the family of linearly independent functions
[TABLE]
is contained in . ∎
Proposition 7.6**.**
Suppose that the scalar curvature of is constant and equal to . Then there exists a positive integer such that
Proof.
From Proposition 6.12, there exist such that
[TABLE]
for all . On the other hand, it follows from Proposition 6.12 that
[TABLE]
Thus
[TABLE]
that is,
[TABLE]
Multiplying by we obtain
[TABLE]
where , , and . Since , this can be rewritten as
[TABLE]
where and . Note that (7.7) holds for all .
Consider the linear subspace of spanned by the functions , , that is,
[TABLE]
Observe that and that for every , the derivative of with respect to belongs to , that is, .
Let be the function defined by . Because of (7.7), belongs to , and by the observation above, so does its derivatives of all orders. Therefore is a linear subspace of for every integer , which implies for every . According to Lemma 7.5, this is only possible if , that is, if . ∎
In what follows, we will use the following notations:
- •
are the unique real numbers such that and
- •
, ,
- •
.
Note that and that and .
Lemma 7.7**.**
Assume that on , with . With the notation of Proposition 7.6, we have
[TABLE]
for every .
Proof.
We know from Proposition 7.2 and Proposition 7.6 that there are real numbers , with , and such that
[TABLE]
for all , where we have used and (see Remark 7.4). From Proposition 6.12, we also have that \psi(\theta)=\ln\big{(}\sum_{k=0}^{m}e^{C_{k}+\theta F_{k}}\big{)}. Therefore
[TABLE]
We compute the left and right hand sides of (7.8) separately:
[TABLE]
where we have used . The lemma follows by comparing the left and right hand sides. ∎
Lemma 7.8**.**
Let and be families of real numbers such that and . If
[TABLE]
for all , then and for every , and .
Proof.
By hypothesis, we have
[TABLE]
for all , where are functions that are easily seen to satisfy and . It follows that
[TABLE]
which forces and . The lemma is proved by repeating the same argument. ∎
Theorem 7.9**.**
Let be a 1-dimensional exponential family defined over a finite set , with elements of the form , where , and . Suppose that . Given , define via the formula
[TABLE]
Then the scalar curvature is constant if and only if there exist such that
[TABLE]
for all , and in that case, .
Proof.
This follows from Lemma 7.7 and Lemma 7.8. This can be proved by reversing the reasoning above. ∎
Corollary 7.10**.**
Let be a 1-dimensional exponential family defined over a finite set \Omega=\bigl{\{}x_{0},...,x_{m}\bigr{\}}. If the scalar curvature of is constant and equal to , then
[TABLE]
Example 7.11**.**
(Binomial distribution). Recall that elements of are parametrized as follows
[TABLE]
where k\in\bigl{\{}0,...,n\bigr{\}} and . In this case, we have and
[TABLE]
for all . Clearly and are solutions of (7.11) with . Therefore the scalar curvature of is constant and equal to .
8 Equivalent and reduced exponential families
The following notation will be used throughout this section. Given a finite set , let denote the space of maps (clearly there is a natural identification ). Given , let denote the 1-dimensional exponential family defined over with elements of the form
[TABLE]
where , and \psi_{C,F}(\theta)=\ln\big{(}\sum_{k=0}^{m}\textup{exp}(C(x_{k})+\theta F(x_{k}))\big{)}. In the above notation, it is assumed that the function and the constant function are linearly independent (this guarantees that the map , is bijective). In other words, .
Definition 8.1**.**
Two 1-dimensional exponential families and defined over the same set are equivalent if the families of maps and coincide.
In order to caracterize equivalent exponential families, we introduce the group of matrices
[TABLE]
Given an integer , the group acts on via the formula
[TABLE]
where and .
Proposition 8.2**.**
Two 1-dimensional exponential families and defined over the same finite set are equivalent if and only if there exists g=\Big{[}\begin{smallmatrix}1&b&d\\ 0&a&c\\ 0&0&1\end{smallmatrix}\Big{]}\in G such that . In that case, the following holds.
- (i)
for all and all . 2. (ii)
for all .
Proof.
Suppose . Because the maps , and , are bijective, there exists a bijection such that
[TABLE]
for all and all Since and are linearly independent, there exist such that . Putting and in (8.1), we obtain the following system
[TABLE]
Subtracting, we obtain
[TABLE]
for all . Therefore there exist such that for all . Note that is necessarily nonzero. Taking the derivative in (8.1) with respect to and using the formula we find that
[TABLE]
for all and all This implies that there exists such that
[TABLE]
for all , and
[TABLE]
for all . Integrating the equation above, we obtain
[TABLE]
for all , where is some constant. Then, using (8.1), (8.3) and (8.4) we see that
[TABLE]
for all . It follows from (8.3) and (8.5) that (C,F)=\Big{[}\begin{smallmatrix}1&b&d\\ 0&a&c\\ 0&0&1\end{smallmatrix}\Big{]}\cdot(C^{\prime},F^{\prime}), which concludes one direction of the proof. Note that the computation above shows that if , then (i) and (ii) hold.
Left as a simple exercice to the reader. ∎
Definition 8.3**.**
Let be a finite dimensional real vector space and let be an integer satisfying . The affine Grassmannian, denoted by , is the set of all -dimensional affine subspaces of .
It can be shown that is a noncompact smooth manifold of dimension , where (see [LWY19]).
Given an integer , the space decomposes as the following direct sum:
[TABLE]
where is the orthogonal complement of in with respect to the usual inner product on , that is,
[TABLE]
Given , we will denote by the orthogonal projection of on .
Finally, given , we will denote by the corresponding equivalence class in the quotient space .
Proposition 8.4**.**
For every integer , the map
[TABLE]
given by is a bijection.
Proof.
By a direct verification. ∎
It follows from Proposition 8.2 and Proposition 8.4 that the set of equivalence classes of 1-dimensional exponential families defined over the same finite set is in one-to-one correspondence with .
Example 8.5**.**
All 1-dimensional exponential families defined over are equivalent, because is a single point.
Definition 8.6**.**
Let be a 1-dimensional exponential family defined over a finite set . Let and be the families of real numbers characterized by the following conditions:
- (i)
and , 2. (ii)
.
Let . Define by
[TABLE]
where . Then is a 1-dimensional exponential family defined over . We call it the reduced exponential family of .
Remark 8.7**.**
If is a 1-dimensional exponential family defined over a finite set , then for all .
Proposition 8.8**.**
Let be a 1-dimensional exponential family defined over a finite set . The following are equivalent.
- (i)
The scalar curvature of is constant. 2. (ii)
, where is the cardinality of .
Proof.
Let and let be defined by and Comparing with Example 6.6, we see that .
If the scalar curvature of is constant, then by Theorem 7.9 there are real numbers and such that
[TABLE]
for all , which implies that
[TABLE]
It follows from this and Proposition 8.2 that .
Conversely, if , then and hence there is g=\Big{[}\begin{smallmatrix}1&b&d\\ 0&a&c\\ 0&0&1\end{smallmatrix}\Big{]} such that , which implies that and are solutions of (8.8) for all , provided and . By Theorem 7.9 again, this implies that the scalar curvature of is constant. ∎
Remark 8.9**.**
As we saw in this paper, if the scalar curvature of the tangent bundle of a 1-dimensional exponential family defined over a finite set is constant, then . This is not true for more general exponential families. For example, if is the family of Gaussian distributions over (see Example 6.4), then is constant and equal to (see [Mol14]).
Acknowledgments
I am thankful to Caroline Santos Leite Ribeiro who carefully read and helped typing a preliminary version of this article.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AJLS 17] Nihat Ay, Jürgen Jost, Hông Vân Lê, and Lorenz Schwachhöfer. Information geometry , volume 64 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] . Springer, Cham, 2017.
- 2[AN 00] Shun-ichi Amari and Hiroshi Nagaoka. Methods of information geometry , volume 191 of Translations of Mathematical Monographs . American Mathematical Society, Providence, RI; Oxford University Press, Oxford, 2000. Translated from the 1993 Japanese original by Daishi Harada.
- 3[Bal 06] Werner Ballmann. Lectures on Kähler manifolds . ESI Lectures in Mathematics and Physics. European Mathematical Society (EMS), Zürich, 2006.
- 4[BS 98] Rolf Berndt and Ralf Schmidt. Elements of the representation theory of the Jacobi group . Modern Birkhäuser Classics. Birkhäuser/Springer Basel AG, Basel, 1998. [2011 reprint of the 1998 original] [MR 1634977].
- 5[Dom 62] Peter Dombrowski. On the geometry of the tangent bundle. J. Reine Angew. Math. , 210:73–88, 1962.
- 6[EZ 85] Martin Eichler and Don Zagier. The theory of Jacobi forms , volume 55 of Progress in Mathematics . Birkhäuser Boston, Inc., Boston, MA, 1985.
- 7[Huy 05] Daniel Huybrechts. Complex geometry . Universitext. Springer-Verlag, Berlin, 2005. An introduction.
- 8[LWY 19] Lek-Heng Lim, Ken Sze-Wai Wong, and Ke Ye. Numerical algorithms on the affine Grassmannian. SIAM J. Matrix Anal. Appl. , 40(2):371–393, 2019.
