On a method to construct exponential families by representation theory
Koichi Tojo, Taro Yoshino

TL;DR
This paper investigates a method to construct exponential families on homogeneous spaces using representation theory, answering key questions about injectivity and uniqueness, and relates the construction to the generalized inverse Gaussian distribution.
Contribution
It provides criteria for when the constructed exponential family is injective and unique, and connects the method to known distributions like GIG.
Findings
Answered when the correspondence is injective.
Determined conditions for different pairs to generate the same family.
Linked the construction to the generalized inverse Gaussian distribution.
Abstract
Exponential family plays an important role in information geometry. In arXiv:1811.01394, we introduced a method to construct an exponential family on a homogeneous space from a pair . Here is a representation of and is an -fixed vector in . Then the following questions naturally arise: (Q1) when is the correspondence injective? (Q2) when do distinct pairs and generate the same family? In this paper, we answer these two questions (Theorems 1 and 2). Moreover, in Section 3, we consider the case with a certain representation on . Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsgraph theory and CDMA systems · Matrix Theory and Algorithms · Advanced Research in Systems and Signal Processing
\hypersetup
colorlinks=true
11institutetext: RIKEN Center for Advanced Intelligence Project, Tokyo, Japan/
Department of Mathematics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan
11email: [email protected]
22institutetext: Graduate School of Mathematical Science, The University of Tokyo,
3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan
22email: [email protected]
On a method to construct exponential families by representation theory
Koichi Tojo 11
Taro Yoshino 22
Abstract
Exponential family plays an important role in information geometry. In [TY18], we introduced a method to construct an exponential family on a homogeneous space from a pair . Here is a representation of and is an -fixed vector in . Then the following questions naturally arise: (Q1) when is the correspondence injective? (Q2) when do distinct pairs and generate the same family? In this paper, we answer these two questions (Theorems 2.1 and 2.2). Moreover, in Section 3, we consider the case with a certain representation on . Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG).
Keywords:
exponential family representation theory homogeneous space generalized inverse Gaussian distribution
1 Introduction
Let be a Lie group and its closed subgroup. In [TY18], we introduced a method to construct an exponential family on the homogeneous space from . In this paper, we answer two natural questions on our method.
1.1 Correspondence parameters and probability measures
In the theory of exponential family, “minimal representation” is important ([BN70]). If an exponential family is realized by “minimal representation”, then we obtain one-to-one correspondence between the parameter space and the family of probability measures, which enable us to make use of the family. Moreover, from the perspective of information geometry, the correspondence is used as a coordinate. Then we would like to consider the following:
Question 1
When is the following correspondence injective?
[TABLE]
We want to answer this question for families obtained by our method. We give a necessary and sufficient condition for the injectivity of (1.1) in Theorem 2.1. It is, however, a little bit difficult to check. So, we will see the following easier equivalent conditions (A) and (B) are necessary.
- (A)
The orbit is not contained in any proper affine subspace of . 2. (B)
- (1)
is cyclic, 2. (2)
has no nonzero -fixed vector.
In the case where is compact or connected semisimple, they are also sufficient (see Remark 2).
1.2 Equivalence relation
Our method in [TY18] constructs an exponential family from a pair . In some cases, the same exponential family comes from distinct pairs and . To reduce the choice of , it is useful to give an answer to the following question.
Question 2
When do distinct pairs and generate the same family?
We give an answer to this question in Theorem 2.2. More precisely, we introduce an equivalence relation on the set of pairs and show that two families obtained by , coincide if .
2 Main theorems
2.1 Method introduced in [TY18]
Before stating our main results, we recall the method introduced in [TY18]. Let be a Lie group and its closed subgroup. Then the quotient space naturally equips manifold structure, which is called the homogeneous space of .
Let be a finite dimensional real vector space, and a Lie group homomorphism. Then the pair is called a representation of . We often use simpler notation for and .
A vector is said to be -fixed if for any . We denote by the linear subspace consisting of all -fixed vectors. Let be a pair of representation of and an -fixed vector.
We put
[TABLE]
Take a relatively -invariant measure on . Then we define a measure on parameterized by as follows:
[TABLE]
where .
Remark 1
Since is -fixed, the notion in (2.3) is well-defined. Owing to , the notion is also well-defined for .
Then we consider the normalization of the measures above. Put
[TABLE]
Then we obtain a family of distributions on as follows:
[TABLE]
This is an exponential family if ([TY18]).
2.2 Correspondence
In this section, we give an answer to Question 1. Namely, we state a criterion of the injectivity of the correspondence (1.1). Moreover, we also give necessary conditions, which one can easily check (Proposition 1)
Theorem 2.1
In the setting as in Section 2.1, the following three conditions are equivalent:
The correspondence is injective. 2.
There does not exist such that . 3.
There does not exist a triple satisfying for any .
Here, for .
We prove this theorem in Section 4.2.
Moreover, we also give necessary conditions for the injectivity of (1.1). To state them, we prepare the notion of cyclic.
Definition 1 (cyclic)
We say a vector is cyclic if .
Proposition 1
If the correspondence (1.1) is injective, then the following equivalent conditions (A) and (B) are satisfied. Namely, ((1.1) is injective) (A) (B).
- (A)
The orbit is not contained in any proper affine subspace of . 2. (B)
- (1)
* is cyclic,* 2. (2)
* has no nonzero -fixed vector.*
Here is the contragredient representation of . Moreover, in the case where , the converse implication also holds.
We prove this proposition in Section 4.3
Remark 2
In the case where is compact or connected semisimple, we have . See [TY18] for the details.
2.3 Equivalence
We use the same notation as in Section 2.1. In this subsection, we give an answer to Question 2. To state it, we introduce the notations and .
Definition 2
We put
[TABLE]
We say elements and in are equivalent if there exists a -equivariant linear isomorphism such that and denote it by . This is an equivalence relation on . By definition, this is also an equivalence relation on .
Theorem 2.2
Equivalent elements in generate the same family by our method.
We prove this theorem in Section 4.4.
Remark 3
From Theorem 2.2, in the special case , the choice of is essentially unique. In the next section, we also see an example in which the choice of is essentially unique even if .
3 Generalized inverse Gaussian distribution
Throughout this section, we put , and , and consider a representation given by for . We answer Questions 1 and 2 for this case.
We consider the following two cases.
(Case 1) In the case where with or :
Vectors , are not cyclic. Therefore the obtained families have “unessential parameters”.
(Case 2) In the case where with and :
Proposition 2
The pairs with and are equivalent each other. Moreover, we obtain the family of GIG (3.1) by applying our method to , where .
Definition 3 (Generalized inverse Gaussian distribution. See [J82] for the details)
The following distribution on is called generalized inverse Gaussian distribution.
[TABLE]
where denotes Lebesgue measure on , and satisfies one of the following three conditions:
[TABLE]
Here is the normalizing constant given as follows, respectively.
[TABLE]
where is the modified Bessel function of the second kind with index .
Proof (Proposition 2)
Put . For , a -linear isomorphism gives , which implies the former part.
For the latter part, it is enough to show the case by Theorem 2.2. It is easily checked that . Take a relatively invariant measure on . We identify with by taking the standard inner product. Then we have
[TABLE]
We get . By normalizing these distributions, we obtain the desired family of GIG (3.1).
Finally, let us check the injectivity of the correspondence (1.1). For ,
[TABLE]
holds only if . Thus, the condition (iii) of Theorem 2.1 is satisfied.
4 Proof of main theorems
In this section, we give proofs to Theorems 2.1 and 2.2 and Proposition 1.
4.1 Preliminary
In this subsection, we prepare some notations for proofs in the following sections. Let be a Lie group, H a closed subgroup of and V a finite dimensional real vector space.
Notation 4.1
*We denote by the vector space consisting of all -valued continuous functions on . The constant function is an element of . The space admits the left and right regular representations , , respectively. We put . *
Remark 4
The set is a subspace of (see (2.2)). For , the condition is equivalent to the pair of the following conditions:
for any , 2.
for any .
Notation 4.2
We denote by the evaluation map. We identify with canonically as follows:
[TABLE]
Let be a subspace of . Then we put
[TABLE]
Notation 4.3
For a representation , we denote the contragredient representation by . We often use simpler notation for and . Then, the following equality holds:
[TABLE]
4.2 Proof of Theorem 2.1
Proof (Theorem 2.1)
We are enough to show (ii)(iii)(i)(ii).
First, we see (ii)(iii). Take such that . Then there exists such that for any , so (iii) is proved.
Next, we see (iii)(i). Assume there exist , and satisfying for any . Take any and put . It is enough to show that and . This comes from .
Finally, we see (i)(ii). Assume two distinct elements and satisfy . Put . It is enough to show the following:
Claim
and .
From , we have for almost every ,
[TABLE]
Therefore we have
[TABLE]
From Remark 4, we have , that is, . Moreover, from (4.4) and , we obtain .
4.3 Proof of Proposition 1
In this subsection, we prove Proposition 1 by using Lemma 1 below.
Lemma 1
*For , we consider the following three conditions: *
- (i)
* for any ,* 2. (ii)
* (see Theorem 2.1 for the definition of ),* 3. (iii)
there exists satisfying .
Then, we have (i)(ii)(iii). Moreover, under the assumption that is cyclic, the implication (iii)(i) also holds.
Proof
Since the implications (i)(ii)(iii) are easy, we prove only the implication (iii)(i) under the assumption that is cyclic. Take any . It is enough to show that for any . From (4.3), we have
[TABLE]
Proof (Proposition 1)
First, note that we have the following three easy implications (a), (b) and (c):
- (a)
(A) there exists satisfying Lemma 1(iii), 2. (b)
(B)(2) there exists satisfying Lemma 1(i), 3. (c)
(A) is cyclic.
Therefore, the equivalence (A)(B) comes from Lemma 1.
Next, the implication ((1.1) is injective)(A) follows from (a). In fact, the condition Theorem 2.1(ii) fails if there exists satisfying Lemma 1(ii).
Finally, assume . The converse implication above also holds. So, (A) implies the injectivity of (1.1).
4.4 Proof of Theorem 2.2
We show Theorem 2.2 by using Lemmas 2 and 3 below. We prove Lemma 2 in the next subsection.
Proof (Theorem 2.2)
It is enough to show that as a subspace of if are equivalent. This follows from Lemmas 2 and 3 below.
Lemma 2
Put
[TABLE]
The following map gives a one-to-one correspondence.
[TABLE]
where
[TABLE]
Lemma 3
Let be a closed subgroup of . Suppose corresponds to in Lemma 2. Then is -fixed if and only if any element is -fixed.
Proof
We have
[TABLE]
4.5 Proof of Lemma 2
In this subsection, we prove Lemma 2. To show this lemma, we use Lemmas 4 and 5 below.
Lemma 4 (property of )
The map defined in satisfies the following:
* is a -equivariant linear map,* 2.
* is cyclic if and only if is injective, * 3.
, where and .
We give a proof of this lemma at the end of this subsection.
Lemma 5
Let be a finite dimensional -invariant subspace. Then is -cyclic in .
Proof
Put . It is enough to show . Take any function , then we have . Therefore, we obtain .
Proof (Lemma 2)
From Lemmas 4(1) and 5, the following maps are well-defined:
[TABLE]
Then it is enough to show the following:
- (a)
in , 2. (b)
, 3. (c)
in for .
First, the condition (a) follows from Lemma 4(3).
Next, we show the condition (b). Let be an element of . Since we have , we get . Then, we have
[TABLE]
Therefore, we obtain .
Finally, we show the condition (c). Let be an element of . Put and . Since is a -linear isomorphism by Lemma 4(1) and (2), it is enough to show that . For any , we have
[TABLE]
Therefore, we obtain .
Proof (Lemma 4)
- (1)
Clearly, is a linear map. The -equivariance of follows from the definition of the contragredient representation. 2. (2)
Since is linear, it is enough to show that is cyclic if and only if . The condition means that for , for any implies . Therefore this is equivalent to the condition is cyclic. 3. (3)
Take a -equivariant linear isomorphism with . Then it is enough to show . For any and ,
[TABLE]
Acknowledgements
The authors would like to thank Dr. Frédéric Barbaresco for recommending us to submit a paper to the conference Geometric Science of Information 2019. The authors wish to thank referees for several helpful comments, particularly the comment concerning the condition (A) in Proposition 1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[BN 70] O. E. Barndorff-Nielsen , Exponential families: Exact theory , Various Publication Series, No. 19. Matematisk Institut, Aarhus Universitet, Aarhus, 1970.
- 2[TY 18] K. Tojo, T. Yoshino, A method to construct exponential families by representation theory , ar Xiv:1811.01394 v 2.
- 3[J 82] B. Jørgensen, Statistical properties of the generalized inverse Gaussian distribution , Lecture Notes in Statistics 9 , Springer-Verlag, New York-Berlin, 1982.
