On a method to construct exponential families by representation theory

Koichi Tojo; Taro Yoshino

arXiv:1907.04212·math.RT·July 10, 2019

On a method to construct exponential families by representation theory

Koichi Tojo, Taro Yoshino

PDF

Open Access

TL;DR

This paper investigates a method to construct exponential families on homogeneous spaces using representation theory, answering key questions about injectivity and uniqueness, and relates the construction to the generalized inverse Gaussian distribution.

Contribution

It provides criteria for when the constructed exponential family is injective and unique, and connects the method to known distributions like GIG.

Findings

01

Answered when the correspondence is injective.

02

Determined conditions for different pairs to generate the same family.

03

Linked the construction to the generalized inverse Gaussian distribution.

Abstract

Exponential family plays an important role in information geometry. In arXiv:1811.01394, we introduced a method to construct an exponential family $P = {p_{θ}}_{θ \in Θ}$ on a homogeneous space $G / H$ from a pair $(V, v_{0})$ . Here $V$ is a representation of $G$ and $v_{0}$ is an $H$ -fixed vector in $V$ . Then the following questions naturally arise: (Q1) when is the correspondence $θ \mapsto p_{θ}$ injective? (Q2) when do distinct pairs $(V, v_{0})$ and $(V^{'}, v_{0}^{'})$ generate the same family? In this paper, we answer these two questions (Theorems 1 and 2). Moreover, in Section 3, we consider the case $(G, H) = (R_{> 0}, {1})$ with a certain representation on $R^{2}$ . Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG).

Equations60

Θ ∋ θ \mapsto p_{θ} \in P .

Θ ∋ θ \mapsto p_{θ} \in P .

Ω_{0} (G, H)

Ω_{0} (G, H)

lo g Ω_{0} (G, H)

d \tilde{p}_{θ} (x) = d \tilde{p}_{ξ, χ} (x) := exp (- ⟨ ξ, x v_{0} ⟩) χ (x) d μ (x) (x \in X),

d \tilde{p}_{θ} (x) = d \tilde{p}_{ξ, χ} (x) := exp (- ⟨ ξ, x v_{0} ⟩) χ (x) d μ (x) (x \in X),

Θ

Θ

φ (θ)

d p_{θ}

P := {p_{θ}}_{θ \in Θ} .

P := {p_{θ}}_{θ \in Θ} .

\tilde{V} (G)

\tilde{V} (G)

\tilde{V} (G, H)

c_{a, b, λ} x^{λ - 1} e^{- (a x + b / x) /2} d x (x \in R_{> 0}),

c_{a, b, λ} x^{λ - 1} e^{- (a x + b / x) /2} d x (x \in R_{> 0}),

(i) a > 0, b > 0, (ii) a > 0, b = 0, λ > 0, (iii) a = 0, b > 0, λ < 0.

(i) a > 0, b > 0, (ii) a > 0, b = 0, λ > 0, (iii) a = 0, b > 0, λ < 0.

(i) \frac{( a / b ) ^{\frac{λ}{2}}}{2 K _{λ} ( ab )}, (ii) \frac{1}{Γ ( λ )} (\frac{a}{2})^{λ}, (iii) \frac{1}{Γ ( - λ )} (\frac{b}{2})^{- λ},

(i) \frac{( a / b ) ^{\frac{λ}{2}}}{2 K _{λ} ( ab )}, (ii) \frac{1}{Γ ( λ )} (\frac{a}{2})^{λ}, (iii) \frac{1}{Γ ( - λ )} (\frac{b}{2})^{- λ},

d \tilde{p}_{a, b, λ} (x)

d \tilde{p}_{a, b, λ} (x)

= exp (- (a x + b x^{- 1}) /2) x^{λ - 1} d x .

a xg + b y g^{- 1} = λ lo g g + c for any g \in G

a xg + b y g^{- 1} = λ lo g g + c for any g \in G

V \to (V^{\lor})^{\lor}, x \mapsto ev_{x} .

V \to (V^{\lor})^{\lor}, x \mapsto ev_{x} .

W^{⊥} := {f \in V^{\lor} ∣ ⟨ f, w ⟩ = 0 for any w \in W} .

W^{⊥} := {f \in V^{\lor} ∣ ⟨ f, w ⟩ = 0 for any w \in W} .

⟨ g^{\lor} ξ, v ⟩ = ⟨ ξ, g^{- 1} v ⟩ (g \in G, v \in V, ξ \in V^{\lor}) .

⟨ g^{\lor} ξ, v ⟩ = ⟨ ξ, g^{- 1} v ⟩ (g \in G, v \in V, ξ \in V^{\lor}) .

exp (- ⟨ ξ_{1}, x v_{0} ⟩ + lo g χ_{1} (x) - φ (θ_{1}) + ⟨ ξ_{2}, x v_{0} ⟩ - lo g χ_{2} (x) + φ (θ_{2})) = \frac{d p _{θ_{1}}}{d p _{θ_{2}}} (x) = 1.

exp (- ⟨ ξ_{1}, x v_{0} ⟩ + lo g χ_{1} (x) - φ (θ_{1}) + ⟨ ξ_{2}, x v_{0} ⟩ - lo g χ_{2} (x) + φ (θ_{2})) = \frac{d p _{θ_{1}}}{d p _{θ_{2}}} (x) = 1.

⟨ ξ, g v_{0} ⟩ + φ (θ_{2}) - φ (θ_{1}) = lo g χ_{2} (g) - lo g χ_{1} (g) \in lo g Ω_{0} (G, H) .

⟨ ξ, g v_{0} ⟩ + φ (θ_{2}) - φ (θ_{1}) = lo g χ_{2} (g) - lo g χ_{1} (g) \in lo g Ω_{0} (G, H) .

⟨ g^{\lor} ξ, g^{'} v_{0} ⟩ = ⟨ ξ, g^{- 1} g^{'} v_{0} ⟩ = c = ⟨ ξ, g^{'} v_{0} ⟩ .

⟨ g^{\lor} ξ, g^{'} v_{0} ⟩ = ⟨ ξ, g^{- 1} g^{'} v_{0} ⟩ = c = ⟨ ξ, g^{'} v_{0} ⟩ .

V (G)

V (G)

W (G)

V (G) \to W (G), (V, v_{0}) \mapsto η (V^{\lor}),

V (G) \to W (G), (V, v_{0}) \mapsto η (V^{\lor}),

η := η_{V, v_{0}} : V^{\lor} \to C (G), ξ \mapsto (g \mapsto ⟨ ξ, g v_{0} ⟩) .

η := η_{V, v_{0}} : V^{\lor} \to C (G), ξ \mapsto (g \mapsto ⟨ ξ, g v_{0} ⟩) .

the function η (ξ) : G \to R is R_{H} -fixed for any ξ \in V^{\lor},

the function η (ξ) : G \to R is R_{H} -fixed for any ξ \in V^{\lor},

⟺

⟺

⟺

Φ : \tilde{V} (G)

Φ : \tilde{V} (G)

Ψ : W (G)

⟨ ξ, L_{g}^{\lor} (ev_{e} ∣_{W})⟩ = (L_{g}^{\lor} (ev_{e} ∣_{W})) (ξ) = (ev_{e} ∣_{W}) (L_{g^{- 1}} ξ) = (L_{g^{- 1}} ξ) (e) = ξ (g) .

⟨ ξ, L_{g}^{\lor} (ev_{e} ∣_{W})⟩ = (L_{g}^{\lor} (ev_{e} ∣_{W})) (ξ) = (ev_{e} ∣_{W}) (L_{g^{- 1}} ξ) = (L_{g^{- 1}} ξ) (e) = ξ (g) .

⟨ ξ, η^{\lor} (ev_{e} ∣_{W})⟩ = ⟨ η (ξ), ev_{e} ∣_{W} ⟩ = η (ξ) (e) = ⟨ ξ, v_{0} ⟩ .

⟨ ξ, η^{\lor} (ev_{e} ∣_{W})⟩ = ⟨ η (ξ), ev_{e} ∣_{W} ⟩ = η (ξ) (e) = ⟨ ξ, v_{0} ⟩ .

η \circ ψ^{\lor} (ξ^{'}) (g) = ⟨ ψ^{\lor} ξ^{'}, g v_{0} ⟩ = ⟨ ξ^{'}, ψ (g v_{0})⟩ = ⟨ ξ^{'}, g ψ (v_{0})⟩ = ⟨ ξ^{'}, g v_{0}^{'} ⟩ = η^{'} (ξ^{'}) (g) .

η \circ ψ^{\lor} (ξ^{'}) (g) = ⟨ ψ^{\lor} ξ^{'}, g v_{0} ⟩ = ⟨ ξ^{'}, ψ (g v_{0})⟩ = ⟨ ξ^{'}, g ψ (v_{0})⟩ = ⟨ ξ^{'}, g v_{0}^{'} ⟩ = η^{'} (ξ^{'}) (g) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicsgraph theory and CDMA systems · Matrix Theory and Algorithms · Advanced Research in Systems and Signal Processing

Full text

\hypersetup

colorlinks=true

11institutetext: RIKEN Center for Advanced Intelligence Project, Tokyo, Japan/

Department of Mathematics, Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, 223-8522, Japan

11email: [email protected]

22institutetext: Graduate School of Mathematical Science, The University of Tokyo,

3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan

22email: [email protected]

On a method to construct exponential families by representation theory

Koichi Tojo 11

Taro Yoshino 22

Abstract

Exponential family plays an important role in information geometry. In [TY18], we introduced a method to construct an exponential family $\mathcal{P}=\{p_{\theta}\}_{\theta\in\Theta}$ on a homogeneous space $G/H$ from a pair $(V,v_{0})$ . Here $V$ is a representation of $G$ and $v_{0}$ is an $H$ -fixed vector in $V$ . Then the following questions naturally arise: (Q1) when is the correspondence $\theta\mapsto p_{\theta}$ injective? (Q2) when do distinct pairs $(V,v_{0})$ and $(V^{\prime},v_{0}^{\prime})$ generate the same family? In this paper, we answer these two questions (Theorems 2.1 and 2.2). Moreover, in Section 3, we consider the case $(G,H)=(\mathbb{R}_{>0},\{1\})$ with a certain representation on $\mathbb{R}^{2}$ . Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG).

Keywords:

exponential family representation theory homogeneous space generalized inverse Gaussian distribution

1 Introduction

Let $G$ be a Lie group and $H$ its closed subgroup. In [TY18], we introduced a method to construct an exponential family $\mathcal{P}=\{p_{\theta}\}_{\theta\in\Theta}$ on the homogeneous space $X:=G/H$ from $(V,v_{0})$ . In this paper, we answer two natural questions on our method.

1.1 Correspondence parameters and probability measures

In the theory of exponential family, “minimal representation” is important ([BN70]). If an exponential family is realized by “minimal representation”, then we obtain one-to-one correspondence between the parameter space and the family of probability measures, which enable us to make use of the family. Moreover, from the perspective of information geometry, the correspondence is used as a coordinate. Then we would like to consider the following:

Question 1

When is the following correspondence injective?

[TABLE]

We want to answer this question for families obtained by our method. We give a necessary and sufficient condition for the injectivity of (1.1) in Theorem 2.1. It is, however, a little bit difficult to check. So, we will see the following easier equivalent conditions (A) and (B) are necessary.

(A)

The orbit $Gv_{0}$ is not contained in any proper affine subspace of $V$ . 2. (B)

(1)

$v_{0}$ is cyclic, 2. (2)

$V^{\vee}$ has no nonzero $G$ -fixed vector.

In the case where $G$ is compact or connected semisimple, they are also sufficient (see Remark 2).

1.2 Equivalence relation

Our method in [TY18] constructs an exponential family from a pair $(V,v_{0})$ . In some cases, the same exponential family comes from distinct pairs $(V,v_{0})$ and $(V^{\prime},v_{0}^{\prime})$ . To reduce the choice of $(V,v_{0})$ , it is useful to give an answer to the following question.

Question 2

When do distinct pairs $(V,v_{0})$ and $(V^{\prime},v_{0}^{\prime})$ generate the same family?

We give an answer to this question in Theorem 2.2. More precisely, we introduce an equivalence relation on the set of pairs $\{(V,v_{0})\}$ and show that two families obtained by $(V,v_{0})$ , $(V^{\prime},v_{0}^{\prime})$ coincide if $(V,v_{0})\sim(V^{\prime},v_{0}^{\prime})$ .

2 Main theorems

2.1 Method introduced in [TY18]

Before stating our main results, we recall the method introduced in [TY18]. Let $G$ be a Lie group and $H$ its closed subgroup. Then the quotient space $X:=G/H$ naturally equips manifold structure, which is called the homogeneous space of $G$ .

Let $V$ be a finite dimensional real vector space, and $\rho\colon G\to GL(V)$ a Lie group homomorphism. Then the pair $V:=(\rho,V)$ is called a representation of $G$ . We often use simpler notation $gv:=\rho(g)v$ for $g\in G$ and $v\in V$ .

A vector $v_{0}\in V$ is said to be $H$ -fixed if $hv_{0}=v_{0}$ for any $h\in H$ . We denote by $V^{H}$ the linear subspace consisting of all $H$ -fixed vectors. Let $(V,v_{0})$ be a pair of representation of $G$ and an $H$ -fixed vector.

We put

[TABLE]

Take a relatively $G$ -invariant measure $\mu$ on $X$ . Then we define a measure $\tilde{p}_{\theta}$ on $X$ parameterized by $V^{\vee}\times\Omega_{0}(G,H)$ as follows:

[TABLE]

where $\theta=(\xi,\chi)\in V^{\vee}\times\Omega_{0}(G,H)$ .

Remark 1

Since $v_{0}$ is $H$ -fixed, the notion $xv_{0}$ in (2.3) is well-defined. Owing to $\chi|_{H}=1$ , the notion $\chi(x)$ is also well-defined for $\chi\in\Omega_{0}(G,H)$ .

Then we consider the normalization of the measures above. Put

[TABLE]

Then we obtain a family of distributions on $X$ as follows:

[TABLE]

This is an exponential family if $\Theta\neq\emptyset$ ([TY18]).

2.2 Correspondence

In this section, we give an answer to Question 1. Namely, we state a criterion of the injectivity of the correspondence (1.1). Moreover, we also give necessary conditions, which one can easily check (Proposition 1)

Theorem 2.1

In the setting as in Section 2.1, the following three conditions are equivalent:

$(i)$

The correspondence $\Theta\ni\theta\mapsto p_{\theta}\in\mathcal{P}$ is injective. 2. $(ii)$

There does not exist $\xi\in V^{\vee}\setminus\{0\}$ such that $f_{\xi}\in\log\Omega_{0}(G,H)$ . 3. $(iii)$

There does not exist a triple $(\xi,\chi,c)\in(V^{\vee}\setminus\{0\})\times\Omega_{0}(G,H)\times\mathbb{R}$ satisfying $\langle\xi,gv_{0}\rangle=\log\chi(g)+c$ for any $g\in G$ .

Here, $f_{\xi}(g):=\langle\xi,gv_{0}-v_{0}\rangle$ for $g\in G$ .

We prove this theorem in Section 4.2.

Moreover, we also give necessary conditions for the injectivity of (1.1). To state them, we prepare the notion of cyclic.

Definition 1 (cyclic)

We say a vector $v\in V$ is cyclic if $\operatorname{span}\{gv\,|\,g\in G\}=V$ .

Proposition 1

If the correspondence (1.1) is injective, then the following equivalent conditions (A) and (B) are satisfied. Namely, ((1.1) is injective) $\Rightarrow$ (A) $\Leftrightarrow$ (B).

(A)

The orbit $Gv_{0}$ is not contained in any proper affine subspace of $V$ . 2. (B)

(1)

$v_{0}\in V$ * is cyclic,* 2. (2)

$\rho^{\vee}:G\to GL(V^{\vee})$ * has no nonzero $G$ -fixed vector.*

Here $\rho^{\vee}$ is the contragredient representation of $G$ . Moreover, in the case where $\Omega_{0}(G,H)=\{1\}$ , the converse implication also holds.

We prove this proposition in Section 4.3

Remark 2

In the case where $G$ is compact or connected semisimple, we have $\Omega_{0}(G,H)=\{1\}$ . See [TY18] for the details.

2.3 Equivalence

We use the same notation as in Section 2.1. In this subsection, we give an answer to Question 2. To state it, we introduce the notations $\tilde{\mathcal{V}}(G)$ and $\tilde{\mathcal{V}}(G,H)$ .

Definition 2

We put

[TABLE]

We say elements $(V,v_{0})$ and $(V^{\prime},v_{0}^{\prime})$ in $\tilde{\mathcal{V}}(G)$ are equivalent if there exists a $G$ -equivariant linear isomorphism $\psi:V\to V^{\prime}$ such that $\psi(v_{0})=v_{0}^{\prime}$ and denote it by $(V,v_{0})\sim(V^{\prime},v_{0}^{\prime})$ . This is an equivalence relation on $\tilde{\mathcal{V}}(G)$ . By definition, this is also an equivalence relation on $\tilde{\mathcal{V}}(G,H)$ .

Theorem 2.2

Equivalent elements in $\tilde{\mathcal{V}}(G,H)$ generate the same family by our method.

We prove this theorem in Section 4.4.

Remark 3

From Theorem 2.2, in the special case $\dim V^{H}=1$ , the choice of $v_{0}$ is essentially unique. In the next section, we also see an example in which the choice of $v_{0}$ is essentially unique even if $\dim V^{H}>1$ .

3 Generalized inverse Gaussian distribution

Throughout this section, we put $G=\mathbb{R}_{>0}$ , $H=\{1\}$ and $V=\mathbb{R}^{2}$ , and consider a representation $\rho\colon G\to GL(V)$ given by $\rho(g)=\begin{pmatrix}g&\\ &g^{-1}\end{pmatrix}$ for $g\in G$ . We answer Questions 1 and 2 for this case.

We consider the following two cases.

(Case 1) In the case where $\begin{pmatrix}r\\ s\end{pmatrix}\in V^{H}=V$ with $r=0$ or $s=0$ :

Vectors $\begin{pmatrix}r\\ 0\end{pmatrix}$ , $\begin{pmatrix}0\\ s\end{pmatrix}$ are not cyclic. Therefore the obtained families have “unessential parameters”.

(Case 2) In the case where $\begin{pmatrix}r\\ s\end{pmatrix}\in V^{H}$ with $r\neq 0$ and $s\neq 0$ :

Proposition 2

The pairs $(V,\begin{pmatrix}r\\ s\end{pmatrix})$ with $r\neq 0$ and $s\neq 0$ are equivalent each other. Moreover, we obtain the family $\{dp_{a,b,\lambda}\}_{(a,b,\lambda)\in\Theta}$ of GIG (3.1) by applying our method to $(V,\begin{pmatrix}r\\ s\end{pmatrix})$ , where $\Theta=\{(a,b,\lambda)\in\mathbb{R}^{3}\ |\ (a,b,\lambda)\text{ satisfies }(\ref{eq:param_GIG})\}$ .

Definition 3 (Generalized inverse Gaussian distribution. See [J82] for the details)

The following distribution on $\mathbb{R}_{>0}$ is called generalized inverse Gaussian distribution.

[TABLE]

where $dx$ denotes Lebesgue measure on $\mathbb{R}_{>0}$ , and $(a,b,\lambda)$ satisfies one of the following three conditions:

[TABLE]

Here $c_{a,b,\lambda}$ is the normalizing constant given as follows, respectively.

[TABLE]

where $K_{\lambda}$ is the modified Bessel function of the second kind with index $\lambda$ .

Proof (Proposition 2)

Put $v_{0}:=\frac{1}{2}\begin{pmatrix}1\\ 1\end{pmatrix}$ . For $r,s\neq 0$ , a $G$ -linear isomorphism $\begin{pmatrix}2r&0\\ 0&2s\end{pmatrix}\in GL(V)$ gives $(V,v_{0})\sim(V,\begin{pmatrix}r\\ s\end{pmatrix})$ , which implies the former part.

For the latter part, it is enough to show the case $(V,v_{0})$ by Theorem 2.2. It is easily checked that $\Omega_{0}(G,H)=\{x\mapsto x^{\lambda}\ |\ \lambda\in\mathbb{R}\}$ . Take a relatively invariant measure $\frac{dx}{x}$ on $\mathbb{R}_{>0}$ . We identify $(\mathbb{R}^{2})^{\vee}$ with $\mathbb{R}^{2}$ by taking the standard inner product. Then we have

[TABLE]

We get $\Theta=\{\theta=(a,b,\lambda)\in\mathbb{R}^{3}\ |\ (a,b,\lambda)\text{ satisfies }(\ref{eq:param_GIG})\}$ . By normalizing these distributions, we obtain the desired family of GIG (3.1).

Finally, let us check the injectivity of the correspondence (1.1). For $(a,b,c,\lambda)\in~{}\mathbb{R}^{4}$ ,

[TABLE]

holds only if $(a,b,c,\lambda)=0$ . Thus, the condition (iii) of Theorem 2.1 is satisfied.

4 Proof of main theorems

In this section, we give proofs to Theorems 2.1 and 2.2 and Proposition 1.

4.1 Preliminary

In this subsection, we prepare some notations for proofs in the following sections. Let $G$ be a Lie group, H a closed subgroup of $G$ and V a finite dimensional real vector space.

Notation 4.1

*We denote by $C(G)$ the vector space consisting of all $\mathbb{R}$ -valued continuous functions on $G$ . The constant function $1$ is an element of $C(G)$ . The space $C(G)$ admits the left and right regular representations $L$ , $R:G\to GL(C(G))$ , respectively. We put $C(G)^{H}:=\{f\in C(G)\ |\ R_{h}f=f\text{ for any }h\in H\}$ . *

Remark 4

The set $\log\Omega_{0}(G,H)$ is a subspace of $C(G)$ (see (2.2)). For $f\in C(G)$ , the condition $f\in\log\Omega_{0}(G,H)$ is equivalent to the pair of the following conditions:

$(a)$

$f(h)=0$ for any $h\in H$ , 2. $(b)$

$f(gg^{\prime})=f(g)+f(g^{\prime})$ for any $g,g^{\prime}\in G$ .

Notation 4.2

We denote by $\mathop{\mathrm{ev}}\nolimits$ the evaluation map. We identify $V$ with $(V^{\vee})^{\vee}$ canonically as follows:

[TABLE]

Let $W$ be a subspace of $V$ . Then we put

[TABLE]

Notation 4.3

For a representation $\rho:G\to GL(V)$ , we denote the contragredient representation by $\rho^{\vee}:G\to GL(V^{\vee})$ . We often use simpler notation $g^{\vee}\xi:=\rho^{\vee}(g)\xi$ for $g\in G$ and $\xi\in V^{\vee}$ . Then, the following equality holds:

[TABLE]

4.2 Proof of Theorem 2.1

Proof (Theorem 2.1)

We are enough to show $\lnot$ (ii) $\Rightarrow\lnot$ (iii) $\Rightarrow\lnot$ (i) $\Rightarrow\lnot$ (ii).

First, we see $\lnot$ (ii) $\Rightarrow\lnot$ (iii). Take $\xi\in V^{\vee}\setminus\{0\}$ such that $f_{\xi}\in\log\Omega_{0}(G,H)$ . Then there exists $\chi\in\Omega_{0}(G,H)$ such that $\langle\xi,gv_{0}-v_{0}\rangle=\langle\xi,gv_{0}\rangle-\langle\xi,v_{0}\rangle=\log\chi(g)$ for any $g\in G$ , so $\lnot$ (iii) is proved.

Next, we see $\lnot$ (iii) $\Rightarrow\lnot$ (i). Assume there exist $\xi\in V^{\vee}\setminus\{0\}$ , $c\in\mathbb{R}$ and $\chi\in\Omega_{0}(G,H)$ satisfying $\langle\xi,gv_{0}\rangle=\log\chi(g)+c$ for any $g\in G$ . Take any $\theta_{1}=(\xi_{1},\chi_{1})\in\Theta$ and put $\theta_{2}:=(\xi_{1}+\xi,\chi_{1}\chi)\in V^{\vee}\times\Omega_{0}(G,H)$ . It is enough to show that $\theta_{2}\in\Theta$ and $p_{\theta_{1}}=p_{\theta_{2}}$ . This comes from $d\tilde{p}_{\theta_{2}}(x)=e^{-\langle\xi_{1}+\xi,xv_{0}\rangle}\chi_{1}(x)\chi(x)d\mu(x)=e^{-\langle\xi,xv_{0}\rangle+\log\chi(x)}e^{-\langle\xi_{1},xv_{0}\rangle}\chi_{1}(x)d\mu(x)=e^{-c}d\tilde{p}_{\theta_{1}}(x)$ .

Finally, we see $\lnot$ (i) $\Rightarrow\lnot$ (ii). Assume two distinct elements $\theta_{1}=(\xi_{1},\chi_{1})$ and $\theta_{2}=(\xi_{2},\chi_{2})\in\Theta$ satisfy $p_{\theta_{1}}=p_{\theta_{2}}$ . Put $\xi:=\xi_{2}-\xi_{1}$ . It is enough to show the following:

Claim

$\xi\neq 0$ and $f_{\xi}\in\log\Omega_{0}(G,H)$ .

From $p_{\theta_{1}}=p_{\theta_{2}}$ , we have for almost every $x\in X$ ,

[TABLE]

Therefore we have

[TABLE]

From Remark 4 $(a)$ , we have $\varphi(\theta_{2})-\varphi(\theta_{1})=-\langle\xi,v_{0}\rangle$ , that is, $f_{\xi}\in\log\Omega_{0}(G,H)$ . Moreover, from (4.4) and $\theta_{1}\neq\theta_{2}$ , we obtain $\xi\neq 0$ .

4.3 Proof of Proposition 1

In this subsection, we prove Proposition 1 by using Lemma 1 below.

Lemma 1

*For $\xi\in V^{\vee}\setminus\{0\}$ , we consider the following three conditions: *

(i)

$g^{\vee}\xi=\xi$ * for any $g\in G$ ,* 2. (ii)

$f_{\xi}=0$ * (see Theorem 2.1 for the definition of $f_{\xi}$ ),* 3. (iii)

there exists $c\in\mathbb{R}$ satisfying $Gv_{0}\subset\{v\in V\ |\ \langle\xi,v\rangle=c\}$ .

Then, we have (i) $\Rightarrow$ (ii) $\Leftrightarrow$ (iii). Moreover, under the assumption that $v_{0}$ is cyclic, the implication (iii) $\Rightarrow$ (i) also holds.

Proof

Since the implications (i) $\Rightarrow$ (ii) $\Leftrightarrow$ (iii) are easy, we prove only the implication (iii) $\Rightarrow$ (i) under the assumption that $v_{0}$ is cyclic. Take any $g\in G$ . It is enough to show that $\langle g^{\vee}\xi,g^{\prime}v_{0}\rangle=\langle\xi,g^{\prime}v_{0}\rangle$ for any $g^{\prime}\in G$ . From (4.3), we have

[TABLE]

Proof (Proposition 1)

First, note that we have the following three easy implications (a), (b) and (c):

(a)

$\lnot$ (A) $\iff$ there exists $\xi\in V^{\vee}\setminus\{0\}$ satisfying Lemma 1(iii), 2. (b)

$\lnot$ (B)(2) $\iff$ there exists $\xi\in V^{\vee}\setminus\{0\}$ satisfying Lemma 1(i), 3. (c)

(A) $\implies$ $v_{0}$ is cyclic.

Therefore, the equivalence (A) $\Leftrightarrow$ (B) comes from Lemma 1.

Next, the implication ((1.1) is injective) $\Rightarrow$ (A) follows from (a). In fact, the condition Theorem 2.1(ii) fails if there exists $\xi\in V^{\vee}\setminus\{0\}$ satisfying Lemma 1(ii).

Finally, assume $\Omega_{0}(G,H)=\{1\}$ . The converse implication above also holds. So, (A) implies the injectivity of (1.1).

4.4 Proof of Theorem 2.2

We show Theorem 2.2 by using Lemmas 2 and 3 below. We prove Lemma 2 in the next subsection.

Proof (Theorem 2.2)

It is enough to show that $\{g\mapsto\langle\xi,gv_{0}\rangle\ |\ \xi\in V^{\vee}\}=\{g\mapsto\langle\xi^{\prime},gv_{0}^{\prime}\rangle\ |\ \xi^{\prime}\in V^{\prime\vee}\}$ as a subspace of $C(G)^{H}$ if $(V,v_{0}),(V^{\prime},v_{0}^{\prime})\in\tilde{\mathcal{V}}(G,H)$ are equivalent. This follows from Lemmas 2 and 3 below.

Lemma 2

Put

[TABLE]

The following map gives a one-to-one correspondence.

[TABLE]

where

[TABLE]

Lemma 3

Let $H$ be a closed subgroup of $G$ . Suppose $(V,v_{0})\in\mathcal{V}(G)$ corresponds to $W\in\mathcal{W}(G)$ in Lemma 2. Then $v_{0}$ is $H$ -fixed if and only if any element $w\in W$ is $R_{H}$ -fixed.

Proof

We have

[TABLE]

4.5 Proof of Lemma 2

In this subsection, we prove Lemma 2. To show this lemma, we use Lemmas 4 and 5 below.

Lemma 4 (property of $\eta$ )

The map $\eta:V^{\vee}\to C(G)$ defined in $(\ref{eq:eta})$ satisfies the following:

$(1)$

$\eta$ * is a $G$ -equivariant linear map,* 2. $(2)$

$v_{0}$ * is cyclic if and only if $\eta$ is injective, * 3. $(3)$

$(V,v_{0})\sim(V^{\prime},v_{0}^{\prime})\Rightarrow\eta(V^{\vee})=\eta^{\prime}(V^{\prime\vee})$ , where $\eta=\eta_{V,v_{0}}$ and $\eta^{\prime}=\eta_{V^{\prime},v_{0}^{\prime}}$ .

We give a proof of this lemma at the end of this subsection.

Lemma 5

Let $W\subset C(G)$ be a finite dimensional $L_{G}$ -invariant subspace. Then $v_{0}:=\mathop{\mathrm{ev}}\nolimits_{e}|_{W}\in W^{\vee}$ is $L_{G}^{\vee}$ -cyclic in $W^{\vee}$ .

Proof

Put $E:=\operatorname{span}\{L_{g}^{\vee}v_{0}\ |\ g\in G\}\subset W^{\vee}$ . It is enough to show $E^{\perp}=\{0\}$ . Take any function $f\in E^{\perp}$ , then we have $f(g)=(L_{g^{-1}}f)(e)=\langle v_{0},L_{g^{-1}}f\rangle=\langle L_{g}^{\vee}v_{0},f\rangle=0$ . Therefore, we obtain $f=0$ .

Proof (Lemma 2)

From Lemmas 4(1) and 5, the following maps are well-defined:

[TABLE]

Then it is enough to show the following:

(a)

$(V,v_{0})\sim(V^{\prime},v_{0}^{\prime})$ in $\tilde{\mathcal{V}}(G)\Rightarrow\Phi(V,v_{0})=\Phi(V^{\prime},v_{0}^{\prime})$ , 2. (b)

$\Phi\circ\Psi=\operatorname{id}_{\mathcal{W}(G)}$ , 3. (c)

$\Psi\circ\Phi(V,v_{0})\sim(V,v_{0})$ in $\tilde{\mathcal{V}}(G)$ for $(V,v_{0})\in\tilde{\mathcal{V}}(G)$ .

First, the condition (a) follows from Lemma 4(3).

Next, we show the condition (b). Let $W$ be an element of $\mathcal{W}(G)$ . Since we have $\Psi(W)=(W^{\vee},\mathop{\mathrm{ev}}\nolimits_{e}|_{W})$ , we get $\Phi\circ\Psi(W)=\{g\mapsto\langle\xi,L_{g}^{\vee}(\mathop{\mathrm{ev}}\nolimits_{e}|_{W})\rangle\ |\ \xi\in(W^{\vee})^{\vee}\}$ . Then, we have

[TABLE]

Therefore, we obtain $\Phi\circ\Psi(W)=W$ .

Finally, we show the condition (c). Let $(V,v_{0})$ be an element of $\tilde{\mathcal{V}}(G)$ . Put $W:=\eta(V^{\vee})$ and $(V^{\prime},v_{0}^{\prime}):=\Psi\circ\Phi(V,v_{0})=\Psi(W)=(W^{\vee},\mathop{\mathrm{ev}}\nolimits_{e}|_{W})$ . Since $\eta^{\vee}:W^{\vee}\to(V^{\vee})^{\vee}$ is a $G$ -linear isomorphism by Lemma 4(1) and (2), it is enough to show that $\eta^{\vee}(\mathop{\mathrm{ev}}\nolimits_{e}|_{W})=v_{0}$ . For any $\xi\in V^{\vee}$ , we have

[TABLE]

Therefore, we obtain $\eta^{\vee}(\mathop{\mathrm{ev}}\nolimits_{e}|_{W})=v_{0}$ .

Proof (Lemma 4)

(1)

Clearly, $\eta$ is a linear map. The $G$ -equivariance of $\eta$ follows from the definition of the contragredient representation. 2. (2)

Since $\eta$ is linear, it is enough to show that $v_{0}$ is cyclic if and only if $\ker\eta=\{0\}$ . The condition $\ker\eta=\{0\}$ means that for $\xi\in V^{\vee}$ , $\langle\xi,gv_{0}\rangle=0$ for any $g\in G$ implies $\xi=0$ . Therefore this is equivalent to the condition $v_{0}$ is cyclic. 3. (3)

Take a $G$ -equivariant linear isomorphism $\psi:V\to V^{\prime}$ with $\psi(v_{0})=v_{0}^{\prime}$ . Then it is enough to show $\eta^{\prime}=\eta\circ\psi^{\vee}:V^{\prime\vee}\to C(G)$ . For any $\xi^{\prime}\in V^{\prime\vee}$ and $g\in G$ ,

[TABLE]

Acknowledgements

The authors would like to thank Dr. Frédéric Barbaresco for recommending us to submit a paper to the conference Geometric Science of Information 2019. The authors wish to thank referees for several helpful comments, particularly the comment concerning the condition (A) in Proposition 1.

Bibliography3

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[BN 70] O. E. Barndorff-Nielsen , Exponential families: Exact theory , Various Publication Series, No. 19. Matematisk Institut, Aarhus Universitet, Aarhus, 1970.
2[TY 18] K. Tojo, T. Yoshino, A method to construct exponential families by representation theory , ar Xiv:1811.01394 v 2.
3[J 82] B. Jørgensen, Statistical properties of the generalized inverse Gaussian distribution , Lecture Notes in Statistics 9 , Springer-Verlag, New York-Berlin, 1982.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

On a method to construct exponential families by representation theory

Abstract

Keywords:

1 Introduction

1.1 Correspondence parameters and probability measures

Question 1

1.2 Equivalence relation

Question 2

2 Main theorems

2.1 Method introduced in [TY18]

Remark 1

2.2 Correspondence

Theorem 2.1

Definition 1 (cyclic)

Proposition 1

Remark 2

2.3 Equivalence

Definition 2

Theorem 2.2

Remark 3

3 Generalized inverse Gaussian distribution

Proposition 2

Definition 3 (Generalized inverse Gaussian distribution. See [J82] for the details)

Proof (Proposition 2)

4 Proof of main theorems

4.1 Preliminary

Notation 4.1

Remark 4

Notation 4.2

Notation 4.3

4.2 Proof of Theorem 2.1

Proof (Theorem 2.1)

Claim

4.3 Proof of Proposition 1

Lemma 1

Proof

Proof (Proposition 1)

4.4 Proof of Theorem 2.2

Proof (Theorem 2.2)

Lemma 2

Lemma 3

Proof

4.5 Proof of Lemma 2

Lemma 4 (property of η\etaη)

Lemma 5

Proof

Proof (Lemma 2)

Proof (Lemma 4)

Acknowledgements

Lemma 4 (property of $\eta$ )