Identifiability of parametric random matrix models

Tomohiro Hayase

arXiv:1812.10678·math.PR·June 7, 2021

Identifiability of parametric random matrix models

Tomohiro Hayase

PDF

Open Access

TL;DR

This paper studies whether the parameters of certain random matrix models can be uniquely determined from their spectral distributions, using free probability theory to establish conditions for identifiability.

Contribution

It demonstrates that compound Wishart and signal-plus-noise models are identifiable up to rotation, advancing understanding of parameter recovery in spectral analysis.

Findings

01

Models are identifiable up to rotation.

02

Free probability theory is effective for analyzing identifiability.

03

Provides theoretical conditions for parameter uniqueness.

Abstract

We investigate parameter identifiability of spectral distributions of random matrices. In particular, we treat compound Wishart type and signal-plus-noise type. We show that each model is identifiable up to some kind of rotation of parameter space. Our method is based on free probability theory.

Equations230

W_{CW} (D)

W_{CW} (D)

W_{SPN} (A, σ)

μ_{W} = \frac{1}{d} k = 1 \sum d δ_{λ_{k}},

μ_{W} = \frac{1}{d} k = 1 \sum d δ_{λ_{k}},

μ_{CW}^{□} (D), D \in M_{p} (C)

μ_{CW}^{□} (D), D \in M_{p} (C)

(resp. μ_{SPN}^{□} (A, σ), A \in M_{p, d} (C), σ \in R)

μ_{D} = μ_{D^{'}}

μ_{D} = μ_{D^{'}}

μ_{A} = μ_{A^{'}}, σ^{2} = σ^{^{'} 2}

μ_{D} = μ_{D^{'}}

μ_{D} = μ_{D^{'}}

{v \in R^{p} ∣ v_{k} = v_{π (k)}, \forall k = 1, \dots, p, \forall π \in S_{p}},

{v \in R^{p} ∣ v_{k} = v_{π (k)}, \forall k = 1, \dots, p, \forall π \in S_{p}},

μ_{SPN}^{□} (A, σ) = μ_{SPN}^{□} (B, ρ) ⟺ {μ_{A^{*} A} = μ_{B^{*} B}, σ^{2} = ρ^{2} .

μ_{SPN}^{□} (A, σ) = μ_{SPN}^{□} (B, ρ) ⟺ {μ_{A^{*} A} = μ_{B^{*} B}, σ^{2} = ρ^{2} .

{v \in R^{d} ∣ v_{π (k)} = v_{k} \geq 0 \forall k = 1, \dots, d, \forall π \in S_{d}} \times {v \in R ∣ v \geq 0} .

{v \in R^{d} ∣ v_{π (k)} = v_{k} \geq 0 \forall k = 1, \dots, d, \forall π \in S_{d}} \times {v \in R ∣ v \geq 0} .

\int x^{k} μ_{a} (d x) = τ (a^{k}), k \in N .

\int x^{k} μ_{a} (d x) = τ (a^{k}), k \in N .

τ (a_{1} \dots a_{l}) = 0.

τ (a_{1} \dots a_{l}) = 0.

μ_{s} (d x) = \frac{4 - x ^{2}}{2 π} 1_{[- 2, 2]} (x) d x,

μ_{s} (d x) = \frac{4 - x ^{2}}{2 π} 1_{[- 2, 2]} (x) d x,

c = v \frac{s _{1} + i s _{2}}{2},

c = v \frac{s _{1} + i s _{2}}{2},

G_{a}^{E} (Z) := E [(Z - a)^{- 1}], Z \in H^{+} (B) .

G_{a}^{E} (Z) := E [(Z - a)^{- 1}], Z \in H^{+} (B) .

E (a_{1} \dots a_{l}) = 0.

E (a_{1} \dots a_{l}) = 0.

E [Z_{ij}] = 0, E [\overset{ˉ}{Z}_{ij} Z_{ij}] = v .

E [Z_{ij}] = 0, E [\overset{ˉ}{Z}_{ij} Z_{ij}] = v .

W_{CW} (D) := Z^{*} D Z, D \in M_{p} (K) .

W_{CW} (D) := Z^{*} D Z, D \in M_{p} (K) .

W_{SPN} (A, σ) := (A + σ Z)^{*} (A + σ Z), A \in M_{p, d} (K), σ \in R .

W_{SPN} (A, σ) := (A + σ Z)^{*} (A + σ Z), A \in M_{p, d} (K), σ \in R .

τ (C_{ij}) = 0, τ (C_{ij}^{*} C_{ij}) = 1/ d .

τ (C_{ij}) = 0, τ (C_{ij}^{*} C_{ij}) = 1/ d .

W_{CW}^{□} (D) = C^{*} D C, D \in M_{p} (C) .

W_{CW}^{□} (D) = C^{*} D C, D \in M_{p} (C) .

μ_{CW}^{□} (D) = μ_{W_{CW}^{□} (D)} .

μ_{CW}^{□} (D) = μ_{W_{CW}^{□} (D)} .

W_{SPN}^{□} (A, σ) = (A + σ C)^{*} (A + σ C), A \in M_{p, d} (C), σ \in R .

W_{SPN}^{□} (A, σ) = (A + σ C)^{*} (A + σ C), A \in M_{p, d} (C), σ \in R .

μ_{SPN}^{□} (A, σ) = μ_{W_{SPN}^{□} (A, σ)} .

μ_{SPN}^{□} (A, σ) = μ_{W_{SPN}^{□} (A, σ)} .

μ_{CW}^{□} (D) = μ_{CW}^{□} (D^{'}) .

μ_{CW}^{□} (D) = μ_{CW}^{□} (D^{'}) .

R (b, v) = \frac{1}{d} k = 1 \sum p \frac{v _{k}}{1 - v _{k} b}, b \in H^{-} (C) .

R (b, v) = \frac{1}{d} k = 1 \sum p \frac{v _{k}}{1 - v _{k} b}, b \in H^{-} (C) .

R (b, v) = R (b, v^{'}), b \in H^{-} (C) .

R (b, v) = R (b, v^{'}), b \in H^{-} (C) .

v_{π (k)} = v_{k}^{'}, k = 1, \dots, p .

v_{π (k)} = v_{k}^{'}, k = 1, \dots, p .

μ_{D} = μ_{D^{'}} .

μ_{D} = μ_{D^{'}} .

D_{2} = {[z_{1} I_{d} 0 0 z_{2} I_{p}] ∣ z_{1}, z_{2} \in C} \subseteq M_{p + d} (C) \subseteq M_{p + d} (A) .

D_{2} = {[z_{1} I_{d} 0 0 z_{2} I_{p}] ∣ z_{1}, z_{2} \in C} \subseteq M_{p + d} (C) \subseteq M_{p + d} (A) .

[z_{1} I_{d} 0 0 z_{2} I_{p}] \mapsto [z_{1} z_{2}] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Bayesian Methods and Mixture Models · Advanced Combinatorial Mathematics

Full text

Identifiability of Parametric Random Matrix Models

Tomohiro Hayase

Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo, 153-8914, Japan

[email protected]

Abstract.

We investigate parameter identifiability of spectral distributions of random matrices. In particular, we treat compound Wishart type and signal-plus-noise type. We show that each model is identifiable up to some kind of rotation of parameter space. Our method is based on free probability theory.

Key words and phrases:

Identifiability, Random Matrix Theory, Free Probability Theory, Statistical Models

1 Introduction
2 Related Works
3 Preliminary
3.1 Freeness
3.2 Random Matrix Models and Free Deterministic Equivalents
4 Identifiability
4.1 Identifiability of CW Model
4.2 Identifiablity of SPN Model
4.2.1 Analytic Part
4.2.2 Combinatorial Part
4.2.3 Free Poisson Distribution
4.2.4 Second Lemma
4.2.5 Proof of Identifiability
5 Acknowledgement

1. Introduction

Identifiability analysis is fundamental in a theoretical understanding of statistical models, for example, log-likelihood maximization. A parametric statistical model $(P_{\vartheta})_{\vartheta\in\Theta}$ , a parametric family of probability measures, is said to be identifiable if the map $\vartheta\mapsto P_{\vartheta}$ is injective. For a statistical model, its identifiability is necessary for its regularity. Under regularity condition, then maximal likelihood estimator has a good behavior such as asymptotic normality. In general, a geometry of log-likelihood is determined by the Fisher information matrix (see [2]), which is expected Hessian of log-likelihood with respect to parameters. If a statistical model is non-identifiable, then the Fisher information matrix is singular, and the eigenspace for the zero eigenvalue is determined by non-identifiable parameters. Therefore, determining non-identifiable parameters is important in non-identifiable models.

In this paper, we investigate identifiability of statistical models introduced for parameter estimation of random matrix models. In [8], two typical random matrix models, the compound Wishart model $W_{\mathrm{CW}}$ and the signal-plus-noise model $W_{\mathrm{SPN}}$ are treated. They are defined as the following:

[TABLE]

where $Z$ is $p\times d$ matrix of independent and identically distributed Gaussian random variables with mean zero and variance $1/d$ , $D$ and $A$ are deterministic matrices, and $\sigma\in\mathbb{R}$ . For any self-adjoint matrix $W$ , let us denote by $\mu_{W}$ the eigenvalue distribution defined as

[TABLE]

where $\lambda_{1}\leq\lambda_{2}\dots\leq\lambda_{d}$ are eigenvalues of $W$ . The parameter estimation method introduced in [8] is minimizing modified KL-divergence between a statistical model

[TABLE]

and a sample of the empirical eigenvalue distribution $\mu_{W_{\mathrm{CW}}(D_{0})}$ (resp. $\mu_{W_{\mathrm{SPN}}(A_{0},\sigma_{0})}$ ), where true parameters $D_{0},A_{0}$ , $\sigma_{0}$ are unknown. The definition of the statistical models $\mu^{\Box}_{x}$ is based on free deterministic equivalent. The free deterministic equivalent is introduced by [14], which is a deterministic and infinite-dimensional approximation of random matrices based on a central limit theorem of the eigenvalue distribution.

It directly follows from the definition of $\mu_{x}^{\Box}$ that

[TABLE]

In particular, these statistical models are not identifiable. For the CW model, it is easy to show that the converse also holds:

[TABLE]

In other words, if we replace the parameter set by the set of eigenvalue distributions then this model becomes identifiable. Note that there is a bijection between the set of eigenvalue distributions and

[TABLE]

where $S_{p}$ is the permutation group of $p$ elements. However, it is not clear that the converse holds for the SPN model.

The main theorem of this paper is as follows.

Theorem 1.1.

Let $p,d\in\mathbb{N}$ with $p\geq d$ . For $A,B\in M_{p,d}(\mathbb{C})$ and $\sigma,\rho\in\mathbb{R}$ , the following holds:

[TABLE]

In particular, if we replace the parameter space by the direct product of singular value distribution and the nonzero real numbers, then this statistical model becomes identifiable. Note that there is a bijective between the direct product and

[TABLE]

Our proof consists of an analytic part based on operator-valued analytic free additive subordination [3] and a combinatorial part based on free multiplicative deconvolution [11, 12].

2. Related Works

The compound Wishart random matrix was introduced by [13]. It appears as sample covariance matrices of correlated samplings [4, 5, 7]. The signal-plus-noise random matrix appears in signal precessing [11, 6, 15].

Free probability is invented by Voiculescu [16]. In free probability theory, motivated by solving a problem in operator algebras, some infinite-dimensional operators are described as infinite-dimensional limit of random matrices. The approximation is based on a central limit theorem, which is called the free central limit theorem, of eigenvalue distribution of random matrices [17]. Conversely, the purpose of free deterministic equivalent is to approximate fixed-size but large random matrix models by deterministic operators.

For analysis of non-identifiable models, generic identifiability was introduced in [1].

3. Preliminary

3.1. Freeness

First, we summarize some definitions from operator algebras and free probability theory. See [9] for the detail.

Definition 3.1.

(1)

A C∗-probability space is a pair $(\mathfrak{A},\tau)$ satisfying followings.

(a)

The set $\mathfrak{A}$ is a unital $C^{*}$ -algebra, that is, a possibly non-commutative subalgebra of the algebra $B(\mathcal{H})$ of bounded $\mathbb{C}$ -linear operators on a Hilbert space $\mathcal{H}$ over $\mathbb{C}$ satisfying the following conditions:

(i)

it is stable under the adjoint $*:a\to a^{*},a\in\mathfrak{A}$ , 2. (ii)

it is closed under the topology of the operator norm of $B(\mathcal{H})$ , 3. (iii)

it contains the identity operator $\operatorname*{id}_{\mathcal{H}}$ as the unit $1_{\mathfrak{A}}$ of $\mathfrak{A}$ . 2. (b)

The function $\tau$ on $\mathfrak{A}$ is a faithful tracial state, that, is a $\mathbb{C}$ -valued linear functional with

(i)

$\tau(a)\geq 0$ for any $a\geq 0$ , and the equality holds if and only if $a=0$ , 2. (ii)

$\tau(1_{\mathfrak{A}})=1$ , 3. (iii)

$\tau(ab)=\tau(ba)$ for any $a,b\in\mathfrak{A}$ . 2. (2)

A subalgebra $\mathfrak{B}$ of a C∗-algebra $\mathfrak{A}$ is called a $*$ -subalgebra if it is stable under the adjoint operator $*$ . Moreover, it is called a unital C∗-subalgebra if the $*$ -subalgebra is closed under the operator norm topology and contains $1_{\mathfrak{A}}$ as its unit. 3. (3)

Two unital $C^{*}$ -algebras are called $*$ -isomorphic if there is a bijective linear map between them which preserves the $*$ -operation and the multiplication. 4. (4)

Let us denote by $\mathfrak{A}_{\mathrm{s.a.}}$ the set of self-adjoint elements, that is, $a=a^{*}$ of $\mathfrak{A}$ . 5. (5)

Write $\operatorname*{Re}a:=(a+a^{*})/2$ and $\operatorname*{Im}a:=(a-a^{*})/{2i}$ for any $a\in\mathfrak{A}$ . 6. (6)

The distribution of $a\in\mathfrak{A}_{\mathrm{s.a.}}$ is the probability measure $\mu_{a}\in\mathcal{B}_{c}(\mathbb{R})$ determined by

[TABLE] 7. (7)

For $a\in\mathfrak{A}_{\mathrm{s.a.}}$ , we define its Cauchy transform $G_{a}$ by $G_{a}(z):=\tau[(z-a)^{-1}]\ (z\in\mathbb{C}\setminus\mathbb{R})$ , equivalently, $G_{a}:=G_{\mu_{a}}$ .

Definition 3.2.

A family of $*$ -subalgebras $(\mathfrak{A}_{j})_{j\in J}$ of $\mathfrak{A}$ is said to be free if the following factorization rule holds: for any $n\in\mathbb{N}$ and indexes $j_{1},j_{2},\dots,j_{n}\in J$ with $j_{1}\neq j_{2}\neq j_{3}\neq\cdots\neq j_{n}$ , and $a_{l}\in\mathfrak{A}_{l}$ with $\tau(a_{l})=0$ $(l=1,\dots,n)$ , it holds that

[TABLE]

Let $(x_{j})_{j\in J}$ be a family of self-adjoint elements $x_{j}\in\mathfrak{A}_{\mathrm{s.a.}}$ . For $j\in J$ , let $\mathfrak{A}_{j}$ be the $*$ -subalgebra of polynomials of $x_{j}$ . Then $(x_{j})_{j\in J}$ is said to be free if $\mathfrak{A}_{j}$ is free.

We introduce special elements in a non-commutative probability space.

Definition 3.3.

Let $(\mathfrak{A},\tau)$ be a C∗-probability space.

(1)

An element $s\in\mathfrak{A}_{\mathrm{s.a.}}$ is called standard semicircular if its distribution is given by the standard semicircular law;

[TABLE]

where ${\bf 1}_{S}$ is the indicator function for any subset $S\subseteq\mathbb{R}$ . 2. (2)

Let $v>0$ . An element $c\in\mathfrak{A}$ is called circular of variance $v$ if

[TABLE]

where $(s_{1},s_{2})$ is a pair of free standard semicircular elements. In addition. $c$ is called standard circular element if $v=1$ . 3. (3)

A $*$ -free circular family (resp. standard $*$ -free circular family) is a family $\{c_{j}\mid j\in J\}$ of circular elements $c_{j}\in\mathfrak{A}$ such that $\bigcup_{j\in J}\{\operatorname*{Re}c_{j},\operatorname*{Im}c_{j}\}$ is free (resp. and each elements is of variance $1$ ).

Definition 3.4.

Let $(\mathfrak{A},\tau)$ be a C∗-probability space and $\mathfrak{B}$ be a unital C∗-subalgebra of $\mathfrak{A}$ . Recall that they share the unit: $I_{\mathfrak{A}}=I_{\mathfrak{B}}$ .

(1)

Then a linear operator $E\colon\mathfrak{A}\to\mathfrak{B}$ is called a conditional expectation onto $\mathfrak{B}$ if it satisfies following conditions;

(a)

$E[b]=b$ for any $b\in\mathfrak{B}$ , 2. (b)

$E[b_{1}ab_{2}]=b_{1}E[a]b_{2}$ for any $a\in\mathfrak{A}$ and $b_{1},b_{2}\in\mathfrak{B}$ , 3. (c)

$E[a^{*}]=E[a]^{*}$ for any $a\in\mathfrak{A}$ . 2. (2)

We write $\mathbb{H}^{+}(\mathfrak{B}):=\{W\in\mathfrak{B}\mid\text{ there is$ \varepsilon>0 $such that$ \imaginary W\geq\varepsilon I_{\mathfrak{A}} $}\}$ and $\mathbb{H}^{-}(\mathfrak{B}):=-\mathbb{H}^{+}(\mathfrak{B})$ . 3. (3)

Let $E\colon\mathfrak{A}\to\mathfrak{B}$ be a conditional expectation. For $a\in\mathfrak{A}_{\mathrm{s.a.}}$ , we define a $E$ -Cauchy transform as the map $G_{a}^{E}\colon\mathbb{H}^{+}(\mathfrak{B})\to\mathbb{H}^{-}(\mathfrak{B})$ , where

[TABLE]

If there is no confusion, we also call $E$ a $\mathfrak{B}$ -valued Cauchy transform.

Definition 3.5.

(Operator-valued Freeness) Let $(\mathfrak{A},\tau)$ be a C∗-probability space, and $E:\mathfrak{A}\to\mathfrak{B}$ be a conditional expectation. Let $(\mathfrak{B}_{j})_{j\in J}$ be a family of $*$ -subalgebras of $\mathfrak{A}$ such that $\mathfrak{B}\subseteq\mathfrak{B}_{j}$ . Then $(\mathfrak{B}_{j})_{j\in J}$ is said to be $E$ -free if the following factorization rule holds: for any $n\in\mathbb{N}$ and indexes $j_{1},j_{2},\dots,j_{n}\in J$ with $j_{1}\neq j_{2}\neq j_{3}\neq\cdots\neq j_{n}$ , and $a_{l}\in\mathfrak{B}_{l}$ with $E(a_{l})=0$ $(l=1,\dots,n)$ , it holds that

[TABLE]

In addition, a family of elements $X_{j}\in\mathfrak{A}_{\mathrm{s.a.}}\ (j\in J)$ is called $E$ -free if the family of $*$ -subalgebra of the $\mathfrak{B}$ -coefficient polynomials of $X_{j}$ is $E$ -free.

3.2. Random Matrix Models and Free Deterministic Equivalents

Definition 3.6.

Fix a probability measure space $(\Omega,\mathfrak{F},\mathbb{P})$ . Write $\mathbb{E}[\cdot]=\int\ \cdot\ \mathbb{P}(d\omega)$ . Let $p,d\in\mathbb{N}$ . Then real (resp. complex) $p\times d$ Ginibre random matrix of variance $v>0$ is defined as $p\times d$ matrix of independent and identically distributed real (resp. complex) Gaussian random variables $Z_{ij}$ $(i=1,\dots,p,j=1,\dots,d)$ such that

[TABLE]

Definition 3.7.

Let $\mathbb{K}=\mathbb{R}$ (resp. $\mathbb{K}=\mathbb{C}$ ). Let us denote by $Z$ the real (resp. complex) $p\times d$ Ginibre random matrix of variance $1/d$ .

(1)

A real (resp. complex) compound Wishart model ( CW model for short) of type $(p,d)$ is defined as a parametric family $W_{\mathrm{CW}},$ where

[TABLE] 2. (2)

A real (resp. complex) signal-plus-noise model (SPN model for short ) of type $(p,d)$ is defined as a parametric family $W_{\mathrm{SPN}}$ , where

[TABLE]

Here we introduce free deterministic equivalent of each random matrix model. Note that the free deterministic equivalent does not depend on the choice of the field $\mathbb{R}$ or $\mathbb{C}$ .

Definition 3.8.

Let $p,d\in\mathbb{N}$ . Fix a C∗-probability space $(\mathfrak{A},\tau)$ . Let us denote by $C$ the $p\times d$ matrix of $*$ -free circular elements in $(\mathfrak{A},\tau)$ so that

[TABLE]

(1)

The free deterministic equivalent of CW model (FDECW model, for short) of type $(p,d)$ is defined as a parametric family $W^{\Box}_{\mathrm{CW}}$ , where

[TABLE]

In addition, we denote by $\mu_{\mathrm{CW}}(D)$ the distribution of $W^{\Box}_{\mathrm{CW}}(D)$ in the C∗-probability space $(M_{d}(\mathfrak{A}),\tr_{d}\otimes\tau)$ :

[TABLE] 2. (2)

The free deterministic equivalent of SPN model (FDESPN model, for short) of type $(p,d)$ is defined as a parametric family $W^{\Box}_{\mathrm{SPN}}$ , where

[TABLE]

In addition we denote by $\mu_{\mathrm{SPN}}(A,\sigma)$ the distribution of $W^{\Box}_{\mathrm{SPN}}(A,\sigma)$ in the C∗-probability space $(M_{d}(\mathfrak{A}),\tr_{d}\otimes\tau)$ , that is,

[TABLE]

4. Identifiability

4.1. Identifiability of CW Model

First, we quickly check the identifiability of the CW model. Fix $p,d\in\mathbb{N}$ . Let $D,D^{\prime}\in M_{p}(\mathbb{C})]_{\mathrm{s.a.}}$ and $v=(v_{1}\leq v_{2}\leq\dots v_{p}),v^{\prime}=(v^{\prime}_{1}\dots v^{\prime}_{d})\in\mathbb{R}^{p}$ be the vectors of eigenvalues of $D,D^{\prime}$ respectively. Assume that

[TABLE]

Now since $\mu_{\mathrm{CW}}^{\Box}(D)$ is a compound free Poisson law ( see [10]), the $\mathcal{R}$ -transform of $\mu_{\mathrm{CW}}^{\Box}(D)$ is given by the following.

[TABLE]

By the assumption (4.1), it holds that

[TABLE]

Since all polos of $\mathcal{R}(\cdot,v)$ are order one, $v$ and $v^{\prime}$ are equal up to permutation of entries, that is, there is a permutation $\pi\in S_{p}$ such that

[TABLE]

Equivalently, we have

[TABLE]

4.2. Identifiablity of SPN Model

Next, we work on the SPN model. We prove the following identifiability of the statistical model $\mu^{\Box}_{\mathrm{SPN}}$ for the random matrix model $W_{\mathrm{SPN}}$ . The proof is divided into an analytic part and a combinatorial one.

Theorem 4.1.

Let $p,d\in\mathbb{N}$ with $p\geq d$ , $A,B\in M_{p,d}(\mathbb{C})$ , and $\sigma,\rho\in\mathbb{R}$ . Then ${\mu_{\mathrm{SPN}}^{\Box}(A,\sigma)}={\mu_{\mathrm{SPN}}^{\Box}(B,\rho)}$ if and only if $\mu_{A^{*}A}=\mu_{B^{*}B}$ and $\sigma^{2}=\rho^{2}$ .

The proof is postponed to Section 4.2.5.

4.2.1. Analytic Part

Write

[TABLE]

We identify $\mathfrak{D}_{2}$ and $\mathbb{C}^{2}$ via the following isomorphism $\mathfrak{D}_{2}\simeq\mathbb{C}^{2}$ :

[TABLE]

We define a conditional expectation $E\colon M_{p+d}(\mathfrak{A})\to\mathbb{C}^{2}$ by

[TABLE]

where $X_{+,+}\in M_{d}(\mathfrak{A})$ is the $d\times d$ -upper left corner of $X\in M_{p,d}(\mathfrak{A})$ and $X_{-,-}\in M_{p}(\mathfrak{A})$ is the $p\times p$ -lower right corner of $X$ . For $X\in M_{p+d}(\mathfrak{A})$ and $z\in\mathbb{H}^{+}(\mathbb{C}^{2})=\{(z_{1},z_{2})\in\mathbb{C}^{2}\mid\imaginary z_{1},\imaginary z_{2}>0\}$ , we write

[TABLE]

For any rectangular matrix $Y\in M_{p,d}(\mathfrak{A})$ , write

[TABLE]

Let $z=(\alpha,\beta)\in\mathbb{C}^{2}$ . Then we have

[TABLE]

Applying $E$ , we have

[TABLE]

In particular, $G_{\Lambda(Y)}$ is determined by $\mu_{Y^{*}Y}$ . Let $C\in M_{p,d}(\mathfrak{A})$ be a matrix of $*$ -free standard circular elements. By [8, Proposition 5.30], $\Lambda(C)$ is a $\mathbb{C}^{2}$ -valued semicircular element (see [9, Section 9.1] for the definition) with the following variance mapping $\eta\colon\mathbb{C}^{2}\to\mathbb{C}^{2}$ :

[TABLE]

Hence the following equations hold for any $z\in\mathbb{H}^{+}(\mathbb{C}^{2})$ :

[TABLE]

Next, to prove a key lemma, we refer to an analytic free additive subordination formula based on [3].

Corollary 4.2.

Set $a:=\Lambda(A)$ and $s:=\sigma\Lambda(C)$ . Then there exists a pair of Fréche analytic (equivalently, holomorphic) mappings $\psi_{1},\psi_{2}\in\mathrm{Hol}(\mathbb{H}^{+}(\mathbb{C}^{2}))$ so that for all $z\in\mathbb{H}^{+}(\mathbb{C}^{2})$ ,

[TABLE]

Proof.

By [8, Proposition 5.30], the pair $(a,s)$ is $E$ -free. Then the assertion follows from [3, Theorem 2.7]. ∎

Lemma 4.3.

Let $p,d\in\mathbb{N}$ with $p\geq d$ . Let $A\in M_{p,d}(\mathbb{C})$ and $\sigma\in\mathbb{R}$ . Then we have the following equation between holomorphic mappings on $\mathbb{H}^{+}(\mathbb{C}^{2})$ :

[TABLE]

Proof.

Set $a:=\Lambda(A)$ and $s:=\Lambda(C)$ . Pick same holomorphic mappings $\psi_{1}$ and $\psi_{2}$ as in Corollary 4.2. Then for any $z\in\mathbb{H}^{+}(\mathbb{C}^{2})$ ,

[TABLE]

∎

Now we have prepared to prove the first key lemma.

Lemma 4.4.

Fix $p,d\in\mathbb{N}$ with $p\geq d$ . Let $A,B\in M_{p,d}(\mathbb{C})$ and $\sigma\in\mathbb{R}$ . If ${\mu_{\mathrm{SPN}}^{\Box}(A,\sigma)}={\mu_{\mathrm{SPN}}^{\Box}(B,0)}$ then $\sigma=0$ .

Proof.

Assume that ${\mu_{\mathrm{SPN}}^{\Box}(A,\sigma)}={\mu_{\mathrm{SPN}}^{\Box}(B,0)}$ . Then $G_{\Lambda(A+\sigma C)}=G_{\Lambda(B)}$ since $G_{\Lambda(Y)}$ is determined by $\mu_{Y^{*}Y}$ for any $Y\in M_{p,d}(\mathfrak{A})$ .

In the case $B=0$ , it holds that $(A+\sigma C)^{*}(A+\sigma C)=0$ . Thus $A=-\sigma C$ and $A^{*}A=\sigma^{2}C^{*}C$ . Since $\mu_{C^{*}C}$ has no atom and $\mu_{A^{*}A}$ is a sum of delta measures, we have $\sigma=0$ .

Consider the case $B\neq 0$ . Write $\beta:=\norm{B^{*}B}^{1/2}>0$ . Now for any $z\in\mathbb{H}^{+}(\mathbb{C}^{2})$ , by the assumption and Lemma 4.3, the following holds:

[TABLE]

Let

[TABLE]

Then

[TABLE]

where $m\geq 1$ is the multiplicity of the eigenvalue $\beta$ of $\sqrt{B^{*}B}$ . Let $a_{1}\leq\dots\leq a_{d}$ be eigenvalues of $\sqrt{A^{*}A}$ . Then

[TABLE]

Now for any $k=1,\dots,d$ and $j=1,2$ ,

[TABLE]

Let $\gamma>0$ and $z_{1}=z_{2}=\beta+i\gamma$ . Then (4.32) converges to [math] as $\gamma\to+0$ by (4.30).

Assume that $\sigma\neq 0$ , then by $\eqref{align:rational}$ , it holds that

[TABLE]

In particular,

[TABLE]

By (4.27), this contradicts $\eqref{align:degree-one}$ . Therefore $\sigma=0$ . ∎

4.2.2. Combinatorial Part

We use the free multiplicative deconvolution introduced by [12, 11]. We quickly review the deconvolution.

First, we introduce a family of formal power series, since the deconvolution is defined as an operation between moment power series. Let us denote by $\Xi$ the set of formal power series without the constant term of the form

[TABLE]

with $\alpha_{n}\in\mathbb{C}(\forall n\in\mathbb{N})$ . Let $f\in\Xi$ be as in (4.35). For every $n\in\mathbb{N}$ we denote

[TABLE]

Second, we introduce Kreweras complement and boxed convolution. Here we only need one-dimensional boxed convolution. See [10, Lecture 17, 18] for the detail. Let $n\in\mathbb{N}$ and $\pi\in\mathrm{NC}(n)$ . Write $[n]=\{1,2,\dots,n\}$ and consider the discriminant union $[n]\coprod[n]$ . We write the elements from the second entry as $\bar{k}\ (k\in[n])$ , and write $[\bar{n}]=\{\bar{1},\bar{2},\dotsm\bar{n}\}$ . We define an order as follows:

[TABLE]

Then the set $[n]\coprod[n]$ is a totally ordered set. Let $\pi\in\mathrm{NC}(n)$ and

[TABLE]

Then $J$ has the biggest element with respect to the following partially order of $\mathrm{NC}(n)$ : for $\rho$ and $\pi\in\mathrm{NC}(n)$ , $\rho\leq\pi$ if $\forall V_{1},V_{2}\in\rho,\exists W\in\pi$ such that $V_{1}\cup V_{2}\subseteq W$ . The Kreweras complement of $\pi$ , denoted by $K(\pi)$ is defined as

[TABLE]

For $n\in\mathbb{N}$ and $\mathrm{NC}(n)$ , we denote

[TABLE]

where $\absolutevalue{V}$ is the number of elements in $V$ . For $f,g\in\Xi$ , the one dimensional boxed convolution (boxed convolution, for short), denoted by $f\framebox[7.0pt]{$ \star $}g$ is defined as

[TABLE]

where $K(\pi)$ is the Kreweras complement (4.39). One has the operation $\framebox[7.0pt]{$ \star $}$ is associative and commutative [10, Proposition 17.5, Corollary 17.10]. In addition, let us denote by $\Delta$ the series in $\Xi$ defined as

[TABLE]

Then $\Delta$ is the unit of $(\Xi,\framebox[7.0pt]{$ \star $})$ [10, Proposition 17.5]. We denote by $\Xi^{\times}$ the set of invertible elements in $\Xi$ with respect to $\framebox[7.0pt]{$ \star $}$ . For $f\in\Xi$ , we denote by $f^{-1}$ its inverse with respect to $\framebox[7.0pt]{$ \star $}$ . Then by [10, Proposition 17.7],

[TABLE]

Third, we define the Zeta function as

[TABLE]

Clearly $\mathrm{Zeta}\in\Xi^{\times}$ . Then we define the R-transform of formal power series.

Definition 4.5.

(R-transform) Let $f\in\Xi$ . Let us define the R-transform of $f$ as

[TABLE]

For any probability measure $\mu$ on $\mathbb{R}$ with all moments finite, we denote by $M_{\mu}$ its moment formal power series:

[TABLE]

Let $(\mathfrak{A},\varphi)$ be a C∗-probability space, and let $a$ be an element of $\mathfrak{A}$ . The moment power series of $a$ , denote by $M_{a}$ , is a formal power series defined as

[TABLE]

We simply write

[TABLE]

Usually R-transform of $a\in\mathfrak{A}$ is defined as formal power series whose coefficients are free cumulants (see [10]). The compatibility of our definition (4.49) and usual definition is proven in [10, Proposition 17.4]. In addition, the following holds.

Lemma 4.6.

Let $(\mathfrak{A},\varphi)$ be a C∗-probability space and $a,b\in\mathfrak{A}$ . Assume that $(a,b)$ is free. Then

[TABLE]

Proof.

This is a direct consequence of [10, Proposition 17.2]. ∎

Lastly, note that it holds that for $f\in\Xi$ ,

[TABLE]

since $\mathrm{Cf}_{1}(R_{f})=\mathrm{Cf}_{1}(f)$ . Now we have prepared to define the free multiplicative deconvolution.

Definition 4.7.

(free multiplicative deconvolution) For $f\in\Xi$ and $g\in\Xi^{\times}$ , the free multiplicative deconvolution of $f$ with $g$ is defined as

[TABLE]

Equivalently, $f\$ g $istheuniqueformalpowerseriesin$ Ξ $determinedby\begin{aligned} R_{f}=R_{g}\framebox[7.0pt]{$ \star $}R_{(f\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;g)}.\end{aligned}$

Example 4.8.

Let $\beta\in\mathbb{R}$ and $\delta_{\beta}$ be the delta measure on $\mathbb{R}$ whose support is $\{\beta\}\subseteq\mathbb{R}$ . Then

[TABLE]

since

[TABLE]

Note that $K(\{\{1,2,\dots,n\}\})=\{\{1\},\{2\},\dots,\{n\}\}$ . Hence

[TABLE]

Then for any $f\in\Xi$ , we have

[TABLE]

In particular, if $f\in\Xi^{\times}$ , it holds that

[TABLE]

In the case $f=M[a]$ with $a\in\mathfrak{A}$ , it is easy to show that

[TABLE]

since each scalar is free from any element of $\mathfrak{A}$ .

Definition 4.9.

Let $f,g\in\Xi$ . Then their free additive convolution, denoted by $f\boxplus g\in\Xi$ , is defined as

[TABLE]

Equivalently, $f\boxplus g$ is the unique formal power series in $\Xi$ determined by

[TABLE]

Notation 4.10.

Let $(\mathfrak{A},\varphi)$ be a C∗-probability space. Let $q\in\mathfrak{A}$ be a non-zero projection, that is, $q=q^{*}=q^{2}$ . Then

[TABLE]

becomes a C∗-probability space. For $a\in\mathfrak{A}$ , we denote by $M^{q\mathfrak{A}q}[qaq]$ the moment power series of $qaq\ (a\in\mathfrak{A})$ in $(q\mathfrak{A}q,\varphi(q)^{-1}\varphi)$ :

[TABLE]

Proposition 4.11.

Let $(\mathfrak{A},\varphi)$ be a C∗-probability space. Assume that $a,c,p\in\mathfrak{A}$ satisfies the following conditions:

(1)

$a^{*}=a$ , 2. (2)

$c$ * is a circular element, that is,*

[TABLE]

where $(s_{1},s_{2})$ is a pair of free standard semicircular elements in $(\mathfrak{A},\varphi)$ and $\sigma\in\mathbb{R}$ , 3. (3)

$q$ * is a projection, and,* 4. (4)

$(\{c,c^{*}\},\{a,q\})$ * is a pair of free families.*

Set $\lambda:=\varphi(q)$ and

[TABLE]

Then we have

[TABLE]

Proof.

This is a direct consequence of [12, Theorem 3.4]. ∎

4.2.3. Free Poisson Distribution

The formal power series $f_{\lambda}$ in Proposition 4.11 is R-transform of a free Poisson distribution. We review on the free Poisson distribution.

Definition 4.12.

(Free Poisson Distribution) Let $\lambda>0$ , $\alpha\in\mathbb{R}$ . Then the free Poisson distribution with rate $\lambda$ and jump size $\alpha$ is defined as the probability measure on $\mathbb{R}$ determined by

[TABLE]

Usually free Poisson law is defined as the limit law of free version of law of small numbers **[10, Definition 12.12]**. The compatibility between our definition and usual definition is given by **[10, Proposition 12.11]**. Note that $\nu_{\lambda,\alpha}$ is, in fact, a compactly supported probability measure. Note that

[TABLE]

Lemma 4.13.

Let $(\mathfrak{A},\varphi)$ be a C∗-probability space, $a\in\mathfrak{A}$ , and $q\in\mathfrak{A}$ be a non-zero projection free from $a$ . Then it holds that

[TABLE]

where $\lambda:=\varphi(q)$ .

This is well-known, but for the reader’s convenience, we sketch the proof.

Proof.

Note that $M^{q\mathfrak{A}q}[qaq]=\lambda^{-1}M[qaq].$ By the tracial condition and Lemma 4.6,

[TABLE]

By definition of the boxed convolution, we have

[TABLE]

Since $\#\pi+\#K(\pi)=n+1$ , this is equal to

[TABLE]

Thus

[TABLE]

∎

Example 4.14.

Let $q,c\in\mathfrak{A}$ and $q$ be a nonzero-projection. Assume that $(\{q\},\{c,c^{*}\})$ is free pair in $(\mathfrak{A},\tau)$ and $c$ is a standard circular element. Then by Lemma 4.13,

[TABLE]

4.2.4. Second Lemma

In this section, we convert the model to an operator of the form $qaq$ where $q$ is a projection. Let $(\mathfrak{A},\varphi)$ be a C∗-probability space. Let $p,d\in\mathbb{N}$ with $p\geq d$ and write $\lambda=d/p$ . In this section and in next one, we denote by $C^{p,d}$ be a $p\times d$ matrix of $*$ -free circular elements with

[TABLE]

Recall that

[TABLE]

Now we identify $C^{p,d}$ with $d\times d$ upper-left corner of $C^{p,p}$ with a normalization as the following:

[TABLE]

Recall that a family $\{C^{p,p}_{ij}\mid\text{$ 1\leq i,j\leq p $}\}$ is a $*$ -free family of circular elements such as

[TABLE]

We write

[TABLE]

Then $C^{p,p}$ is a circular element in $(\mathfrak{C},\tau)$ , and it is standard, that is,

[TABLE]

We define a projection $\Pi\in M_{p}(\mathbb{C})\subseteq\mathfrak{C}$ as

[TABLE]

One has $\tau(\Pi)=\lambda$ . For a $p\times d$ -matrix $A$ , let us denote by $\tilde{A}$ be the $p\times p$ -square matrix obtained by adding zeros to $A$ ;

[TABLE]

Now by definition, we have

[TABLE]

Therefore, for any $m\in\mathbb{N}$ ,

[TABLE]

Equivalently, we have

[TABLE]

Recall that

[TABLE]

Lemma 4.15.

Let $\alpha\in\mathbb{R}$ . Then

[TABLE]

where $\delta_{\alpha}$ is the delta measure on $\mathbb{R}$ whose support is $\{\alpha\}$ .

Proof.

Now $\{C^{p,p}\}$ and $\{\tilde{A},\Pi\}$ is $*$ -free in $(\mathfrak{C},\tau)$ , since the entries of $A$ and $\Pi$ are scalar. By Lemma 4.13,

[TABLE]

Hence by (4.57),

[TABLE]

∎

Corollary 4.16.

Let $p,d\in\mathbb{N}$ , $A\in M_{p,d}(\mathbb{C})$ , $\sigma\in\mathbb{R}$ . Assume that $p\geq d$ and set $\lambda:=d/p$ . Then

[TABLE]

Proof.

By (4.86) and Proposition 4.65, the left-hand side is equal to

[TABLE]

Now

[TABLE]

By Lemma 4.15, it holds that

[TABLE]

Hence the assertion holds. ∎

Lemma 4.17.

Assume that $\alpha,\beta\in\mathbb{R}$ , and $f,g\in\Xi$ satisfy

[TABLE]

Then

[TABLE]

Proof.

Apply Zeta $tobothhandsideof\eqref{align:assume},then\begin{aligned} R_{f}(z)+\alpha z&=R_{g}(z)+\beta z,\\ R_{f}(z)+(\alpha-\beta)z&=R_{g}(z).\end{aligned}Applying$ $\star$ Zeta $tobothhandside,wehave\eqref{align:subtract-scalar}.\qed\end@proof\par\par\par\par Nowweprovethesecondkeylemma.\begin{lem}Let$ p,d\in\mathbb{N} $,$ \sigma,\rho\in\mathbb{R} $,and,$ A $and$ B\in M_{p,d}(\mathbb{C}) $.Assumethat$ \sigma^{2}\geq\rho^{2} $and\begin{aligned} {\mu_{\mathrm{SPN}}^{\Box}(A,\sigma)}={\mu_{\mathrm{SPN}}^{\Box}(B,\rho)}.\end{aligned}Then\begin{aligned} {\mu_{\mathrm{SPN}}^{\Box}(A,\sqrt{\sigma^{2}-\rho^{2}})}={\mu_{\mathrm{SPN}}^{\Box}(B,0)}.\end{aligned}\par\end{lem}\@proof ByCorollary~{}\ref{cor:deconv-spn}andtheassumption,wehave\begin{aligned} (M_{A^{*}A}\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda})\boxplus M[\delta_{\sigma^{2}/\lambda}]=(M_{B^{*}B}\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda})\boxplus M[\delta_{\rho^{2}/\lambda}].\end{aligned}ThusbyLemma~{}\ref{lem:moment_delta},itholdsthat\begin{aligned} (M[A^{*}A]\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda})\boxplus M[\delta_{(\sigma^{2}-\rho^{2})/\lambda}]=M[B^{*}B]\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda}.\end{aligned}ByusingCorollary~{}\ref{cor:deconv-spn}again,wehave\begin{aligned} M[{\mu_{\mathrm{SPN}}^{\Box}(A,\sqrt{\sigma^{2}-\rho^{2}})}]\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda}=M[{\mu_{\mathrm{SPN}}^{\Box}(B,0)}]\ \framebox(0.0,0.0)[bl]{$ \smallsetminus $}\;f_{\lambda}.\end{aligned}Equivalently,\begin{aligned} R[{\mu_{\mathrm{SPN}}^{\Box}(A,\sqrt{\sigma^{2}-\rho^{2}})}]\framebox[7.0pt]{$ \star $}R[f_{\lambda}]^{-1}=R[{\mu_{\mathrm{SPN}}^{\Box}(B,0)}]\framebox[7.0pt]{$ \star $}R[f_{\lambda}]^{-1}.\end{aligned}Applying$ \framebox[7.0pt]{ $\star$ }R[f_{\lambda}]\framebox[7.0pt]{ $\star$ }\mathrm{Zeta} $tothebothhandsides,wehave\begin{aligned} M[{\mu_{\mathrm{SPN}}^{\Box}(A,\sqrt{\sigma^{2}-\rho^{2}})}]=M[{\mu_{\mathrm{SPN}}^{\Box}(B,0)}].\end{aligned}Sinceanycompactlysupportedprobabilitymeasureisdeterminedbyitsmoments,theassertionholds.\qed\end@proof\par\par\par\par$

4.2.5. Proof of Identifiability

proof of Theorem 4.1.

Without loss of generality, we may assume that $\sigma^{2}\geq\rho^{2}$ . Let ${\mu_{\mathrm{SPN}}^{\Box}(A,\sigma)}={\mu_{\mathrm{SPN}}^{\Box}(B,\rho)}$ . First, by Lemma 4.18, we have

[TABLE]

Second, Lemma 4.4 implies $\sqrt{\sigma^{2}-\rho^{2}}=0$ . Then $\mu_{A^{*}A}=\mu_{B^{*}B}$ , which completes the proof. ∎

5. Acknowledgement

We would like to thank Hiroaki Yoshida for discussions. We appreciate Yuichi Ike’s valuable comments on our manuscript.

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. S. Allman, C. Matias, J. A. Rhodes, et al. Identifiability of parameters in latent structure models with many observed variables. Ann. Stat. , 37(6A):3099–3132, 2009.
2[2] S. Amari. Information geometry and its applications . Springer, 2016.
3[3] S. T. Belinschi, T. Mai, and R. Speicher. Analytic subordination theory of operator-valued free additive convolution and the solution of a general random matrix problem. J. Reine Angew. Math. , 2013.
4[4] Z. Burda, A. Jarosz, M. A. Nowak, J. Jurkiewicz, G. Papp, and I. Zahed. Applying free random variables to random matrix analysis of financial data. part I: The gaussian case. Quant. Fin. , 11(7):1103–1124, 2011.
5[5] R. Couillet, M. Debbah, and J. W. Silverstein. A deterministic equivalent for the analysis of correlated mimo multiple access channels. IEEE Trans. on Inform. Theory , 57(6):3493–3514, 2011.
6[6] W. Hachem, P. Loubaton, X. Mestre, J. Najim, and P. Vallet. Large information plus noise random matrix models and consistent subspace estimation in large sensor networks. Random Matrices Theory Appl. , 1(02):1150006, 2012.
7[7] A. Hasegawa, N. Sakuma, and H. Yoshida. Random matrices by MA models and compound free poisson laws. Probab. Math. Statist , 33(2):243–254, 2013.
8[8] T. Hayase. Cauchy noise loss for stochastic optimization of random matrix models via free deterministic equivalents. ar Xiv:1804.03154 [stat.ML] , 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Identifiability of Parametric Random Matrix Models

Abstract.

Key words and phrases:

Contents

1. Introduction

Theorem 1.1**.**

2. Related Works

3. Preliminary

3.1. Freeness

Definition 3.1**.**

Definition 3.2**.**

Definition 3.3**.**

Definition 3.4**.**

Definition 3.5**.**

3.2. Random Matrix Models and Free Deterministic Equivalents

Definition 3.6**.**

Definition 3.7**.**

Definition 3.8**.**

4. Identifiability

4.1. Identifiability of CW Model

4.2. Identifiablity of SPN Model

Theorem 4.1**.**

4.2.1. Analytic Part

Corollary 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

4.2.2. Combinatorial Part

Definition 4.5**.**

Lemma 4.6**.**

Proof.

Definition 4.7**.**

Example 4.8**.**

Definition 4.9**.**

Notation 4.10**.**

Proposition 4.11**.**

Proof.

4.2.3. Free Poisson Distribution

Definition 4.12**.**

Lemma 4.13**.**

Proof.

Example 4.14**.**

4.2.4. Second Lemma

Lemma 4.15**.**

Proof.

Corollary 4.16**.**

Proof.

Lemma 4.17**.**

Proof.

4.2.5. Proof of Identifiability

proof of Theorem 4.1.

5. Acknowledgement

Theorem 1.1.

Definition 3.1.

Definition 3.2.

Definition 3.3.

Definition 3.4.

Definition 3.5.

Definition 3.6.

Definition 3.7.

Definition 3.8.

Theorem 4.1.

Corollary 4.2.

Lemma 4.3.

Lemma 4.4.

Definition 4.5.

Lemma 4.6.

Definition 4.7.

Example 4.8.

Definition 4.9.

Notation 4.10.

Proposition 4.11.

Definition 4.12.

Lemma 4.13.

Example 4.14.

Lemma 4.15.

Corollary 4.16.

Lemma 4.17.