Eigenvalue Based Detection of a Signal in Colored Noise: Finite and   Asymptotic Analyses

Lahiru D. Chamain; Prathapasinghe Dharmawansa; Saman Atapattu; and; Chintha Tellambura

arXiv:1902.02483·cs.IT·February 8, 2019

Eigenvalue Based Detection of a Signal in Colored Noise: Finite and Asymptotic Analyses

Lahiru D. Chamain, Prathapasinghe Dharmawansa, Saman Atapattu, and, Chintha Tellambura

PDF

TL;DR

This paper derives the finite-sample and asymptotic distribution of the largest generalized eigenvalue used for signal detection in colored noise, enabling improved ROC analysis and detection performance understanding.

Contribution

It provides the first finite-dimensional characterization of the eigenvalue distribution under the alternative hypothesis using orthogonal polynomial methods.

Findings

01

Finite-sample c.d.f. of the largest generalized eigenvalue derived.

02

Asymptotic c.d.f. and special cases analyzed.

03

Reliable detection possible when SNR scales with sample size.

Abstract

Signal detection in colored noise with an unknown covariance matrix has a myriad of applications in diverse scientific/engineering fields. The test statistic is the largest generalized eigenvalue (l.g.e.) of the whitened sample covariance matrix, which is constructed via $m$ -dimensional $p$ signal-plus-noise samples and $m$ -dimensional $n$ noise-only samples. A finite dimensional characterization of this statistic under the alternative hypothesis has hitherto been an open problem. We answer this problem by deriving cumulative distribution function (c.d.f.) of this l.g.e. via the powerful orthogonal polynomial approach, exploiting the deformed Jacobi unitary ensemble (JUE). Two special cases and an asymptotic version of the c.d.f. are also derived. With this new c.d.f., we comprehensively analyze the receiver operating characteristics (ROC) of the detector. Importantly, when the…

Equations307

\begin{split}\det\left[a_{i}\;\;b_{i,j}\right]_{\begin{subarray}{c}i=1,2,\ldots,n\\ j=2,3,\ldots,n\end{subarray}}&=\left|\begin{array}[]{ccccc}a_{1}&b_{1,2}&b_{1,3}&\ldots&b_{1,n}\\ a_{2}&b_{2,2}&b_{2,3}&\ldots&b_{2,n}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ a_{n}&b_{n,2}&b_{n,3}&\ldots&b_{n,n}\end{array}\right|.\end{split}

\begin{split}\det\left[a_{i}\;\;b_{i,j}\right]_{\begin{subarray}{c}i=1,2,\ldots,n\\ j=2,3,\ldots,n\end{subarray}}&=\left|\begin{array}[]{ccccc}a_{1}&b_{1,2}&b_{1,3}&\ldots&b_{1,n}\\ a_{2}&b_{2,2}&b_{2,3}&\ldots&b_{2,n}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ a_{n}&b_{n,2}&b_{n,3}&\ldots&b_{n,n}\end{array}\right|.\end{split}

x = ρ h s + n

x = ρ h s + n

H_{0} : ρ = 0 Signal is absent

H_{0} : ρ = 0 Signal is absent

H_{1} : ρ > 0 Signal is present .

S = ρ h h^{†} + Σ,

S = ρ h h^{†} + Σ,

\displaystyle\begin{array}[]{ll}\mathcal{H}_{0}:\;\mathbf{R}=\boldsymbol{\Sigma}&\text{Signal is absent}\\ \mathcal{H}_{1}:\;\mathbf{S}=\rho\mathbf{h}\mathbf{h}^{\dagger}+\boldsymbol{\Sigma}&\text{Signal is present}.\end{array}

\displaystyle\begin{array}[]{ll}\mathcal{H}_{0}:\;\mathbf{R}=\boldsymbol{\Sigma}&\text{Signal is absent}\\ \mathcal{H}_{1}:\;\mathbf{S}=\rho\mathbf{h}\mathbf{h}^{\dagger}+\boldsymbol{\Sigma}&\text{Signal is present}.\end{array}

Ψ = R^{- 1} S = ρ Σ^{- 1} h h^{†} + I .

Ψ = R^{- 1} S = ρ Σ^{- 1} h h^{†} + I .

R = \frac{1}{n} ℓ = 1 \sum n n_{ℓ} n_{ℓ}^{†}

R = \frac{1}{n} ℓ = 1 \sum n n_{ℓ} n_{ℓ}^{†}

S = \frac{1}{p} k = 1 \sum p x_{k} x_{k}^{†}

S = \frac{1}{p} k = 1 \sum p x_{k} x_{k}^{†}

Ψ = R^{- 1} S

Ψ = R^{- 1} S

n R \sim C W_{m} (n, Σ)

n R \sim C W_{m} (n, Σ)

p S \sim C W_{m} (p, Σ + ρ h h^{†})

n R \sim C W_{m} (n, I_{m})

n R \sim C W_{m} (n, I_{m})

p S \sim C W_{m} (p, I_{m} + γ u u^{†})

P_{D} (γ, μ) = Pr (\hat{λ}_{m a x} (γ) > μ_{th} ∣ H_{1})

P_{D} (γ, μ) = Pr (\hat{λ}_{m a x} (γ) > μ_{th} ∣ H_{1})

P_{F} (γ, μ) = Pr (\hat{λ}_{m a x} (γ) > μ_{th} ∣ H_{0})

P_{F} (γ, μ) = Pr (\hat{λ}_{m a x} (γ) > μ_{th} ∣ H_{0})

f (λ_{1}, λ_{2}, \dots, λ_{m}) = \frac{K _{1} ( m , n , p )}{det ^{p} ( Σ )} j = 1 \prod m λ_{j}^{p - m} Δ_{m}^{2} (λ)_{1} F_{0} (p + n; - Σ^{- 1}, Λ)

f (λ_{1}, λ_{2}, \dots, λ_{m}) = \frac{K _{1} ( m , n , p )}{det ^{p} ( Σ )} j = 1 \prod m λ_{j}^{p - m} Δ_{m}^{2} (λ)_{1} F_{0} (p + n; - Σ^{- 1}, Λ)

K_{1} (m, n, p) = \frac{π ^{m (m - 1)} Γ _{m} ( n + p )}{Γ _{m} ( m ) Γ _{m} ( n ) Γ _{m} ( p )}

K_{1} (m, n, p) = \frac{π ^{m (m - 1)} Γ _{m} ( n + p )}{Γ _{m} ( m ) Γ _{m} ( n ) Γ _{m} ( p )}

Γ_{m} (n) = π^{\frac{1}{2} m (m - 1)} j = 1 \prod m Γ (n - j + 1) .

Γ_{m} (n) = π^{\frac{1}{2} m (m - 1)} j = 1 \prod m Γ (n - j + 1) .

P_{n}^{(a, b)} (x) = k = 0 \sum n (n - k n + a) (k n + k + a + b) (\frac{x - 1}{2})^{k} for a, b > - 1

P_{n}^{(a, b)} (x) = k = 0 \sum n (n - k n + a) (k n + k + a + b) (\frac{x - 1}{2})^{k} for a, b > - 1

P_{n}^{(a, b)} (x) = (a n + a) \Hypergeometric 21 - n, n + a + b + 1 1 + a \frac{1 - x}{2}

P_{n}^{(a, b)} (x) = (a n + a) \Hypergeometric 21 - n, n + a + b + 1 1 + a \frac{1 - x}{2}

\frac{d ^{k}}{d x ^{k}} P_{n}^{(a, b)} (x) = 2^{- k} (n + a + b + 1)_{k} P_{n - k}^{(a + k, b + k)} (x)

\frac{d ^{k}}{d x ^{k}} P_{n}^{(a, b)} (x) = 2^{- k} (n + a + b + 1)_{k} P_{n - k}^{(a + k, b + k)} (x)

\displaystyle(-n)_{k}=\left\{\begin{array}[]{ll}\frac{(-1)^{k}n!}{(n-k)!}&\text{if }0\leq k\leq n\\ 0&\text{if }k>n.\end{array}\right.

\displaystyle(-n)_{k}=\left\{\begin{array}[]{ll}\frac{(-1)^{k}n!}{(n-k)!}&\text{if }0\leq k\leq n\\ 0&\text{if }k>n.\end{array}\right.

Σ = I_{m} + η vv^{†} = V diag (1 + η, 1, 1, \dots, 1) V^{†}

Σ = I_{m} + η vv^{†} = V diag (1 + η, 1, 1, \dots, 1) V^{†}

f (λ_{1}, λ_{2}, \dots, λ_{m}) = f_{uc} (λ_{1}, λ_{2}, \dots, λ_{m}) f_{cor} (λ_{1}, λ_{2}, \dots, λ_{m})

f (λ_{1}, λ_{2}, \dots, λ_{m}) = f_{uc} (λ_{1}, λ_{2}, \dots, λ_{m}) f_{cor} (λ_{1}, λ_{2}, \dots, λ_{m})

f_{uc} (λ_{1}, λ_{2}, \dots, λ_{m}) = K_{1} (m, n, p) j = 1 \prod m \frac{λ _{j}^{p - m}}{( 1 + λ _{j} ) ^{p + n}} Δ_{m}^{2} (λ),

f_{uc} (λ_{1}, λ_{2}, \dots, λ_{m}) = K_{1} (m, n, p) j = 1 \prod m \frac{λ _{j}^{p - m}}{( 1 + λ _{j} ) ^{p + n}} Δ_{m}^{2} (λ),

f_{cor} (λ_{1}, λ_{2}, \dots, λ_{m}) = \frac{K _{2} ( m , n , p )}{η ^{m - 1} ( 1 + η ) ^{p + 1 - m}} j = 1 \prod m (1 + λ_{j}) k = 1 \sum m \frac{( 1 + λ _{k} ) ^{p + n - 1}}{j = 1 j \neq = k \prod m ( λ _{k} - λ _{j} ) ( 1 + \frac{λ _{k}}{η + 1} ) ^{p + n + 1 - m}},

f_{cor} (λ_{1}, λ_{2}, \dots, λ_{m}) = \frac{K _{2} ( m , n , p )}{η ^{m - 1} ( 1 + η ) ^{p + 1 - m}} j = 1 \prod m (1 + λ_{j}) k = 1 \sum m \frac{( 1 + λ _{k} ) ^{p + n - 1}}{j = 1 j \neq = k \prod m ( λ _{k} - λ _{j} ) ( 1 + \frac{λ _{k}}{η + 1} ) ^{p + n + 1 - m}},

K_{2} (m, n, p) = \frac{( m - 1 )! ( p + n - m )!}{( p + n - 1 )!},

K_{2} (m, n, p) = \frac{( m - 1 )! ( p + n - m )!}{( p + n - 1 )!},

x_{j} = \frac{λ _{j}}{1 + λ _{j}}, j = 1, 2, \dots, m,

x_{j} = \frac{λ _{j}}{1 + λ _{j}}, j = 1, 2, \dots, m,

g (x_{1}, x_{2}, \dots, x_{m}) = \frac{K _{3} ( m , n , p )}{η ^{m - 1} ( 1 + η ) ^{p + 1 - m}}

g (x_{1}, x_{2}, \dots, x_{m}) = \frac{K _{3} ( m , n , p )}{η ^{m - 1} ( 1 + η ) ^{p + 1 - m}}

\times k = 1 \sum m \frac{1}{j = 1 j \neq = k \prod m ( x _{k} - x _{j} ) ( 1 - \frac{η}{η + 1} x _{k} ) ^{p + n + 1 - m}}

F_{x_{m a x}} (t) = Pr (x_{m a x} \leq t) = \int_{0 \leq x_{1} \leq x_{2} \leq \dots \leq x_{m} \leq t} g (x_{1}, x_{2}, \dots, x_{m}) d x

F_{x_{m a x}} (t) = Pr (x_{m a x} \leq t) = \int_{0 \leq x_{1} \leq x_{2} \leq \dots \leq x_{m} \leq t} g (x_{1}, x_{2}, \dots, x_{m}) d x

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Eigenvalue Based Detection of a Signal in Colored Noise: Finite and Asymptotic Analyses

Lahiru D. Chamain, Prathapasinghe Dharmawansa, , Saman Atapattu, , and Chintha Tellambura L. D. Chamain is with the Department of Electrical and Computer Engineering, 2064 Kemper Hall, University of California Davis, 1 Shields Avenue, Davis, CA 95616 (e-mail: [email protected]).P. Dharmawansa is with the Department of Electronic and Telecommunication Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka (e-mail: [email protected]).S. Atapattu is with the Department of Electrical and Electronic Engineering, University of Melbourne, Parkville, VIC 3010, Australia (e-mail: [email protected]).C. Tellambura is with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2R3, Canada (e-mail: [email protected]).

Abstract

Signal detection in colored noise with an unknown covariance matrix has a myriad of applications in diverse scientific/engineering fields. The test statistic is the largest generalized eigenvalue (l.g.e.) of the whitened sample covariance matrix, which is constructed via $m$ -dimensional $p$ signal-plus-noise samples and $m$ -dimensional $n$ noise-only samples. A finite dimensional characterization of this statistic under the alternative hypothesis has hitherto been an open problem. We answer this problem by deriving cumulative distribution function (c.d.f.) of this l.g.e. via the powerful orthogonal polynomial approach, exploiting the deformed Jacobi unitary ensemble (JUE). Two special cases and an asymptotic version of the c.d.f. are also derived. With this new c.d.f., we comprehensively analyze the receiver operating characteristics (ROC) of the detector. Importantly, when the noise-only covariant matrix is nearly rank deficient (i.e., $m=n$ ), we show that (a) when $m$ and $p$ increase such that $m/p$ is fixed, at each fixed signal-to-noise ratio (SNR), there exists an optimal ROC profile. We also establish a tight approximation of it; and (b) asymptotically, reliable signal detection is always possible (no matter how weak the signal is) if SNR scales with $m$ .

Index Terms:

Colored noise, eigenvalue, $F$ -matrix, Hypergeometric function of two matrix arguments, Jacobi unitary ensemble, orthogonal polynomials, receiver operating characteristics (ROC), Wishart matrix

I Introduction

Eigenvalue based detection of a signal embedded in noise is a fundamental problem with a myriad of applications in diverse fields including signal processing, wireless communications, cognitive radio, bioinformatics and many more [1, 2, 3, 4, 5, 6, 7, 8]. Thus, sample eigenvalue (of the sample covariance matrix) based detection has gained prominence recently ([9, 10] and references therein). In this context, the largest sample eigenvalue based detection, also known as the Roy’s largest root test [11], has been popular among detection theorists. Under the common Gaussian setting with white noise, this amounts to the use of the largest eigenvalue of a Wishart matrix having a so-called spiked covariance [12, 13, 14, 15, 16, 17].

However, colored noise (or correlated noise) occurs in multitudes of applications [18, 19, 20, 21, 22, 8]. In this case, we can utilize the maximum eigenvalue of the matrix formed by whitening the signal-plus-noise sample covariance matrix with the noise-only sample covariance matrix. For this estimator, Nadakuditi and Silverstein [4] proposed a framework to use the generalized eigenvalues of the whitened signal-plus-noise sample covariance matrix for detection. The assumption of having the noise only sample covariance matrix is realistic in many practical situations as detailed in [4]. The fundamental high dimensional limits of the generalized sample eigenvalue based detection in colored noise have been thoroughly investigated in [4]. However, to our best knowledge, a tractable finite dimensional analysis is not available in the literature. Thus, in this paper, we characterize the statistics of the Roy’s largest root in the finite dimensional colored noise setting. Moreover, we investigate certain limiting behaviors of the Roy’s largest root to deepen our understanding of the classical detection problem in colored noise. These limiting expressions are derived based on their finite dimensional counterparts, whereas in the literature, it is customary to use entirely different tools for finite and asymptotic analyses.

The Roy’s largest root of the generalized eigenvalue detection problem in the Gaussian setting amounts to finite dimensional characterization of the largest eigenvalue of the deformed Jacobi ensemble. Various asymptotic expressions (high dimensional and high signal-to-noise ratio) for it have been derived in [23, 24, 25, 26] for deformed Jacobi ensemble. However, finite dimensional expressions are available for Jacobi ensemble only (without deformation) [27, 28, 29]. Although finite dimensional, these expressions are not amenable to further manipulations. Therefore, in this paper, we present a simple and tractable closed form solution to the cumulative distribution function (c.d.f.) of the maximum eigenvalue of the deformed Jacobi ensemble. This expression further facilitates the analysis of the receiver operating characteristics (ROC) of the Roy’s largest root test. All these results are made possible due to a novel alternative joint eigenvalue density function that we have derived based on the contour integral approach due to [30, 31, 32, 33, 34].

The key results developed in this paper enable us to understand the joint effect of the system dimensionality ( $m$ ), the number of signal-plus-noise samples ( $p$ ) and noise-only samples ( $n$ ), and the signal-to-noise ratio ( $\gamma$ ) on the ROC. For instance, the relative disparity between $m$ and $n$ improves the ROC profile for fixed values of the other parameters. However, the general finite dimensional ROC expressions turns out to give little analytical insights. Therefore, to obtain more insights, we have particularly focused on the case for which the system dimensionality equals the number of the noise-only samples (i.e., $m=n$ ). Since this equality is the minimum requirement for the validity of the whitening operation, from the ROC perspective, it corresponds to the worst possible case when then other parameters being fixed. It turns out that, in this scenario, when $p$ increases for fixed $m,n$ and $\gamma$ , the ROC profile improves. In this respect, the ROC profile converges to a limiting profile as $p\to\infty$ . In contrast, when we increase $p$ and $m$ simultaneously such that $m/p$ is a constant ( $\leq 1$ ) for fixed $\gamma$ , we can observe an optimal ROC profile for some special values of $p$ and $m$ . However, as $p,m,n\to\infty$ such that $m/p$ approaches a constant ( $\leq 1$ ) (the high dimensional limit) and $m/n=1$ for fixed $\gamma$ , the maximum eigenvalue tend to lose its detection power. This phenomenon amounts to stating that the maximum eigenvalue has no power below the phase transition. This has been observed in random matrix theory literature [35, 26, 36, 4, 37]. Be that as it may, the most interesting result emerged from our analysis is that, when $\gamma$ scales with $m$ under the latter assumptions, the ROC attains a finite limit. In other words, the maximum eigenvalue still retains its detection power in the high dimension when $\gamma$ scales with $m$ as $m\to\infty$ . For instance, under Rayleigh fading, as $m\to\infty$ , $\gamma$ scales with $m$ (due to the strong law of large numbers). Therefore, the above insight can be of paramount importance in designing future wireless communication systems (5G and beyond).

The remainder of this paper is organized as follows. In Section II, we formulate the classical detection problem in unknown colored noise. A new c.d.f. expression for the maximum eigenvalue (i.e., Roy’s largest root) of the deformed Jacobi unitary ensemble is derived in Section III. It also gives certain particularizations of the general c.d.f. expression. Subsequently, Section IV investigates the ROC characteristics of the Roy’s largest root test in the light of the c.d.f. derived in Section III. Moreover, the interplay between the system dimensionality, the number of signal-plus-noise samples, and the noise-only samples has been analytically characterized in Section IV. Finally, conclusions are drawn in Section V.

The following notation is used throughout this paper. The superscript $(\cdot)^{\dagger}$ indicates the Hermitian transpose, $\text{det}(\cdot)$ denotes the determinant of a square matrix, $\text{tr}(\cdot)$ represents the trace of a square matrix, and $\text{etr}(\cdot)$ stands for $\exp\left(\text{tr}(\cdot)\right)$ . The $n\times n$ identity matrix is represented by $\mathbf{I}_{n}$ and the Euclidean norm of a vector $\mathbf{w}$ is denoted by $||\mathbf{w}||$ . A diagonal matrix with the diagonal entries $a_{1},a_{2},\ldots,a_{n}$ is denoted by $\text{diag}(a_{1},a_{2},\ldots,a_{n})$ . We denote the $m\times m$ unitary group by $U(m)$ . Finally, we use the following notation to compactly represent the determinant of an $n\times n$ block matrix:

[TABLE]

II Problem formulation

Consider the following generic signal detection problem in colored Gaussian noise

[TABLE]

where $\mathbf{x,h}\in\mathbb{C}^{m}$ are $m$ -dimensional complex vectors, $\rho>0$ is a signal power measure, $s\sim\mathcal{CN}(0,1)$ is a complex Gaussian transmit symbol and $\mathbf{n}\sim\mathcal{CN}_{m}(\mathbf{0},\boldsymbol{\Sigma})$ is random Complex Gaussian noise vector with covariance matrix $\boldsymbol{\Sigma}$ , which may or may not be known at the detector. The classical signal detection problem amounts to the following hypothesis testing problem:

[TABLE]

Nothing that the covariance matrix of $\mathbf{x}$ can be written as

[TABLE]

where $(\cdot)^{\dagger}$ denotes the conjugate transpose, we can have the following equivalent form

[TABLE]

If the signal-plus-noise covariance matrix $\mathbf{S}$ and the noise covariance matrix $\boldsymbol{\Sigma}$ were known, we may compute matrix

[TABLE]

Denote the eigenvalues of $\boldsymbol{\Psi}$ by $\lambda_{1}\leq\lambda_{2}\leq\ldots\leq\lambda_{m}$ . These eigenvalues are in fact the generalized eigenvalues of the matrix pair $(\mathbf{S},\mathbf{R}).$ Since the rank of $\mathbf{h}\mathbf{h}^{\dagger}$ is one, then $m-1$ eigenvalues are all equal to one ( $\lambda_{1}=\lambda_{2}=\ldots=\lambda_{m-1}=1$ ), while the remaining maximum eigenvalue of $\boldsymbol{\Psi}$ ( $\lambda_{m}$ ) is strictly greater than one. Thus, the maximum eigenvalue of $\boldsymbol{\Psi}$ could be used to detect the presence of a signal [4].

In most practical settings, $\mathbf{R}$ and $\bf{S}$ matrices are unknown. To circumvent this difficulty, we may replace $\mathbf{R}$ and $\mathbf{S}$ by their sample estimates. To this end, we assume the availability of $p>1$ i.i.d. signal-plus-noise samples $\{\mathbf{x}_{1},\mathbf{x}_{2},\ldots,\mathbf{x}_{p}\}$ , and $n$ i.i.d. noise-only samples $\{\mathbf{n}_{1},\mathbf{n}_{2},\ldots,\mathbf{n}_{n}\}$ . Thus, the sample estimates of $\mathbf{R}$ and $\mathbf{S}$ become

[TABLE]

where we assume that $n,p\geq m$ (this ensures that both $\widehat{\mathbf{R}}$ and $\widehat{\mathbf{S}}$ are positive definite with probability $1$ [38, 37]). Consequently, following [4], we form the matrix

[TABLE]

and focus on its maximum eigenvalue as the test statistic111This is also known as the Roy’s largest root test which is a consequence of Roy’s union intersection principle [11].. As such, we have

[TABLE]

Noting that the eigenvalues of $\widehat{\boldsymbol{\Psi}}$ do not change under the simultaneous transformations $\widehat{\mathbf{R}}\mapsto\boldsymbol{\Sigma}^{-1/2}\widehat{\mathbf{R}}\boldsymbol{\Sigma}^{-1/2}$ , and $\widehat{\mathbf{S}}\mapsto\boldsymbol{\Sigma}^{-1/2}\widehat{\mathbf{S}}\boldsymbol{\Sigma}^{-1/2}$ , without loss of generality we assume that $\boldsymbol{\Sigma}=\sigma^{2}\mathbf{I}_{m}$ . Therefore, in what follows we focus on the maximum eigenvalue of $\widehat{\boldsymbol{\Psi}}$ , where

[TABLE]

with $\gamma=\rho||\mathbf{h}||^{2}/\sigma^{2}$ and $\mathbf{u}=\mathbf{h}/||\mathbf{h}||$ being a unit vector.

Let us denote the maximum eigenvalue of $\widehat{\boldsymbol{\Psi}}$ as $\hat{\lambda}_{\max}(\gamma)$ . Now, in order to assess the performance of the maximum-eigen based detector, we need to evaluate the detection222This is also known as the power of the test. and false alarm probabilities. They may be expressed as

[TABLE]

and

[TABLE]

where $\mu_{\text{th}}$ is the threshold. The $(P_{D},P_{F})$ pair characterizes the detector and is called the ROC profile.

Our main challenge is to characterize the maximum eigenvalue of $\widehat{\boldsymbol{\Psi}}$ under the alternative $\mathcal{H}_{1}$ . This particular matrix is also referred to as the multivariate $F$ matrix in the statistics literature [38]. It is also related to the so called Jacobi ensemble in random matrix theory [39],[40]. The joint eigenvalue distribution of the $F$ (also Jacobi ensemble) matrix has been well documented in the literature [38], [39], [41]. The extreme eigenvalues of $F$ under the null has been characterized in [27, 28, 29] in terms of hypergeometric function of one matrix argument. To gain more insights into the behavior of the extreme eigenvalues, focus has been shifted to various asymptotic domains (high dimensionality or high SNR). In this respect, various asymptotic expressions for the extreme eigenvalues, under the null, have been established in [42, 43, 23, 24]. Recently, capitalizing on new contour integral representations of hypergeometric functions of matrix arguments by [33, 44, 31, 32, 30], several new asymptotic results (including phase transition phenomena) for the maximum eigenvalue, under the alternative, have been established [25]. Also, the authors in [4, 26, 36] have employed the Stiltjes transform technique to relax the Gaussian assumption, thereby establishing the universality nature of the above results. Despite those asymptotic results, a finite-dimensional characterization of the maximum eigenvalue under the alternative hypothesis has been an open problem. Therefore, in this paper, we attack this problem by exploiting orthogonal polynomial techniques due to Mehta [39] to obtain a closed-form solution. In particular, we derive an expression which contains a determinant whose dimension depends through the relative difference between $m$ and $n$ . Consequently, this property is used to establish an interesting asymptotic result on the maximum eigenvalue under the alternative hypothesis.

III C.D.F. of the Maximum Eigenvalue

Before proceeding further, we present some fundamental results pertaining to the joint eigenvalue distribution of an $F$ -matrix and Jacobi polynomials.

III-A Preliminaries

Definition 1

Let $\mathbf{W}_{1}\sim\mathcal{W}_{m}\left(p,\boldsymbol{\Sigma}\right)$ and $\mathbf{W}_{2}\sim\mathcal{W}_{m}\left(n,\mathbf{I}_{m}\right)$ be two independent Wishart matrices with $p,n\geq m$ . Then the joint eigenvalue density of the ordered eigenvalues, $\lambda_{1}\leq\lambda_{2}\leq\ldots\leq\lambda_{m}$ , of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ is given by [41]

[TABLE]

where ${}_{1}\widetilde{F}_{0}\left(\cdot;\cdot,\cdot\right)$ is the generalized complex hypergeometric function of two matrix arguments, $\Delta_{m}^{2}(\boldsymbol{\lambda})=\prod_{1\leq i<j\leq m}\left(\lambda_{j}-\lambda_{i}\right)$ is the Vandermonde determinant, $\boldsymbol{\Lambda}=\text{diag}\left(\lambda_{m},\ldots,\lambda_{1}\right)$ , and

[TABLE]

with the complex multivariate gamma function is written in terms of the classical gamma function $\Gamma(\cdot)$ as

[TABLE]

Definition 2

Jacobi polynomials can be defined as follows [45, eq. 5.112]

[TABLE]

where $\binom{n}{k}=\frac{n!}{(n-k)!k!}$ with $n\geq k\geq 0$ .

We may alternatively express the Jacobi polynomial as [45]

[TABLE]

where ${}_{2}F_{1}(\cdot;\cdot;\cdot)$ is the Gauss hypergeometric function. Following (10), the successive derivatives of the Jacobi polynomial can be written as

[TABLE]

where $(a)_{k}=a(a+1)\ldots(a+k-1)$ with $(a)_{0}=1$ denotes the Pochhammer symbol. It is noteworthy that, for a negative integer $-n$ with $n\in\mathbb{Z}^{+}$ , we have [45]

[TABLE]

III-B Finite Dimensional Analysis of the C.D.F.

Armed with these preliminary definitions, now we focus on deriving the new c.d.f. for the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ when the covaraince matrix $\boldsymbol{\Sigma}$ takes the so called rank- $1$ spiked form. That is, the covariance matrix can be decomposed as

[TABLE]

where $\mathbf{V}=\left(\mathbf{v}\;\mathbf{v}_{2}\;\ldots\mathbf{v}_{m}\right)\in\mathbb{C}^{m\times m}$ is a unitary matrix and $\eta\geq 0$ . Before developing our method, it is important to highlight the difficulty of a direct solution via (8). Following Khatri [46], the hypergeometric function of two matrix arguments given in the join density (8) can be written as a ratio between the determinants of two $m\times m$ square matrices. Since the eigenvalues of the matrix $\boldsymbol{\Sigma}^{-1}$ are such that $1/(1+\eta)$ has algebraic multiplicity one and $1$ has algebraic multiplicity $m-1$ , the resultant ratio takes an indeterminate form. Therefore, one has to repeatedly apply L’Hospital’s rule to obtain a deterministic expression. However, the resulting expression is not amenable to apply Mehta’s [39] orthogonal polynomial technique. Therefore, to apply it, we first derive an alternative joint eigenvalue density expression. This alternative derivation technique has also been used earlier in [30] to derive a single contour integral representation for the joint eigenvalue density when the matrices are real333However, when the matrices are real, the hypergeometric function of two matrix arguments does not admit such a determinant representation.. The following corollary gives the alternative joint density expression.

Corollary 3

Let $\mathbf{W}_{1}\sim\mathcal{W}_{m}(p,\mathbf{I}_{m}+\eta\mathbf{v}\mathbf{v}^{\dagger})$ and $\mathbf{W}_{2}\sim\mathcal{W}_{m}(n,\mathbf{I}_{m})$ be independent Wishart matrices with $m\leq p,n$ and $\eta\geq 0$ . Then the joint density of the ordered eigenvalues $0\leq\lambda_{1}\leq\lambda_{2}\leq\cdots\leq\lambda_{m}<\infty$ of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ is given by

[TABLE]

where

[TABLE]

and

[TABLE]

Proof: See Appendix A.

Remark 4

It is worth noting that the function $f_{\text{uc}}(\lambda_{1},\lambda_{2},\cdots,\lambda_{m})$ denotes the joint density of the ordered eigenvalues of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ corresponding to the case $\mathbf{W}_{1}\sim\mathcal{W}_{m}(p,\mathbf{I}_{m})$ and $\mathbf{W}_{2}\sim\mathcal{W}_{m}(n,\mathbf{I}_{m})$ .

To facilitates further analysis, nothing that the continuous mapping $h:x\mapsto\frac{x}{x+1},\;x\geq 0$ is strictly increasing (i.e., order preserving), we use the variable transformations

[TABLE]

with $0\leq x_{1}\leq x_{2}\leq\cdots\leq x_{m}<1$ in (13) to obtain

[TABLE]

where $\mathcal{K}_{3}(m,n,p)=\mathcal{K}_{1}(m,n,p)\mathcal{K}_{2}(m,n,p)$ .

The joint eigenvalue density (III-B) in turn facilitates the use of Mehta’s orthogonal polynomial approach in our subsequent c.d.f. analysis.

Remark 5

Alternatively, (III-B) represents the joint density of the ordered eigenvalues of deformed Jacobi ensemble, $\mathbf{W}_{1}(\mathbf{W}_{2}+\mathbf{W}_{1})^{-1}$ with $\mathbf{W}_{1}\sim\mathcal{W}_{m}(p,\mathbf{I}_{m}+\eta\mathbf{vv}^{\dagger})$ and $\mathbf{W}_{2}\sim\mathcal{W}_{m}(n,\mathbf{I}_{m})$ .

We now consider the main contribution of of this paper, namely, the derivation of the c.d.f. of the maximum eigenvalue. By the definition, the c.d.f. of $x_{\max}$ (i.e., $x_{m}$ ) can be written as,

[TABLE]

where, for notational concision, we have used ${\rm d}\mathbf{x}={\rm d}x_{1}{\rm d}x_{2}\ldots{\rm d}x_{m}$ . By evaluating the above Selberg-type integral, the c.d.f. of $x_{\max}$ can be found and hence the c.d.f. of $\lambda_{\max}$ , which is given by the the following theorem.

Theorem 6

Let $\mathbf{W}_{1}\sim\mathcal{W}_{m}(p,\mathbf{I}_{m}+\eta\mathbf{vv}^{\dagger})$ and $\mathbf{W}_{2}\sim\mathcal{W}_{m}(n,\mathbf{I}_{m})$ be independent with $m\leq p,n$ and $\eta\geq 0$ . Then the c.d.f. of the maximum eigenvalue $\lambda_{\max}$ of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ is given by

[TABLE]

where

[TABLE]

and

[TABLE]

with $\alpha=n-m$ and $\beta=p-m$ .

Proof: See Appendix B.

Remark 7

Alternatively, $\Phi_{\text{i}}(t,\eta)$ can be expressed in terms of Gauss hypergeometric function as follows

[TABLE]

The new exact c.d.f. expression for the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ , which contains the determinant of a square matrix whose dimension depends on the difference $\alpha=n-m$ , is highly desirable when the difference between $m$ and $n$ is small irrespective of their individual magnitudes. For instance, when $n=m$ ( $\alpha=0$ ) the determinant vanishes and we obtain a scalar result. This concise result is one of the many advantages of using the orthogonal polynomial approach. This key representation, also facilitates the derivation of the limiting distribution of the maximum eigenvalue (when $m,n\to\infty$ such that $m-n$ is fixed).

For some special values of $\alpha$ and $\eta$ , the c.d.f. expression (18) admits the following simple forms.

Corollary 8

The exact c.d.f. of the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ when $\eta=0$ is given by

[TABLE]

Proof: Following (7), it is easy to see that, when $\eta=0$ , all the elements in the first column of the determinant in (18) become zero except the first entry which is $(p-1)!(n+p-1)!/(m+p-1)!$ . Therefore, we expand the determinant with its first column and shift the indices $i$ and $j$ to conclude the proof.

Alternative expressions for c.d.f and p.d.f. of $x_{\max}$ ( $x_{\max}=\lambda_{\max}/(1+\lambda_{\max})$ ) in the same scenario ( $\eta=0$ ) are given in [27] and [28], respectively. However, these results are fundamentally structurally different from our expression (20), since they contain complex hypergeometric functions of one matrix argument. In particular, the matrix argument in [27] assumes the form $t\mathbf{I}_{m}$ , whereas the matrix argument in [28] takes the form $t\mathbf{I}_{\alpha-1}$ . Further simplification of these expressions requires the repeated application of L’Hospital’s rule followed by the evaluation of the resultant determinants, a cumbersome process. In contrast, the c.d.f. expression (20) does not suffer from these drawbacks.

Corollary 9

The exact c.d.f. of the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ when $\alpha=0$ is given by ( $t\geq 0$ )

[TABLE]

Proof: When $\alpha=0$ , the determinant in (18) reduces to a single term given by

[TABLE]

Nothing that ${}_{2}F_{1}(a,b;b;z)={}_{1}F_{0}(a;z)=(1-z)^{-a}$ with some algebraic manipulations concludes the proof.

In the sequel, this remarkably simple result (21) is used to establish an important high dimensional limit for the maximum eigenvalue. Also, we have, for $\eta_{2}>\eta_{1}>0$ ,

[TABLE]

Having established the finite dimensional c.d.f. results, we now focus on the asymptotic characterization of the maximum eigenvalue.

III-C Asymptotic Analysis of the C.D.F.

Here we characterize the asymptotic behavior of the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ by deriving various limiting c.d.f. expression for (18). In particular, we focus on suitably centerd and scaled maximum eigenvalue in the following two important scenarios:

As $m,n,p\to\infty$ such that $\alpha,\beta$ , and $\eta$ are fixed, 2. 2.

As $m,n,p,\eta\to\infty$ such that $\frac{m}{n}\to 1$ , $\frac{m}{p}\to c\in(0,1]$ , and $\frac{\eta}{m}\to\theta\geq 0$ .

Asymptotic behavior of the Jacobi ensemble has been thoroughly studied in the literature ([42], [40], [43] and references therein). For instance, Johnstone [42] has shown that, for a large class of Jacobi ensembles, properly centered and scaled maximum eigenvalue (the high dimensional limit) admits a Tracy-Widom type limiting distribution. Recently, Ioana [28] has derived a new limiting p.d.f. expression for the maximum and minimum eigenvalues of the Jacobi ensemble for certain new asymptotic regimes. Despite the differences in the asymptotic regimes of their choice, one common features of all the above mentioned investigations is that $\mathbf{W}_{1}$ and $\mathbf{W}_{2}$ are white Wishart matrices. In contrast, more recently, high dimensional limit of the maximum eigenvalue (including the so called universality) has been established when $\mathbf{W}_{2}$ has certain spiked covariance structures (akin to the structure given in (12)) [25], [4], [35], [36], [26]. Most importantly those authors have observed a so called phase transition (also known as BBP phase transition) phenomena associated with the maximum eigenvalue. In a nutshell, phase transition means, in the high dimensional limit, when $\eta$ is below a certain critical threshold, the maximum eigenvalue does not separate from the rest of the eigenvalues444To be precise, it converges almost surely to the upper support of the limiting spectral density [4], [36], [25], whereas when $\eta$ is above the threshold, it separates from the rest of the eigenvalues555It converges almost surely to a location above the upper support of the limiting spectral density [25], [36].. Despite all these efforts, the behavior of the maximum eigenvalue in the above two asymptotic regimes have not been addressed in the literature. Therefore, in what follows we give limiting c.d.f. expressions pertaining to the above two scenarios.

Theorem 10

As $m$ , $p$ and $n$ tend to $\infty$ such that $\alpha=m-n$ , $\beta=p-m$ , and $\eta$ are fixed, the centered and scaled maximum eigenvalue $\displaystyle(1+\lambda_{\max})/m^{2}$ converges in distribution to a random variable $X$ with the c.d.f. $F^{(\alpha)}_{X}(x;\eta)$ . In particular, we have

[TABLE]

where $\mathcal{I}_{k}(z)$ is the $k$ -th order modified Bessel function of the first kind.

Proof: See Appendix D.

It is interesting to see that the limiting c.d.f. is independent of $\eta$ . Due to this independence, (22) should be the limiting c.d.f. for $\eta=0$ as well. However, an alternative expression for the limiting p.d.f. of $x_{\max}$ when $\eta=0$ has been given in [28]. That particular expression contains a hypergeometric function of one matrix argument, and therefore does not admit a simple form. In contrast, the limiting c.d.f. (22) is simple from the representation as well as numerical evaluation perspectives. Since (22) has the same form under both hypotheses, the maximum eigenvalue based test does not have power in this particular regime.

The following theorem characterizes the maximum eigenvalue in one of the most important high dimensional setting outlined in the above second scenario.

Theorem 11

As $m$ , $p$ , $n$ , and $\eta$ tend to $\infty$ such that $m/n\to 1$ , $m/p\to c\in(0,1]$ , and $\eta/m\to\theta\geq 0$ , the centered and scaled maximum eigenvalue $\displaystyle(1+\lambda_{\max})/m^{2}$ converges in distribution to a random variable $X$ with the c.d.f. $F_{X}(x;c,\theta)$ . In particular, we have

[TABLE]

Proof: Following (21), we take $\alpha=0$ and $p=m/c$ to yield

[TABLE]

from which we obtain, noting that $\eta=\theta m$ ,

[TABLE]

The final result now follows by evaluating the limits as $m\to\infty$ .

This remarkably simple limiting c.d.f. sheds some new light on the behavior of the maximum eigenvalue in this particular asymptotic domain. Following [47], [4], we can easily show that, for $m/n\to 1$ and $m/p\to c\in(0,1]$ , the upper support of the limiting spectral density diverges to infinity666Following [47], [48] we can show that the exact limiting spectral density takes the form $\frac{\sqrt{x-a}}{\pi x(x+c)}$ , where $a=(1-c)^{2}/4\leq x<\infty$ . for fixed $\eta$ . Therefore, under this scaling, the operatinal regime is below below the phase transition, where the maximum eigenvalue has no detection power [25], [4]. In contrast, when $\eta$ also scales with $m$ , it turns out that (see next section), the maximum eigenvalue has detection power as shown in Theorem 11. The reason is that the all earlier results treated $\eta$ as a constant when dealing with the high dimensional limits. This new simple result shows that, when $n,p$ and $\eta$ scale with $m$ , an interesting new phenomenon occurs.

Having armed with the finite and asymptotic characteristics of the maximum eigenvalue of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ , we next focus on the ROC curve of the maximum eigenvalue based detector.

IV ROC of the Maximum Eigenvalue of $\widehat{\bf{\Psi}}$

We now investigate the behavior of detection and false alarm probabilities of the maximum eigenvalue based test. To this end, noting that the eigenvalues of $\widehat{\boldsymbol{\Psi}}$ and $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ are related by $\hat{\lambda}_{j}=(n/p)\lambda_{j}$ , for $j=1,2,\ldots,m$ , we represent the c.d.f. of the maximum eigenvalue corresponding to $\widehat{\boldsymbol{\Psi}}$ as $F_{\lambda_{\max}}^{(\alpha)}(\kappa x;\gamma)$ , where $\kappa=p/n$ . For convenient presentation, we treat the finite dimensional and asymptotic behaviors of the ROC in two separate sub sections.

IV-A Finite Dimensional Analysis

We first consider the case where matrix dimensions ( $m,n$ , and $p$ ) are finite. Now following Theorem 6 and Corollary 8 along with with (6), (7), the detection and false alarm probabilities can be written, respectively, as

[TABLE]

In general, deriving a functional relationship between $P_{D}$ and $P_{F}$ by eliminating the parametric dependency on $\mu_{\text{th}}$ is challenging. However, when $\alpha$ admits zero, an explicit relationship between them is specified in Corollary 12.

Corollary 12

For notational brevity, we suppress the parameters $\gamma$ and $\mu_{\text{th}}$ and represent the detection and false alarm probabilities, simply as $P_{D}$ and $P_{F}$ . Then, when $\alpha=0$ , $P_{D}$ and $P_{F}$ are functionally related as

[TABLE]

From (26), taken $P_{D}$ as a function of $\gamma$ , we can easily see that, for $\gamma_{1}>\gamma_{2}$ ,

[TABLE]

This confirms the common observation that the SNR is positively correlated with the detection probability for a fixed value of $P_{F}$ .

The ROC curves corresponding to different parameter settings are shown in Figs. 1 and 2a and 2 depicts the power profile as a function of SNR for different $P_{F}$ values. As can be seen, for a fixed $P_{F}$ , the power increase with the SNR, which is consistent with our intuition. The ROC of maximum eigenvalue based detection is shown in Fig. 1b for several SNR ( $\gamma$ ) values, which clearly shows that ROC profile improves with the increasing SNR. Since the next important parameter determining the ROC profile is the dimensionality of the covariance matrices, we investigate its effect on the ROC profile. To this end, Fig. 2a shows the effect of $m/n$ for $m/p=1$ . As can be seen, the disparity between $m$ and $n$ improves the ROC profile. The reason behind this observation is that the quality of the sample covariance matrix is improved when the length of the data record ( $n$ ) increases in comparison with the dimensionality of the receiver ( $m$ ). Since the minimum requirement for $\bf{\widehat{R}}$ to be invertible is $m=n$ , we can observe the worst ROC performance corresponds to $m/n=1$ . Therefore, the effect of $m/p$ on the ROC for $m/n=1$ is shown in Fig. 2b. As can be seen, for constant $p$ , increasing $m$ degrades the ROC profile. Since we have a closed-form ROC equation for $m/n=1$ , we conduct a deeper investigation on the joint effect of $m$ and $p$ on the ROC.

The joint effect of $m$ and $p$ is characterized in two scenarios. In particular, we consider i) varying $p$ for fixed $m$ and ii) $m$ and $p$ both vary such that $m/p=\nu$ , where $\nu>0$ is a constant. Since $p$ and $m$ take integer values only, the analysis is intractable. To circumvent this difficulty, we let $p$ and $m$ be continuous. We can thus write the derivative of $P_{D}$ with respect to $p$ as

[TABLE]

from which we obtain using the inequality $\ln z\geq 1-1/z$ , $\frac{{\rm d}P_{D}}{{\rm d}p}>0$ . This in turn reveals that $P_{D}$ increases with $p$ for all $\gamma$ and $P_{F}$ , which is consistent with our intuition. The next immediate question of whether $P_{D}$ is bounded as $p\to\infty$ is answered in the sequel.

We now focus on the second scenario. As such, noting that $m/p=\nu$ , we can write derivative of $P_{D}$ as a function of $p$ to yield

[TABLE]

A careful inspection of the right hand expression reveals that it has only one stationary point. However, the direct evaluation of the stationary point based on the above expression does not yield any closed-form solution. Therefore, to gain insights into the $p$ value which maximizes/minimizes $P_{D}$ , in what follows, we derive a tight bound for the stationary point. To this end, first we concentrate on the $p$ values for which $\frac{{\rm d}P_{D}}{{\rm d}p}<0$ for all $\gamma$ and $P_{F}$ . As such, we use the inequalities [49]

[TABLE]

and $z\ln z<z(z-1)$ , $z>0$ to obtain

[TABLE]

Therefore, $\frac{{\rm d}P_{D}}{{\rm d}p}<0$ is strict in the regime where

[TABLE]

Again, using the inequalities [49], $\ln(1+z)>2z/(2+z),\;z>0$ and $\ln z>(1-z)/\sqrt{z},\;0<z<1$ , we have

[TABLE]

This in turn gives that $\frac{{\rm d}P_{D}}{{\rm d}p}>0$ for

[TABLE]

Thus, we conclude that $P_{D}$ attains its maximum at $p=p^{*}$ , where

[TABLE]

Having obtained the upper and lower bounds on $p^{*}$ , a good approximation of $p^{*}$ can be written as777In general any convex combination of the upper and lower bounds can be a candidate for the $p^{*}$ .

[TABLE]

To further highlight the accuracy of the proposed approximation, in Fig. 3 we compare the optimal ROC profiles evaluated based on (30) and by numerically optimizing (26). As can be seen from the figure, the disparity between the proposed approximation and the exact optimal solution is insignificant. Therefore, when $m=n$ , under the second scenario, we can choose $p$ as per (30) for fixed $P_{F}$ , $\gamma$ , and $\nu$ in view of maximizing the detection probability.

The detection of a very weak signal embedded in noise is particularly challenging. In this respect, it is of paramount importance to understand the behavior of $P_{D}$ as a function of SNR in the low SNR regime. To this end, we need to analytically characterize $P_{D}$ around $\gamma=0$ , which is the focus of Corollary 13.

Corollary 13

As $\gamma\to 0$ , for a fixed value of $P_{F}$ , $P_{D}(\gamma)$ admits the following form

[TABLE]

where

[TABLE]

with

[TABLE]

and $G(z)$ being the inverse function of $F_{\lambda_{\max}}^{(\alpha)}(z;0)$ .

The proof simply follows by obtaining the Taylor expansion of the $P_{D}(\gamma)$ in the vicinity of $\gamma=0$ .

Let us now examine the factors affecting weak signal detection with the proposed scheme. Since the ROC curve for the case $n>m$ is too complicated, we confine ourselves to the scenario $m=n$ . Moreover, as we have already seen, this scenario may result in the worst possible ROC and hence serves as a benchmark. Therefore, any improvement in this case will further enhance other ROC curves. Clearly, for very low SNR values, the most critical factor which determines the power is the coefficient of $\gamma$ given by $p\left[1-\left(1-P_{F}\right)^{1/mp}\right]\left(1-P_{F}\right)$ . Since this coefficient depends on two parameters $m$ and $p$ for fixed $P_{F}$ , we investigate the power profile when these parameters are related as follows: i) fixed $m$ , $p$ varies, ii) $m$ and $p$ both vary such that $m/p=k\in(0,1]$ , and iii) $m$ and $p$ both vary such that $p-m$ is a constant. It is easy to show that under the above both options (ii) and (iii), the coefficient degrades when we increase both $p$ and $m$ . In contrast, when $m$ is fixed, the coefficient gradually improves when we increase $p$ . To show this, we rewrite the above coefficient, omitting the factor $(1-P_{F})$ , as a function of $p$ to yield

[TABLE]

Now we treat $p$ as a continuous variable and differentiate $a(p)$ over $p$ to yield

[TABLE]

Nothing the inequality, $\ln z\geq 1-1/z$ , we can easily show that $\frac{{\rm d}}{{\rm d}p}a(p)\geq 0$ for all $p,m$ . This in turn establishes that $a(p)$ is a non decreasing function of $p$ . The next natural question is whether there exist an upper bound for $a(p)$ as $p$ grows large. A simple limiting argument involving L’Hôpital’s rule will then give

[TABLE]

Therefore, we can conclude that a power enhancement is expected in the low SNR regime if we increase $p$ for fixed $m$ and $P_{F}$ . In particular, in the low SNR regime (i.e., as $\gamma\to 0$ ), we have

[TABLE]

To further asses the quality of the derived first order approximations, here we numerically evaluate the relative error between the exact $P_{D}(\gamma)$ and the corresponding first order expansions given in (33). To be precise, we define the relative error as

[TABLE]

where $P_{D}^{\text{f.o.}}(\gamma)$ stands for the first order expansions give in (33). Figure 4a depicts the behavior of the relative error as a function of $P_{F}$ for a set of small values of $\gamma$ . The other parameters have been chosen as $m=n=10$ and $p=15$ . Fig. 4a shows that the diminishing $\gamma$ improves the relative error, which is anticipated. Fig. 4b shows the relative error versus $P_{F}$ curve for a set of small values of $\gamma$ when $m=n=10$ and $p=20$ . Although we can observe the general trend of improving relative error with the diminishing $\gamma$ , for a given $\gamma$ , the relative error is maximized at a certain value of $P_{F}$ . However, the analytical determination of this value seems an arduous task. The relative error improvement in the case of increasing $p$ is depicted in Fig. 5. It is interesting to observe that the relative error does not deviate much from the corresponding asymptotic limit even for finite small values of $p$ when $\gamma$ is moderately low.

Having completed the finite-dimensional analysis, we now examine the ROC behavior in the asymptotic regime.

IV-B Asymptotic Analysis

Here we analyze the ROC profile in three important asymptotic regimes. In particular, we consider the following three regimes

As $m,n,p\to\infty$ such that $\alpha,\beta$ and $\gamma$ are fixed, 2. 2.

As $p\to\infty$ such that $m=n$ , and $\gamma$ are fixed, 3. 3.

As $m,n,p,\gamma\to\infty$ such that $\frac{m}{n}\to 1$ , $\frac{m}{p}\to c\in(0,1]$ , and $\frac{\gamma}{m}\to\theta\geq 0$ .

Following Theorem 10, we can easily see that the maximum eigenvalue has no detection power in the first regime. Therefore, we now turn our attention to the second and third regimes. The asymptotic ROC pertaining to the second scenario can be obtained with the help of Corollary 12 as

[TABLE]

It is noteworthy that this convergence is uniform in $\gamma$ . Asymptotic ROC corresponding to the third regime, is given by the following corollary

Corollary 14

As $m,n,p,\gamma\to\infty$ such that $\frac{m}{n}\to 1$ , $\frac{m}{p}\to c\in(0,1]$ , and $\frac{\gamma}{m}\to\theta\geq 0$ , the ROC admits the following asymptotic limit

[TABLE]

Since the above asymptotic ROC profile is independent of $c$ , this expression should be valid for $c=0$ as well. Therefore, we can extend the domain of $c$ such that $c\in[0,1]$ . Clearly, when $\theta=0$ ( $\gamma$ does not scale with $m$ ), the maximum eigenvalue has no detection power in the high dimension. This is consistent with what has been reported in [35] on the power of the maximum eigenvalue below the phase transition. In contrast, when $\gamma$ scales with $m$ , in the high dimension, the maximum eigenvalue still retains its detection power. For instance, when $\theta\to 0$ (the signal component is extremely weak), we have

[TABLE]

This valuable insight is of paramount importance in detecting signals over fading channels. For instance, for Rayleigh fading, which is the most commonly used statistical model in the literature, $\mathbf{h}$ takes the form $\mathbf{h}\sim\mathcal{CN}_{m}\left(\mathbf{0},\mathbf{I}_{m}\right)$ . Now, by invoking the strong law of large numbers, we obtain

[TABLE]

This in turn shows that $\gamma\propto m$ as $m\to\infty$ for Rayleigh fading channels. This is a clear testament to the utility of our new asymptotic ROC profile given in Corollary 14 in wireless applications.

The above dynamics are depicted in Figs. 6, 7, and 7. In particular, Fig. 6 compares the analytical ROC profiles with the numerical results for an increasing sequence of $m$ values when $\alpha=1,\beta=2$ , and $\gamma=5\;{\rm dB}$ are fixed. As can be seen from the figure, when $m$ increases the ROC profiles go arbitrary closer to $P_{D}=P_{F}$ curve, thereby demonstrating the loss of the power of the test. This observation is consistent with what we have analytically shown related to the regime where $\alpha$ and $\beta$ are fixed with $\gamma=5\;{\rm dB}$ . The effect of increasing $p$ on the ROC profile is depicted in Fig. 7. The analytical curves are based (37) and the close matching between the analytical and simulation results can be seen from the figure. This in turn shows us that that the analytical asymptotic result (as $p\to\infty$ ) derived in (37) serves as a good approximation to finite values of $p$ as well. Finally, Fig. 7 compares the analytical asymptotic result for the third region where $m,n,p,\gamma\to\infty$ such that $\frac{m}{n}\to 1$ , $\frac{m}{p}\to c\in(0,1]$ , and $\frac{\gamma}{m}\to\theta\geq 0$ with the simulation results. Again, closely matching two results reveal that our asymptotic analytical expression serves as a good approximation to the finite dimensional case as well. These results clearly indicate that, when $\gamma$ scales with $m$ , the maximum eigenvalue retains its detection power, whereas it looses the detection power when $\gamma$ does not scale with $m$ .

V Conclusion

This paper investigates the signal detection problem in colored noise with unknown covariance matrix. Thus, the presence of a signal is detected by using the maximum generalized eigenvalue of the whitened sample covariance matrix. Equivalently, we need to determine the distribution of the maximum eigenvalue of the deformed Jacoby unitary ensemble. To this end, we exploited the powerful orthogonal polynomial approach to develop a new c.d.f. expression of the maximum eigenvalue of the deformed JUE. Subsequently, we used it to determine the ROC of the detector. It turns out that, for a fixed SNR, when $m$ (i.e., the dimensionality of the detector), $n$ (i.e., the number of noise-only samples), and $p$ (i.e., the number of signal-plus-noise samples) increase over finite values such that $m=n$ and $m/p$ is constant, we obtain an optimal ROC profile corresponding to specific $m,n$ , and $p$ values. In contrast, in the above setting, when $m,p$ , and $n$ increase asymptotically, the maximum eigenvalue gradually loses its detection power. This is not surprising, since under the above asymptotic setting, the detector operates below the so called phase transition where the maximum eigenvalue has no detection power. However, when the SNR scales with $m$ , in the same asymptotic regime, the maximum eigenvalue retains its detection power. This fact is of paramount importance in detecting a signal in colored noise over fading channels (Rayleigh fading) where the SNR scales with the dimensionality of the system. Clearly, $m=n$ is the minimum requirement for the noise-only covariance matrix to be full rank (or nearly rank deficient). Therefore, some of the key results developed in this paper related to the setting $m=n$ shed some light into the regime where noise-only covariance matrix is nearly rank deficient. However, the analysis pertaining to the regime where the latter matrix is fully rank deficient remains an important open problem.

Appendix A Proof of the joint density of the eigenvalues

Following James [41], we can write the joint density of the eigenvalues of $\mathbf{W}_{1}\mathbf{W}_{2}^{-1}$ as

[TABLE]

where $\alpha=p+n$ and ${\rm{d}}\mathbf{U}$ is the invariant measure on the unitary group $U(m)$ , normalized to make the total measure unity. Let us now focus on simplifying the above matrix integral. To this end, we use (12) to rewrite

[TABLE]

where $\boldsymbol{\bar{\Lambda}}=\boldsymbol{\Lambda}(\mathbf{I}_{m}+\boldsymbol{\Lambda})^{-1}=\text{diag}\left(\bar{\lambda}_{m},\cdots,\bar{\lambda}_{1}\right)=\text{diag}\left(\frac{\lambda_{m}}{1+\lambda_{m}},\cdots,\frac{\lambda_{1}}{1+\lambda_{1}}\right)$ . Therefore, after some algebra, we obtain

[TABLE]

where ${\rm d}\mathbf{H}$ is the invariant measure on the unitary group $U(m)$ , normalized to make the total measure unity. Since $\boldsymbol{\Lambda}_{\eta}$ is rank one, we can further simplify the above matrix integral to yield

[TABLE]

Now it is worth observing that

[TABLE]

This in turn enables us to utilize the relation

[TABLE]

to express the above matrix integral as

[TABLE]

where

[TABLE]

and we have taken the liberty of changing the order of integration. Noting the fact that

[TABLE]

we may use the splitting formula [eq. 92, James] to yield

[TABLE]

Following [34], we can show that

[TABLE]

from which we obtain upon substituting into (46) with some algebra

[TABLE]

Finally, using (A) in (41) with some algebraic manipulation we obtain (13), which concludes the proof.

Appendix B Proof of the c.d.f. of the maximum eigenvalue

By exploiting the symmetry, the ordered region of integration in (17) can be rearranged as an unordered region to yield

[TABLE]

where $[0,t]^{m}=[0,t]\times[0,t]\times\ldots\times[0,t]$ with $\times$ denoting the Cartesian product. Since each term in the above summation contributes the same amount to the final solution, it can be further simplified as

[TABLE]

where,

[TABLE]

Here we have relabeled the variables as $\alpha=n-m$ , $\beta=p-m$ and $\gamma=m+\alpha+\beta+1$ for notational concision. To facilitate further analysis, let us decompose the Vandermonde determinant as

[TABLE]

and relabel the variables $x_{1}=y$ and $x_{j}=z_{j-1}$ , $j=2,3,...,m$ , to obtain

[TABLE]

where $\textbf{z}\in\mathbb{R}^{m-1}$ . Now we apply the variable transformations $y=tx$ and $z_{j}=ts_{j}$ , $j=1,2,...,m-1$ , to make the region of integration independent of $t$ in (53). Consequently we have after some algebraic manipulations

[TABLE]

where,

[TABLE]

Following Appendix C, we can solve the above multidimensional integral to yield

[TABLE]

where

[TABLE]

and $h_{t}=\frac{2}{t}-1$ . Using (56) in (54) with some algebraic manipulation we have

[TABLE]

Having observed that only the first column of the determinant in the integrand depends on $x$ , we can rewrite the above integral as

[TABLE]

For clarity, let us focus on the integral in the above equation. In this respect, we may use the relation (10) followed by the variable transformation $y=1-x$ to arrive at

[TABLE]

which can be solved using [50, eq. 399.6] to obtain

[TABLE]

To facilitate further analysis, nothing that $\frac{\eta t}{1+\eta}<1$ , we may replace the hypergeometric function with its equivalent infinite series expansion to yield

[TABLE]

Since the Gamma function has poles at negative integer values including zero, the above series is nonzero if the argument of $\Gamma\left(3-m-i+k\right)=\Gamma(3-m-i)\left(3-m-i\right)_{k}$ is a positive integer. To this end, $k$ should satisfy the inequality $k\geq m+i-2$ . Therefore, by relabeling summation index $k$ as $j=k-m-i+2$ , we obtain

[TABLE]

The above infinite series can be rearranged by using the addition formula $(a)_{n+k}=(a)_{n}(a+n)_{k}$ with some algebraic manipulations to yield

[TABLE]

where $a_{i}=\beta+m+i-1$ , $b_{i}=\gamma+m+i-2$ , and $c_{i}=\beta+2m+2i-2$ . Now we substitute (64) into (B) followed by some algebraic manipulations to obtain the c.d.f. of $x_{\max}$ as

[TABLE]

Now (18) with $\Phi(t,\eta)$ given by (7) follows by transforming the variable $x_{\max}$ to $\lambda_{\max}$ using the functional relation $\lambda_{\max}=x_{\max}/(1-x_{\max})$ . Finally, noting that $c_{i}-b_{i}$ is a negative integer, we may use the hypergeometric transformation [51, eq. 15.3.4],

[TABLE]

to arrive at the finite series form of $\Phi(t,\eta)$ , thereby concluding the proof.

Appendix C

Let us change the region of integration in (55) from $[0,1]^{m}$ to $[-1,1]^{m}$ by using the variable transformation $s_{j}=\frac{1+z_{j}}{2}$ , $j=1,2,...,m$ , to yield

[TABLE]

where

[TABLE]

with $h_{t}=\frac{2}{t}-1$ and $\textbf{z}\in\mathbb{R}^{m}$ . Our strategy is to start with a related integral given in [39, Eqs. 22.4.2, 22.4.11] as

[TABLE]

where

[TABLE]

and $C_{k}(x)$ are monic polynomials orthogonal with respect to the weight $(1+x)^{\beta}$ , over $-1\leq x\leq 1$ . Since Jacobi polynomials are orthogonal with respect to the preceding weight, we use $C_{k}(x)=2^{k}\frac{(k+\beta)!(k)!}{(2k+\beta)!}P_{k}^{(0,\beta)}(x)$ in (69) to obtain

[TABLE]

where,

[TABLE]

In the above, $r_{i}$ s are generally distinct parameters. Nevertheless, if we choose $r_{i}$ such that

[TABLE]

then the the left side of (70) coincides with the multidimensional integral of our interest in (68). Under the above parameter selection, however, the right side of (70) takes the indeterminate form $0/0$ . Therefore, we have to evaluate following limit:

[TABLE]

To this end, following Khatri [46], we have

[TABLE]

Now the determinant in the denominator of (73) simplifies as

[TABLE]

The numerator can be rewritten with the help of (11) as

[TABLE]

Substituting the above two expression into (73) and then the result into (72) gives

[TABLE]

Appendix D Proof of the microscopic limit of the c.d.f. of the maximum eigenvalue

Let us rewrite (B), keeping in mind $\alpha=n-m$ , $\beta=p-m$ , and $\gamma=m+\alpha+\beta+1$ , as

[TABLE]

where

[TABLE]

Following (10), the Jacobi polynomial $P_{m+i-j}^{(j-2,\beta+j-2)}$ can be written as

[TABLE]

from which we obtain

[TABLE]

To facilitate further analysis, we need to eliminate the dependence of summation upper limit on $i$ . To this end, we decompose the two Pochhammer symbols in the numerator of the above summation as

[TABLE]

and

[TABLE]

Therefore, we obtain

[TABLE]

where,

[TABLE]

and

[TABLE]

Now we substitute (80) into (74) with some algebraic manipulation to yield

[TABLE]

from which we obtain after some rearrangements

[TABLE]

For convenience, let us rewrite the above equation as

[TABLE]

where

[TABLE]

Further manipulation of $\mathcal{V}_{i}(m,\alpha,\beta,\eta,t)$ in its current form is an arduous task due to the presence of the hypergeometric function. To this end, noting that $(\alpha+\beta+2m+i-1)-(\beta+2m+2i-2)=-(\alpha+1-i)$ , which is a negative integer, we use the hypergeometric transformation (66) to arrive at

[TABLE]

A careful inspection of (85) reveals that the suitable scaling as $m\to\infty$ would be to consider the scaled $t$ given by $t=1-\dfrac{x}{m^{2}}$ . Consequently, we can write (85) as

[TABLE]

Now taking the limits of the both sides of (88) as $m\to\infty$ yields

[TABLE]

Towards taking the limit inside the determinant, let us first consider the $\displaystyle\lim_{m\to\infty}\mathcal{V}_{i}\left(m,\alpha,\beta,\eta,1-\dfrac{x}{m^{2}}\right)$ . To this end, noting that $\displaystyle\lim_{m\to\infty}\frac{(m+i-1)_{\beta}}{(m-1)_{\beta}}=1$ and $\displaystyle\lim_{m\to\infty}\frac{(2m+\beta+2i-2)_{\alpha-i+1}}{(m+\beta+i-1)_{\alpha-i+1}}=2^{\alpha-i+1}$ , we may determine the limit of (87) as

[TABLE]

where $\mathcal{T}_{i}(\eta)=2^{\alpha}\left(\frac{\eta}{2}\right)^{i-1}\left(1+\frac{\eta}{2}\right)^{\alpha-i+1}$

Let us Now consider the other columns of the determinant in (89). Following (82), we may rewrite $\mathcal{U}(m,\alpha,\beta)$ as

[TABLE]

where, $c_{j}=m+\alpha-j-k_{j}+1$ and $\Delta_{m}=2m+\alpha+\beta-1$ . Consequently, the terms in determinant in (89) can be rearranged as

[TABLE]

Towards making the determinant independent of $\Delta_{m}$ , we perform the following row operations

[TABLE]

on each row, starting from the second row, to yield

[TABLE]

To facilitate further simplification, noting that

[TABLE]

set the $1^{st}$ element of the $1^{st}$ column to $1$ and

[TABLE]

we apply the row operation $R_{i}\to R_{i}+R_{i-1}$ , for $i=3,4,...,\alpha+1$ , repeatedly to obtain

[TABLE]

Here the exact form of the $*$ marked entries are tacitly avoided, since they do not contribute to the determination evaluation. As such, by expanding the determinant using the first column, we have

[TABLE]

The above determinant can be simplified using [52, Lemma A.1] to yield

[TABLE]

where $\tilde{\Delta}_{\alpha}(\tilde{\textbf{c}})=\prod_{1\leq j<i\leq\alpha}\left(\tilde{c}_{i}-\tilde{c}_{j}\right)$ with $\tilde{\textbf{c}}=\left\{\tilde{c}_{1}(k_{2}),\tilde{c}_{2}(k_{3}),\cdots,\tilde{c}_{\alpha}(k_{\alpha+1})\right\}$ and $\tilde{c}_{j}(k_{j+1})=j+k_{j+1}$ . Now we substitute the above result into (89) to obtain

[TABLE]

For notational convenience, the index $j$ is shifted forward by one unit to yield

[TABLE]

where,

[TABLE]

and $\Delta_{\alpha}({\textbf{c}})=\prod_{1\leq j<i\leq\alpha}\left({c}_{i}-{c}_{j}\right)$ with ${\textbf{c}}=\left\{{c}_{1}(k_{1}),{c}_{2}(k_{1}),\cdots,{c}_{\alpha}(k_{\alpha})\right\}$ and ${c}_{j}(k_{j})=j+k_{j}$ . Having noted that $\Delta_{\alpha}({\textbf{c}})$ is independent of $m$ and

[TABLE]

we evaluate the limit of $\mathcal{S}_{k_{j}}\left(1-\dfrac{x}{m^{2}}\right)$ as

[TABLE]

Therefore, (90) simplifies to

[TABLE]

from which we obtain using [53, Appendix B]

[TABLE]

The above result implies that,

[TABLE]

Finally, noting that

[TABLE]

we may use the continuous mapping theorem [54] to obtain (22), which concludes the proof.

Bibliography54

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] H. L. V. Trees, Detection, Estimation, and Modulation Theory . New York Chichester: Wiley, 2001.
2[2] ——, Optimum Array Processing . New York: Wiley-Interscience, 2002.
3[3] R. R. Nadakuditi and A. Edelman, “Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples,” IEEE Trans. Signal Process. , vol. 56, no. 7, pp. 2625–2638, Jul. 2008.
4[4] R. R. Nadakuditi and J. W. Silverstein, “Fundamental limit of sample generalized eigenvalue based detection of signals in noise using relatively few signal-bearing and noise-only samples,” IEEE J. Sel. Topics Signal Process. , vol. 4, no. 3, pp. 468–480, Jun. 2010.
5[5] P. Bianchi, M. Debbah, M. Maida, and J. Najim, “Performance of statistical tests for single-source detection using random matrix theory,” IEEE Trans. Inf. Theory , vol. 57, no. 4, pp. 2400–2419, Apr. 2011.
6[6] R. Couillet and W. Hachem, “Fluctuations of spiked random matrix models and failure diagnosis in sensor networks,” IEEE Trans. Inf. Theory , vol. 59, no. 1, pp. 509–525, Jan. 2013.
7[7] N. Asendorf and R. R. Nadakuditi, “The performance of a matched subspace detector that uses subspaces estimated from finite, noisy, training data,” IEEE Trans. Signal Process. , vol. 61, no. 8, pp. 1972–1985, Apr. 2013.
8[8] ——, “Improved detection of correlated signals in low-rank-plus-noise type data sets using informative canonical correlation analysis (ICCA),” IEEE Trans. Inf. Theory , vol. 63, no. 6, pp. 3451–3467, Jun. 2017.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Eigenvalue Based Detection of a Signal in Colored Noise: Finite and Asymptotic Analyses

Abstract

Index Terms:

I Introduction

II Problem formulation

III C.D.F. of the Maximum Eigenvalue

III-A Preliminaries

Definition 1

Definition 2

III-B Finite Dimensional Analysis of the C.D.F.

Corollary 3

Remark 4

Remark 5

Theorem 6

Remark 7

Corollary 8

Corollary 9

III-C Asymptotic Analysis of the C.D.F.

Theorem 10

Theorem 11

IV ROC of the Maximum Eigenvalue of Ψ^\widehat{\bf{\Psi}}Ψ

IV-A Finite Dimensional Analysis

Corollary 12

Corollary 13

IV-B Asymptotic Analysis

Corollary 14

V Conclusion

Appendix A Proof of the joint density of the eigenvalues

Appendix B Proof of the c.d.f. of the maximum eigenvalue

Appendix C

Appendix D Proof of the microscopic limit of the c.d.f. of the maximum eigenvalue

IV ROC of the Maximum Eigenvalue of $\widehat{\bf{\Psi}}$