Optimal Covariance Estimation for Condition Number Loss in the Spiked   Model

David L. Donoho; Behrooz Ghorbani

arXiv:1810.07403·math.ST·October 18, 2018·6 cites

Optimal Covariance Estimation for Condition Number Loss in the Spiked Model

David L. Donoho, Behrooz Ghorbani

PDF

Open Access

TL;DR

This paper develops an asymptotically optimal nonlinear shrinker for covariance matrix estimation under condition number loss in the spiked model, with applications to multi-user covariance estimation and linear discriminant analysis.

Contribution

It introduces a new nonlinear shrinker tailored for the spiked covariance model that is asymptotically optimal under the condition number loss, improving estimation accuracy.

Findings

01

Optimal shrinker depends on data aspect ratio and top eigenvalue.

02

Large aspect ratio leads to substantial eigenvalue shrinkage.

03

Diagonal covariance matrices can be optimal even with large eigenvalues.

Abstract

We study estimation of the covariance matrix under relative condition number loss $κ (Σ^{- 1/2} \hat{Σ} Σ^{- 1/2})$ , where $κ (Δ)$ is the condition number of matrix $Δ$ , and $\hat{Σ}$ and $Σ$ are the estimated and theoretical covariance matrices. Optimality in $κ$ -loss provides optimal guarantees in two stylized applications: Multi-User Covariance Estimation and Multi-Task Linear Discriminant Analysis. We assume the so-called spiked covariance model for $Σ$ , and exploit recent advances in understanding that model, to derive a nonlinear shrinker which is asymptotically optimal among orthogonally-equivariant procedures. In our asymptotic study, the number of variables $p$ is comparable to the number of observations $n$ . The form of the optimal nonlinearity depends on the aspect ratio $γ = p / n$ of the data matrix and on the top…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Statistical Methods and Inference · Statistical Methods and Bayesian Inference