Active covariance estimation by random sub-sampling of variables

Eduardo Pavez; Antonio Ortega

arXiv:1804.01620·stat.ML·April 6, 2018

Active covariance estimation by random sub-sampling of variables

Eduardo Pavez, Antonio Ortega

PDF

TL;DR

This paper introduces an active covariance estimation method for partially observed vectors, optimizing sub-sampling probabilities to improve estimation accuracy in high-dimensional settings with limited observations.

Contribution

It develops an unbiased covariance estimator for partially observed data and proposes an optimal sub-sampling strategy within an active learning framework.

Findings

01

Derived an error bound relating sub-sampling probabilities to covariance entries

02

Proposed an active covariance estimation algorithm with optimal sub-sampling design

03

Demonstrated improved estimation efficiency in high-dimensional, limited-observation scenarios

Abstract

We study covariance matrix estimation for the case of partially observed random vectors, where different samples contain different subsets of vector coordinates. Each observation is the product of the variable of interest with a $0 - 1$ Bernoulli random variable. We analyze an unbiased covariance estimator under this model, and derive an error bound that reveals relations between the sub-sampling probabilities and the entries of the covariance matrix. We apply our analysis in an active learning framework, where the expected number of observed variables is small compared to the dimension of the vector of interest, and propose a design of optimal sub-sampling probabilities and an active covariance matrix estimation algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.