Missing values: sparse inverse covariance estimation and an extension to   sparse regression

Nicolas St\"adler; Peter B\"uhlmann

arXiv:0903.5463·stat.ME·February 28, 2012·Stat. Comput.

Missing values: sparse inverse covariance estimation and an extension to sparse regression

Nicolas St\"adler, Peter B\"uhlmann

PDF

TL;DR

This paper introduces an l1-regularized likelihood approach for estimating sparse inverse covariance matrices with missing data, extending to sparse regression, using an efficient EM algorithm with proven convergence.

Contribution

It presents a novel EM-based method for high-dimensional inverse covariance estimation with missing data, including an extension to sparse regression, with theoretical convergence guarantees.

Findings

01

Effective handling of missing data in high-dimensional settings

02

Demonstrated on simulated and real datasets

03

Provides a scalable optimization algorithm

Abstract

We propose an l1-regularized likelihood method for estimating the inverse covariance matrix in the high-dimensional multivariate normal model in presence of missing data. Our method is based on the assumption that the data are missing at random (MAR) which entails also the completely missing at random case. The implementation of the method is non-trivial as the observed negative log-likelihood generally is a complicated and non-convex function. We propose an efficient EM algorithm for optimization with provable numerical convergence properties. Furthermore, we extend the methodology to handle missing values in a sparse regression context. We demonstrate both methods on simulated and real data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.