Robust High Dimensional Expectation Maximization Algorithm via Trimmed   Hard Thresholding

Di Wang; Xiangyu Guo; Shi Li; Jinhui Xu

arXiv:2010.09576·stat.ML·October 20, 2020

Robust High Dimensional Expectation Maximization Algorithm via Trimmed Hard Thresholding

Di Wang, Xiangyu Guo, Shi Li, Jinhui Xu

PDF

TL;DR

This paper introduces a robust high-dimensional EM algorithm with trimming and hard thresholding steps, capable of handling arbitrarily corrupted samples in sparse latent variable models, with proven convergence and applicability to multiple models.

Contribution

It proposes a novel robust EM algorithm with trimming and hard thresholding, providing theoretical guarantees and broad applicability to high-dimensional corrupted data scenarios.

Findings

01

Algorithm converges geometrically under mild conditions.

02

Effective in models like Gaussian mixtures, regressions, and missing covariates.

03

Supports high corruption levels up to (rac{1}{\u221a{n}}).

Abstract

In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space ({\em i.e.,} $d ≫ n$ ) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the Maximization step (M-step), respectively. We show that under some mild assumptions and with an appropriate initialization, the algorithm is corruption-proofing and converges to the (near) optimal statistical rate geometrically when the fraction of the corrupted samples $ϵ$ is bounded by $\tilde{O} (\frac{1}{n})$ . Moreover, we apply our general framework to three canonical models: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Regression