Statistical analysis for a penalized EM algorithm in high-dimensional   mixture linear regression model

Ning Wang; Xin Zhang; Qing Mai

arXiv:2307.11405·math.ST·July 24, 2023·J. Mach. Learn. Res.·2 cites

Statistical analysis for a penalized EM algorithm in high-dimensional mixture linear regression model

Ning Wang, Xin Zhang, Qing Mai

PDF

Open Access

TL;DR

This paper introduces a novel group lasso penalized EM algorithm for high-dimensional mixture linear regression, providing theoretical guarantees without sample-splitting and demonstrating strong numerical performance.

Contribution

The paper develops a new penalized EM algorithm for high-dimensional mixture regression that avoids sample-splitting and extends to multivariate responses, with proven statistical properties.

Findings

01

Algorithm performs well in numerical experiments.

02

Theoretical analysis confirms statistical consistency.

03

No sample-splitting required for convergence.

Abstract

The expectation-maximization (EM) algorithm and its variants are widely used in statistics. In high-dimensional mixture linear regression, the model is assumed to be a finite mixture of linear regression and the number of predictors is much larger than the sample size. The standard EM algorithm, which attempts to find the maximum likelihood estimator, becomes infeasible for such model. We devise a group lasso penalized EM algorithm and study its statistical properties. Existing theoretical results of regularized EM algorithms often rely on dividing the sample into many independent batches and employing a fresh batch of sample in each iteration of the algorithm. Our algorithm and theoretical analysis do not require sample-splitting, and can be extended to multivariate response cases. The proposed methods also have encouraging performances in numerical studies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Survey Sampling and Estimation Techniques · Crystallization and Solubility Studies