Better subset regression

Shifeng Xiong

arXiv:1212.0634·stat.ME·March 20, 2013

Better subset regression

Shifeng Xiong

PDF

Open Access

TL;DR

This paper introduces an EM algorithm-based approach for subset selection in high-dimensional linear regression, demonstrating that better model fitting correlates with improved variable screening and outperforming existing methods in simulations.

Contribution

It proposes a novel EM algorithm for best subset regression that enhances screening performance by leveraging model fitting, supported by theoretical and simulation results.

Findings

01

The method improves variable screening accuracy.

02

It outperforms popular screening methods in simulations.

03

The algorithms have a monotonicity property ensuring better model fitting.

Abstract

To find efficient screening methods for high dimensional linear regression models, this paper studies the relationship between model fitting and screening performance. Under a sparsity assumption, we show that a subset that includes the true submodel always yields smaller residual sum of squares (i.e., has better model fitting) than all that do not in a general asymptotic setting. This indicates that, for screening important variables, we could follow a "better fitting, better screening" rule, i.e., pick a "better" subset that has better model fitting. To seek such a better subset, we consider the optimization problem associated with best subset regression. An EM algorithm, called orthogonalizing subset screening, and its accelerating version are proposed for searching for the best subset. Although the two algorithms cannot guarantee that a subset they yield is the best, their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Control Systems and Identification