On best subset regression

Shifeng Xiong

arXiv:1112.0918·stat.ME·March 20, 2013

On best subset regression

Shifeng Xiong

PDF

Open Access

TL;DR

This paper explores the theoretical properties and computational algorithms for best subset regression, demonstrating its consistency in variable selection and proposing an EM-based iterative algorithm called OSS for efficient computation.

Contribution

It introduces the OSS algorithm for best subset regression, proving its convergence properties and effectiveness in variable selection under high-dimensional settings.

Findings

01

Sparse estimator retains important variables asymptotically.

02

OSS improves model fit while maintaining sparsity.

03

Simulation and real data show the method's effectiveness.

Abstract

In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We prove that the sparse estimator from such a method can retain all of the important variables asymptotically for even exponentially growing dimensionality under regularity conditions. This indicates that the best subset regression method can efficiently shrink the full model down to a submodel of a size less than the sample size, which can be analyzed by well-developed regression techniques for such cases in a follow-up study. We provide an iterative algorithm, called orthogonalizing subset selection (OSS), to address computational issues in best subset regression. OSS is an EM algorithm, and thus possesses the monotonicity property. For any sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Control Systems and Identification · Sparse and Compressive Sensing Techniques