The Loss Rank Criterion for Variable Selection in Linear Regression   Analysis

Minh-Ngoc Tran

arXiv:1011.1373·stat.ME·February 26, 2014

The Loss Rank Criterion for Variable Selection in Linear Regression Analysis

Minh-Ngoc Tran

PDF

TL;DR

This paper introduces a new criterion for selecting the best variable subset in linear regression, demonstrating its consistency and efficiency, especially in high-dimensional data, with strong simulation and real data results.

Contribution

A novel model selection criterion that is consistent and computationally efficient for high-dimensional linear regression variable selection.

Findings

01

Proven model selection consistency when covariates are fixed.

02

Effective in high-dimensional settings with more variables than samples.

03

Performs well compared to existing methods in simulations.

Abstract

Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model selection criterion is proposed to select the best one among this preselected set. The approach leads to a fast and efficient procedure for variable selection, especially in high-dimensional settings. Model selection consistency of the suggested criterion is proven when the number of covariates d is fixed. Simulation studies suggest that the criterion still enjoys model selection consistency when d is much larger than the sample size. The simulations also show that our approach for variable selection works surprisingly well in comparison with existing competitors. The method is also applied to a real data set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.