High-dimensional regression and variable selection using CAR scores

Verena Zuber; Korbinian Strimmer

arXiv:1007.5516·stat.ME·July 20, 2011

High-dimensional regression and variable selection using CAR scores

Verena Zuber, Korbinian Strimmer

PDF

TL;DR

The paper introduces the CAR score, a new variable ranking criterion for high-dimensional linear regression that improves variable selection and prediction accuracy, especially in genomic data analysis.

Contribution

It proposes the CAR score, a novel variable importance measure based on Mahalanobis-decorrelation, with demonstrated effectiveness over existing methods.

Findings

01

CAR scores outperform elastic net and boosting in simulations

02

Effective in selecting relevant variables in genomic data

03

Provides better prediction errors and true/false positive rates

Abstract

Variable selection is a difficult problem that is particularly challenging in the analysis of high-dimensional genomic data. Here, we introduce the CAR score, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables. The CAR score provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables. It decomposes the proportion of variance explained and it is an intermediate between marginal correlation and the standardized regression coefficient. As a population quantity, any preferred inference scheme can be applied for its estimation. Using simulations we demonstrate that variable selection by CAR scores is very effective and yields prediction errors and true and false positive rates that compare favorably with modern regression techniques such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.