High-dimensional variable selection via tilting

Haeran Cho; Piotr Fryzlewicz

arXiv:1611.08640·stat.ME·November 29, 2016

High-dimensional variable selection via tilting

Haeran Cho, Piotr Fryzlewicz

PDF

TL;DR

This paper introduces a novel tilting method for variable selection in high-dimensional linear regression, effectively distinguishing relevant variables by accounting for complex correlations, and demonstrates its theoretical and practical advantages.

Contribution

It proposes an adaptive tilting procedure that improves variable selection accuracy in high-dimensional settings by considering variable correlations.

Findings

01

The tilting method effectively discriminates relevant variables.

02

The iterative screening algorithm shows strong practical performance.

03

Theoretical conditions ensure successful variable selection.

Abstract

The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly spurious) high correlations between the variables, which result in marginal correlation being unreliable as a measure of association between the variables and the response. We propose a new way of measuring the contribution of each variable to the response which takes into account high correlations between the variables in a data-driven way. The proposed tilting procedure provides an adaptive choice between the use of marginal correlation and tilted correlation for each variable, where the choice is made depending on the values of the hard thresholded sample correlation of the design matrix. We study the conditions under which this measure can successfully…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.