Variable Importance Assessments and Backward Variable Selection for High-Dimensional Data
Liuhua Peng, Long Qu, Dan Nettleton

TL;DR
This paper introduces a novel distance-based variable importance measure and a backward selection algorithm for high-dimensional data, improving variable selection accuracy in genomic analysis and similar fields.
Contribution
It proposes a new variable importance assessment inspired by MRPP and a backward selection method tailored for high-dimensional variable selection.
Findings
Effective in identifying important variables in high-dimensional data
Outperforms existing methods in simulations and real data
Demonstrates good properties and advantages over other approaches
Abstract
Variable selection in high-dimensional scenarios is of great interested in statistics. One application involves identifying differentially expressed genes in genomic analysis. Existing methods for addressing this problem have some limits or disadvantages. In this paper, we propose distance based variable importance measures to deal with these problems, which is inspired by the Multi-Response Permutation Procedure (MRPP). The proposed variable importance assessments can effectively measure the importance of an individual dimension by quantifying its influence on the differences between multivariate distributions. A backward selection algorithm is developed that can be used in high-dimensional variable selection to discover important variables. Both simulations and real data applications demonstrate that our proposed method enjoys good properties and has advantages over other methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods and Inference · Bayesian Methods and Mixture Models
