Searching for the core variables in principal components analysis
Yanina Gimenez, Guido Giussani

TL;DR
This paper presents a new variable selection procedure for principal components analysis that identifies a small, informative subset of variables to improve interpretability of underlying data structures.
Contribution
The paper introduces a nonparametric variable selection method for PCA, enhancing interpretability by focusing on core variables after initial PCA analysis.
Findings
Method effectively identifies core variables in PCA
Asymptotic analysis supports method's reliability
Examples demonstrate improved interpretability
Abstract
In this article, we introduce a procedure for selecting variables in principal components analysis. The procedure was developed to identify a small subset of the original variables that best explain the principal components through nonparametric relationships. There are usually some noisy uninformative variables in a dataset, and some variables that are strongly related to each other because of their general interdependence. The procedure is designed to be used following the satisfactory initial use of a principal components analysis with all variables, and its aim is to help to interpret underlying structures. We analyze the asymptotic behavior of the method and provide some examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
