Searching for the core variables in principal components analysis

Yanina Gimenez; Guido Giussani

arXiv:1308.6626·math.ST·January 31, 2017

Searching for the core variables in principal components analysis

Yanina Gimenez, Guido Giussani

PDF

TL;DR

This paper presents a new variable selection procedure for principal components analysis that identifies a small, informative subset of variables to improve interpretability of underlying data structures.

Contribution

The paper introduces a nonparametric variable selection method for PCA, enhancing interpretability by focusing on core variables after initial PCA analysis.

Findings

01

Method effectively identifies core variables in PCA

02

Asymptotic analysis supports method's reliability

03

Examples demonstrate improved interpretability

Abstract

In this article, we introduce a procedure for selecting variables in principal components analysis. The procedure was developed to identify a small subset of the original variables that best explain the principal components through nonparametric relationships. There are usually some noisy uninformative variables in a dataset, and some variables that are strongly related to each other because of their general interdependence. The procedure is designed to be used following the satisfactory initial use of a principal components analysis with all variables, and its aim is to help to interpret underlying structures. We analyze the asymptotic behavior of the method and provide some examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.