An Approach to Variable Clustering: K-means in Transposed Data and its Relationship with Principal Component Analysis
Victor Saquicela, Kenneth Palacio-Baus, Mario Chifla

TL;DR
This paper introduces a novel method combining PCA and K-means on transposed data to analyze variable clustering and its relation to principal components, enhancing multivariate analysis insights.
Contribution
It proposes a new approach to relate variable clusters from K-means on transposed data with PCA components, filling a gap in multivariate analysis methods.
Findings
Provides a quantitative measure of variable cluster contributions to principal components
Enables exploration of variable clustering effects on data variation
Bridges the gap between variable clustering and PCA interpretation
Abstract
Principal Component Analysis (PCA) and K-means constitute fundamental techniques in multivariate analysis. Although they are frequently applied independently or sequentially to cluster observations, the relationship between them, especially when K-means is used to cluster variables rather than observations, has been scarcely explored. This study seeks to address this gap by proposing an innovative method that analyzes the relationship between clusters of variables obtained by applying K-means on transposed data and the principal components of PCA. Our approach involves applying PCA to the original data and K-means to the transposed data set, where the original variables are converted into observations. The contribution of each variable cluster to each principal component is then quantified using measures based on variable loadings. This process provides a tool to explore and understand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Applications · Advanced Clustering Algorithms Research · Sensory Analysis and Statistical Methods
