On the clustering of correlated random variables
Zenon Gniazdowski, Dawid Kaliszewski

TL;DR
This paper investigates clustering correlated random variables using k-means and spectral algorithms, analyzing different similarity measures and initial point effects across various datasets.
Contribution
It introduces methods for clustering correlated variables based on correlation and determination coefficients, comparing spectral and k-means approaches.
Findings
Spectral methods using correlation and determination matrices are effective.
Initial point diversity impacts k-means clustering efficiency.
Different dissimilarity measures influence clustering outcomes.
Abstract
In this work, the possibility of clustering correlated random variables was examined, both because of their mutual similarity and because of their similarity to the principal components. The k-means algorithm and spectral algorithms were used for clustering. For spectral methods, the similarity matrix was both the matrix of relation established on the level of correlation and the matrix of coefficients of determination. For four different sets of data, different ways of measuring the disimilarity of variables were analyzed, and the impact of the diversity of initial points on the efficiency of the k-means algorithm was analyzed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Rough Sets and Fuzzy Logic
