On the clustering of correlated random variables

Zenon Gniazdowski; Dawid Kaliszewski

arXiv:1909.03332·cs.LG·September 10, 2019·1 cites

On the clustering of correlated random variables

Zenon Gniazdowski, Dawid Kaliszewski

PDF

Open Access

TL;DR

This paper investigates clustering correlated random variables using k-means and spectral algorithms, analyzing different similarity measures and initial point effects across various datasets.

Contribution

It introduces methods for clustering correlated variables based on correlation and determination coefficients, comparing spectral and k-means approaches.

Findings

01

Spectral methods using correlation and determination matrices are effective.

02

Initial point diversity impacts k-means clustering efficiency.

03

Different dissimilarity measures influence clustering outcomes.

Abstract

In this work, the possibility of clustering correlated random variables was examined, both because of their mutual similarity and because of their similarity to the principal components. The k-means algorithm and spectral algorithms were used for clustering. For spectral methods, the similarity matrix was both the matrix of relation established on the level of correlation and the matrix of coefficients of determination. For four different sets of data, different ways of measuring the disimilarity of variables were analyzed, and the impact of the diversity of initial points on the efficiency of the k-means algorithm was analyzed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Rough Sets and Fuzzy Logic