Determining the Number of Clusters via Iterative Consensus Clustering
Shaina Race, Carl Meyer, Kevin Valakuzhy

TL;DR
This paper introduces an iterative consensus clustering method that uses a random walk and eigenvalue analysis on a consensus matrix to accurately determine the number of clusters, especially in noisy or high-dimensional data.
Contribution
The paper proposes a novel iterative approach to refine consensus matrices for spectral clustering, improving cluster number estimation in challenging data scenarios.
Findings
Consensus matrix outperforms existing similarity matrices.
Eigenvalue analysis effectively determines the number of clusters.
Iterative refinement enhances clustering accuracy in noisy data.
Abstract
We use a cluster ensemble to determine the number of clusters, k, in a group of data. A consensus similarity matrix is formed from the ensemble using multiple algorithms and several values for k. A random walk is induced on the graph defined by the consensus matrix and the eigenvalues of the associated transition probability matrix are used to determine the number of clusters. For noisy or high-dimensional data, an iterative technique is presented to refine this consensus matrix in way that encourages a block-diagonal form. It is shown that the resulting consensus matrix is generally superior to existing similarity matrices for this type of spectral analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
