Generalization Performance of Ensemble Clustering: From Theory to Algorithm
Xu Zhang, Haoye Qiu, Weixuan Liang, Hui Liu, Junhui Hou, Yuheng Jia

TL;DR
This paper provides a theoretical analysis of ensemble clustering's generalization performance, deriving bounds and conditions for consistency, and introduces a new algorithm that improves clustering accuracy based on these insights.
Contribution
It establishes theoretical bounds for ensemble clustering's generalization error, proves conditions for consistency, and develops a novel algorithm that enhances performance by balancing bias and diversity.
Findings
Derived convergence and excess risk bounds of order $\\mathcal{O}(\rac{\log n}{m})$
Proved ensemble clustering is consistent when $m,n \to \infty$ and $m \gg \log n$
Achieved average improvements of 6.1%, 7.3%, and 6.0% on 10 datasets for NMI, ARI, and Purity
Abstract
Ensemble clustering has demonstrated great success in practice; however, its theoretical foundations remain underexplored. This paper examines the generalization performance of ensemble clustering, focusing on generalization error, excess risk and consistency. We derive a convergence rate of generalization error bound and excess risk bound both of , with and being the numbers of samples and base clusterings. Based on this, we prove that when and approach infinity and is significantly larger than log , i.e., , ensemble clustering is consistent. Furthermore, recognizing that and are finite in practice, the generalization error cannot be reduced to zero. Thus, by assigning varying weights to finite clusterings, we minimize the error between the empirical average clusterings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Neural Networks and Applications
