Understanding the Generalization Performance of Spectral Clustering Algorithms
Shaojie Li, Sheng Ouyang, Yong Liu

TL;DR
This paper analyzes the generalization performance of spectral clustering algorithms, providing excess risk bounds, and introduces new algorithms that improve out-of-sample clustering without re-eigendecomposition.
Contribution
It offers the first theoretical excess risk bounds for spectral clustering and proposes algorithms that enhance out-of-sample clustering efficiency.
Findings
Excess risk bounds converge at a rate of O(1/√n).
Algorithms can be designed to reduce the key quantity influencing excess risk.
Proposed algorithms effectively cluster out-of-sample data without re-eigendecomposition.
Abstract
The theoretical analysis of spectral clustering mainly focuses on consistency, while there is relatively little research on its generalization performance. In this paper, we study the excess risk bounds of the popular spectral clustering algorithms: \emph{relaxed} RatioCut and \emph{relaxed} NCut. Firstly, we show that their excess risk bounds between the empirical continuous optimal solution and the population-level continuous optimal solution have a convergence rate, where is the sample size. Secondly, we show the fundamental quantity in influencing the excess risk between the empirical discrete optimal solution and the population-level discrete optimal solution. At the empirical level, algorithms can be designed to reduce this quantity. Based on our theoretical analysis, we propose two novel algorithms that can not only penalize this quantity, but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFace and Expression Recognition · Advanced Computing and Algorithms · Advanced Clustering Algorithms Research
MethodsSpectral Clustering
