A Unified Framework for Tuning Hyperparameters in Clustering Problems
Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang

TL;DR
This paper introduces a unified, theoretically-guaranteed framework for tuning hyperparameters in clustering problems, applicable to various models including mixture models and network models, with demonstrated superior performance.
Contribution
The paper presents a novel framework with provable guarantees for hyperparameter tuning across multiple clustering models, unifying different approaches under one method.
Findings
Framework outperforms existing tuning methods in simulations and real data.
Provides theoretical guarantees for hyperparameter selection.
Applicable to both i.i.d. and non-i.i.d. data models.
Abstract
Selecting hyperparameters for unsupervised learning problems is challenging in general due to the lack of ground truth for validation. Despite the prevalence of this issue in statistics and machine learning, especially in clustering problems, there are not many methods for tuning these hyperparameters with theoretical guarantees. In this paper, we provide a framework with provable guarantees for selecting hyperparameters in a number of distinct models. We consider both the subgaussian mixture model and network models to serve as examples of i.i.d. and non-i.i.d. data. We demonstrate that the same framework can be used to choose the Lagrange multipliers of penalty terms in semi-definite programming (SDP) relaxations for community detection, and the bandwidth parameter for constructing kernel similarity matrices for spectral clustering. By incorporating a cross-validation procedure, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Complex Network Analysis Techniques
