Bagged $k$-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets
Hanyuan Hang

TL;DR
This paper introduces BDMBC, an ensemble clustering algorithm that uses a new probability measure to accurately identify clusters across varying densities, with proven theoretical convergence and practical effectiveness.
Contribution
The paper presents a novel ensemble clustering method using PLLS, with theoretical guarantees and improved practical performance for mode and level set estimation.
Findings
Achieves optimal convergence rates for mode estimation.
Effectively finds localized level sets for varying densities.
Demonstrates promising accuracy and efficiency in experiments.
Abstract
In this paper, we propose an ensemble learning algorithm named \textit{bagged -distance for mode-based clustering} (\textit{BDMBC}) by putting forward a new measurement called the \textit{probability of localized level sets} (\textit{PLLS}), which enables us to find all clusters for varying densities with a global threshold. On the theoretical side, we show that with a properly chosen number of nearest neighbors in the bagged -distance, the sub-sample size , the bagging rounds , and the number of nearest neighbors for the localized level sets, BDMBC can achieve optimal convergence rates for mode estimation. It turns out that with a relatively small , the sub-sample size can be much smaller than the number of training data at each bagging round, and the number of nearest neighbors can be reduced simultaneously. Moreover, we establish optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Clustering Algorithms Research · Face and Expression Recognition
