Exponentially Consistent Nonparametric Linkage-Based Clustering of Data Sequences
Bhupender Singh, Ananth Ram Rajagopalan, Srikrishna Bhashyam

TL;DR
This paper demonstrates that the SLINK clustering algorithm is exponentially consistent for a broader class of data sequences than previously known, and introduces a sequential version that improves sample efficiency.
Contribution
The paper extends the theoretical understanding of SLINK clustering's exponential consistency to less restrictive conditions and proposes a sequential algorithm with better sample efficiency.
Findings
SLINK is exponentially consistent under the condition d_I < d_H.
SLINK outperforms k-medoids in certain clustering scenarios.
The sequential SLINK-SEQ algorithm reduces the required sample size.
Abstract
In this paper, we consider nonparametric clustering of independent and identically distributed (i.i.d.) data sequences generated from {\em unknown} distributions. The distributions of the data sequences belong to underlying distribution clusters. Existing results on exponentially consistent nonparametric clustering algorithms, like single linkage-based (SLINK) clustering and -medoids distribution clustering, assume that the maximum intra-cluster distance () is smaller than the minimum inter-cluster distance (). First, in the fixed sample size (FSS) setting, we show that exponential consistency can be achieved for SLINK clustering under a less strict assumption, , where is the maximum distance between any two sub-clusters of a cluster that partition the cluster. Note that in general. Thus, our results show that SLINK is exponentially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Data Stream Mining Techniques
