AdaCluster : Adaptive Clustering for Heterogeneous Data
Mehmet Emin Basbug, Barbara Engelhardt

TL;DR
AdaCluster introduces an adaptive clustering method using parametrized Bregman divergences, effectively handling heterogeneous data with varying attribute topologies and dispersions, outperforming traditional fixed-divergence approaches.
Contribution
The paper presents AdaCluster, an EM-based algorithm that adaptively learns divergence parameters for clustering heterogeneous data, extending beyond fixed divergence methods.
Findings
AdaCluster outperforms Gaussian mixture models on synthetic and UCI datasets.
Adaptive learning of divergence improves clustering accuracy for heterogeneous data.
Proposed hard clustering method shows competitive results with k-means.
Abstract
Clustering algorithms start with a fixed divergence, which captures the possibly asymmetric distance between a sample and a centroid. In the mixture model setting, the sample distribution plays the same role. When all attributes have the same topology and dispersion, the data are said to be homogeneous. If the prior knowledge of the distribution is inaccurate or the set of plausible distributions is large, an adaptive approach is essential. The motivation is more compelling for heterogeneous data, where the dispersion or the topology differs among attributes. We propose an adaptive approach to clustering using classes of parametrized Bregman divergences. We first show that the density of a steep exponential dispersion model (EDM) can be represented with a Bregman divergence. We then propose AdaCluster, an expectation-maximization (EM) algorithm to cluster heterogeneous data using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Gaussian Processes and Bayesian Inference
