Bayesian Distance Clustering
Leo L Duan, David B Dunson

TL;DR
This paper introduces Bayesian distance clustering, a robust method that models pairwise distances instead of raw data, improving cluster inference especially when traditional kernels are misspecified.
Contribution
It proposes a novel Bayesian distance clustering framework that enhances robustness by focusing on pairwise distances, bridging distance- and model-based clustering approaches.
Findings
Significant improvement in clustering robustness over traditional methods.
Effective in identifying clusters poorly represented by standard kernels.
Validated through simulation and application to brain genome data.
Abstract
Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on properties of pairwise differences between data points, we propose a class of Bayesian distance clustering methods, which rely on modeling the likelihood of the pairwise distances in place of the original data. Although some information in the data is discarded, we gain substantial robustness to modeling assumptions. The proposed approach represents an appealing middle ground between distance- and model-based clustering, drawing advantages from each of these canonical approaches. We illustrate dramatic gains in the ability to infer clusters that are not well represented by the usual choices of kernel. A simulation study is included to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · Advanced Clustering Algorithms Research
