Search Algorithms and Loss Functions for Bayesian Clustering
David B. Dahl, Devin J. Johnson, Peter Mueller

TL;DR
This paper introduces a stochastic greedy search algorithm for Bayesian clustering that efficiently finds optimal partitions by minimizing various loss functions, outperforming existing methods in accuracy and speed.
Contribution
The paper presents a novel randomized greedy search algorithm for Bayesian clustering that handles complex loss functions and improves computational efficiency.
Findings
The method produces better clustering estimates that minimize expected loss.
It is faster than existing clustering algorithms.
The approach effectively handles complex loss functions like Binder loss and variation of information.
Abstract
We propose a randomized greedy search algorithm to find a point estimate for a random partition based on a loss function and posterior Monte Carlo samples. Given the large size and awkward discrete nature of the search space, the minimization of the posterior expected loss is challenging. Our approach is a stochastic search based on a series of greedy optimizations performed in a random order and is embarrassingly parallel. We consider several loss functions, including Binder loss and variation of information. We note that criticisms of Binder loss are the result of using equal penalties of misclassification and we show an efficient means to compute Binder loss with potentially unequal penalties. Furthermore, we extend the original variation of information to allow for unequal penalties and show no increased computational costs. We provide a reference implementation of our algorithm.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Gaussian Processes and Bayesian Inference
