Bayesian Agglomerative Clustering with Coalescents
Yee Whye Teh, Hal Daum\'e III, Daniel Roy

TL;DR
This paper presents a Bayesian hierarchical clustering method using Kingman's coalescent prior, with new inference algorithms that outperform existing methods in document and linguistic data clustering.
Contribution
It introduces a novel Bayesian model for hierarchical clustering based on coalescent processes and develops efficient greedy and Monte Carlo inference algorithms.
Findings
Our algorithms outperform existing clustering methods.
Effective in document clustering tasks.
Demonstrated applicability in phylolinguistics.
Abstract
We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman's coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over others, and demonstrate our approach in document clustering and phylolinguistics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Data Management and Algorithms
