Repulsive Mixtures

Francesca Petralia; Vinayak Rao; David B. Dunson

arXiv:1204.5243·stat.ME·September 21, 2012

Repulsive Mixtures

Francesca Petralia, Vinayak Rao, David B. Dunson

PDF

Open Access

TL;DR

This paper introduces a repulsive process for mixture models to produce fewer, more distinct, and interpretable clusters, addressing issues of redundancy and overlap in traditional methods.

Contribution

It proposes a novel repulsive prior for mixture components, with theoretical characterization and a new MCMC algorithm for improved clustering.

Findings

01

Fewer, better-separated clusters in simulated data

02

More interpretable clusters in real datasets

03

Theoretical properties of the repulsive prior established

Abstract

Discrete mixture models are routinely used for density estimation and clustering. While conducting inferences on the cluster-specific parameters, current frequentist and Bayesian methods often encounter problems when clusters are placed too close together to be scientifically meaningful. Current Bayesian practice generates component-specific parameters independently from a common prior, which tends to favor similar components and often leads to substantial probability assigned to redundant components that are not needed to fit the data. As an alternative, we propose to generate components from a repulsive process, which leads to fewer, better separated and more interpretable clusters. We characterize this repulsive prior theoretically and propose a Markov chain Monte Carlo sampling algorithm for posterior computation. The methods are illustrated using simulated data as well as real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Advanced Clustering Algorithms Research