Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data
Siva Rajesh Kasa, Vaibhav Rajan

TL;DR
This paper compares gradient descent with automatic differentiation to EM for Gaussian mixture model clustering, addressing misspecification and high-dimensional data challenges, and introduces a new penalized likelihood method.
Contribution
It proposes a novel penalized likelihood approach with AD for improved clustering, especially in high-dimensional settings, and analyzes performance under misspecification.
Findings
GD outperforms EM on high-dimensional data
EM performs better under model misspecification
The penalized likelihood improves cluster interpretation
Abstract
We study two practically important cases of model based clustering using Gaussian Mixture Models: (1) when there is misspecification and (2) on high dimensional data, in the light of recent advances in Gradient Descent (GD) based optimization using Automatic Differentiation (AD). Our simulation studies show that EM has better clustering performance, measured by Adjusted Rand Index, compared to GD in cases of misspecification, whereas on high dimensional data GD outperforms EM. We observe that both with EM and GD there are many solutions with high likelihood but poor cluster interpretation. To address this problem we design a new penalty term for the likelihood based on the Kullback Leibler divergence between pairs of fitted components. Closed form expressions for the gradients of this penalized likelihood are difficult to derive but AD can be done effortlessly, illustrating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Statistical Methods and Inference
