Parsimonious Skew Mixture Models for Model-Based Clustering and   Classification

Irene Vrbik; Paul D. McNicholas

arXiv:1302.2373·stat.ME·November 12, 2013·Comput. Stat. Data Anal.

Parsimonious Skew Mixture Models for Model-Based Clustering and Classification

Irene Vrbik, Paul D. McNicholas

PDF

TL;DR

This paper introduces parsimonious skew mixture models based on skew-t and skew-normal distributions for improved clustering and classification of asymmetric data, using eigenvalue decomposition and BIC for model selection.

Contribution

It develops new skew mixture models with a parsimonious structure, extending the GPCM family, and compares their performance to existing models in various classification settings.

Findings

01

Models outperform existing approaches on benchmark data

02

Effective parameter estimation via EM algorithm

03

Models successfully handle asymmetric data distributions

Abstract

In recent work, robust mixture modelling approaches using skewed distributions have been explored to accommodate asymmetric data. We introduce parsimony by developing skew-t and skew-normal analogues of the popular GPCM family that employ an eigenvalue decomposition of a positive-semidefinite matrix. The methods developed in this paper are compared to existing models in both an unsupervised and semi-supervised classification framework. Parameter estimation is carried out using the expectation-maximization algorithm and models are selected using the Bayesian information criterion. The efficacy of these extensions is illustrated on simulated and benchmark clustering data sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.