Variable selection for model-based clustering using the integrated   complete-data likelihood

Marbac Matthieu; Sedki Mohammed

arXiv:1501.06314·stat.ME·December 23, 2016·Stat. Comput.

Variable selection for model-based clustering using the integrated complete-data likelihood

Marbac Matthieu, Sedki Mohammed

PDF

TL;DR

This paper introduces a novel model selection criterion based on integrated complete-data likelihood for variable selection in model-based clustering, eliminating the need for parameter estimation during selection.

Contribution

It proposes a new, computationally efficient information criterion for variable selection that avoids parameter estimation and improves over classical methods.

Findings

01

Outperforms classical variable selection methods on simulated data

02

Efficient model selection without parameter estimation

03

Applicable to Gaussian mixture models with independence assumptions

Abstract

Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty. However, the calibration of the penalty term can suffer from criticisms. Model selection methods are an efficient alternative, yet they require a difficult optimization of an information criterion which involves combinatorial problems. First, most of these optimization algorithms are based on a suboptimal procedure (e.g. stepwise method). Second, the algorithms are often greedy because they need multiple calls of EM algorithms. Here we propose to use a new information criterion based on the integrated complete-data likelihood. It does not require any estimate and its maximization is simple and computationally efficient. The original contribution of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.