Full Model Estimation for Non-Parametric Multivariate Finite Mixture Models
Marie Du Roy de Chaumaray, Matthieu Marbac

TL;DR
This paper introduces a consistent method for full model estimation in non-parametric multivariate finite mixture models, focusing on selecting the number of components and relevant variables using discretization and penalization techniques.
Contribution
It proposes a novel approach combining variable discretization and penalized likelihood to accurately estimate mixture models and select relevant variables.
Findings
Estimator is consistent under proper penalty choice.
Method performs well on simulated data.
Effective variable and component selection demonstrated on benchmark data.
Abstract
This paper addresses the problem of full model estimation for non-parametric finite mixture models. It presents an approach for selecting the number of components and the subset of discriminative variables (i.e., the subset of variables having different distributions among the mixture components). The proposed approach considers a discretization of each variable into B bins and a penalization of the resulting log-likelihood. Considering that the number of bins tends to infinity as the sample size tends to infinity, we prove that our estimator of the model (number of components and subset of relevant variables for clustering) is consistent under a suitable choice of the penalty term. Interest of our proposal is illustrated on simulated and benchmark data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
