Analysis of Estimating the Bayes Rule for Gaussian Mixture Models with a Specified Missing-Data Mechanism
Ziyang Lyu

TL;DR
This paper demonstrates that incorporating a missing-data mechanism into Bayesian classifiers for Gaussian mixture models can outperform traditional fully supervised classifiers, especially under certain conditions of class overlap and missing labels.
Contribution
It introduces a novel classifier that leverages a missing-data mechanism within the Bayesian framework, showing improved performance over standard classifiers in Gaussian mixture models.
Findings
The missing-data mechanism classifier outperforms fully supervised classifiers in low-overlap scenarios.
It performs better regardless of overlap or missing label proportion.
Simulation and real data examples validate the effectiveness of the proposed method.
Abstract
Semi-supervised learning (SSL) approaches have been successfully applied in a wide range of engineering and scientific fields. This paper investigates the generative model framework with a missingness mechanism for unclassified observations, as introduced by Ahfock and McLachlan(2020). We show that in a partially classified sample, a classifier using Bayes rule of allocation with a missing-data mechanism can surpass a fully supervised classifier in a two-class normal homoscedastic model, especially with moderate to low overlap and proportion of missing class labels, or with large overlap but few missing labels. It also outperforms a classifier with no missing-data mechanism regardless of the overlap region or the proportion of missing class labels. Our exploration of two- and three-component normal mixture models with unequal covariances through simulations further corroborates our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Genetic and phenotypic traits in livestock · Machine Learning and Data Classification
