Flexible High-Dimensional Unsupervised Learning with Missing Data
Yuhong Wei, Yang Tang, Paul D. McNicholas

TL;DR
This paper introduces a generalized hyperbolic factor analyzers model that handles high-dimensional data with missing values, providing an efficient estimation algorithm and demonstrating its effectiveness on simulated and real datasets.
Contribution
It extends the MGHFA model to accommodate missing data and develops a new efficient estimation algorithm for high-dimensional unsupervised learning.
Findings
Effective handling of missing data in high-dimensional settings.
Improved clustering and imputation performance demonstrated.
Algorithm shows computational efficiency on real and simulated data.
Abstract
The mixture of factor analyzers (MFA) model is a famous mixture model-based approach for unsupervised learning with high-dimensional data. It can be useful, inter alia, in situations where the data dimensionality far exceeds the number of observations. In recent years, the MFA model has been extended to non-Gaussian mixtures to account for clusters with heavier tail weight and/or asymmetry. The generalized hyperbolic factor analyzers (MGHFA) model is one such extension, which leads to a flexible modelling paradigm that accounts for both heavier tail weight and cluster asymmetry. In many practical applications, the occurrence of missing values often complicates data analyses. A generalization of the MGHFA is presented to accommodate missing values. Under a missing-at-random mechanism, we develop a computationally efficient alternating expectation conditional maximization algorithm for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · Statistical Methods and Bayesian Inference
