TL;DR
This paper introduces HI-VAE, a novel variational autoencoder framework designed to effectively model and impute incomplete, heterogeneous data types, improving predictive accuracy in real-world scenarios with missing information.
Contribution
The paper presents a comprehensive HI-VAE model capable of handling various data types and missing data, advancing the applicability of VAEs to real-world, incomplete datasets.
Findings
HI-VAE accurately imputes missing data across multiple data types.
HI-VAE outperforms traditional supervised models on incomplete datasets.
The framework demonstrates competitive predictive performance.
Abstract
Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate for capturing the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications. In this paper, we propose a general framework to design VAEs suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows accurate estimation (and potentially imputation) of missing data. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming supervised models when trained on incomplete data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
