IVGAE: Handling Incomplete Heterogeneous Data with a Variational Graph Autoencoder
Youran Zhou, Mohamed Reda Bouadjenek, Sunil Aryal%

TL;DR
IVGAE introduces a graph-based autoencoder framework that effectively imputes missing heterogeneous data by modeling structural dependencies and missingness patterns, outperforming existing methods across various real-world datasets.
Contribution
The paper proposes IVGAE, a novel variational graph autoencoder with a dual-decoder and Transformer-based embeddings for improved handling of incomplete heterogeneous data.
Findings
IVGAE outperforms existing methods in RMSE and F1 scores.
Effective in various missing data scenarios including MCAR, MAR, and MNAR.
Achieves consistent improvements across 16 real-world datasets.
Abstract
Handling missing data remains a fundamental challenge in real-world tabular datasets, especially when data are heterogeneous with both numerical and categorical features. Existing imputation methods often fail to capture complex structural dependencies and handle heterogeneous data effectively. We present \textbf{IVGAE}, a Variational Graph Autoencoder framework for robust imputation of incomplete heterogeneous data. IVGAE constructs a bipartite graph to represent sample-feature relationships and applies graph representation learning to model structural dependencies. A key innovation is its \textit{dual-decoder architecture}, where one decoder reconstructs feature embeddings and the other models missingness patterns, providing structural priors aware of missing mechanisms. To better encode categorical variables, we introduce a Transformer-based heterogeneous embedding module that avoids…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Healthcare · Domain Adaptation and Few-Shot Learning
