Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs
Navindu Leelarathna, Andrei Margeloiu, Mateja Jamnik, Nikola, Simidjievski

TL;DR
This paper introduces a novel divide-and-conquer ensemble VAE method that improves representation learning in high-dimensional, small-sample tabular data, enhancing downstream classification, disentanglement, and robustness.
Contribution
It proposes an ensemble of lightweight VAEs with a new joint posterior factorization for better HDLSS representation learning, data augmentation, and robustness.
Findings
Better latent representations in HDLSS settings
Higher accuracy in downstream classification
Improved disentanglement and robustness
Abstract
Variational Autoencoders and their many variants have displayed impressive ability to perform dimensionality reduction, often achieving state-of-the-art performance. Many current methods however, struggle to learn good representations in High Dimensional, Low Sample Size (HDLSS) tasks, which is an inherently challenging setting. We address this challenge by using an ensemble of lightweight VAEs to learn posteriors over subsets of the feature-space, which get aggregated into a joint posterior in a novel divide-and-conquer approach. Specifically, we present an alternative factorisation of the joint posterior that induces a form of implicit data augmentation that yields greater sample efficiency. Through a series of experiments on eight real-world datasets, we show that our method learns better latent representations in HDLSS settings, which leads to higher accuracy in a downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Cancer-related molecular mechanisms research
