Multimodal Deep Generative Model for Semi-Supervised Learning under Class Imbalance

Heegeon Yoon; Heeyoung Kim

arXiv:2605.06289·stat.ML·May 8, 2026

Multimodal Deep Generative Model for Semi-Supervised Learning under Class Imbalance

Heegeon Yoon, Heeyoung Kim

PDF

TL;DR

This paper introduces a multimodal deep generative model designed for semi-supervised learning on imbalanced datasets, leveraging separate modality encoders, Student's t-distributions, and a novel training objective to improve classification accuracy.

Contribution

It proposes a new multimodal generative framework with heavy-tailed priors and a specialized training objective to effectively handle class imbalance and partial supervision.

Findings

01

Outperforms baseline methods on benchmark datasets.

02

Achieves superior classification accuracy with imbalanced, partially labeled multimodal data.

03

Demonstrates robustness across real-world datasets.

Abstract

When modeling class-imbalanced data, it is crucial to address the imbalance, as models trained on such data tend to be biased towards the majority classes. This problem is amplified under partial supervision, where pseudo-labels for unlabeled data are predicted based on imbalanced labeled data, propagating the bias. While recent semi-supervised models address class imbalance, they typically assume single-modal input data. However, with the growing availability of multimodal data, it is essential to leverage complementary modalities. In this article, we propose a multimodal deep generative model for semi-supervised learning under class imbalance. Our approach uses separate encoders for each modality, sharing latent variables across modalities, and simplifies joint posterior computation with a product-of-experts method. To further address class imbalance, we replace typical Gaussian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.