VAE-Inf: A statistically interpretable generative paradigm for imbalanced classification

Hongfei Wu; Ruijian Han; Yancheng Yuan

arXiv:2604.25334·cs.LG·April 29, 2026

VAE-Inf: A statistically interpretable generative paradigm for imbalanced classification

Hongfei Wu, Ruijian Han, Yancheng Yuan

PDF

TL;DR

VAE-Inf introduces a two-stage, interpretable generative framework for imbalanced classification that combines deep representation learning with hypothesis testing to improve minority class detection.

Contribution

It proposes a novel VAE-based method with a distribution-aware loss and hypothesis testing for better minority class identification under data scarcity.

Findings

01

Achieves finite-sample control of false positive rate without parametric assumptions.

02

Constructs a global Gaussian reference model for the majority class using Wasserstein barycenter.

03

Demonstrates competitive performance on real-world benchmarks.

Abstract

Imbalanced classification remains a pervasive challenge in machine learning, particularly when minority samples are too scarce to provide a robust discriminative boundary. In such extreme scenarios, conventional models often suffer from unstable decision boundaries and a lack of reliable error control. To bridge the gap between generative modeling and discriminative classification, we propose a two-stage framework \textbf{VAE-Inf} that integrates deep representation learning with statistically interpretable hypothesis testing. In the first stage, we adopt a one-class modeling perspective by training a variational autoencoder (VAE) exclusively on majority-class data to capture the underlying reference distribution. The resulting latent posteriors are aggregated via a Wasserstein barycenter to construct a global Gaussian reference model, providing a geometrically principled baseline for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.