ScarceGAN: Discriminative Classification Framework for Rare Class Identification for Longitudinal Data with Weak Prior
Surajit Chakrabarty, Rukma Talwadker, Tridib Mukherjee

TL;DR
ScarceGAN is a novel semi-supervised GAN framework designed to identify extremely rare classes in longitudinal data with weak labels, outperforming existing models in recall and establishing new benchmarks.
Contribution
It introduces a modified semi-supervised GAN that leverages weak negative labels and positive samples to improve rare class detection in highly imbalanced, longitudinal datasets.
Findings
Achieves over 85% recall on scarce classes, a 60% improvement over vanilla semi-supervised GAN.
Outperforms recent GAN-based models in rare attack class identification in intrusion datasets.
Establishes new benchmarks for rare class detection in longitudinal telemetry data.
Abstract
This paper introduces ScarceGAN which focuses on identification of extremely rare or scarce samples from multi-dimensional longitudinal telemetry data with small and weak label prior. We specifically address: (i) severe scarcity in positive class, stemming from both underlying organic skew in the data, as well as extremely limited labels; (ii) multi-class nature of the negative samples, with uneven density distributions and partially overlapping feature distributions; and (iii) massively unlabelled data leading to tiny and weak prior on both positive and negative classes, and possibility of unseen or unknown behavior in the unlabelled set, especially in the negative class. Although related to PU learning problems, we contend that knowledge (or lack of it) on the negative class can be leveraged to learn the compliment of it (i.e., the positive class) better in a semi-supervised manner.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
