Regression Augmentation With Data-Driven Segmentation
Shayan Alahyari, Shiva Mehdipour Ghobadlou, Mike Domaratzki

TL;DR
This paper introduces a novel GAN-based data augmentation method for imbalanced regression that automatically identifies minority samples using data-driven modeling, leading to improved performance over existing techniques.
Contribution
It presents a fully data-driven, GAN-based augmentation framework utilizing Mahalanobis-GMM and nearest-neighbour matching to better identify and enrich rare regions in imbalanced regression.
Findings
Outperforms state-of-the-art augmentation methods on 32 datasets.
Automatically identifies truly rare samples without preset thresholds.
Enhances model performance on skewed target distributions.
Abstract
Imbalanced regression arises when the target distribution is skewed, causing models to focus on dense regions and struggle with underrepresented (minority) samples. Despite its relevance across many applications, few methods have been designed specifically for this challenge. Existing approaches often rely on fixed, ad hoc thresholds to label samples as rare or common, overlooking the continuous complexity of the joint feature-target space and fail to represent the true underlying rare regions. To address these limitations, we propose a fully data-driven GAN-based augmentation framework that uses Mahalanobis-Gaussian Mixture Modeling (GMM) to automatically identify minority samples and employs deterministic nearest-neighbour matching to enrich sparse regions. Rather than preset thresholds, our method lets the data determine which observations are truly rare. Evaluation on 32 benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Bayesian Methods and Mixture Models · Face and Expression Recognition
