Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance

\c{C}a\u{g}r{\i} Eser; Zeynep Sonat Baltac{\i}; Emre Akba\c{s}; Sinan Kalkan

arXiv:2511.10475·cs.LG·January 22, 2026

Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance

\c{C}a\u{g}r{\i} Eser, Zeynep Sonat Baltac{\i}, Emre Akba\c{s}, Sinan Kalkan

PDF

Open Access

TL;DR

This paper introduces data Intrinsic Dimensionality (ID) as a simple, model-free metric for class imbalance, outperforming traditional cardinality-based methods and enhancing imbalance mitigation strategies.

Contribution

It proposes using Intrinsic Dimensionality as a novel, effective measure of class imbalance that can be integrated into existing methods, improving performance across datasets.

Findings

01

ID outperforms cardinality-based re-weighting and re-sampling techniques.

02

Combining ID with cardinality further enhances performance.

03

ID is easy-to-compute and model-free, suitable for various imbalance mitigation methods.

Abstract

Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes. This, however, disregards the presence of redundant examples and inherent differences in the learning difficulties of classes. Alternatively, one can use complex measures such as training loss and uncertainty, which, however, depend on training a machine learning model. Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance that can be seamlessly incorporated into various imbalance mitigation methods. Our results across five different datasets with a diverse range of imbalance ratios show that ID consistently outperforms cardinality-based re-weighting and re-sampling techniques used in the literature. Moreover, we show that combining ID with cardinality can further improve performance. Our code and models are available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)