Appropriateness of Performance Indices for Imbalanced Data Classification: An Analysis
Sankha Subhra Mullick, Shounak Datta, Sourish Gunesh Dhekane and, Swagatam Das

TL;DR
This paper critically examines performance indices for imbalanced data classification, identifying conditions for their robustness, analyzing common indices theoretically, proposing modifications, and recommending best practices based on empirical and theoretical insights.
Contribution
It introduces fundamental conditions for performance indices under class imbalance, analyzes existing indices, and provides guidelines for their appropriate use and modification.
Findings
Certain indices are sensitive to class distribution changes.
Modified indices better satisfy robustness conditions.
Recommendations improve classifier evaluation in imbalanced scenarios.
Abstract
Indices quantifying the performance of classifiers under class-imbalance, often suffer from distortions depending on the constitution of the test set or the class-specific classification accuracy, creating difficulties in assessing the merit of the classifier. We identify two fundamental conditions that a performance index must satisfy to be respectively resilient to altering number of testing instances from each class and the number of classes in the test set. In light of these conditions, under the effect of class imbalance, we theoretically analyze four indices commonly used for evaluating binary classifiers and five popular indices for multi-class classifiers. For indices violating any of the conditions, we also suggest remedial modification and normalization. We further investigate the capability of the indices to retain information about the classification performance over all the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
