Fair Overlap Number of Balls (Fair-ONB): A Data-Morphology-based   Undersampling Method for Bias Reduction

Jos\'e Daniel Pascual-Triana; Alberto Fern\'andez; Paulo Novais,; Francisco Herrera

arXiv:2407.14210·cs.LG·March 13, 2025

Fair Overlap Number of Balls (Fair-ONB): A Data-Morphology-based Undersampling Method for Bias Reduction

Jos\'e Daniel Pascual-Triana, Alberto Fern\'andez, Paulo Novais,, Francisco Herrera

PDF

Open Access

TL;DR

Fair-ONB is a novel data-morphology-based undersampling technique designed to reduce bias in classification tasks involving protected features, improving fairness with minimal impact on accuracy.

Contribution

This paper introduces Fair-ONB, a new undersampling method that leverages data morphology to target overlap areas, enhancing fairness in AI models.

Findings

01

Improves fairness metrics in classification models.

02

Maintains high predictive performance.

03

Effectively reduces bias in datasets with protected features.

Abstract

One of the key issues regarding classification problems in Trustworthy Artificial Intelligence is ensuring Fairness in the prediction of different classes when protected (sensitive) features are present. Data quality is critical in these cases, as biases in training data can be reflected in machine learning, impacting human lives and failing to comply with current regulations. One strategy to improve data quality and avoid these problems is preprocessing the dataset. Instance selection via undersampling can foster balanced learning of classes and protected feature values. Performing undersampling in class overlap areas close to the decision boundary should bolster the impact on the classifier. This work proposes Fair Overlap Number of Balls (Fair-ONB), an undersampling method that harnesses the data morphology of the different data groups (obtained from the combination of classes and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance