Auditing and Mitigating Bias in Gender Classification Algorithms: A Data-Centric Approach

Tadesse K Bahiru; Natnael Tilahun Sinshaw; Teshager Hailemariam Moges; and Dheeraj Kumar Singh

arXiv:2510.17873·cs.CV·January 23, 2026

Auditing and Mitigating Bias in Gender Classification Algorithms: A Data-Centric Approach

Tadesse K Bahiru, Natnael Tilahun Sinshaw, Teshager Hailemariam Moges, and Dheeraj Kumar Singh

PDF

Open Access

TL;DR

This paper audits gender classification datasets for bias, reveals significant underrepresentation, and introduces BalancedFace, a balanced dataset that substantially reduces bias in classifiers with minimal accuracy loss.

Contribution

It provides a comprehensive bias audit of existing datasets and creates BalancedFace, a new balanced dataset to mitigate bias in gender classification algorithms.

Findings

01

Bias exists in all audited datasets, especially for intersectional groups.

02

Training on BalancedFace reduces racial subgroup bias by over 50%.

03

BalancedFace improves fairness metrics with minimal impact on accuracy.

Abstract

Gender classification systems often inherit and amplify demographic imbalances in their training data. We first audit five widely used gender classification datasets, revealing that all suffer from significant intersectional underrepresentation. To measure the downstream impact of these flaws, we train identical MobileNetV2 classifiers on the two most balanced of these datasets, UTKFace and FairFace. Our fairness evaluation shows that even these models exhibit significant bias, misclassifying female faces at a higher rate than male faces and amplifying existing racial skew. To counter these data-induced biases, we construct BalancedFace, a new public dataset created by blending images from FairFace and UTKFace, supplemented with images from other collections to fill missing demographic gaps. It is engineered to equalize subgroup shares across 189 intersections of age, race, and gender…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Face recognition and analysis · Authorship Attribution and Profiling