When majority rules, minority loses: bias amplification of gradient descent
Fran\c{c}ois Bachoc (LPP), J\'er\^ome Bolte (TSE-R), Ryan Boustany (TSE-R), Jean-Michel Loubes (IMT)

TL;DR
This paper develops a theoretical framework to understand bias amplification in machine learning, showing how gradient descent can favor majority groups and neglect minorities, with implications demonstrated in deep learning tasks.
Contribution
It introduces a formal analysis of bias amplification, revealing how population and variance imbalance influence model bias and training requirements.
Findings
Full-data and stereotypical predictors are closely aligned.
Training often emphasizes majority traits over minority features.
There is a lower bound on additional training needed to mitigate bias.
Abstract
Despite growing empirical evidence of bias amplification in machine learning, its theoretical foundations remain poorly understood. We develop a formal framework for majority-minority learning tasks, showing how standard training can favor majority groups and produce stereotypical predictors that neglect minority-specific features. Assuming population and variance imbalance, our analysis reveals three key findings: (i) the close proximity between ``full-data'' and stereotypical predictors, (ii) the dominance of a region where training the entire model tends to merely learn the majority traits, and (iii) a lower bound on the additional training required. Our results are illustrated through experiments in deep learning for tabular and image classification tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
