Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective
Firas Laakom, Haobo Chen, J\"urgen Schmidhuber, Yuheng Bu

TL;DR
This paper introduces an information-theoretic framework to analyze and bound fairness overfitting in machine learning, providing theoretical guarantees and empirical validation for fairness generalization.
Contribution
It develops a novel bounding technique using Efron-Stein inequality to derive tight fairness generalization bounds with MI and CMI, advancing understanding of fairness overfitting.
Findings
Bounds are tight and practically relevant across various algorithms.
The framework offers insights for designing algorithms with better fairness generalization.
Empirical results validate the theoretical bounds.
Abstract
Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron-Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Mobile Crowdsensing and Crowdsourcing · Adversarial Robustness in Machine Learning
