Fairness Sample Complexity and the Case for Human Intervention
Ananth Balashankar, Alyssa Lees

TL;DR
This paper investigates the sample complexity needed for fair classification across subgroups, emphasizing the importance of human intervention when subgroup data is insufficient for equitable machine learning.
Contribution
It introduces lower bounds on subgroup sample complexity for metric-fair learning and advocates for human intervention when data scarcity prevents fair classification.
Findings
Subgroup sample size critically affects fairness in ML models.
Aligning model dimensionality with subgroup distributions improves fairness.
Human intervention can help achieve fairness when data is limited.
Abstract
With the aim of building machine learning systems that incorporate standards of fairness and accountability, we explore explicit subgroup sample complexity bounds. The work is motivated by the observation that classifier predictions for real world datasets often demonstrate drastically different metrics, such as accuracy, when subdivided by specific sensitive variable subgroups. The reasons for these discrepancies are varied and not limited to the influence of mitigating variables, institutional bias, underlying population distributions as well as sampling bias. Among the numerous definitions of fairness that exist, we argue that at a minimum, principled ML practices should ensure that classification predictions are able to mirror the underlying sub-population distributions. However, as the number of sensitive variables increase, populations meeting at the intersectionality of these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
