Size-adaptive Hypothesis Testing for Fairness
Antonio Ferrara, Francesco Cozzi, Alan Perotti, Andr\'e Panisson, Francesco Bonchi

TL;DR
This paper introduces a size-adaptive hypothesis testing framework for fairness that provides statistically rigorous decisions for both large and small demographic groups, addressing issues of sampling error and intersectionality.
Contribution
It develops a unified approach combining analytic confidence intervals for large groups and Bayesian credible intervals for small groups, improving fairness assessment reliability.
Findings
Provides guaranteed error control for large subgroups.
Calibrates credible intervals for small intersectional groups.
Demonstrates effectiveness on benchmark datasets.
Abstract
Determining whether an algorithmic decision-making system discriminates against a specific demographic typically involves comparing a single point estimate of a fairness metric against a predefined threshold. This practice is statistically brittle: it ignores sampling error and treats small demographic subgroups the same as large ones. The problem intensifies in intersectional analyses, where multiple sensitive attributes are considered jointly, giving rise to a larger number of smaller groups. As these groups become more granular, the data representing them becomes too sparse for reliable estimation, and fairness metrics yield excessively wide confidence intervals, precluding meaningful conclusions about potential unfair treatments. In this paper, we introduce a unified, size-adaptive, hypothesis-testing framework that turns fairness assessment into an evidence-based statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Mobile Crowdsensing and Crowdsourcing
