Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness
Stephen R. Pfohl, Natalie Harris, Chirag Nagpal, David Madras, Vishwali Mhasawade, Olawale Salaudeen, Awa Dieng, Shannon Sequeira, Santiago Arciniegas, Lillian Sung, Nnamdi Ezeanochie, Heather Cole-Lewis, Katherine Heller, Sanmi Koyejo, Alexander D'Amour

TL;DR
Disaggregated evaluation is essential for fairness assessment but can be misleading without considering data representativeness, causal assumptions, and potential biases, affecting how practitioners interpret model fairness.
Contribution
The paper introduces a causal framework to analyze the limitations of disaggregated evaluations under various data biases and proposes methods to improve fairness assessments.
Findings
Equal subgroup performance can be misleading for fairness.
Disaggregated evaluation may be invalid under selection bias.
Causal analysis helps identify and mitigate confounding effects.
Abstract
Disaggregated evaluation across subgroups is critical for assessing the fairness of machine learning models, but its uncritical use can mislead practitioners. We show that equal performance across subgroups is an unreliable measure of fairness when data are representative of the relevant populations but reflective of real-world disparities. Furthermore, when data are not representative due to selection bias, both disaggregated evaluation and alternative approaches based on conditional independence testing may be invalid without explicit assumptions regarding the bias mechanism. We use causal graphical models to characterize fairness properties and metric stability across subgroups under different data generating processes. Our framework suggests complementing disaggregated evaluations with explicit causal assumptions and analysis to control for confounding and distribution shift,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
