Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination
Nathan Kallus, Xiaojie Mao, Angela Zhou

TL;DR
This paper develops methods to assess algorithmic fairness when protected class data is unobserved, using auxiliary data and optimization to characterize possible disparities, with applications in lending and healthcare.
Contribution
It introduces a framework for bounding true disparities using proxy data and optimization, addressing unobserved protected classes in fairness assessment.
Findings
Common disparity measures are often unidentifiable with proxy data.
The paper provides algorithms to compute and visualize bounds on disparities.
Case studies demonstrate practical application in lending and medicine.
Abstract
The increasing impact of algorithmic decisions on people's lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly-color-blind algorithms can have on different groups. Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policymaking, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary dataset, such as the US census, to construct models that predict the protected class from proxy variables, such as surname and geolocation. We show that even with such data, a variety of common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsurance, Mortality, Demography, Risk Management · Health Systems, Economic Evaluations, Quality of Life · Advanced Causal Inference Techniques
