Observing Context Improves Disparity Estimation when Race is Unobserved
Kweku Kwegyir-Aggrey, Naveen Durvasula, Jennifer Wang, Suresh, Venkatasubramanian

TL;DR
This paper introduces two new contextual proxy models that incorporate additional features to improve race estimation accuracy, thereby enabling more unbiased disparity analysis in scenarios lacking direct race data.
Contribution
The paper presents novel contextual proxy models that enhance race estimation by integrating contextual features, addressing bias issues in existing proxy methods.
Findings
Significant performance improvements in disparity estimation on real-world data.
Contextual proxies achieve more accurate race estimates than traditional proxies.
Unbiased disparity estimates depend on a mean-consistency calibration condition.
Abstract
In many domains, it is difficult to obtain the race data that is required to estimate racial disparity. To address this problem, practitioners have adopted the use of proxy methods which predict race using non-protected covariates. However, these proxies often yield biased estimates, especially for minority groups, limiting their real-world utility. In this paper, we introduce two new contextual proxy models that advance existing methods by incorporating contextual features in order to improve race estimates. We show that these algorithms demonstrate significant performance improvements in estimating disparities on real-world home loan and voter data. We establish that achieving unbiased disparity estimates with contextual proxies relies on mean-consistency, a calibration-like condition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Data-Driven Disease Surveillance · Racial and Ethnic Identity Research
