TL;DR
This paper examines how imputation strategies for missing data in healthcare impact algorithmic fairness, revealing current practices can worsen disparities and proposing a framework for more equitable imputation choices.
Contribution
It critically analyzes existing imputation methods in healthcare, highlighting their limitations for fairness, and introduces a new empirical framework to guide fair imputation strategy selection.
Findings
Current imputation practices lack solid theoretical foundations.
Group-specific imputation can worsen prediction disparities.
The proposed framework helps select imputation strategies balancing accuracy and fairness.
Abstract
Machine learning risks reinforcing biases present in data and, as we argue in this work, in what is absent from data. In healthcare, societal and decision biases shape patterns in missing data, yet the algorithmic fairness implications of group-specific missingness are poorly understood. The way we address missingness in healthcare can have detrimental impacts on downstream algorithmic fairness. Our work questions current recommendations and practices aimed at handling missing data with a focus on their effect on algorithmic fairness, and offers a path forward. Specifically, we consider the theoretical underpinnings of existing recommendations as well as their empirical predictive performance and corresponding algorithmic fairness measured through subgroup performances. Our results show that current practices for handling missingness lack principled foundations, are disconnected from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
