TL;DR
This empirical study investigates how penalizing group fairness violations in clinical risk prediction models affects performance and fairness, revealing widespread performance degradation and emphasizing the need for contextual understanding beyond fairness metrics.
Contribution
The paper provides a comprehensive empirical analysis of fairness interventions in healthcare ML models, highlighting their limitations and advocating for broader sociotechnical considerations.
Findings
Fairness penalties often degrade model performance across multiple metrics.
Effects on fairness measures vary across different conditions and attributes.
Contextual and causal understanding is crucial for fair healthcare ML applications.
Abstract
The use of machine learning to guide clinical decision making has the potential to worsen existing health disparities. Several recent works frame the problem as that of algorithmic fairness, a framework that has attracted considerable attention and criticism. However, the appropriateness of this framework is unclear due to both ethical as well as technical considerations, the latter of which include trade-offs between measures of fairness and model performance that are not well-understood for predictive models of clinical outcomes. To inform the ongoing debate, we conduct an empirical study to characterize the impact of penalizing group fairness violations on an array of measures of model performance and group fairness. We repeat the analyses across multiple observational healthcare databases, clinical outcomes, and sensitive attributes. We find that procedures that penalize differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
