Worst-Group Equalized Odds Regularization for Multi-Attribute Fair Medical Image Classification
Nikhil Cherian Kurian, Victor Caquilpan Parra, Abin Shoby, Luke Whitbread, Lauren Oakden-Rayner, Robert Vandersluis, Jessica Schrouff, Lyle J. Palmer, Mark Jenkinson

TL;DR
This paper introduces a regularization method to improve fairness in medical image classification by reducing disparities across demographic groups at the inference stage.
Contribution
It proposes a worst-group equalized-odds margin regularizer that targets subgroup-level fairness without sacrificing overall diagnostic accuracy.
Findings
Reduces disparities in Equalized Odds and Opportunity across demographic groups.
Maintains high diagnostic AUC while improving fairness.
Effective across multiple medical imaging datasets with multi-label settings.
Abstract
Diagnostic performance in medical AI varies systematically across demographic groups, yet subgroup AUC can mask clinically important disparities. At a fixed inference-time operating point, some groups may exhibit over-diagnostic behaviour, characterized by elevated true and false positive rates, while others show under-diagnostic patterns with reduced true and false positive rates. These opposing tendencies can cancel in aggregate AUCs while producing meaningful inequities in clinical decision-making. Motivated by the need to assess and mitigate such disparities at the operating point and across multiple demographic attributes simultaneously, we propose a worst-group equalized-odds margin regularizer. The proposed regularizer explicitly targets subgroup-level deviations on both the true positive and false positive sides at inference. At each update, the method identifies subgroups…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
