TL;DR
This paper introduces Fairboard, a framework and dashboard for assessing and monitoring equity in AI medical models, revealing patient-specific biases and factors affecting performance.
Contribution
It provides a comprehensive evaluation of model equity across multiple dimensions and releases an open-source tool for ongoing fairness assessment in healthcare AI.
Findings
Patient identity explains more performance variance than model choice.
Clinical factors predict segmentation accuracy more strongly than architecture.
Newer models tend toward greater equity but lack formal fairness guarantees.
Abstract
Despite there now being more than 1,000 FDA-authorised AI medical devices, formal equity assessments -- whether model performance is uniform across patient subgroups -- are rare. Here, we evaluate the equity of 18 open-source brain tumour segmentation models across 648 glioma patients from two independent datasets (n = 11,664 model inferences) along distinct univariate, Bayesian multivariate, spatial, and representational dimensions. We find that patient identity consistently explains more performance variance than model choice, with clinical factors, including molecular diagnosis, tumour grade, and extent of resection, predicting segmentation accuracy more strongly than model architecture. A voxel-wise spatial meta-analysis identifies neuroanatomically localised biases that are compartment-specific yet often consistent across models. Within a high-dimensional latent space of lesion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
