Fairness Hub Technical Briefs: AUC Gap
Jinsook Lee, Chris Brooks, Renzhe Yu, Rene Kizilcec

TL;DR
This paper introduces AUC Gap, a fairness metric measuring disparities in model performance across subgroups, to help teams assess and improve fairness in educational AI models.
Contribution
It proposes a versatile, easy-to-compute fairness measure, AUC Gap, applicable across models and subgroups, facilitating benchmarking and bias mitigation strategies.
Findings
AUC Gap effectively captures subgroup performance disparities.
The measure is model-agnostic and suitable for intersectional groups.
It provides a common benchmark for fairness in educational AI.
Abstract
To measure bias, we encourage teams to consider using AUC Gap: the absolute difference between the highest and lowest test AUC for subgroups (e.g., gender, race, SES, prior knowledge). It is agnostic to the AI/ML algorithm used and it captures the disparity in model performance for any number of subgroups, which enables non-binary fairness assessments such as for intersectional identity groups. The teams use a wide range of AI/ML models in pursuit of a common goal of doubling math achievement in low-income middle schools. Ensuring that the models, which are trained on datasets collected in many different contexts, do not introduce or amplify biases is important for achieving the goal. We offer here a versatile and easy-to-compute measure of model bias for all the teams in order to create a common benchmark and an analytical basis for sharing what strategies have worked for different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety
