TL;DR
This paper analyzes the statistical properties of the ABROCA metric for fairness assessment, highlighting its distributional biases and the need for careful interpretation in bias detection tasks.
Contribution
It provides a detailed examination of ABROCA's distributional behavior under various conditions, informing better use in fairness evaluations.
Findings
ABROCA distributions are highly skewed depending on sample size and class imbalance.
Skewness in ABROCA can lead to inflated bias signals by chance.
Careful interpretation of ABROCA is necessary, especially with imbalanced classes.
Abstract
Algorithmic bias continues to be a key concern of learning analytics. We study the statistical properties of the Absolute Between-ROC Area (ABROCA) metric. This fairness measure quantifies group-level differences in classifier performance through the absolute difference in ROC curves. ABROCA is particularly useful for detecting nuanced performance differences even when overall Area Under the ROC Curve (AUC) values are similar. We sample ABROCA under various conditions, including varying AUC differences and class distributions. We find that ABROCA distributions exhibit high skewness dependent on sample sizes, AUC differences, and class imbalance. When assessing whether a classifier is biased, this skewness inflates ABROCA values by chance, even when data is drawn (by simulation) from populations with equivalent ROC curves. These findings suggest that ABROCA requires careful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
