Privilege Scores
Ludwig Bothmann, Philip A. Boustani, Jose M. Alvarez, Giuseppe, Casalicchio, Bernd Bischl, Susanne Dandl

TL;DR
This paper introduces privilege scores (PS) to quantify and interpret PA-related privilege in machine learning models, enabling better bias correction and policy formulation.
Contribution
It presents a novel formulation of privilege scores and contributions, along with estimation methods and interpretability tools like privilege score contributions (PSCs).
Findings
PS effectively measures individual and global privilege.
Methods reveal gender and racial privilege in real-world data.
Confidence intervals support the reliability of PS and PSC estimates.
Abstract
Bias-transforming methods of fairness-aware machine learning aim to correct a non-neutral status quo with respect to a protected attribute (PA). Current methods, however, lack an explicit formulation of what drives non-neutrality. We introduce privilege scores (PS) to measure PA-related privilege by comparing the model predictions in the real world with those in a fair world in which the influence of the PA is removed. At the individual level, PS can identify individuals who qualify for affirmative action; at the global level, PS can inform bias-transforming policies. After presenting estimation methods for PS, we propose privilege score contributions (PSCs), an interpretation method that attributes the origin of privilege to mediating features and direct effects. We provide confidence intervals for both PS and PSCs. Experiments on simulated and real-world data demonstrate the broad…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth and Medical Studies
