fAux: Testing Individual Fairness via Gradient Alignment
Giuseppe Castiglione, Ga Wu, Christopher Srinivasa, Simon Prince

TL;DR
fAux is a new method for testing individual fairness in machine learning models by comparing prediction derivatives with an auxiliary model, avoiding domain knowledge and out-of-domain risks.
Contribution
It introduces a novel gradient alignment criterion and a practical testing approach for individual fairness that does not require domain-specific metrics or input transformations.
Findings
Effectively detects discrimination in synthetic datasets
Performs well on real-world datasets
Offers advantages over existing fairness testing methods
Abstract
Machine learning models are vulnerable to biases that result in unfair treatment of individuals from different populations. Recent work that aims to test a model's fairness at the individual level either relies on domain knowledge to choose metrics, or on input transformations that risk generating out-of-domain samples. We describe a new approach for testing individual fairness that does not have either requirement. We propose a novel criterion for evaluating individual fairness and develop a practical testing method based on this criterion which we call fAux (pronounced fox). This is based on comparing the derivatives of the predictions of the model to be tested with those of an auxiliary model, which predicts the protected variable from the observed data. We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets, and has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
MethodsTest
