Counterfactual Reasoning for Bias Evaluation and Detection in a Fairness under Unawareness setting
Giandomenico Cornacchia, Vito Walter Anelli, Fedelucio Narducci,, Azzurra Ragone, Eugenio Di Sciascio

TL;DR
This paper introduces a counterfactual reasoning approach to detect hidden biases in machine learning models that omit sensitive features, revealing potential discrimination through analysis of counterfactual examples and external classifiers.
Contribution
It proposes a novel method to uncover bias in models trained without sensitive features by analyzing counterfactuals and feature deviations, addressing fairness under unawareness.
Findings
Models often retain biases despite excluding sensitive features.
Counterfactual analysis can reveal hidden proxy biases.
The method effectively detects discrimination in black-box models.
Abstract
Current AI regulations require discarding sensitive features (e.g., gender, race, religion) in the algorithm's decision-making process to prevent unfair outcomes. However, even without sensitive features in the training set, algorithms can persist in discrimination. Indeed, when sensitive features are omitted (fairness under unawareness), they could be inferred through non-linear relations with the so called proxy features. In this work, we propose a way to reveal the potential hidden bias of a machine learning model that can persist even when sensitive features are discarded. This study shows that it is possible to unveil whether the black-box predictor is still biased by exploiting counterfactual reasoning. In detail, when the predictor provides a negative classification outcome, our approach first builds counterfactual examples for a discriminated user category to obtain a positive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsCounterfactuals Explanations
