Fairness Interventions: A Study in AI Explainability
Thomas Souverain, Johnathan Nguyen, Nicolas Meric, Paul \'Egr\'e

TL;DR
This paper explores fairness interventions in AI, emphasizing the importance of explainability and transparency in corrective methods, and introduces FairDream, a tool that reveals biases and trade-offs in fairness criteria.
Contribution
It provides a philosophical and experimental analysis of fairness corrections, introduces the FairDream package, and clarifies the relationship between fairness criteria with a focus on transparency.
Findings
FairDream tends towards Equalized Odds despite user intent for Demographic Parity.
Transparency in fairness interventions reveals inherent biases and trade-offs.
The study links fairness criteria with epistemological insights, including Simpson's paradox.
Abstract
This paper presents a philosophical and experimental study of fairness interventions in AI classification, centered on the explainability of corrective methods. We argue that ensuring fairness requires not only satisfying a target criterion, but also explaining which variables constrain its realization. When corrections are used to mitigate advantage transparently, they must remain sensitive to the distribution of true labels. To illustrate this approach, we built FairDream, a fairness package whose mechanism is made transparent for lay users, increasing the model's weights of errors on disadvantaged groups. While a user may intend to achieve Demographic Parity by the correction method, experiments show that FairDream tends towards Equalized Odds, revealing a conservative bias inherent to the data environment. We clarify the relationship between these fairness criteria, analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
