Bias Mitigation Framework for Intersectional Subgroups in Neural Networks
Narine Kokhlikyan, Bilal Alsallakh, Fulton Wang, Vivek Miglani, Oliver, Aobo Yang, David Adkins

TL;DR
This paper introduces a simple, generic framework for reducing intersectional bias in neural networks by minimizing mutual information between protected attributes and outputs, improving fairness without sacrificing accuracy.
Contribution
The authors propose a novel bias mitigation method that prevents models from learning protected attribute relationships, enhancing fairness and causal insensitivity in neural networks.
Findings
Effective bias reduction with minimal accuracy loss
Models become causally fair and insensitive to protected attributes
Significant reduction in feature interactions between protected and non-protected attributes
Abstract
We propose a fairness-aware learning framework that mitigates intersectional subgroup bias associated with protected attributes. Prior research has primarily focused on mitigating one kind of bias by incorporating complex fairness-driven constraints into optimization objectives or designing additional layers that focus on specific protected attributes. We introduce a simple and generic bias mitigation approach that prevents models from learning relationships between protected attributes and output variable by reducing mutual information between them. We demonstrate that our approach is effective in reducing bias with little or no drop in accuracy. We also show that the models trained with our learning framework become causally fair and insensitive to the values of protected attributes. Finally, we validate our approach by studying feature interactions between protected and non-protected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
