Information-Theoretic Bounds on The Removal of Attribute-Specific Bias   From Neural Networks

Jiazhi Li; Mahyar Khayatkhoei; Jiageng Zhu; Hanchen Xie; Mohamed E.; Hussein; Wael AbdAlmageed

arXiv:2310.04955·cs.LG·November 17, 2023·1 cites

Information-Theoretic Bounds on The Removal of Attribute-Specific Bias From Neural Networks

Jiazhi Li, Mahyar Khayatkhoei, Jiageng Zhu, Hanchen Xie, Mohamed E., Hussein, Wael AbdAlmageed

PDF

Open Access 1 Repo

TL;DR

This paper establishes an information-theoretic limit on removing attribute bias from neural networks, showing that existing methods are ineffective against strong bias, especially in small datasets, and highlights the need for more robust solutions.

Contribution

It introduces a theoretical upper bound on bias removal performance based on bias strength and empirically verifies this limit across various datasets.

Findings

01

Existing bias removal methods fail under strong bias conditions.

02

Theoretical bounds predict practical limitations of current methods.

03

Strong bias in small datasets cannot be effectively mitigated with current techniques.

Abstract

Ensuring a neural network is not relying on protected attributes (e.g., race, sex, age) for predictions is crucial in advancing fair and trustworthy AI. While several promising methods for removing attribute bias in neural networks have been proposed, their limitations remain under-explored. In this work, we mathematically and empirically reveal an important limitation of attribute bias removal methods in presence of strong bias. Specifically, we derive a general non-vacuous information-theoretical upper bound on the performance of any attribute bias removal method in terms of the bias strength. We provide extensive experiments on synthetic, image, and census datasets to verify the theoretical bound and its consequences in practice. Our findings show that existing attribute bias removal methods are effective only when the inherent bias in the dataset is relatively weak, thus cautioning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiazhi412/strong_attribute_bias
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education