NeuronFair: Interpretable White-Box Fairness Testing through Biased Neuron Identification
Haibin Zheng, Zhiqing Chen, Tianyu Du, Xuhong Zhang, Yao Cheng,, Shouling Ji, Jingyi Wang, Yue Yu, and Jinyin Chen

TL;DR
NeuronFair is a novel framework for fairness testing of deep neural networks that offers interpretability, efficiency, and broad applicability, significantly improving instance generation and aiding in fairness enhancement.
Contribution
It introduces an interpretable, effective, and generic fairness testing framework that outperforms existing methods in instance generation speed and diversity across various datasets.
Findings
Generates ~5.84 times more instances than previous methods
Achieves an average speedup of 534.56% in testing
Instances can be used to improve DNN fairness
Abstract
Deep neural networks (DNNs) have demonstrated their outperformance in various domains. However, it raises a social concern whether DNNs can produce reliable and fair decisions especially when they are applied to sensitive domains involving valuable resource allocation, such as education, loan, and employment. It is crucial to conduct fairness testing before DNNs are reliably deployed to such sensitive domains, i.e., generating as many instances as possible to uncover fairness violations. However, the existing testing methods are still limited from three aspects: interpretability, performance, and generalizability. To overcome the challenges, we propose NeuronFair, a new DNN fairness testing framework that differs from previous work in several key aspects: (1) interpretable - it quantitatively interprets DNNs' fairness violations for the biased decision; (2) effective - it uses the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
