MAFT: Efficient Model-Agnostic Fairness Testing for Deep Neural Networks   via Zero-Order Gradient Search

Zhaohui Wang; Min Zhang; Jingran Yang; Bojie Shao; Min Zhang

arXiv:2412.20086·cs.LG·December 31, 2024

MAFT: Efficient Model-Agnostic Fairness Testing for Deep Neural Networks via Zero-Order Gradient Search

Zhaohui Wang, Min Zhang, Jingran Yang, Bojie Shao, Min Zhang

PDF

TL;DR

This paper introduces MAFT, a scalable black-box fairness testing method for deep neural networks that matches white-box effectiveness and significantly outperforms existing black-box approaches in discovering fairness violations.

Contribution

The paper presents MAFT, a novel model-agnostic black-box fairness testing technique that uses gradient estimation and attribute perturbation for improved scalability and effectiveness.

Findings

01

MAFT achieves effectiveness comparable to white-box methods.

02

MAFT is approximately 14.69 times more effective than existing black-box approaches.

03

MAFT is approximately 32.58 times more efficient than existing black-box approaches.

Abstract

Deep neural networks (DNNs) have shown powerful performance in various applications and are increasingly being used in decision-making systems. However, concerns about fairness in DNNs always persist. Some efficient white-box fairness testing methods about individual fairness have been proposed. Nevertheless, the development of black-box methods has stagnated, and the performance of existing methods is far behind that of white-box methods. In this paper, we propose a novel black-box individual fairness testing method called Model-Agnostic Fairness Testing (MAFT). By leveraging MAFT, practitioners can effectively identify and address discrimination in DL models, regardless of the specific algorithm or architecture employed. Our approach adopts lightweight procedures such as gradient estimation and attribute perturbation rather than non-trivial procedures like symbol execution, rendering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.