TL;DR
ProF is a novel fairness repair framework for deep neural networks that provides provable guarantees by leveraging interval bound propagation and solving a MILP problem, ensuring fairness over entire input sets.
Contribution
ProF introduces a verification-based fairness repair method with provable guarantees, capable of generalizing fairness corrections to unseen samples and multiple sensitive attributes.
Findings
Achieves up to 95.93% fairness repair on benchmark datasets.
Provides provable fairness guarantees over the entire input space.
Supports multiple sensitive attributes and fairness definitions.
Abstract
Deep neural networks (DNNs) are suffering from ethical issues such as individual discrimination. In response, extensive NN repair techniques have been developed to adjust models and mitigate such undesired behaviors. However, existing fairness repair methods are typically data-centric, which often lack provable guarantees and generalization to unseen samples. To overcome these limitations, we propose ProF, a novel fairness repair framework with provable guarantees. The key intuition of ProF is to leverage interval bound propagation (a widely used NN verification technique) to soundly capture model outputs over the whole set around a biased sample . The derived bounds are utilized to guide fairness repair which encourages the model to produce consistent outputs on . Specifically, we integrate fairness constraints and model modifications into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
