Fair Classifiers that Abstain without Harm
Tongxin Yin, Jean-Fran\c{c}ois Ton, Ruocheng Guo, Yuanshun Yao,, Mingyan Liu, Yang Liu

TL;DR
This paper introduces a post-hoc abstaining classifier that ensures fairness and accuracy without harm by selectively abstaining from predictions, using an Integer Programming approach and a surrogate model for generalization.
Contribution
It presents the first theoretical analysis linking fairness constraints, abstention rates, and accuracy, and offers a practical framework for fair abstention without sacrificing overall accuracy.
Findings
Outperforms existing fairness methods at similar abstention rates.
Provides theoretical bounds on abstention rates based on fairness constraints.
Demonstrates the feasibility of achieving fairness without harming accuracy.
Abstract
In critical applications, it is vital for classifiers to defer decision-making to humans. We propose a post-hoc method that makes existing classifiers selectively abstain from predicting certain samples. Our abstaining classifier is incentivized to maintain the original accuracy for each sub-population (i.e. no harm) while achieving a set of group fairness definitions to a user specified degree. To this end, we design an Integer Programming (IP) procedure that assigns abstention decisions for each training sample to satisfy a set of constraints. To generalize the abstaining decisions to test samples, we then train a surrogate model to learn the abstaining decisions based on the IP solutions in an end-to-end manner. We analyze the feasibility of the IP procedure to determine the possible abstention rate for different levels of unfairness tolerance and accuracy constraint for achieving no…
Peer Reviews
Decision·ICLR 2024 poster
1. Paper is well structured and easy to read 2. The problem’s scope and methodology is well defined 3. The proposed method seems motivated; they seem to include the “no-harm” constraint along with giving feasibility conditions for disparity thresholds 4. The method is performant in the tasks considered
The reviewer is not convinced on the feasibility of the IP and the ability of surrogate to learn the patterns in AB or FB. Not a weakness as such, but would like to see a discussion from the authors.
The work appears to be the first to consider the problem of selective classification under fairness, abstention rate, and no harm constraints on the same time. The proposed approach seems quite interesting especially for being flexible about the type of fairness constraints that one may impose. In addition, achieving fairness guarantees without sacrificing accuracy seems of great importance for real world applications. The paper appears very-well structured and nicely written. The authors cl
Even though the paper is nicely and clearly written, there are a few points that could confuse the reader: In the Paragraph “Stage I: Integer Programming. We approximate h_A and h_F…” “approximate” is confusing as $h_A$ and $h_F$ are already defined as binary parameters. In the optimization problem in section 3.1 the abstention rate and the no harm constraints are not defined for any $z \in \mathcal{Z}$, whereas in the IP-Main these constraints are defined for each $z \in \mathcal{Z}$. If
1. The paper is quite well-written. Most design choices are appropriately motivated. Terminology is clean and easy to understand despite the large number of components involved. 2. The experimental results are encouraging. 3. The paper solves a mix of problems that are all quite useful: fairness, abstaining from making decisions, and not reducing accuracy for groups in the data. All of these components are individually addressed elsewhere in prior work, but putting them all together is a nice
I think the paper needs to address a couple of points before it is ready for publication: 1. The paper claims to provide hard constraint satisfaction guarantees but does not discuss how these guarantees are supposed to hold when replacing AB and FB modules with surrogate models, and when replacing the true label predictor with a surrogate model. Does the generalization ability of these surrogate models not affect the constraint satisfaction? If yes, how? Or is that the guarantees only hold when
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Free Will and Agency
