TL;DR
This paper introduces a novel approach for identifying, explaining, and reducing biases in training data to improve fairness in machine learning models, emphasizing data transparency and minimal utility loss.
Contribution
It proposes new algorithms for measuring sample bias, providing interpretability, and strategies for data editing to mitigate unfairness, filling a gap in existing FairML research.
Findings
Effective bias attribution and explanation at the sample level
Successful mitigation of group and individual unfairness
Minimal or zero utility loss during bias mitigation
Abstract
Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals. Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction, with far less effort dedicated towards exploring how to trace biases present in the data, despite its importance for the transparency and interpretability of FairML. To fill this gap, we investigate a novel research problem: discovering samples that reflect biases/prejudices from the training data. Grounding on the existing fairness notions, we lay out a sample bias criterion and propose practical algorithms for measuring and countering sample bias. The derived bias score provides intuitive sample-level attribution and explanation of historical bias in data. On this basis, we further design two FairML strategies via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
