Fairness via Adversarial Attribute Neighbourhood Robust Learning
Qi Qi, Shervin Ardeshir, Yi Xu, Tianbao Yang

TL;DR
This paper introduces RAAN, a novel adversarial loss function that promotes fairness across sensitive groups by reducing biased representation differences using adversarial neighborhood weights, with proven theoretical guarantees.
Contribution
The paper proposes RAAN, a new loss function for fair representation learning, and develops SCRAAN, an efficient optimization framework with theoretical guarantees.
Findings
RAAN improves fairness across sensitive groups in benchmark datasets.
SCRAAN achieves efficient optimization with theoretical guarantees.
Empirical results confirm the effectiveness of the proposed method.
Abstract
Improving fairness between privileged and less-privileged sensitive attribute groups (e.g, {race, gender}) has attracted lots of attention. To enhance the model performs uniformly well in different sensitive attributes, we propose a principled \underline{R}obust \underline{A}dversarial \underline{A}ttribute \underline{N}eighbourhood (RAAN) loss to debias the classification head and promote a fairer representation distribution across different sensitive attribute groups. The key idea of RAAN is to mitigate the differences of biased representations between different sensitive attribute groups by assigning each sample an adversarial robust weight, which is defined on the representations of adversarial attribute neighbors, i.e, the samples from different protected groups. To provide efficient optimization algorithms, we cast the RAAN into a sum of coupled compositional functions and propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Psychology of Moral and Emotional Judgment
