Towards Equal Opportunity Fairness through Adversarial Learning

Xudong Han; Timothy Baldwin; Trevor Cohn

arXiv:2203.06317·cs.CL·May 17, 2022

Towards Equal Opportunity Fairness through Adversarial Learning

Xudong Han, Timothy Baldwin, Trevor Cohn

PDF

Open Access 1 Repo

TL;DR

This paper introduces an augmented adversarial training method that explicitly models equal opportunity fairness in NLP, leading to improved bias mitigation and better performance-fairness trade-offs.

Contribution

It proposes a novel augmented discriminator for adversarial training that explicitly incorporates equal opportunity considerations in bias mitigation.

Findings

01

Significant improvement over standard adversarial debiasing methods.

02

Enhanced performance-fairness trade-off in experiments.

03

Effective modeling of equal opportunity in NLP bias mitigation.

Abstract

Adversarial training is a common approach for bias mitigation in natural language processing. Although most work on debiasing is motivated by equal opportunity, it is not explicitly captured in standard adversarial training. In this paper, we propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features and more explicitly model equal opportunity. Experimental results over two datasets show that our method substantially improves over standard adversarial debiasing methods, in terms of the performance--fairness trade-off.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HanXudong/fairlib
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)