Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations
Alex Beutel, Jilin Chen, Zhe Zhao, Ed H. Chi

TL;DR
This paper explores how adversarial training can be used to learn fair representations without explicit knowledge of sensitive attributes, highlighting the importance of data distribution in fairness outcomes.
Contribution
It investigates the impact of data choice on adversarial fairness training, revealing that minimal data suffices and that data distribution influences fairness measures.
Findings
Small data samples are sufficient for effective adversarial fairness training.
Data distribution significantly influences the fairness properties learned by the adversary.
Adversarial models can achieve fairness without explicit sensitive attribute labels.
Abstract
How can we learn a classifier that is "fair" for a protected or sensitive group, when we do not know if the input to the classifier belongs to the protected group? How can we train such a classifier when data on the protected group is difficult to attain? In many settings, finding out the sensitive input attribute can be prohibitively expensive even during model training, and sometimes impossible during model serving. For example, in recommender systems, if we want to predict if a user will click on a given recommendation, we often do not know many attributes of the user, e.g., race or age, and many attributes of the content are hard to determine, e.g., the language or topic. Thus, it is not feasible to use a different classifier calibrated based on knowledge of the sensitive attribute. Here, we use an adversarial training procedure to remove information about the sensitive attribute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
