Learning Fair Representations via Rate-Distortion Maximization
Somnath Basu Roy Chowdhury, Snigdha Chaturvedi

TL;DR
This paper introduces FaRM, a novel method for learning fair text representations by maximizing rate-distortion to remove demographic biases, applicable to multiple protected attributes and effective against probing attacks.
Contribution
FaRM is a new debiasing technique that uncorrelates protected attribute information in representations using rate-distortion maximization, with or without a target task.
Findings
Achieves state-of-the-art debiasing performance on multiple datasets.
Reduces protected attribute information leakage against probing attacks.
Effective for multiple protected attributes simultaneously.
Abstract
Text representations learned by machine learning models often encode undesirable demographic information of the user. Predictive models based on these representations can rely on such information, resulting in biased decisions. We present a novel debiasing technique, Fairness-aware Rate Maximization (FaRM), that removes protected information by making representations of instances belonging to the same protected attribute class uncorrelated, using the rate-distortion function. FaRM is able to debias representations with or without a target task at hand. FaRM can also be adapted to remove information about multiple protected attributes simultaneously. Empirical evaluations show that FaRM achieves state-of-the-art performance on several datasets, and learned representations leak significantly less protected attribute information against an attack by a non-linear probing network.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Adversarial Robustness in Machine Learning
