Towards classification parity across cohorts
Aarsh Patel, Rahul Gupta, Mukund Harakere, Satyapriya Krishna, Aman, Alok, Peng Liu

TL;DR
This paper explores achieving fairness in machine learning by ensuring classification parity across both explicit sensitive attributes and implicit cohorts derived from language usage, introducing a loss function modification to improve fairness.
Contribution
It introduces a method to define implicit cohorts via language embeddings and proposes a loss function adjustment to enhance classification fairness across all cohort types.
Findings
Discovered performance disparities across explicit and implicit cohorts.
Improved classification parity with a modified loss function.
Abstract
Recently, there has been a lot of interest in ensuring algorithmic fairness in machine learning where the central question is how to prevent sensitive information (e.g. knowledge about the ethnic group of an individual) from adding "unfair" bias to a learning algorithm (Feldman et al. (2015), Zemel et al. (2013)). This has led to several debiasing algorithms on word embeddings (Qian et al. (2019) , Bolukbasi et al. (2016)), coreference resolution (Zhao et al. (2018a)), semantic role labeling (Zhao et al. (2017)), etc. Most of these existing work deals with explicit sensitive features such as gender, occupations or race which doesn't work with data where such features are not captured due to privacy concerns. In this research work, we aim to achieve classification parity across explicit as well as implicit sensitive features. We define explicit cohorts as groups of people based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education · Hate Speech and Cyberbullying Detection
