A Survey on Bias and Fairness in Natural Language Processing
Rajas Bansal

TL;DR
This survey reviews the origins of bias in NLP models, discusses fairness definitions, and evaluates mitigation strategies, emphasizing the importance of addressing social biases as NLP systems become more embedded in daily life.
Contribution
It provides a comprehensive overview of bias sources, fairness concepts, and mitigation techniques in NLP, guiding future research towards bias eradication.
Findings
Biases originate from data and model design.
Various fairness definitions exist and are applied in NLP.
Mitigation strategies include data balancing and model adjustments.
Abstract
As NLP models become more integrated with the everyday lives of people, it becomes important to examine the social effect that the usage of these systems has. While these models understand language and have increased accuracy on difficult downstream tasks, there is evidence that these models amplify gender, racial and cultural stereotypes and lead to a vicious cycle in many settings. In this survey, we analyze the origins of biases, the definitions of fairness, and how different subfields of NLP mitigate bias. We finally discuss how future studies can work towards eradicating pernicious biases from NLP algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
