Using Adversarial Debiasing to Remove Bias from Word Embeddings
Dana Kenna

TL;DR
This paper investigates the effectiveness of Adversarial Debiasing in removing societal biases from word embeddings, showing it may be more effective than existing superficial bias removal methods.
Contribution
The paper provides experimental evidence suggesting Adversarial Debiasing can more thoroughly reduce bias in word embeddings compared to prior approaches.
Findings
Adversarial Debiasing shows deeper bias removal than previous methods
Experimental results indicate improved bias mitigation effectiveness
Motivates further research into Adversarial Debiasing utility
Abstract
Word Embeddings have been shown to contain the societal biases present in the original corpora. Existing methods to deal with this problem have been shown to only remove superficial biases. The method of Adversarial Debiasing was presumed to be similarly superficial, but this is was not verified in previous works. Using the experiments that demonstrated the shallow removal in other methods, I show results that suggest Adversarial Debiasing is more effective at removing bias and thus motivate further investigation on the utility of Adversarial Debiasing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Hate Speech and Cyberbullying Detection · Topic Modeling
