TL;DR
This paper improves debiasing techniques for multilingual word embeddings, demonstrating state-of-the-art results for Indian languages and English, thereby enabling fairer NLP applications across diverse languages.
Contribution
It introduces enhanced debiasing methods that effectively reduce bias in multilingual embeddings, especially for Indian languages, outperforming previous approaches.
Findings
State-of-the-art debiasing performance for Hindi, Bengali, Telugu, and English.
Effective bias quantification and mitigation strategies for multilingual embeddings.
Improved downstream NLP application fairness and accuracy.
Abstract
In this paper, we advance the current state-of-the-art method for debiasing monolingual word embeddings so as to generalize well in a multilingual setting. We consider different methods to quantify bias and different debiasing approaches for monolingual as well as multilingual settings. We demonstrate the significance of our bias-mitigation approach on downstream NLP applications. Our proposed methods establish the state-of-the-art performance for debiasing multilingual embeddings for three Indian languages - Hindi, Bengali, and Telugu in addition to English. We believe that our work will open up new opportunities in building unbiased downstream NLP applications that are inherently dependent on the quality of the word embeddings used.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
