TL;DR
This paper investigates how grammatical gender influences inanimate noun representations in gender-marking languages and demonstrates that targeted methods can effectively remove gender bias from word embeddings, improving their quality.
Contribution
It introduces a language-specific approach to neutralize grammatical gender signals in word embeddings, showing effectiveness in bias removal and embedding quality enhancement.
Findings
Gender affects inanimate noun representations in embeddings.
Debiasing methods that consider context are effective in removing gender bias.
Removing gender signals improves embedding quality in monolingual and cross-lingual tasks.
Abstract
Many natural languages assign grammatical gender also to inanimate nouns in the language. In such languages, words that relate to the gender-marked nouns are inflected to agree with the noun's gender. We show that this affects the word representations of inanimate nouns, resulting in nouns with the same gender being closer to each other than nouns with different gender. While "embedding debiasing" methods fail to remove the effect, we demonstrate that a careful application of methods that neutralize grammatical gender signals from the words' context when training word embeddings is effective in removing it. Fixing the grammatical gender bias yields a positive effect on the quality of the resulting word embeddings, both in monolingual and cross-lingual settings. We note that successfully removing gender signals, while achievable, is not trivial to do and that a language-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
