GENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena
Eva Vanmassenhove, Johanna Monti

TL;DR
This paper introduces gENder-IT, a novel English-Italian challenge set designed to address natural gender phenomena in translation, aiding in resolving gender ambiguities in machine translation systems.
Contribution
It provides the first resource with word-level gender tags and multiple gendered translations for English-Italian, facilitating research on gender ambiguity resolution in MT.
Findings
Creates a new challenge set for gender resolution in translation
Provides annotated gender tags and multiple translations
Supports development of gender-aware MT systems
Abstract
Languages differ in terms of the absence or presence of gender features, the number of gender classes and whether and where gender features are explicitly marked. These cross-linguistic differences can lead to ambiguities that are difficult to resolve, especially for sentence-level MT systems. The identification of ambiguity and its subsequent resolution is a challenging task for which currently there aren't any specific resources or challenge sets available. In this paper, we introduce gENder-IT, an English--Italian challenge set focusing on the resolution of natural gender phenomena by providing word-level gender tags on the English source side and multiple gender alternative translations, where needed, on the Italian target side.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Gender Studies in Language · Text Readability and Simplification
