Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations

Pardis Sadat Zahraei; Ali Emami

arXiv:2506.00748·cs.CL·June 3, 2025

Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations

Pardis Sadat Zahraei, Ali Emami

PDF

Open Access 1 Repo 2 Videos

TL;DR

This paper introduces the TWC dataset to evaluate gender bias and reasoning in machine translation, revealing widespread biases in current models and demonstrating that fine-tuning can significantly reduce these biases and improve translation quality.

Contribution

The paper presents the TWC dataset for assessing gender bias in translation and shows that fine-tuning models on this data reduces bias and enhances translation fairness and accuracy.

Findings

01

Models prefer masculine pronouns in gender-neutral contexts.

02

Fine-tuning on TWC reduces gender bias and stereotyping.

03

Open-source models can outperform proprietary systems after fine-tuning.

Abstract

Addressing gender bias and maintaining logical coherence in machine translation remains challenging, particularly when translating between natural gender languages, like English, and genderless languages, such as Persian, Indonesian, and Finnish. We introduce the Translate-with-Care (TWC) dataset, comprising 3,950 challenging scenarios across six low- to mid-resource languages, to assess translation systems' performance. Our analysis of diverse technologies, including GPT-4, mBART-50, NLLB-200, and Google Translate, reveals a universal struggle in translating genderless content, resulting in gender stereotyping and reasoning errors. All models preferred masculine pronouns when gender stereotypes could influence choices. Google Translate and GPT-4 showed particularly strong bias, favoring male pronouns 4-6 times more than feminine ones in leadership and professional success contexts.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pardissz/translate-with-care
noneOfficial

Videos

Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations· underline

Taxonomy

TopicsNatural Language Processing Techniques

MethodsLinear Layer · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Adam · Softmax · Label Smoothing · Multi-Head Attention · Attention Is All You Need · Dropout