Towards Massive Multilingual Holistic Bias
Xiaoqing Ellen Tan, Prangthip Hansanti, Carleigh Wood, Bokai, Yu, Christophe Ropers, Marta R. Costa-juss\`a

TL;DR
This paper introduces the MASSIVE MULTILINGUAL HOLISTICBIAS dataset with 6 million sentences across 13 demographic axes, providing a benchmark for evaluating and mitigating biases in multilingual language models.
Contribution
It presents a scalable, multilingual dataset construction methodology and demonstrates its use in analyzing gender bias and toxicity in machine translation.
Findings
Gender bias shows +4 chrf points for masculine sentences.
Models overgeneralize to masculine forms, with +12 chrf points.
Toxicity increases up to 2.3% in biased translations.
Abstract
In the current landscape of automatic language generation, there is a need to understand, evaluate, and mitigate demographic biases as existing models are becoming increasingly multilingual. To address this, we present the initial eight languages from the MASSIVE MULTILINGUAL HOLISTICBIAS (MMHB) dataset and benchmark consisting of approximately 6 million sentences representing 13 demographic axes. We propose an automatic construction methodology to further scale up MMHB sentences in terms of both language coverage and size, leveraging limited human annotation. Our approach utilizes placeholders in multilingual sentence construction and employs a systematic method to independently translate sentence patterns, nouns, and descriptors. Combined with human translation, this technique carefully designs placeholders to dynamically generate multiple sentence variations and significantly reduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topicslinguistics and terminology studies · Interpreting and Communication in Healthcare · Translation Studies and Practices
