Debiasing Multilingual LLMs in Cross-lingual Latent Space

Qiwei Peng; Guimin Hu; Yekun Chai; Anders S{\o}gaard

arXiv:2508.17948·cs.CL·August 26, 2025

Debiasing Multilingual LLMs in Cross-lingual Latent Space

Qiwei Peng, Guimin Hu, Yekun Chai, Anders S{\o}gaard

PDF

1 Video

TL;DR

This paper introduces a novel approach to debias multilingual large language models by performing debiasing in a joint cross-lingual latent space, leading to improved effectiveness and transferability across languages.

Contribution

It proposes constructing a well-aligned cross-lingual latent space using autoencoders and applying debiasing techniques within this space, enhancing cross-lingual debiasing performance.

Findings

01

Autoencoders effectively create aligned cross-lingual latent spaces.

02

Debiasing in the latent space improves overall debiasing effectiveness.

03

Cross-lingual transferability of debiasing techniques is significantly enhanced.

Abstract

Debiasing techniques such as SentDebias aim to reduce bias in large language models (LLMs). Previous studies have evaluated their cross-lingual transferability by directly applying these methods to LLM representations, revealing their limited effectiveness across languages. In this work, we therefore propose to perform debiasing in a joint latent space rather than directly on LLM representations. We construct a well-aligned cross-lingual latent space using an autoencoder trained on parallel TED talk scripts. Our experiments with Aya-expanse and two debiasing techniques across four languages (English, French, German, Dutch) demonstrate that a) autoencoders effectively construct a well-aligned cross-lingual latent space, and b) applying debiasing techniques in the learned cross-lingual latent space significantly improves both the overall debiasing performance and cross-lingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Debiasing Multilingual LLMs in Cross-lingual Latent Space· underline