Mitigating Translationese Bias in Multilingual LLM-as-a-Judge via Disentangled Information Bottleneck
Hongbin Zhang, Kehai Chen, Xuefen Bai, Youcheng Pan, Yang Xiang, Jinpeng Wang, and Min Zhang

TL;DR
This paper introduces DIBJudge, a fine-tuning framework that reduces translationese bias in multilingual LLM evaluation by disentangling bias factors from judgment-critical information, leading to more fair and accurate assessments.
Contribution
The paper proposes a novel disentangled information bottleneck approach with a bias suppression mechanism to mitigate translationese bias in multilingual LLM evaluation.
Findings
DIBJudge significantly reduces translationese bias in evaluations.
The method outperforms existing baselines on multilingual benchmarks.
Disentanglement improves the robustness of LLM-based evaluation metrics.
Abstract
Large language models (LLMs) have become a standard for multilingual evaluation, yet they exhibit a severe systematic translationese bias. In this paper, translationese bias is characterized as LLMs systematically favoring machine-translated text over human-authored references, particularly in low-resource languages. We attribute this bias to spurious correlations with (i) latent manifold alignment with English and (ii) cross-lingual predictability. To mitigate this bias, we propose DIBJudge, a robust fine-tuning framework that learns a minimally sufficient, judgment-critical representation via variational information compression, while explicitly isolating spurious factors into the dedicated bias branch. Furthermore, we incorporate a cross-covariance penalty that explicitly suppresses statistical dependence between robust and bias representations, thereby encouraging effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
