Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Zixiang Xu; Yanbo Wang; Yue Huang; Xiuying Chen; Jieyu Zhao; Meng Jiang; Xiangliang Zhang

arXiv:2505.18673·cs.CL·May 27, 2025

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Zixiang Xu, Yanbo Wang, Yue Huang, Xiuying Chen, Jieyu Zhao, Meng Jiang, Xiangliang Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a new method using beam search and LLM simulation to identify and analyze cross-lingual weaknesses in multilingual large language models, revealing significant performance drops across languages.

Contribution

It introduces an efficient methodology and dataset for probing cross-lingual weaknesses in LLMs, highlighting the impact of linguistic similarity and potential for targeted improvements.

Findings

01

Over 50% accuracy drops in target languages detected

02

Linguistically related languages show similar weakness patterns

03

Method effectively reveals weaknesses in state-of-the-art models

Abstract

Large Language Models (LLMs) have achieved remarkable success in Natural Language Processing (NLP), yet their cross-lingual performance consistency remains a significant challenge. This paper introduces a novel methodology for efficiently identifying inherent cross-lingual weaknesses in LLMs. Our approach leverages beam search and LLM-based simulation to generate bilingual question pairs that expose performance discrepancies between English and target languages. We construct a new dataset of over 6,000 bilingual pairs across 16 languages using this methodology, demonstrating its effectiveness in revealing weaknesses even in state-of-the-art models. The extensive experiments demonstrate that our method precisely and cost-effectively pinpoints cross-lingual weaknesses, consistently revealing over 50\% accuracy drops in target languages across a wide range of models. Moreover, further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xzx34/cross-lingual-pitfalls
noneOfficial

Videos

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques