The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI
Alan Saji, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully

TL;DR
This paper investigates multilingual reasoning in large reasoning models, revealing that English reasoning often outperforms native language reasoning but introduces translation errors, impacting interpretability and accuracy.
Contribution
It systematically compares reasoning in English versus native languages in LRMs, highlighting strengths and vulnerabilities of multilingual reasoning strategies.
Findings
English reasoning traces show more cognitive behaviors.
Reasoning in English yields higher accuracy, especially on complex tasks.
Translation errors can cause significant reasoning failures.
Abstract
Large Reasoning Models (LRMs) achieve strong performance on mathematical, scientific, and other question-answering tasks, but their multilingual reasoning abilities remain underexplored. When presented with non-English questions, LRMs often default to reasoning in English, raising concerns about interpretability and the handling of linguistic and cultural nuances. We systematically compare an LRM's reasoning in English versus the language of the question. Our evaluation spans two tasks: MGSM and GPQA Diamond. Beyond measuring answer accuracy, we also analyze cognitive attributes in the reasoning traces. We find that English reasoning traces exhibit a substantially higher presence of these cognitive behaviors, and that reasoning in English generally yields higher final-answer accuracy, with the performance gap increasing as tasks become more complex. However, this English-centric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
