ReasonXL: Shifting LLM Reasoning Language Without Sacrificing Performance
Daniil Gurgurov, Tom R\"ohr, Sebastian von Rohrscheidt, Josef van Genabith, Alexander L\"oser, Simon Ostermann

TL;DR
This paper introduces ReasonXL, a large-scale multilingual reasoning dataset, and demonstrates how LLMs can be adapted to reason in different languages with minimal performance loss.
Contribution
It provides the first parallel corpus for cross-lingual reasoning, a two-stage adaptation pipeline, and an analysis of how reasoning representations shift across model layers.
Findings
Models can be effectively adapted to new languages with minimal knowledge loss.
RLVR achieves greater behavioral divergence with smaller parameter updates than SFT.
Early layers encode language identity, while upper layers adapt reasoning capabilities.
Abstract
Despite advances in multilingual capabilities, most large language models (LLMs) remain English-centric in their training and, crucially, in their production of reasoning traces. Even when tasked with non-English problems, these models predominantly reason in English, creating a fundamental mismatch for non-English usage scenarios. We address this disparity directly with three contributions. (i) We introduce ReasonXL, the first large-scale parallel corpus of cross-domain reasoning traces spanning five European languages (English, German, French, Italian, and Spanish), with over two million aligned samples per language, each comprising prompts, reasoning traces, and final outputs, enabling direct supervision of language-specific reasoning. (ii) Using ReasonXL, we demonstrate that LLMs can be adapted to reason entirely in a desired target language, using a simple two-stage pipeline of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
