Do Language Models Reason Across Languages?

Yan Meng; Wafaa Mohammed; Christof Monz

arXiv:2601.06644·cs.CL·January 13, 2026

Do Language Models Reason Across Languages?

Yan Meng, Wafaa Mohammed, Christof Monz

PDF

Open Access

TL;DR

This paper investigates whether multilingual language models can perform step-by-step reasoning across languages, revealing limitations in their reasoning process and proposing a prompting method to improve multi-hop question answering accuracy.

Contribution

It introduces a two-hop multilingual reasoning setting, analyzes models' reasoning failures, and proposes a SUBQ prompting method to enhance multi-step reasoning accuracy.

Findings

01

Models are more sensitive to language variation in answer documents.

02

Up to 33% of cases show models fail to infer bridging info but still answer correctly.

03

SUBQ prompting improves multi-hop reasoning accuracy from 10.1% to 66.5%.

Abstract

The real-world information sources are inherently multilingual, which naturally raises a question about whether language models can synthesize information across languages. In this paper, we introduce a simple two-hop question answering setting, where answering a question requires making inferences over two multilingual documents. We find that language models are more sensitive to language variation in answer-span documents than in those providing bridging information, despite the equal importance of both documents for answering a question. Under a step-by-step sub-question evaluation, we further show that in up to 33% of multilingual cases, models fail to infer the bridging information in the first step yet still answer the overall question correctly. This indicates that reasoning in language models, especially in multilingual settings, does not follow a faithful step-by-step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Text Readability and Simplification