What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features

Dayeon Ki; Kevin Duh; Marine Carpuat

arXiv:2604.04720·cs.CL·April 7, 2026

What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features

Dayeon Ki, Kevin Duh, Marine Carpuat

PDF

TL;DR

This paper investigates what characterizes effective multilingual reasoning in large models, revealing that features associated with reasoning vary across languages and challenging English-centric assumptions.

Contribution

It defines measurable reasoning features, uses autoencoders to discover latent reasoning concepts, and evaluates their effectiveness across multiple languages and models.

Findings

01

Most features positively correlate with accuracy

02

Association strength varies across languages

03

English-centric reward designs may be insufficient

Abstract

Large Reasoning Models (LRMs) still exhibit large performance gaps between English and other languages, yet much current work assumes these gaps can be closed simply by making reasoning in every language resemble English reasoning. This work challenges this assumption by asking instead: what actually characterizes effective reasoning in multilingual settings, and to what extent do English-derived reasoning features genuinely help in other languages? We first define a suite of measurable reasoning features spanning multilingual alignment, reasoning step, and reasoning flow aspects of reasoning traces, and use logistic regression to quantify how each feature associates with final answer accuracy. We further train sparse autoencoders over multilingual traces to automatically discover latent reasoning concepts that instantiate or extend these features. Finally, we use the features as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.