Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap
Hyunwoo Ko, Guijin Son, Dasol Choi

TL;DR
This paper introduces HRM8K, a bilingual math benchmark, and proposes UST, a method that uses English as an anchor to improve multilingual reasoning in LLMs, significantly reducing performance gaps.
Contribution
The paper presents HRM8K for evaluating multilingual math reasoning and UST, a novel approach that enhances reasoning accuracy by leveraging English as an anchor language.
Findings
UST improves Korean math reasoning performance by 10.91%.
Performance gap reduces from 11.6% to 0.7%.
Method generalizes across different Korean domains.
Abstract
Large language models (LLMs) demonstrate exceptional performance on complex reasoning tasks. However, despite their strong reasoning capabilities in high-resource languages (e.g., English and Chinese), a significant performance gap persists in other languages. To investigate this gap in Korean, we introduce HRM8K, a benchmark comprising 8,011 English-Korean parallel bilingual math problems. Through systematic analysis of model behaviors, we identify a key finding: these performance disparities stem primarily from difficulties in comprehending non-English inputs, rather than limitations in reasoning capabilities. Based on these findings, we propose UST (Understand, Solve, and Translate), a method that strategically uses English as an anchor for reasoning and solution generation. By fine-tuning the model on 130k synthetically generated data points, UST achieves a 10.91% improvement on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMathematics Education and Teaching Techniques
