Understand, Solve and Translate: Bridging the Multilingual Mathematical   Reasoning Gap

Hyunwoo Ko; Guijin Son; Dasol Choi

arXiv:2501.02448·cs.CL·February 3, 2025

Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap

Hyunwoo Ko, Guijin Son, Dasol Choi

PDF

Open Access 2 Models 1 Datasets 1 Video

TL;DR

This paper introduces HRM8K, a bilingual math benchmark, and proposes UST, a method that uses English as an anchor to improve multilingual reasoning in LLMs, significantly reducing performance gaps.

Contribution

The paper presents HRM8K for evaluating multilingual math reasoning and UST, a novel approach that enhances reasoning accuracy by leveraging English as an anchor language.

Findings

01

UST improves Korean math reasoning performance by 10.91%.

02

Performance gap reduces from 11.6% to 0.7%.

03

Method generalizes across different Korean domains.

Abstract

Large language models (LLMs) demonstrate exceptional performance on complex reasoning tasks. However, despite their strong reasoning capabilities in high-resource languages (e.g., English and Chinese), a significant performance gap persists in other languages. To investigate this gap in Korean, we introduce HRM8K, a benchmark comprising 8,011 English-Korean parallel bilingual math problems. Through systematic analysis of model behaviors, we identify a key finding: these performance disparities stem primarily from difficulties in comprehending non-English inputs, rather than limitations in reasoning capabilities. Based on these findings, we propose UST (Understand, Solve, and Translate), a method that strategically uses English as an anchor for reasoning and solution generation. By fine-tuning the model on 130k synthetically generated data points, UST achieves a 10.91% improvement on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

HAERAE-HUB/HRM8K
dataset· 583 dl
583 dl

Videos

Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap· underline

Taxonomy

TopicsMathematics Education and Teaching Techniques