Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages

Israel Abebe Azime; Tadesse Destaw Belay; Dietrich Klakow; Philipp Slusallek; Anshuman Chhabra

arXiv:2508.14913·cs.CL·April 21, 2026

Bridging the Culture Gap: A Framework for LLM-Driven Socio-Cultural Localization of Math Word Problems in Low-Resource Languages

Israel Abebe Azime, Tadesse Destaw Belay, Dietrich Klakow, Philipp Slusallek, Anshuman Chhabra

PDF

TL;DR

This paper presents a framework that uses large language models to create culturally localized math word problem datasets in low-resource languages, addressing biases and improving multilingual reasoning.

Contribution

The authors introduce an automated framework for socio-cultural localization of math problems, generating native entities and reducing English-centric biases in multilingual benchmarks.

Findings

01

Localized datasets reveal true multilingual math ability.

02

Framework mitigates entity bias and enhances robustness.

03

Experiments show improved performance with native entities.

Abstract

Large language models (LLMs) have demonstrated significant capabilities in solving mathematical problems expressed in natural language. However, multilingual and culturally-grounded mathematical reasoning in low-resource languages lags behind English due to the scarcity of socio-cultural task datasets that reflect accurate native entities such as person names, organization names, and currencies. Existing multilingual benchmarks are predominantly produced via translation and typically retain English-centric entities, owing to the high cost associated with human annotater-based localization. Moreover, automated localization tools are limited, and hence, truly localized datasets remain scarce. To bridge this gap, we introduce a framework for LLM-driven cultural localization of math word problems that automatically constructs datasets with native names, organizations, and currencies from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.