MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Zixian Huang; Wenhao Zhu; Gong Cheng; Lei Li; Fei Yuan

arXiv:2405.17386·cs.CL·May 28, 2024

MindMerger: Efficient Boosting LLM Reasoning in non-English Languages

Zixian Huang, Wenhao Zhu, Gong Cheng, Lei Li, Fei Yuan

PDF

Open Access 1 Repo

TL;DR

MindMerger enhances multilingual reasoning in large language models by integrating external language understanding capabilities, significantly improving accuracy especially in low-resource languages without updating LLM parameters.

Contribution

The paper introduces MindMerger, a novel method that merges LLMs with external multilingual models and employs a two-step training scheme to boost reasoning in non-English languages.

Findings

01

Outperforms all baselines on three multilingual reasoning datasets.

02

Achieves 6.7% average accuracy improvement across all languages.

03

Achieves 8.0% accuracy improvement in low-resource languages.

Abstract

Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understanding non-English. Unfortunately, these methods often underutilize the built-in skilled reasoning and useful language understanding capabilities of LLMs. In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MindMerger, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance. Furthermore, a two-step training scheme is introduced to first train to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cone-mt/mindmerger
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques