IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning

Navya Gupta; Rishitej Reddy Vyalla; Avinash Anand; Chhavi Kirtani; Erik Cambria; Zhengchen Zhang; Zhengkui Wang; Timothy Liu; Aik Beng Ng; Simon See; Rajiv Ratn Shah

arXiv:2604.24114·cs.CL·April 28, 2026

IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning

Navya Gupta, Rishitej Reddy Vyalla, Avinash Anand, Chhavi Kirtani, Erik Cambria, Zhengchen Zhang, Zhengkui Wang, Timothy Liu, Aik Beng Ng, Simon See, Rajiv Ratn Shah

PDF

TL;DR

IRIS introduces a dual-axis curriculum learning framework combining supervised fine-tuning and reinforcement learning to enhance multilingual mathematical reasoning, especially in low-resource languages.

Contribution

It proposes a novel interleaved reinforcement and staged curriculum approach, along with a new dataset, to improve cross-lingual math reasoning performance.

Findings

01

IRIS outperforms baseline models on multilingual math reasoning benchmarks.

02

The approach yields significant gains in low-resource language settings.

03

The CL-Math dataset contains 29,000 problems with step-level annotations in English, Hindi, and Marathi.

Abstract

Curriculum learning helps language models tackle complex reasoning by gradually increasing task difficulty. However, it often fails to generate consistent step-by-step reasoning, especially in multilingual and low-resource settings where cross-lingual transfer from English to Indian languages remains limited. We propose IRIS: Interleaved Reinforcement with Incremental Staged Curriculum, a two-axis framework that combines Supervised Fine-Tuning on progressively harder problems (vertical axis) with Reverse Curriculum Reinforcement Learning to reduce reliance on step-by-step guidance (horizontal axis). We design a composite reward combining correctness, step-wise alignment, continuity, and numeric incentives, optimized via Group Relative Policy Optimization (GRPO). We release CL-Math, a dataset of 29k problems with step-level annotations in English, Hindi, and Marathi. Across standard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.