MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Jinhao Chen; Zhen Yang; Jianxin Shi; Tianyu Wo; Jie Tang

arXiv:2511.06805·cs.AI·November 11, 2025

MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Jinhao Chen, Zhen Yang, Jianxin Shi, Tianyu Wo, Jie Tang

PDF

Open Access

TL;DR

MathSE introduces a self-evolving iterative reflection and reward-guided fine-tuning framework that enhances multimodal mathematical reasoning by iteratively refining the model through inference, reflection, and feedback, surpassing existing models.

Contribution

It proposes a novel iterative fine-tuning framework for multimodal models that improves reasoning by incorporating reflection and reward feedback, addressing limitations of static datasets.

Findings

01

Significant performance improvements on mathematical reasoning benchmarks.

02

Outperforms leading open-source multimodal mathematical reasoning models.

03

Effective in handling complex and novel mathematical questions.

Abstract

Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in vision-language answering tasks. Despite their strengths, these models often encounter challenges in achieving complex reasoning tasks such as mathematical problem-solving. Previous works have focused on fine-tuning on specialized mathematical datasets. However, these datasets are typically distilled directly from teacher models, which capture only static reasoning patterns and leaving substantial gaps compared to student models. This reliance on fixed teacher-derived datasets not only restricts the model's ability to adapt to novel or more intricate questions that extend beyond the confines of the training data, but also lacks the iterative depth needed for robust generalization. To overcome these limitations, we propose \textbf{\method}, a \textbf{Math}ematical \textbf{S}elf-\textbf{E}volving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Intelligent Tutoring Systems and Adaptive Learning