M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models

Junjian Wang; Xin Zhou; Qiran Xu; Kun Zhan

arXiv:2605.09879·cs.AI·May 12, 2026

M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models

Junjian Wang, Xin Zhou, Qiran Xu, Kun Zhan

PDF

1 Repo

TL;DR

M2A introduces a parameter-space merging approach to enhance the synergy between mathematical and agentic reasoning in large language models, leading to improved reasoning depth and performance.

Contribution

The paper proposes a novel parameter-space merging method that combines mathematical and agentic reasoning without additional training, improving reasoning capabilities.

Findings

01

M2A improves reasoning depth in real-world coding tasks.

02

Applying M2A to Qwen3-8B increases the verified resolved rate from 44.0% to 51.2%.

03

The method requires no gradient updates and uses a simple merging coefficient.

Abstract

While reasoning has become a central capability of large language models (LLMs), the reasoning patterns required for different scenarios are often misaligned. Mathematical reasoning typically relies on intrinsic logic to solve closed-world problems in a single response, whereas agentic reasoning requires not only internal reasoning but also multi-turn interaction with external environments, interleaving thought and action. This misalignment prevents mathematical and agentic reasoning from effectively benefiting from each other, often yielding unstable reasoning behavior and only limited performance gains under multi-task learning. In this paper, we propose M2A, a novel paradigm that synergizes mathematical and agentic reasoning via model merging. To avoid overfitting to superficial reasoning patterns under joint training, M2A operates directly in parameter space: it identifies the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

laplucky/M2A.git
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.