M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

Inclusion AI: Fudong Wang; Jiajia Liu; Jingdong Chen; Jun Zhou; Kaixiang Ji; Lixiang Ru; Qingpei Guo; Ruobing Zheng; Tianqi Li; Yi Yuan; Yifan Mao; Yuting Xiao; Ziping Ma

arXiv:2507.08306·cs.AI·July 14, 2025

M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning

Inclusion AI: Fudong Wang, Jiajia Liu, Jingdong Chen, Jun Zhou, Kaixiang Ji, Lixiang Ru, Qingpei Guo, Ruobing Zheng, Tianqi Li, Yi Yuan, Yifan Mao, Yuting Xiao, Ziping Ma

PDF

1 Models

TL;DR

M2-Reasoning-7B is a new multimodal model that combines innovative data generation and multi-task training to excel in general and spatial reasoning, achieving state-of-the-art results across multiple benchmarks.

Contribution

The paper introduces a novel data pipeline and dynamic multi-task training strategy to enhance reasoning capabilities in multimodal large language models.

Findings

01

Achieved SOTA performance on 8 reasoning benchmarks.

02

Generated 294.2K high-quality reasoning samples.

03

Effectively integrated spatial and general reasoning in a single model.

Abstract

Recent advancements in Multimodal Large Language Models (MLLMs), particularly through Reinforcement Learning with Verifiable Rewards (RLVR), have significantly enhanced their reasoning abilities. However, a critical gap persists: these models struggle with dynamic spatial interactions, a capability essential for real-world applications. To bridge this gap, we introduce M2-Reasoning-7B, a model designed to excel in both general and spatial reasoning. Our approach integrates two key innovations: (1) a novel data pipeline that generates 294.2K high-quality data samples (168K for cold-start fine-tuning and 126.2K for RLVR), which feature logically coherent reasoning trajectories and have undergone comprehensive assessment; and (2) a dynamic multi-task training strategy with step-wise optimization to mitigate conflicts between data, and task-specific rewards for delivering tailored incentive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
inclusionAI/M2-Reasoning
model· 29 dl· ♡ 37
29 dl♡ 37

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.