Milestones over Outcome: Unlocking Geometric Reasoning with Sub-Goal Verifiable Reward

Jianlong Chen; Daocheng Fu; Shengze Xu; Jiawei Chen; Yuan Feng; Yue Yang; Junchi Yan; Hongyuan Zha; Renqiu Xia

arXiv:2601.05073·cs.LG·January 9, 2026

Milestones over Outcome: Unlocking Geometric Reasoning with Sub-Goal Verifiable Reward

Jianlong Chen, Daocheng Fu, Shengze Xu, Jiawei Chen, Yuan Feng, Yue Yang, Junchi Yan, Hongyuan Zha, Renqiu Xia

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a new framework and benchmark for improving geometric reasoning in multimodal large language models by focusing on subgoal verification, leading to significant performance gains and better generalization.

Contribution

It presents the GeoGoal benchmark and the Sub-Goal Verifiable Reward (SGVR) framework, enabling more rigorous and effective geometric reasoning in large language models.

Findings

01

SGVR improves geometric reasoning accuracy by 9.7%.

02

Models trained with SGVR transfer gains to general math (+8.0%).

03

Enhanced reasoning generalizes to other domains (+2.8%).

Abstract

Multimodal Large Language Models (MLLMs) struggle with complex geometric reasoning, largely because "black box" outcome-based supervision fails to distinguish between lucky guesses and rigorous deduction. To address this, we introduce a paradigm shift towards subgoal-level evaluation and learning. We first construct GeoGoal, a benchmark synthesized via a rigorous formal verification data engine, which converts abstract proofs into verifiable numeric subgoals. This structure reveals a critical divergence between reasoning quality and outcome accuracy. Leveraging this, we propose the Sub-Goal Verifiable Reward (SGVR) framework, which replaces sparse signals with dense rewards based on the Skeleton Rate. Experiments demonstrate that SGVR not only enhances geometric performance (+9.7%) but also exhibits strong generalization, transferring gains to general math (+8.0%) and other general…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

carpe002/GeoGoal-SGVR
dataset· 33 dl
33 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks