GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Jinhao Jing; Zheng Ma; Jinwei Liang; Qiannian Zhao; Shawn Chen; Jing Yang; Por Lip Yee; Prayag Tiwari; Jingjing Bai; Benyou Wang; Lewei Lu; Zhan Su

arXiv:2605.16371·cs.CV·May 19, 2026

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Jinhao Jing, Zheng Ma, Jinwei Liang, Qiannian Zhao, Shawn Chen, Jing Yang, Por Lip Yee, Prayag Tiwari, Jingjing Bai, Benyou Wang, Lewei Lu, Zhan Su

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces GeoSym127K, a scalable neuro-symbolic framework for precise geometric reasoning, along with a large dataset and evaluation benchmarks, significantly improving multimodal models' reasoning capabilities.

Contribution

It presents the GeoSym Engine for exact symbolic ground truths, constructs the GeoSym127K dataset, and demonstrates enhanced reasoning performance with new training and reinforcement learning methods.

Findings

01

GeoSym127K contains 127K questions with symbolic ground truths.

02

Fine-tuning with GeoSym improves model accuracy on geometry tasks.

03

RLVR with structural SFT checkpoints boosts performance over zero-shot methods.

Abstract

Large Multimodal Models (LMMs) often struggle with geometric reasoning due to visual hallucinations and a lack of mathematically precise Chain-of-Thought (CoT) data. To address this, we propose the GeoSym Engine, an automated and scalable neuro-symbolic framework. By leveraging a type-conditional grammar and an analytic SymGT Solver, it derives exact symbolic ground truths and seamlessly integrates with a robust rendering pipeline to produce high-precision geometric diagrams. Using this engine, we construct GeoSym127K, a difficulty-stratified dataset featuring 51K high-resolution images, 127K questions with symbolic ground truths, and 55K answer-verified CoT QA pairs. We also introduce GeoSym-Bench, an expert-curated suite of 511 complex samples for rigorous evaluation. Through extensive supervised fine-tuning (SFT), we demonstrate that GeoSym drives concentrated improvements…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Tomie56/GeoSym127K
github

Datasets

Tomie0506/GeoSym127K
dataset· 479 dl
479 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.