SOLIDGEO: Measuring Multimodal Spatial Math Reasoning in Solid Geometry
Peijie Wang, Chao Yang, Zhong-Zhi Li, Fei Yin, Dekang Ran, Mi Tian, Zhilong Ji, Jinfeng Bai, Cheng-Lin Liu

TL;DR
SolidGeo is a comprehensive benchmark designed to evaluate multimodal large language models on complex solid geometry reasoning tasks, revealing significant performance gaps and providing insights into their spatial reasoning capabilities.
Contribution
This paper introduces SolidGeo, the first large-scale benchmark specifically targeting solid geometry reasoning in multimodal models, filling a critical gap in existing mathematical evaluation tools.
Findings
MLLMs perform significantly worse than humans on SolidGeo tasks.
Analysis reveals specific error patterns and inference inefficiencies in MLLMs.
SolidGeo covers diverse 3D reasoning topics, challenging current models.
Abstract
Geometry is a fundamental branch of mathematics and plays a crucial role in evaluating the reasoning capabilities of multimodal large language models (MLLMs). However, existing multimodal mathematics benchmarks mainly focus on plane geometry and largely ignore solid geometry, which requires spatial reasoning and is more challenging than plane geometry. To address this critical gap, we introduce SolidGeo, the first large-scale benchmark specifically designed to evaluate the performance of MLLMs on mathematical reasoning tasks in solid geometry. SolidGeo consists of 3,113 real-world K-12 and competition-level problems, each paired with visual context and annotated with difficulty levels and fine-grained solid geometry categories. Our benchmark covers a wide range of 3D reasoning subjects such as projection, unfolding, spatial measurement, and spatial vector, offering a rigorous testbed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial Cognition and Navigation · Constraint Satisfaction and Optimization
