AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

Bowen Ping; Minnan Luo; Zhuohang Dang; Chenxi Wang; Chengyou Jia

arXiv:2505.23381·cs.AI·February 16, 2026

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

Bowen Ping, Minnan Luo, Zhuohang Dang, Chenxi Wang, Chengyou Jia

PDF

Open Access 3 Reviews

TL;DR

AutoGPS introduces a neuro-symbolic framework that combines neural multimodal understanding with symbolic deductive reasoning to solve geometry problems reliably and interpretably, achieving state-of-the-art results.

Contribution

The paper presents AutoGPS, a novel neuro-symbolic system that formalizes geometry problems and performs rigorous deductive reasoning, improving reliability and interpretability over existing methods.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

99% logical coherence in human evaluations.

03

Provides minimal, human-readable solutions.

Abstract

Geometry problem solving presents distinctive challenges in artificial intelligence, requiring exceptional multimodal comprehension and rigorous mathematical reasoning capabilities. Existing approaches typically fall into two categories: neural-based and symbolic-based methods, both of which exhibit limitations in reliability and interpretability. To address this challenge, we propose AutoGPS, a neuro-symbolic collaborative framework that solves geometry problems with concise, reliable, and human-interpretable reasoning processes. Specifically, AutoGPS employs a Multimodal Problem Formalizer (MPF) and a Deductive Symbolic Reasoner (DSR). The MPF utilizes neural cross-modal comprehension to translate geometry problems into structured formal language representations, with feedback from DSR collaboratively. The DSR takes the formalization as input and formulates geometry problem solving as…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 5

Strengths

1. The design of the framework is reasonable. AutoGPS first parse the diagram into symbolic languages, further construct a hyper graph of the given problem and search the solution path on the graph to finally get the solution steps, ensure the interpretability. 2. Comparing to the mentioned previous works in this paper, AutoGPS improved the problem solving accuracy than previous baselines, which show the effectiveness of the proposed method. 3. In the human evaluation, AutoGPS output stepwise so

Weaknesses

1. As I know, using hyper graph in solving plane geometry probelms (PGPs) was already explored by FGeo-HyperGNet [1], this paper not give appropriate discussion on this previous work, what is the difference of the hyper graph between AutoGPS and HyperGNet. Meanwhile, HyperGNet achieved 91.99% on Geometry3K which is higher than AutoGPS with 81.6. 2. To this end, the contribution of this paper is limited, as the diagram parsing mainly relies on PGDP and the hypergraph has already been used in oth

Reviewer 02Rating 4Confidence 4

Strengths

1. The design of the overall neuro-symbolic architecture is reasonable, and the MPF and DSR modules are well defined. 2. The introduction of the multimodal alignment stage in MPF is effective in filling the missing semantic information that is not captured by the pixel-level diagram parser. 3. The experimental results are convincing and demonstrate the effectiveness of the AutoGPS framework. 4. The writing and structure of the paper are clear and easy to follow.

Weaknesses

1. My main concern lies in the originality of this paper. First, in MPF, the text parser $M_t$ should properly cite InterGPS (Line 214). Second, in Line 274, the authors state that "solving algebraic relations remains out of its scope". However, in my understanding, AlphaGeometry adopts a DD + AR symbolic reasoning engine, where AR stands for algebraic reasoning. Therefore, the DSR module appears highly similar to the DD + AR reasoning engine of AlphaGeometry, as well as to the hypergraph expans

Reviewer 03Rating 8Confidence 5

Strengths

1. Innovative integration of modalities: The combination of natural language, diagram understanding, and formal reasoning is both technically challenging and conceptually elegant. The multimodal formalization module is well-motivated and executed with a clear architecture. 2. Clear reasoning pipeline: The formal-to-symbolic transition is described thoroughly, including explicit steps for entity detection, relation extraction, and logical grounding. The modular design supports interpretability a

Weaknesses

1. Limited scalability and automation. The formalization process still relies partly on rule-based heuristics for entity alignment and relation mapping. It is unclear how the system scales to more complex, noisy, or real-world diagrams with ambiguous geometry. 2. Dataset construction bias. The training and evaluation datasets appear to be semi-synthetic or curated from well-structured geometry problems. There is limited discussion on how AutoGPS performs on imperfect or non-standard problem sta

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Polynomial and algebraic computation · Multimodal Machine Learning Applications