Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
Seunghyuk Cho, Zhenyue Qin, Yang Liu, Youngbin Choi, Seungbeom Lee, Dongwoo Kim

TL;DR
This survey reviews recent advances in plane geometry problem solving using multi-modal reasoning, categorizing methods, analyzing architectures, and highlighting challenges like hallucination and data leakage.
Contribution
It provides a systematic overview of PGPS methods, classifies architectures, and discusses key challenges and future research directions.
Findings
Categorization of PGPS methods into encoder-decoder frameworks
Analysis of architectural designs of encoders and decoders
Identification of hallucination and data leakage issues in benchmarks
Abstract
Plane geometry problem solving (PGPS) has recently gained significant attention as a benchmark to assess the multi-modal reasoning capabilities of large vision-language models. Despite the growing interest in PGPS, the research community still lacks a comprehensive overview that systematically synthesizes recent work in PGPS. To fill this gap, we present a survey of existing PGPS studies. We first categorize PGPS methods into an encoder-decoder framework and summarize the corresponding output formats used by their encoders and decoders. Subsequently, we classify and analyze these encoders and decoders according to their architectural designs. Finally, we outline major challenges and promising directions for future research. In particular, we discuss the hallucination issues arising during the encoding phase within encoder-decoder architectures, as well as the problem of data leakage in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsConstraint Satisfaction and Optimization · Manufacturing Process and Optimization · Advanced Numerical Analysis Techniques
MethodsSoftmax · Attention Is All You Need
