GeoFM: Enhancing Geometric Reasoning of MLLMs via Synthetic Data Generation through Formal Language
Yuhao Zhang, Dingxin Hu, Tinghao Yu, Hao Liu, Yiting Liu

TL;DR
This paper introduces GeoFM, a formal language-based method for generating high-quality synthetic geometric data to improve multi-modal large language models' geometric reasoning, outperforming existing data generation approaches.
Contribution
GeoFM is a novel formal language-driven approach that synthesizes diverse, high-fidelity geometric problems with correctness guarantees, enhancing geometric reasoning in MLLMs.
Findings
Synthetic data generated by GeoFM outperforms existing methods.
Models trained with GeoFM data surpass GPT-4o in geometry tasks.
GeoFM improves performance on MathVista and GeoQA benchmarks.
Abstract
Multi-modal Large Language Models (MLLMs) have gained significant attention in both academia and industry for their capabilities in handling multi-modal tasks. However, these models face challenges in mathematical geometric reasoning due to the scarcity of high-quality geometric data. To address this issue, synthetic geometric data has become an essential strategy. Current methods for generating synthetic geometric data involve rephrasing or expanding existing problems and utilizing predefined rules and templates to create geometric images and problems. However, these approaches often produce data that lacks diversity or is prone to noise. Additionally, the geometric images synthesized by existing methods tend to exhibit limited variation and deviate significantly from authentic geometric diagrams. To overcome these limitations, we propose GeoFM, a novel method for synthesizing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Polynomial and algebraic computation · Mathematics Education and Teaching Techniques
