GeoLoom: High-quality Geometric Diagram Generation from Textual Input
Xiaojing Wei, Ting Zhang, Wei He, Jingdong Wang, Hua Huang

TL;DR
GeoLoom is a novel framework that converts natural language geometric descriptions into high-quality, accurate diagrams using formal language translation and Monte Carlo optimization, outperforming existing methods.
Contribution
It introduces GeoLoom, combining autoformalization and coordinate solving, along with a new dataset GeoNF and a constraint-based evaluation metric for improved diagram generation.
Findings
Outperforms state-of-the-art in structural fidelity
Uses formal language and Monte Carlo optimization for accuracy
Provides a new dataset and evaluation metric for geometric diagram generation
Abstract
High-quality geometric diagram generation presents both a challenge and an opportunity: it demands strict spatial accuracy while offering well-defined constraints to guide generation. Inspired by recent advances in geometry problem solving that employ formal languages and symbolic solvers for enhanced correctness and interpretability, we propose GeoLoom, a novel framework for text-to-diagram generation in geometric domains. GeoLoom comprises two core components: an autoformalization module that translates natural language into a specifically designed generation-oriented formal language GeoLingua, and a coordinate solver that maps formal constraints to precise coordinates using the efficient Monte Carlo optimization. To support this framework, we introduce GeoNF, a dataset aligning natural language geometric descriptions with formal GeoLingua descriptions. We further propose a…
Peer Reviews
Decision·Submitted to ICLR 2026
- **Significance & Motivation:** The paper tackles a well-defined and significant problem. The authors correctly observe that general-purpose text-to-image models are unsuitable for domains requiring high structural and spatial accuracy, such as mathematical diagrams. An automated tool for this task would have clear applications in education, research, and engineering. - **Methodological Clarity:** The proposed two-stage, "parse-then-solve" pipeline is logical, interpretable, and well-structured
Despite its strengths, the paper suffers from several weaknesses, primarily concerning the novelty of the paradigm, the scalability of the solver, and a lack of depth in key areas. - **Missing Comparison to Key Baselines:** The core idea of a text-to-diagram pipeline based on formal language and constraint-based optimization is not new. The related work discusses GANs/diffusion and vector graphics generation, but omits a critical category: constraint-based diagram generation systems. A well-kno
1.For the first time, the "formal language + symbolic solving" paradigm of geometric problem-solving has been transferred to image generation, presenting a novel approach. 2.The self-developed GeoLingua explicitly encodes the construction sequence and constraints, facilitating subsequent coordinate calculations. 3.Quantifiable indicators such as LCI/ADI have been proposed to provide an objective evaluation benchmark for the community. 4.The two-stage process supports two modes: training-free
1.This is only applicable to 2D Euclidean geometry. Non-Euclidean, 3D or dynamic geometry need to be rewritten for constraints and solvers. 2.Monte Carlo random sampling relies on a large number of iterations, often taking over 50 seconds for complex graphs, and the convergence is slow. 3.The grammar of formal languages is fixed. If the description contains advanced concepts such as "similar" or "trajectory", it cannot be expressed. 4.The dataset size is still small, and it only comes from mi
1. **Strong and novel method.** GeoLingua presents a principled framework for geometric diagram generation. This suggested architectural design is well-supported by strong empirical results: GeoLoom achieves top performance in constraint satisfaction metrics (LCI, ADI), manual accuracy, and user preference in both diagram quality and alignment. 2. **Clear writing and presentation.** The paper is clearly written, with informative diagrams, making the method easy to follow.
1. **Some geometric constraint are hard to understand.** The paper introduces several evaluation equations (e.g., Eq. 2 and Eq. 5 for length/angle relations) using Iverson bracket notation and conditional terms, but does not provide intuition or derivations. It’s unclear how these metrics correspond to geometric correctness or what theoretical justification supports their formulation. 2. **No ablation study on GeoLingua components or constraint types.** The impact of individual constraint types
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · 3D Shape Modeling and Analysis · Constraint Satisfaction and Optimization
