Geo-Code: A Code Framework for Reverse Code Generation from Geometric Images Based on Two-Stage Multi-Agent Evolution
Zhenyu Wu, Yanxi Long, Jian Li, Hua Huang

TL;DR
Geo-code introduces a novel inverse programming framework utilizing a multi-agent system to accurately reconstruct complex geometric images, enhancing geometric fidelity and multimodal reasoning capabilities.
Contribution
The paper presents the first inverse programming framework for geometric images based on multi-agent systems, combining pixel-wise modeling and a closed-loop code evolution process.
Findings
Achieves higher geometric reconstruction accuracy.
Maintains visual consistency and core geometric semantics.
Enables effective multimodal reasoning with reconstructed images.
Abstract
Program code serves as a bridge linking vision and logic, providing a feasible supervisory approach for enhancing the multimodal reasoning capability of large models through geometric operations such as auxiliary line construction and perspective transformation. Nevertheless, current inverse graphics methods face tremendous challenges in accurately reconstructing complex geometric details, which often results in the loss of key geometric constraints or structural distortion. To address this bottleneck, we propose Geo-coder -- the first inverse programming framework for geometric images based on a multi-agent system. Our method innovatively decouples the process into geometric modeling via pixel-wise anchoring and metric-driven code evolution: Stage 1 leverages the complementary advantages of visual operators and large models to achieve precise capture of pixel coordinates and visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
