GGPT: Geometry Grounded Point Transformer
Yutong Chen, Yiming Wang, Xucong Zhang, Sergey Prokudin, Siyu Tang

TL;DR
GGPT introduces a geometry-grounded point transformer that combines geometric priors with dense feed-forward predictions to improve 3D reconstruction accuracy, consistency, and detail recovery from sparse RGB views.
Contribution
The paper presents a novel framework integrating geometric guidance with a point transformer for enhanced 3D reconstruction, including an improved SfM pipeline and explicit partial-geometry supervision.
Findings
Outperforms state-of-the-art models in 3D reconstruction accuracy.
Produces geometrically consistent and spatially complete reconstructions.
Generalizes well across different datasets and architectures.
Abstract
Recent feed-forward networks have achieved remarkable progress in sparse-view 3D reconstruction by predicting dense point maps directly from RGB images. However, they often suffer from geometric inconsistencies and limited fine-grained accuracy due to the absence of explicit multi-view constraints. We introduce the Geometry-Grounded Point Transformer (GGPT), a framework that augments feed-forward reconstruction with reliable sparse geometric guidance. We first propose an improved Structure-from-Motion pipeline based on dense feature matching and lightweight geometric optimisation to efficiently estimate accurate camera poses and partial 3D point clouds from sparse input views. Building on this foundation, we propose a geometry-guided 3D point transformer that refines dense point maps under explicit partial-geometry supervision using an optimised guidance encoding. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
