Ref-SAM3D: Bridging SAM3D with Text for Reference 3D Reconstruction
Yun Zhou, Yaoting Wang, Guangquan Jie, Jinyu Liu, Henghui Ding

TL;DR
Ref-SAM3D extends SAM3D by integrating textual descriptions, enabling accurate, zero-shot 3D reconstruction from a single image guided solely by natural language, thus enhancing flexibility and practical applicability.
Contribution
We introduce Ref-SAM3D, a novel extension that incorporates text as a high-level prior for reference-guided 3D reconstruction from a single RGB image.
Findings
Achieves high-fidelity zero-shot 3D reconstruction guided by text.
Effectively bridges 2D visual cues and 3D geometric understanding.
Demonstrates competitive performance with only natural language and a single view.
Abstract
SAM3D has garnered widespread attention for its strong 3D object reconstruction capabilities. However, a key limitation remains: SAM3D cannot reconstruct specific objects referred to by textual descriptions, a capability that is essential for practical applications such as 3D editing, game development, and virtual environments. To address this gap, we introduce Ref-SAM3D, a simple yet effective extension to SAM3D that incorporates textual descriptions as a high-level prior, enabling text-guided 3D reconstruction from a single RGB image. Through extensive qualitative experiments, we show that Ref-SAM3D, guided only by natural language and a single 2D view, delivers competitive and high-fidelity zero-shot reconstruction performance. Our results demonstrate that Ref-SAM3D effectively bridges the gap between 2D visual cues and 3D geometric understanding, offering a more flexible and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Interactive and Immersive Displays · 3D Surveying and Cultural Heritage
