S-VGGT: Structure-Aware Subscene Decomposition for Scalable 3D Foundation Models
Xinze Li, Pengxu Chen, Yiyuan Wang, Weifeng Su, Wentao Cheng

TL;DR
S-VGGT introduces a structure-aware subscene decomposition method that reduces the computational cost of 3D foundation models by leveraging scene graphs and shared reference frames, enabling scalable and efficient processing.
Contribution
It presents a novel scene graph-based subscene partitioning approach that drastically reduces global attention costs in 3D models, complementing existing token-level acceleration techniques.
Findings
Significant reduction in computational cost for 3D models.
Compatible with token-level acceleration methods for compounded speedups.
Maintains high reconstruction fidelity despite structural decomposition.
Abstract
Feed-forward 3D foundation models face a key challenge: the quadratic computational cost introduced by global attention, which severely limits scalability as input length increases. Concurrent acceleration methods, such as token merging, operate at the token level. While they offer local savings, the required nearest-neighbor searches introduce undesirable overhead. Consequently, these techniques fail to tackle the fundamental issue of structural redundancy dominant in dense capture data. In this work, we introduce \textbf{S-VGGT}, a novel approach that addresses redundancy at the structural frame level, drastically shifting the optimization focus. We first leverage the initial features to build a dense scene graph, which characterizes structural scene redundancy and guides the subsequent scene partitioning. Using this graph, we softly assign frames to a small number of subscenes,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Computational Geometry and Mesh Generation
