TL;DR
GlobalSplat introduces a global scene token-based approach for efficient, compact, and consistent 3D Gaussian splatting, achieving high-quality novel-view synthesis with faster inference and fewer primitives.
Contribution
It proposes a novel global scene representation that encodes multi-view input before decoding, improving efficiency and consistency over local, heuristic methods.
Findings
Achieves competitive view synthesis with only 16K Gaussians.
Operates under 78 milliseconds in a single forward pass.
Uses a 4MB model footprint with high reconstruction quality.
Abstract
The efficient spatial allocation of primitives serves as the foundation of 3D Gaussian Splatting, as it directly dictates the synergy between representation compactness, reconstruction speed, and rendering fidelity. Previous solutions, whether based on iterative optimization or feed-forward inference, suffer from significant trade-offs between these goals, mainly due to the reliance on local, heuristic-driven allocation strategies that lack global scene awareness. Specifically, current feed-forward methods are largely pixel-aligned or voxel-aligned. By unprojecting pixels into dense, view-aligned primitives, they bake redundancy into the 3D asset. As more input views are added, the representation size increases and global consistency becomes fragile. To this end, we introduce GlobalSplat, a framework built on the principle of align first, decode later. Our approach learns a compact,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
