Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Yue Chen, Xingyu Chen, Anpei Chen, Gerard Pons-Moll, Yuliang Xiu

TL;DR
This paper introduces Feat2GS, a novel framework that probes the 3D understanding of visual foundation models using Gaussian splatting, enabling texture and geometry analysis without 3D data, and advancing view synthesis capabilities.
Contribution
We propose Feat2GS, a new method for assessing 3D awareness in VFMs through Gaussian splatting, which does not require 3D ground-truth data and allows separate analysis of geometry and texture.
Findings
Feats2GS effectively probes 3D awareness in VFMs.
Our variants achieve state-of-the-art results on diverse datasets.
The method enables novel view synthesis without 3D supervision.
Abstract
Given that visual foundation models (VFMs) are trained on extensive datasets but often limited to 2D images, a natural question arises: how well do they understand the 3D world? With the differences in architecture and training protocols (i.e., objectives, proxy tasks), a unified framework to fairly and comprehensively probe their 3D awareness is urgently needed. Existing works on 3D probing suggest single-view 2.5D estimation (e.g., depth and normal) or two-view sparse 2D correspondence (e.g., matching and tracking). Unfortunately, these tasks ignore texture awareness, and require 3D data as ground-truth, which limits the scale and diversity of their evaluation set. To address these issues, we introduce Feat2GS, which readout 3D Gaussians attributes from VFM features extracted from unposed images. This allows us to probe 3D awareness for geometry and texture via novel view synthesis,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
MethodsAttentive Walk-Aggregating Graph Neural Network
