MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
Muyu Xu, Fangneng Zhan, Xiaoqin Zhang, Ling Shao, Shijian Lu

TL;DR
MuSASplat is a lightweight, efficient framework for sparse-view 3D Gaussian splatting that maintains high rendering quality while drastically reducing training costs through multi-scale adaptation and feature fusion techniques.
Contribution
Introduces MuSASplat, a novel lightweight multi-scale adapter and feature fusion method for efficient, high-quality 3D Gaussian splatting without extensive fine-tuning of large models.
Findings
Achieves state-of-the-art rendering quality on multiple datasets.
Reduces training parameters and computational costs significantly.
Maintains high fidelity with sparse input views.
Abstract
Sparse-view 3D Gaussian splatting seeks to render high-quality novel views of 3D scenes from a limited set of input images. While recent pose-free feed-forward methods leveraging pre-trained 3D priors have achieved impressive results, most of them rely on full fine-tuning of large Vision Transformer (ViT) backbones and incur substantial GPU costs. In this work, we introduce MuSASplat, a novel framework that dramatically reduces the computational burden of training pose-free feed-forward 3D Gaussian splats models with little compromise of rendering quality. Central to our approach is a lightweight Multi-Scale Adapter that enables efficient fine-tuning of ViT-based architectures with only a small fraction of training parameters. This design avoids the prohibitive GPU overhead associated with previous full-model adaptation techniques while maintaining high fidelity in novel view synthesis,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Computer Graphics and Visualization Techniques
