Masks make discriminative models great again!
Tianshi Cao, Marie-Julie Rakotosaona, Ben Poole, Federico Tombari, Michael Niemeyer

TL;DR
This paper introduces Image2GS, a method that improves 3D scene reconstruction from a single image by focusing on the visible region and using visibility masks, leading to better quality in visible areas and competitive overall results.
Contribution
It proposes decoupling the image-to-3D lifting from completion, using visibility masks with Gaussian splats, and demonstrating improved reconstruction quality for visible regions.
Findings
Improved visible region reconstruction quality.
Competitive performance on complete scenes.
Highlighting the importance of separating lifting and completion.
Abstract
We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
