Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for   Robust Ground-View Scene Rendering

Yonghan Lee; Jaehoon Choi; Dongki Jung; Jaeseong Yun; Soohyun Ryu,; Dinesh Manocha; and Suyong Yeon

arXiv:2410.04646·cs.CV·October 8, 2024·2 cites

Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for Robust Ground-View Scene Rendering

Yonghan Lee, Jaehoon Choi, Dongki Jung, Jaeseong Yun, Soohyun Ryu,, Dinesh Manocha, and Suyong Yeon

PDF

Open Access

TL;DR

Mode-GS introduces a novel neural rendering approach that leverages monocular depth-guided Gaussian splats with scale calibration, significantly improving 3D scene rendering accuracy in ground-robot datasets.

Contribution

The paper proposes a new anchored Gaussian splatting method that integrates monocular depth cues and scale calibration to enhance robustness and accuracy in ground-view scene rendering.

Findings

01

Achieves state-of-the-art performance on R3LIVE and Tanks and Temples datasets.

02

Improves rendering metrics such as PSNR, SSIM, and LPIPS.

03

Effectively handles scene complexity and monocular depth scale ambiguity.

Abstract

We present a novel-view rendering algorithm, Mode-GS, for ground-robot trajectory datasets. Our approach is based on using anchored Gaussian splats, which are designed to overcome the limitations of existing 3D Gaussian splatting algorithms. Prior neural rendering methods suffer from severe splat drift due to scene complexity and insufficient multi-view observation, and can fail to fix splats on the true geometry in ground-robot datasets. Our method integrates pixel-aligned anchors from monocular depths and generates Gaussian splats around these anchors using residual-form Gaussian decoders. To address the inherent scale ambiguity of monocular depth, we parameterize anchors with per-view depth-scales and employ scale-consistent depth loss for online scale calibration. Our method results in improved rendering performance, based on PSNR, SSIM, and LPIPS metrics, in ground scenes with free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · 3D Surveying and Cultural Heritage