MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

Yating Xu; Chen Li; Gim Hee Lee

arXiv:2410.21566·cs.CV·October 30, 2024

MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

Yating Xu, Chen Li, Gim Hee Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

MVSDet introduces a plane sweep-based approach for indoor 3D object detection that improves accuracy by effectively utilizing probabilistic sampling and Gaussian Splatting, outperforming NeRF-based methods.

Contribution

The paper proposes a novel plane sweep method with probabilistic sampling and Gaussian Splatting for more accurate geometry-aware 3D detection in indoor scenes.

Findings

01

Outperforms NeRF-based methods on ScanNet and ARKitScenes datasets.

02

Achieves high detection accuracy with low computational overhead.

03

Demonstrates robustness across multiple indoor datasets.

Abstract

The key challenge of multi-view indoor 3D object detection is to infer accurate geometry information from images for precise 3D detection. Previous method relies on NeRF for geometry reasoning. However, the geometry extracted from NeRF is generally inaccurate, which leads to sub-optimal detection performance. In this paper, we propose MVSDet which utilizes plane sweep for geometry-aware 3D object detection. To circumvent the requirement for a large number of depth planes for accurate depth prediction, we design a probabilistic sampling and soft weighting mechanism to decide the placement of pixel features on the 3D volume. We select multiple locations that score top in the probability volume for each pixel and use their probability score to indicate the confidence. We further apply recent pixel-aligned Gaussian Splatting to regularize depth prediction and improve detection performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pixie8888/mvsdet
jaxOfficial

Videos

MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization