MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection
Junkai Xu, Liang Peng, Haoran Cheng, Hao Li, Wei Qian, Ke Li, Wenxiao, Wang, Deng Cai

TL;DR
MonoNeRD introduces a novel monocular 3D detection framework using NeRF-like dense 3D scene representations via volume rendering, improving 3D perception especially for distant and occluded objects.
Contribution
It is the first to apply volume rendering with implicit scene representations for monocular 3D detection, enhancing dense 3D understanding without explicit depth estimation.
Findings
Outperforms existing methods on KITTI-3D and Waymo datasets.
Produces dense 3D reconstructions from monocular images.
Demonstrates the effectiveness of implicit scene modeling in 3D detection.
Abstract
In the field of monocular 3D detection, it is common practice to utilize scene geometric clues to enhance the detector's performance. However, many existing works adopt these clues explicitly such as estimating a depth map and back-projecting it into 3D space. This explicit methodology induces sparsity in 3D representations due to the increased dimensionality from 2D to 3D, and leads to substantial information loss, especially for distant and occluded objects. To alleviate this issue, we propose MonoNeRD, a novel detection framework that can infer dense 3D geometry and occupancy. Specifically, we model scenes with Signed Distance Functions (SDF), facilitating the production of dense 3D representations. We treat these representations as Neural Radiance Fields (NeRF) and then employ volume rendering to recover RGB images and depth maps. To the best of our knowledge, this work is the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
