TL;DR
MonoRec is a semi-supervised monocular dense reconstruction method that predicts depth maps from a single moving camera, effectively handling dynamic scenes by detecting moving objects and generalizing across datasets.
Contribution
The paper introduces MonoRec, a novel semi-supervised multi-view stereo architecture with a mask module for dynamic object reconstruction without LiDAR data.
Findings
Achieves state-of-the-art performance on KITTI dataset
Generalizes well to Oxford RobotCar and TUM-Mono datasets
Effectively reconstructs static and moving objects
Abstract
In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. MonoRec is based on a multi-view stereo setting which encodes the information of multiple consecutive images in a cost volume. To deal with dynamic objects in the scene, we introduce a MaskModule that predicts moving object masks by leveraging the photometric inconsistencies encoded in the cost volumes. Unlike other multi-view stereo methods, MonoRec is able to reconstruct both static and moving objects by leveraging the predicted masks. Furthermore, we present a novel multi-stage training scheme with a semi-supervised loss formulation that does not require LiDAR depth values. We carefully evaluate MonoRec on the KITTI dataset and show that it achieves state-of-the-art performance compared to both multi-view and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
