MFuseNet: Robust Depth Estimation with Learned Multiscopic Fusion

Weihao Yuan; Rui Fan; Michael Yu Wang; Qifeng Chen

arXiv:2108.02448·cs.CV·August 21, 2021

MFuseNet: Robust Depth Estimation with Learned Multiscopic Fusion

Weihao Yuan, Rui Fan, Michael Yu Wang, Qifeng Chen

PDF

TL;DR

This paper introduces MFuseNet, a multiscopic vision system that uses a low-cost monocular camera and learned fusion techniques to achieve accurate depth estimation, outperforming traditional stereo methods.

Contribution

The paper presents a novel multiscopic system with a new heuristic and learning-based fusion method for cost volumes, along with a synthetic dataset for training.

Findings

01

Outperforms traditional stereo matching in depth accuracy

02

Effective fusion of multiple cost volumes improves depth estimation

03

System works well on real-world datasets and robot demonstrations

Abstract

We design a multiscopic vision system that utilizes a low-cost monocular RGB camera to acquire accurate depth estimation. Unlike multi-view stereo with images captured at unconstrained camera poses, the proposed system controls the motion of a camera to capture a sequence of images in horizontally or vertically aligned positions with the same parallax. In this system, we propose a new heuristic method and a robust learning-based method to fuse multiple cost volumes between the reference image and its surrounding images. To obtain training data, we build a synthetic dataset with multiscopic images. The experiments on the real-world Middlebury dataset and real robot demonstration show that our multiscopic vision system outperforms traditional two-frame stereo matching methods in depth estimation. Our code and dataset are available at https://sites.google.com/view/multiscopic.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.