TL;DR
This paper introduces Neural Ray Surfaces, a novel neural network approach that learns accurate depth and ego-motion from raw videos across various camera types without prior camera model knowledge.
Contribution
We propose Neural Ray Surfaces, a differentiable neural network model that approximates pixel-wise projection rays for diverse camera systems, enabling self-supervised depth and ego-motion learning without known camera parameters.
Findings
NRS can accurately model different camera types including fisheye and catadioptric.
The method achieves state-of-the-art results on benchmark datasets.
NRS enables self-supervised learning without prior camera calibration.
Abstract
Self-supervised learning has emerged as a powerful tool for depth and ego-motion estimation, leading to state-of-the-art results on benchmark datasets. However, one significant limitation shared by current methods is the assumption of a known parametric camera model -- usually the standard pinhole geometry -- leading to failure when applied to imaging systems that deviate significantly from this assumption (e.g., catadioptric cameras or underwater imaging). In this work, we show that self-supervision can be used to learn accurate depth and ego-motion estimation without prior knowledge of the camera model. Inspired by the geometric model of Grossberg and Nayar, we introduce Neural Ray Surfaces (NRS), convolutional networks that represent pixel-wise projection rays, approximating a wide range of cameras. NRS are fully differentiable and can be learned end-to-end from unlabeled raw videos.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
