TL;DR
This paper introduces VLocNet, a multitask CNN architecture for visual localization and odometry that outperforms existing deep learning methods and rivals traditional feature-based techniques.
Contribution
The paper presents a novel multitask CNN with a new loss function leveraging auxiliary learning for improved pose estimation accuracy.
Findings
VLocNet exceeds state-of-the-art deep architectures in global localization.
The model achieves competitive visual odometry performance.
Multitask learning with the Geometric Consistency Loss enhances localization accuracy.
Abstract
Localization is an indispensable component of a robot's autonomy stack that enables it to determine where it is in the environment, essentially making it a precursor for any action execution or planning. Although convolutional neural networks have shown promising results for visual localization, they are still grossly outperformed by state-of-the-art local feature-based techniques. In this work, we propose VLocNet, a new convolutional neural network architecture for 6-DoF global pose regression and odometry estimation from consecutive monocular images. Our multitask model incorporates hard parameter sharing, thus being compact and enabling real-time inference, in addition to being end-to-end trainable. We propose a novel loss function that utilizes auxiliary learning to leverage relative pose information during training, thereby constraining the search space to obtain consistent pose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
