VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry
Noha Radwan, Abhinav Valada, Wolfram Burgard

TL;DR
VLocNet++ is a multitask deep learning architecture that jointly learns semantic understanding, 6-DoF pose regression, and odometry, improving localization accuracy and robustness in urban outdoor environments by integrating geometric and semantic cues.
Contribution
The paper introduces VLocNet++, a novel multitask network that combines semantic segmentation, pose regression, and odometry with a new fusion layer and self-supervised warping for enhanced localization.
Findings
Outperforms state-of-the-art methods on Microsoft 7-Scenes and DeepLoc datasets.
Achieves robust localization in challenging urban outdoor scenarios.
Effectively integrates semantic and geometric information for improved pose estimation.
Abstract
Semantic understanding and localization are fundamental enablers of robot autonomy that have for the most part been tackled as disjoint problems. While deep learning has enabled recent breakthroughs across a wide spectrum of scene understanding tasks, its applicability to state estimation tasks has been limited due to the direct formulation that renders it incapable of encoding scene-specific constrains. In this work, we propose the VLocNet++ architecture that employs a multitask learning approach to exploit the inter-task relationship between learning semantics, regressing 6-DoF global pose and odometry, for the mutual benefit of each of these tasks. Our network overcomes the aforementioned limitation by simultaneously embedding geometric and semantic knowledge of the world into the pose regression network. We propose a novel adaptive weighted fusion layer to aggregate motion-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
