VLocNet++: Deep Multitask Learning for Semantic Visual Localization and   Odometry

Noha Radwan; Abhinav Valada; Wolfram Burgard

arXiv:1804.08366·cs.RO·October 12, 2018

VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry

Noha Radwan, Abhinav Valada, Wolfram Burgard

PDF

TL;DR

VLocNet++ is a multitask deep learning architecture that jointly learns semantic understanding, 6-DoF pose regression, and odometry, improving localization accuracy and robustness in urban outdoor environments by integrating geometric and semantic cues.

Contribution

The paper introduces VLocNet++, a novel multitask network that combines semantic segmentation, pose regression, and odometry with a new fusion layer and self-supervised warping for enhanced localization.

Findings

01

Outperforms state-of-the-art methods on Microsoft 7-Scenes and DeepLoc datasets.

02

Achieves robust localization in challenging urban outdoor scenarios.

03

Effectively integrates semantic and geometric information for improved pose estimation.

Abstract

Semantic understanding and localization are fundamental enablers of robot autonomy that have for the most part been tackled as disjoint problems. While deep learning has enabled recent breakthroughs across a wide spectrum of scene understanding tasks, its applicability to state estimation tasks has been limited due to the direct formulation that renders it incapable of encoding scene-specific constrains. In this work, we propose the VLocNet++ architecture that employs a multitask learning approach to exploit the inter-task relationship between learning semantics, regressing 6-DoF global pose and odometry, for the mutual benefit of each of these tasks. Our network overcomes the aforementioned limitation by simultaneously embedding geometric and semantic knowledge of the world into the pose regression network. We propose a novel adaptive weighted fusion layer to aggregate motion-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.