Large Scale Joint Semantic Re-Localisation and Scene Understanding via   Globally Unique Instance Coordinate Regression

Ignas Budvytis; Marvin Teichmann; Tomas Vojir; Roberto Cipolla

arXiv:1909.10239·cs.CV·September 24, 2019

Large Scale Joint Semantic Re-Localisation and Scene Understanding via Globally Unique Instance Coordinate Regression

Ignas Budvytis, Marvin Teichmann, Tomas Vojir, Roberto Cipolla

PDF

TL;DR

This paper introduces a novel joint approach for semantic localisation and scene understanding that predicts 3D geometry and camera pose simultaneously, outperforming existing methods on real and synthetic datasets.

Contribution

It proposes a two-step neural network method for scene coordinate regression that scales to larger maps and integrates object recognition with local coordinate prediction.

Findings

01

Achieves smaller pose estimation errors than state-of-the-art methods.

02

Effectively predicts accurate 3D geometry of static objects.

03

Scales to maps several orders of magnitude larger than previous approaches.

Abstract

In this work we present a novel approach to joint semantic localisation and scene understanding. Our work is motivated by the need for localisation algorithms which not only predict 6-DoF camera pose but also simultaneously recognise surrounding objects and estimate 3D geometry. Such capabilities are crucial for computer vision guided systems which interact with the environment: autonomous driving, augmented reality and robotics. In particular, we propose a two step procedure. During the first step we train a convolutional neural network to jointly predict per-pixel globally unique instance labels and corresponding local coordinates for each instance of a static object (e.g. a building). During the second step we obtain scene coordinates by combining object center coordinates and local coordinates and use them to perform 6-DoF camera pose estimation. We evaluate our approach on real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.