Semantically-Guided Representation Learning for Self-Supervised   Monocular Depth

Vitor Guizilini; Rui Hou; Jie Li; Rares Ambrus; Adrien Gaidon

arXiv:2002.12319·cs.CV·February 28, 2020·47 cites

Semantically-Guided Representation Learning for Self-Supervised Monocular Depth

Vitor Guizilini, Rui Hou, Jie Li, Rares Ambrus, Adrien Gaidon

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel self-supervised monocular depth estimation method that uses fixed pretrained semantic segmentation networks to guide representation learning, improving accuracy across all pixels and semantic categories.

Contribution

It proposes a new architecture leveraging fixed semantic segmentation networks and a two-stage training process to enhance self-supervised depth estimation.

Findings

01

Outperforms state-of-the-art methods in depth prediction accuracy.

02

Improves depth estimation for fine-grained details and semantic categories.

03

Effectively mitigates semantic bias on dynamic objects.

Abstract

Self-supervised learning is showing great promise for monocular depth estimation, using geometry as the only source of supervision. Depth networks are indeed capable of learning representations that relate visual appearance to 3D properties by implicitly leveraging category-level patterns. In this work we investigate how to leverage more directly this semantic structure to guide geometric representation learning, while remaining in the self-supervised regime. Instead of using semantic labels and proxy losses in a multi-task approach, we propose a new architecture leveraging fixed pretrained semantic segmentation networks to guide self-supervised representation learning via pixel-adaptive convolutions. Furthermore, we propose a two-stage training process to overcome a common semantic bias on dynamic objects via resampling. Our method improves upon the state of the art for self-supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TRI-ML/packnet-sfm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Image Processing Techniques and Applications