Semantically-Guided Representation Learning for Self-Supervised Monocular Depth
Vitor Guizilini, Rui Hou, Jie Li, Rares Ambrus, Adrien Gaidon

TL;DR
This paper introduces a novel self-supervised monocular depth estimation method that uses fixed pretrained semantic segmentation networks to guide representation learning, improving accuracy across all pixels and semantic categories.
Contribution
It proposes a new architecture leveraging fixed semantic segmentation networks and a two-stage training process to enhance self-supervised depth estimation.
Findings
Outperforms state-of-the-art methods in depth prediction accuracy.
Improves depth estimation for fine-grained details and semantic categories.
Effectively mitigates semantic bias on dynamic objects.
Abstract
Self-supervised learning is showing great promise for monocular depth estimation, using geometry as the only source of supervision. Depth networks are indeed capable of learning representations that relate visual appearance to 3D properties by implicitly leveraging category-level patterns. In this work we investigate how to leverage more directly this semantic structure to guide geometric representation learning, while remaining in the self-supervised regime. Instead of using semantic labels and proxy losses in a multi-task approach, we propose a new architecture leveraging fixed pretrained semantic segmentation networks to guide self-supervised representation learning via pixel-adaptive convolutions. Furthermore, we propose a two-stage training process to overcome a common semantic bias on dynamic objects via resampling. Our method improves upon the state of the art for self-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Image Processing Techniques and Applications
