Improved monocular depth prediction using distance transform over pre-semantic contours with self-supervised neural networks

Marwane Hariat; Antoine Manzanera; David Filliat

arXiv:2605.08320·eess.IV·May 14, 2026

Improved monocular depth prediction using distance transform over pre-semantic contours with self-supervised neural networks

Marwane Hariat, Antoine Manzanera, David Filliat

PDF

1 Video

TL;DR

This paper introduces a self-supervised monocular depth estimation method that uses distance transforms over pre-semantic contours to improve accuracy in low-texture areas, demonstrating superior results on multiple datasets.

Contribution

The novel integration of distance transforms over pre-semantic contours enhances spatial information and training effectiveness in self-supervised monocular depth estimation.

Findings

01

Outperforms existing self-supervised methods on KITTI, Cityscapes, Waymo, NYUv2, and ScanNet datasets.

02

Theoretically proves the optimality of distance transform for variance augmentation.

03

Improves depth and ego-motion estimation in low-texture regions.

Abstract

Monocular depth estimation (MDE) with self-supervised training approaches struggles in low-texture areas, where photometric losses may lead to ambiguous depth predictions. To address this, we propose a novel technique that enhances spatial information by applying a distance transform over pre-semantic contours, augmenting discriminative power in low texture regions. Our approach jointly estimates pre-semantic contours, depth and ego-motion. The pre-semantic contours are leveraged to produce new input images, with variance augmented by the distance transform in uniform areas. This approach results in more effective loss functions, enhancing the training process for depth and ego-motion. We demonstrate theoretically that the distance transform is the optimal variance-augmenting technique in this context. Through extensive experiments on KITTI, Cityscapes, Waymo, NYUv2 and ScanNet our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Improved Monocular Depth Prediction Using Distance Transform Over Pre-semantic Contours with Self-supervised Neural Networks· slideslive