MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation
Andra Petrovai, Sergiu Nedevschi

TL;DR
MonoDVPS introduces a self-supervised, multi-task approach for depth-aware video panoptic segmentation, effectively combining monocular depth estimation and segmentation using unlabeled video data, with novel loss functions and masking schemes.
Contribution
It presents a new self-supervised method that jointly performs depth estimation and panoptic segmentation, leveraging pseudo-labels and novel training schemes for improved accuracy.
Findings
Achieves competitive results on Cityscapes-DVPS and SemKITTI-DVPS datasets.
Demonstrates fast inference speed.
Improves depth prediction with panoptic-guided losses and masking.
Abstract
Depth-aware video panoptic segmentation tackles the inverse projection problem of restoring panoptic 3D point clouds from video sequences, where the 3D points are augmented with semantic classes and temporally consistent instance identifiers. We propose a novel solution with a multi-task network that performs monocular depth estimation and video panoptic segmentation. Since acquiring ground truth labels for both depth and image segmentation has a relatively large cost, we leverage the power of unlabeled video sequences with self-supervised monocular depth estimation and semi-supervised learning from pseudo-labels for video panoptic segmentation. To further improve the depth prediction, we introduce panoptic-guided depth losses and a novel panoptic masking scheme for moving objects to avoid corrupting the training signal. Extensive experiments on the Cityscapes-DVPS and SemKITTI-DVPS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Remote Sensing and LiDAR Applications · Industrial Vision Systems and Defect Detection
