Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for   Urban-Scene Segmentation

Liang Zeng; Attila Lengyel; Nergis T\"omen; Jan van Gemert

arXiv:2211.14074·cs.CV·November 28, 2022

Copy-Pasting Coherent Depth Regions Improves Contrastive Learning for Urban-Scene Segmentation

Liang Zeng, Attila Lengyel, Nergis T\"omen, Jan van Gemert

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel contrastive learning approach for urban scene segmentation that uses depth-based copy-paste augmentation to improve semantic invariance and surpasses state-of-the-art results without extensive pre-training.

Contribution

It proposes leveraging estimated depth to create coherent regions for copy-paste augmentation, enhancing contrastive learning for better urban scene segmentation.

Findings

01

Achieves +7.14% mIoU on Cityscapes

02

Achieves +6.65% mIoU on KITTI

03

Does not require pre-training on ImageNet or COCO

Abstract

In this work, we leverage estimated depth to boost self-supervised contrastive learning for segmentation of urban scenes, where unlabeled videos are readily available for training self-supervised depth estimation. We argue that the semantics of a coherent group of pixels in 3D space is self-contained and invariant to the contexts in which they appear. We group coherent, semantically related pixels into coherent depth regions given their estimated depth and use copy-paste to synthetically vary their contexts. In this way, cross-context correspondences are built in contrastive learning and a context-invariant representation is learned. For unsupervised semantic segmentation of urban scenes, our method surpasses the previous state-of-the-art baseline by +7.14% in mIoU on Cityscapes and +6.65% on KITTI. For fine-tuning on Cityscapes and KITTI segmentation, our method is competitive with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leungtsang/cpcdr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · Robotics and Sensor-Based Localization

Methodssimple Copy-Paste · Contrastive Learning