Warp-Refine Propagation: Semi-Supervised Auto-labeling via Cycle-consistency
Aditya Ganeshan, Alexis Vallet, Yasunori Kudo, Shin-ichi Maeda, Tommi, Kerola, Rares Ambrus, Dennis Park, Adrien Gaidon

TL;DR
This paper introduces Warp-Refine Propagation, a semi-supervised method that combines semantic and geometric cues with cycle consistency to auto-label videos, significantly improving semantic segmentation accuracy.
Contribution
It presents a novel label propagation technique that refines geometrically-warped labels with semantic priors using cycle consistency, advancing semi-supervised video annotation.
Findings
Improves label propagation by 13.1 mIoU on ApolloScape
Achieves state-of-the-art results on NYU-V2 and KITTI datasets
Matches best results on Cityscapes dataset
Abstract
Deep learning models for semantic segmentation rely on expensive, large-scale, manually annotated datasets. Labelling is a tedious process that can take hours per image. Automatically annotating video sequences by propagating sparsely labeled frames through time is a more scalable alternative. In this work, we propose a novel label propagation method, termed Warp-Refine Propagation, that combines semantic cues with geometric cues to efficiently auto-label videos. Our method learns to refine geometrically-warped labels and infuse them with learned semantic priors in a semi-supervised setting by leveraging cycle consistency across time. We quantitatively show that our method improves label-propagation by a noteworthy margin of 13.1 mIoU on the ApolloScape dataset. Furthermore, by training with the auto-labelled frames, we achieve competitive results on three semantic-segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
